Code Generation Cost: Per-Function, Per-File, Per-PR
Generating one function costs ~$0.005. Generating an entire file costs ~$0.05. Generating a multi-file PR costs $0.50-$5. The cost ladder for AI-assisted coding.
Detailed Explanation
The Three Tiers of Code Generation Cost
Per-function generation (~50 lines of code)
Typical context: 200-line file as context, generate one function (50 lines).
- Input: ~3,000 tokens (file + system prompt) on GPT-4o
- Output: ~500 tokens (the function)
- Cost: 3K × $2.50/1M + 500 × $10/1M = $0.0125
At 100 functions per day per developer, that's $1.25/day — well under the cost of a coffee. For a 50-engineer team, $62.50/day or $1,875/month.
Per-file generation (~300 lines)
Generating a complete file from a description, with related files as context:
- Input: ~10,000 tokens (3-5 related files + spec) on Claude Opus 4.7
- Output: ~3,000 tokens (the new file)
- Cost: 10K × $15/1M + 3K × $75/1M = $0.375
Three-file refactor: ~$1.50. For a complex feature involving 10 files: ~$5.
Per-PR generation (multi-file, with tests)
This is where serious agent products live. A PR typically involves:
- Reading 10-30 files for context: ~50,000 tokens of input
- Multiple LLM turns to plan, edit, test, and refine: ~5-15 turns
- Generated code + tests + commit message: ~10,000 tokens output
Without caching: 50K × 10 turns × $15/1M + 10K × $75/1M = $8.25 per PR (Claude Opus).
With caching (assume 80% cache hit on the file context): drops to **$2.00** per PR.
Why Claude Opus dominates here
Coding tasks are the workload where Opus's quality premium most reliably justifies the price gap vs GPT-4o. Reduced regressions, fewer compile errors, and better adherence to codebase conventions translate to less human cleanup.
For high-volume background tasks (auto-formatting, lint fixes, simple refactors), drop to GPT-4o mini or Claude Haiku 4.5 and the per-PR cost drops to under $0.10.
Caching is non-negotiable
Code agents read the same files multiple times during a task. Without prompt caching, the bill on a 10-turn refactor is genuinely painful. Anthropic caching at 0.1x reads is the difference between $8 and $2 per PR.
Use Case
Apply when sizing the budget for an internal coding-agent rollout, when deciding which model to use for which class of code task, or when justifying the cost of a developer-AI subscription.
Try It — Prompt Token Cost Calculator
Related Topics
Agent Loops: Why a 'Simple' Task Costs 50K Tokens
Caching & long context
Claude Prompt Caching: 80% Bill Reduction in One Setting
Caching & long context
Long-Context Costs: What 128K Tokens Actually Cost Per Call
Caching & long context
Monthly Budget Estimation: Build a 30-Day Forecast in 5 Minutes
Operational
Cost Optimization Strategies: 10 Techniques to Cut Your LLM Bill
Operational