How to Calculate AI API Costs (2026 Guide)

Calculating AI API cost is simple in theory and tricky in practice. This guide walks through the formula, a worked example, and the real-world factors — retries, tool calls, caching, system prompts — that separate a toy estimate from a number you can budget against.

Updated 2026-06-19

The core formula

Every LLM API bills two things: input tokens (your prompt) and output tokens (the model's reply), each at a price quoted per million tokens. The base monthly cost is:

monthly = requests × (input_tokens × input_price + output_tokens × output_price) ÷ 1,000,000

That's it for a naive estimate. The art is in getting the inputs right and adding what production actually costs.

Step 1 — Estimate tokens, not words

One token is roughly ¾ of an English word (~4 characters). Count your typical prompt and typical response. Don't forget the system prompt: it's sent on every request, so a 1,000-token system prompt across a million requests is a billion input tokens. Use the token counter to measure real text.

Step 2 — Pick the right price

Prices vary by 100× across models. A frontier model can be 50× the price of a small model that may handle your task just as well. Always check the current pricing — rates change often.

Step 3 — Add the multipliers production introduces

  • Retries: failed or invalid calls are re-run. A 10% retry rate adds 10% to the bill.
  • Tool calls: each tool call is an extra round-trip that re-sends context.
  • Embeddings: RAG apps pay to embed queries and documents.
  • Human review: regulated or high-risk output needs a person to check it.

Step 4 — Subtract the discounts

Prompt caching slashes the cost of repeated context; the Batch API takes ~50% off asynchronous jobs. Model both in the API cost calculator.

Worked example

200,000 requests/month, 1,500 input + 500 output tokens, GPT-4o mini ($0.15 / $0.60 per 1M):

  • Input: 200,000 × 1,500 ÷ 1e6 × $0.15 = $45.00
  • Output: 200,000 × 500 ÷ 1e6 × $0.60 = $60.00
  • Base total: $105.00/month, before retries, caching and margin.

Add a 15% safety margin and you'd budget about $120/month. Now you have a defensible number.

Related calculators & guides