Question 1

Why are output tokens more expensive than input tokens?

Accepted Answer

Generating tokens is sequential and compute-intensive, whereas prompt tokens can be processed in parallel. Providers price output 2–5× higher to reflect that, so shortening answers saves more than shortening prompts.

Question 2

Are AI API costs predictable?

Accepted Answer

They are predictable per request, but total spend scales with usage, retries and context length. Add a safety margin and monitor token usage in production; a single long-context feature can multiply costs.

Question 3

How do I reduce AI API costs?

Accepted Answer

Use a smaller model for simple steps, cap output length, enable prompt caching for repeated context, batch non-urgent jobs, trim system prompts, and only retrieve the RAG chunks you actually need.

Question 4

What is a good AI cost per user?

Accepted Answer

It depends on price point, but for a paid SaaS many teams target AI cost under 20–30% of revenue per user. Use the SaaS pricing calculator to check your margin.

Question 5

Do these prices include caching and batch discounts?

Accepted Answer

Only if you set the cached % and batch % fields. By default the calculator assumes full standard pricing.

AI API Cost Calculator

The formula

Questions

Related calculators & guides