Prompt Caching Savings Calculator

See how much prompt caching saves when you reuse a large, stable prompt prefix — system instructions, few-shot examples or retrieved context. Compare cost with and without caching and find the cache hit rate where it pays off.

Pricing updated 2026-06-19. Estimates only.

The formula

Without caching = requests × (static + dynamic) × input price + output. With caching = hits × (static × cached price + dynamic × input price) + misses × (static × input price × write multiplier + dynamic × input price) + output. Break-even hit rate = write penalty ÷ (per-hit saving + write penalty).

Questions

How much can prompt caching save?

It depends on how large and how reused your static prefix is. With a big system prompt and a high hit rate, savings of 50–90% on input cost are common.

Does caching ever cost more?

On a cache miss some providers charge a small write premium. If your hit rate is too low or the prefix is small, caching can be net-negative — this calculator shows the break-even hit rate.

What is a cache hit rate?

The share of requests whose prefix is already cached. Higher is better; it depends on traffic patterns and how identical your prefix is across requests.

Related calculators & guides