The formula
Without caching = requests × (static + dynamic) × input price + output. With caching = hits × (static × cached price + dynamic × input price) + misses × (static × input price × write multiplier + dynamic × input price) + output. Break-even hit rate = write penalty ÷ (per-hit saving + write penalty).
Questions
How much can prompt caching save?
It depends on how large and how reused your static prefix is. With a big system prompt and a high hit rate, savings of 50–90% on input cost are common.
Does caching ever cost more?
On a cache miss some providers charge a small write premium. If your hit rate is too low or the prefix is small, caching can be net-negative — this calculator shows the break-even hit rate.
What is a cache hit rate?
The share of requests whose prefix is already cached. Higher is better; it depends on traffic patterns and how identical your prefix is across requests.