RAG App Cost Calculator

Estimate the cost of a retrieval-augmented generation (RAG) app — one-time indexing, ongoing embeddings, vector-database storage, retrieval and answer generation. See where the money goes and the cost per answer.

Pricing updated 2026-06-19. Estimates only.

The formula

One-time indexing = documents × tokens/doc × embedding price ÷ 1,000,000. Monthly = re-index cost + query embeddings + generation + reranker + vector DB storage. Generation per query = (system + chunks × chunk tokens) × input price + answer tokens × output price, all ÷ 1,000,000.

Questions

How much does a RAG app cost?

A small one-time indexing cost (often a few dollars) plus an ongoing per-query cost dominated by answer generation. Embeddings and vector storage are usually cheap by comparison.

Is embedding or generation more expensive?

Generation, by a wide margin. Embedding models are roughly 100× cheaper per token, so optimisation should focus on the generation model and the amount of context you retrieve.

How much does a vector database cost?

Managed vector DBs typically run $0–$70/month for small-to-mid workloads; self-hosted options cost only the underlying compute and storage.

Related calculators & guides