Question 1

How much does a RAG app cost?

Accepted Answer

A small one-time indexing cost (often a few dollars) plus an ongoing per-query cost dominated by answer generation. Embeddings and vector storage are usually cheap by comparison.

Question 2

Is embedding or generation more expensive?

Accepted Answer

Generation, by a wide margin. Embedding models are roughly 100× cheaper per token, so optimisation should focus on the generation model and the amount of context you retrieve.

Question 3

How much does a vector database cost?

Accepted Answer

Managed vector DBs typically run $0–$70/month for small-to-mid workloads; self-hosted options cost only the underlying compute and storage.

Question 4

How do I lower RAG costs?

Accepted Answer

Retrieve fewer chunks, add a reranker, cap answer length, cache stable context, use a smaller generation model, and re-index incrementally.

Question 5

How often should I re-index?

Accepted Answer

Only when documents change. Incremental updates of changed documents are far cheaper than periodic full rebuilds.

RAG App Cost Calculator

The formula

Questions

Related calculators & guides