The formula
Conversations = visitors × open rate% × conversations/visitor. Input tokens per message = system prompt + user message + half the conversation history + retrieved RAG chunks. Monthly = messages × (input tokens × input price + output tokens × output price) ÷ 1,000,000 + embeddings.
Questions
How much does an AI chatbot cost per month?
From a few dollars for a low-traffic FAQ bot to hundreds for a high-volume RAG support bot. The main drivers are conversation count, messages per conversation and how much context each message carries.
Does RAG make a chatbot more expensive?
Yes — retrieved chunks add input tokens on every message and you pay for query embeddings. It is usually still cheap, but retrieve only what you need and cache stable knowledge.
What model should a chatbot use?
Most support and FAQ bots work well on a small, fast model. Reserve premium models for complex reasoning or regulated domains where answer quality is critical.