The 30-second version
By default, an AI only knows what it was trained on, plus whatever you paste into the chat. It does not know your contracts, your product manual, or last week's meeting notes. RAG fixes that. You give the AI a library of your documents. When someone asks a question, it searches the library, pulls the relevant pages, and answers from them.
The important part: the AI does not memorize your documents. It looks them up fresh every time. So when you update a document, the next answer uses the new version, not a stale copy baked in months ago.
A mental model you can keep
Picture an assistant with a filing cabinet full of your documents. You ask a question, they walk to the cabinet, pull the right folder, read it, and answer. They did not memorize the cabinet. They look it up each time, so the answer is always based on what is in the cabinet right now.
That filing cabinet is RAG. There is a close cousin worth knowing: memory. If RAG is the filing cabinet the assistant reads from, memory is a notebook the assistant keeps about you, jotting down what you prefer and what you decided so they remember it next time. RAG looks things up from a fixed library. Memory saves things that build up over time. Good systems often use both.
How it works, in plain terms
When you add documents to a RAG system, it breaks them into smaller pieces and files each piece by meaning, not by keyword. That is why you can ask "how do I get my money back" and it finds the page titled "Refund policy," even though the words do not match.
When you ask a question, the system finds the handful of pieces most relevant to your question and hands only those to the AI, along with your question. The AI reads them and answers. The quality of the answer depends almost entirely on whether the right pieces got pulled, which is why the real engineering in a good RAG system is the looking-up, not the writing.
Where RAG earns its keep, and where it is overkill
RAG is the right tool when you want an AI to answer from a body of knowledge that is yours and that changes: your policies, your product docs, your support history. It is how you get an assistant that actually knows your business instead of a smart stranger who only knows generic facts.
It is overkill when the AI only needs general knowledge it already has, or when the "library" is one short page you could just paste into the prompt. And a common question worth answering: RAG is usually the better first move than fine-tuning. Fine-tuning retrains the model on your examples, which is slower, more expensive, and bakes the information in. RAG just looks your documents up, stays current, and is far easier to keep accurate.
The short reality check
RAG does not make an AI smarter. It makes a capable AI answer from the right source instead of from memory or guesswork. It is only as good as two things: the documents you give it and whether it pulls the right ones. Feed it stale or messy documents and it will answer from stale or messy documents, confidently. The win is that its answers become checkable, because you can see what they were based on.
Short explainer video coming soon.
How this connects to what we build
A lot of what businesses actually want from AI is this: an assistant that knows their documents, their policies, and their history, and answers from them instead of making things up. That is RAG work, and it is a large part of the custom agents and skills we build. The standard is the same as always. It has to save time, cut mistakes, or protect revenue. If retrieval would not do one of those for you, we will tell you so.