How does RAG fit into a prompt?
A RAG prompt has three parts. First, your **instructions** — the role, the answer format, and the grounding rule ("answer only from the context below"). Second, the **retrieved context** — the chunks your retriever pulled for this query, clearly delimited and ideally numbered or tagged with their source. Third, the **question** itself.
The order and the delimiters matter. Put the instruction about grounding near the question so it is the last thing the model reads, wrap the retrieved chunks in obvious markers (for example, a CONTEXT block with each source labeled), and tell the model what to do when the answer isn't in the context — usually "say you don't know" rather than improvise.
This is where prompting and retrieval interlock. A perfect retriever still produces bad answers if the prompt lets the model wander off-context; a perfect prompt produces nothing useful if the retriever surfaces the wrong chunks. Treat them as one system.