**Scenario 1 — Customer support bot answering from your help docs.** Information-dominant: model doesn't know your docs. **RAG.** Index docs into vector DB; retrieve relevant chunks per query. Fine-tuning would require constant retraining as docs update.
**Scenario 2 — Sales chat that needs your brand voice consistently.** Behavior-dominant: voice is the pattern. **Fine-tuning** on 500–2000 example exchanges in brand voice. RAG can't make a model 'sound like you' reliably; it can give it relevant info but voice consistency comes from the weights.
**Scenario 3 — Sales chat that answers about pricing AND uses brand voice.** Both problems. **RAG for pricing data (changes), fine-tuning for voice (stable behavior).** This combination is the strongest pattern for product chat; underbuilt teams pick one and underperform.
**Scenario 4 — Code-generation assistant for your internal codebase.** Information-dominant: model doesn't know your code. **RAG over your codebase** with semantic + symbol-based retrieval. Fine-tuning on your code patterns can help with style consistency, but the primary need is information access, not behavior shaping.
**Scenario 5 — Email-draft generator following a specific 5-step template.** Behavior-dominant: structural pattern. **Fine-tuning** on 500–1000 examples following the template. Could also work with strong prompt engineering + few-shot in user prompt; choose based on volume — high volume justifies fine-tuning's setup cost.
**Scenario 6 — Legal research tool answering from current case law.** Information AND freshness critical. **RAG only** — case law updates frequently, fine-tuning can't keep current. Add periodic prompt-engineering updates for behavior shaping without retraining.
**Scenario 7 — Classification model that picks one of 12 internal categories.** Behavior-dominant: structured output to a fixed schema. **Fine-tuning a small model** (DistilBERT, Llama 7B class) often outperforms prompting frontier LLM on this exact task at 1/100th the cost. RAG isn't relevant; categories are stable. This is a case where fine-tuning the small model beats both RAG and frontier prompting.
Picking RAG or fine-tuning based on what's trending: leads to expensive architecture mistakes that take 3–6 months to recognize and another 3–6 to unwind. The misalignment is invisible until production scale exposes it.
Mapping problem type to technique: RAG for fresh-information needs, fine-tuning for behavior consistency, both when both problems exist. Architecture decisions hold across years instead of needing rework every 6 months.