Phase 1: When GraphRAG wins and when vanilla RAG wins
GraphRAG is not a universal upgrade over standard RAG. It has a significantly higher offline indexing cost (LLM-heavy extraction over every chunk), more complex infrastructure, and higher per-query cost (community summaries are large). The decision depends on your query distribution and corpus characteristics.
**GraphRAG wins when:**
- Queries require multi-hop reasoning: 'What companies are connected to the CEO of X through board memberships?' Dense retrieval cannot follow multi-hop chains — it retrieves documents mentioning X but does not traverse relationships.
- Queries ask about corpus-wide patterns: 'What are the major themes in this collection of 500 earnings calls?' This requires synthesizing across the entire corpus, not retrieving similar chunks.
- Queries involve entity relationships: 'What is the relationship between Organization A and Person B across all these reports?' The knowledge graph captures these relationships explicitly.
- Your corpus is a coherent domain (e.g., all SEC filings for a company, all papers in a research area, all contracts in a legal matter) where entities and relationships are meaningful.
**Vanilla RAG wins when:**
- Queries are lookup-style: 'What is the return policy?', 'What does function X do?' Standard retrieval is fast, cheap, and accurate for these.
- Cost is constrained: GraphRAG construction at $100-500/1M corpus tokens is expensive for large corpora. A 100M-token corpus could cost $10K-50K to index.
- Corpus updates frequently: re-indexing after corpus changes requires re-running entity extraction over changed documents (incremental re-indexing is supported in GraphRAG v2 but adds complexity).
- Your team lacks experience with graph data structures: debugging entity extraction errors, community detection failures, and graph query logic requires graph expertise that most ML teams don't have on day 1.
**Decision rule (simplified):** if more than 20% of your queries involve multi-hop reasoning, entity-relationship questions, or corpus-wide synthesis, benchmark GraphRAG. If fewer than 20%, standard hybrid RAG (see hybrid search BM25 + dense) is likely the better investment.