Knowledge graph RAG (sometimes called GraphRAG, KG-RAG, or structured RAG) extends retrieval from 'find similar documents' to 'traverse relationships between entities.' For use cases where the answer lives across multiple connected documents rather than in any single one, graph-augmented retrieval is a step change. This post is when knowledge graph RAG earns its added complexity, how we build these systems, and where they fail.
When plain RAG is not enough
Plain RAG works well when the answer is in one or two chunks. 'What is our refund policy?' retrieves the refund policy document; the LLM answers from it. Straightforward.
Plain RAG breaks down when the answer requires joining information across documents. 'Which customers from Europe bought product X and had support tickets in the last quarter?' requires joining CRM data, product catalog, and ticket database. Retrieval alone returns fragments; the LLM has to reconstruct the join in context, often unreliably.
Knowledge graph RAG handles this explicitly. Entities (customer, product, ticket) are nodes; relationships are edges. The query becomes a graph traversal; retrieval returns connected subgraphs; the LLM reasons over structured context.
Architecture
Dual index. Vector database for semantic retrieval (text chunks, unstructured documents). Knowledge graph (Neo4j, TigerGraph, or similar) for entity relationships. Both populated from the same source data during ingestion.
Query router. LLM (or smaller classifier) decides whether the query is semantic (use vector), relational (use graph), or both. 'Show me similar documents' to vector. 'How are A and B connected?' to graph. 'Find documents about A connected to B' to both.
Result merging. When both paths run, results are combined into LLM context. Relationships from the graph; supporting text from the vector store. LLM sees a structured context with both the facts and the prose.
Graph construction from unstructured data
The hard part. Extracting entities and relationships from text requires NLP. Named entity recognition for entities; relation extraction for edges. Modern approach: LLM-based extraction with structured output. Pass chunks through an LLM prompted to extract (entity, relationship, entity) triples.
Reliability: 70-90% on typical corpora. Never perfect. For high-stakes applications, human review of extracted graph is worthwhile. See structured outputs post.
Entity resolution: the same entity mentioned multiple times needs to be recognized. 'Apple Inc.', 'Apple', 'AAPL' are the same node. Standard ER techniques apply; LLM-based resolution works well for noisy inputs.
GraphRAG (Microsoft) and related approaches
Microsoft's GraphRAG paper and open-source release (2024) popularized a specific approach: community detection on the graph, summaries at each community level, query routing to the right community summary. Effective for very large corpora where plain RAG struggles with aggregate questions.
Not always the right choice. GraphRAG's indexing cost is high (extracting graph from unstructured text is expensive LLM work). For corpora under 100K documents without strong relational structure, plain RAG is often sufficient and much cheaper.
Use cases that fit
Investigations and research: connecting people, companies, events across documents. Intelligence analysis, journalism, due diligence. The graph structure matches the work.
Enterprise search across structured business data: customers linked to orders linked to products linked to support tickets. Business questions that span these relationships benefit.
Scientific literature: papers cite papers, authors collaborate with authors. Co-citation networks and author graphs add real value beyond semantic retrieval alone.
Regulatory compliance: regulations reference other regulations, amend earlier rules, apply to specific entity types. The graph of references is critical. See compliance post.
Limitations and pitfalls
Indexing cost. Graph construction from unstructured text is 5-20x the cost of plain vector indexing. Budget accordingly or use incremental construction.
Stale graphs. As source data changes, the graph drifts out of sync. Either rebuild periodically (expensive) or incremental updates (complex). Mature KG-RAG systems have dedicated pipelines for this.
Over-engineering. Many teams reach for KG-RAG when plain RAG with good chunking would suffice. Start with plain RAG; add graph structure only when you can name specific questions plain RAG can't answer well.
Rollout
Phase 1: plain RAG, measure where it fails. See RAG patterns post. Phase 2: build graph on top of existing RAG; route graph-appropriate queries. Phase 3: merge results; refine as evidence accumulates. Don't skip Phase 1 — you need the baseline.