Knowledge graph RAG (sometimes called GraphRAG, KG-RAG, or structured RAG) extends retrieval from 'find similar documents' to 'traverse relationships between entities.' For use cases where the answer lives across multiple connected documents rather than in any single one, graph-augmented retrieval is a step change. This post is when knowledge graph RAG earns its added complexity, how we build these systems, and where they fail.

KG-RAG architecture

Dual index: vector DB for semantic retrieval, knowledge graph for entity relationships. Query router decides which path per question. Results merged for LLM context.

When plain RAG is not enough

Plain RAG works well when the answer is in one or two chunks. 'What is our refund policy?' retrieves the refund policy document; the LLM answers from it. Straightforward.

Plain RAG breaks down when the answer requires joining information across documents. 'Which customers from Europe bought product X and had support tickets in the last quarter?' requires joining CRM data, product catalog, and ticket database. Retrieval alone returns fragments; the LLM has to reconstruct the join in context, often unreliably.

Knowledge graph RAG handles this explicitly. Entities (customer, product, ticket) are nodes; relationships are edges. The query becomes a graph traversal; retrieval returns connected subgraphs; the LLM reasons over structured context.

Architecture

Dual index. Vector database for semantic retrieval (text chunks, unstructured documents). Knowledge graph (Neo4j, TigerGraph, or similar) for entity relationships. Both populated from the same source data during ingestion.

Query router. LLM (or smaller classifier) decides whether the query is semantic (use vector), relational (use graph), or both. 'Show me similar documents' to vector. 'How are A and B connected?' to graph. 'Find documents about A connected to B' to both.

Result merging. When both paths run, results are combined into LLM context. Relationships from the graph; supporting text from the vector store. LLM sees a structured context with both the facts and the prose.

Graph construction from unstructured data

The hard part. Extracting entities and relationships from text requires NLP. Named entity recognition for entities; relation extraction for edges. Modern approach: LLM-based extraction with structured output. Pass chunks through an LLM prompted to extract (entity, relationship, entity) triples.

Reliability: 70-90% on typical corpora. Never perfect. For high-stakes applications, human review of extracted graph is worthwhile. See structured outputs post.

Entity resolution: the same entity mentioned multiple times needs to be recognized. 'Apple Inc.', 'Apple', 'AAPL' are the same node. Standard ER techniques apply; LLM-based resolution works well for noisy inputs.

GraphRAG (Microsoft) and related approaches

Microsoft's GraphRAG paper and open-source release (2024) popularized a specific approach: community detection on the graph, summaries at each community level, query routing to the right community summary. Effective for very large corpora where plain RAG struggles with aggregate questions.

Not always the right choice. GraphRAG's indexing cost is high (extracting graph from unstructured text is expensive LLM work). For corpora under 100K documents without strong relational structure, plain RAG is often sufficient and much cheaper.

Use cases that fit

Investigations and research: connecting people, companies, events across documents. Intelligence analysis, journalism, due diligence. The graph structure matches the work.

Enterprise search across structured business data: customers linked to orders linked to products linked to support tickets. Business questions that span these relationships benefit.

Scientific literature: papers cite papers, authors collaborate with authors. Co-citation networks and author graphs add real value beyond semantic retrieval alone.

Regulatory compliance: regulations reference other regulations, amend earlier rules, apply to specific entity types. The graph of references is critical. See compliance post.

Limitations and pitfalls

Indexing cost. Graph construction from unstructured text is 5-20x the cost of plain vector indexing. Budget accordingly or use incremental construction.

Stale graphs. As source data changes, the graph drifts out of sync. Either rebuild periodically (expensive) or incremental updates (complex). Mature KG-RAG systems have dedicated pipelines for this.

Over-engineering. Many teams reach for KG-RAG when plain RAG with good chunking would suffice. Start with plain RAG; add graph structure only when you can name specific questions plain RAG can't answer well.

Rollout

Phase 1: plain RAG, measure where it fails. See RAG patterns post. Phase 2: build graph on top of existing RAG; route graph-appropriate queries. Phase 3: merge results; refine as evidence accumulates. Don't skip Phase 1 — you need the baseline.

Knowledge graph RAG: when relations beat chunks

When plain RAG is not enough

Architecture

Graph construction from unstructured data

GraphRAG (Microsoft) and related approaches

Use cases that fit

Limitations and pitfalls

Rollout

Continue the thread.

GraphRAG patterns: Microsoft GraphRAG and alternatives

Multi-hop retrieval: questions that span documents

Six RAG patterns that actually work in production

Want to talk about this?