eazyware
Engineering·July 29, 2024·11 min read

GraphRAG patterns: Microsoft GraphRAG and alternatives

GraphRAG extracts entities and communities from documents; queries combine community summaries with local context. Where it wins, where it does not.

KR
Kushal R.
Engineering lead

If knowledge graph RAG is the what, GraphRAG patterns are the how — specific architectural patterns that have emerged from 2024-2025 open-source work and production deployments. Global summaries, community detection, query-focused exploration, and hybrid approaches each fit different query types. This post is the catalog of GraphRAG patterns with deployment guidance.

Build-time vs query-time
Microsoft GraphRAG — two-phase design Phase 1: Offline indexing · Extract entities + relations from docs · Build graph; detect communities · LLM-generate community summaries expensive one-time cost per corpus Phase 2: Query-time Global: aggregate community summaries Local: traverse graph near entities Hybrid: combine both routing: query classifier picks mode When GraphRAG earns its complexity · Global queries: "what are the main themes across these 500 reports?" · Multi-hop questions spanning documents · Corpora stable enough that offline indexing cost amortizes skip if: freeform chat with frequently-changing docs; simple single-doc lookup
Build-time: extract entities, build graph, detect communities, generate summaries. Query-time: route query, traverse relevant subgraph, merge context, LLM synthesis.

Build-time phases

Entity extraction. LLM reads each chunk, identifies entities (people, companies, products, events), outputs structured triples. Quality depends on prompt and domain — for well-known entity types, modern LLMs are reliable; for specialized domains, you may need examples or domain-specific extractors.

Entity resolution. Merge entities referring to the same real-world thing. Critical for a useful graph — without resolution, 'Apple' mentioned in 100 documents becomes 100 separate nodes.

Relationship extraction. Connect entities via relationships (works at, acquired, references, etc). Again LLM-based; domain-specific fine-tuning helps at scale.

Community detection. Louvain, Leiden algorithms on the graph. Produces hierarchical communities — small tight-knit clusters nest inside larger ones. Each level gets a summary.

Community summarization. LLM-generated summary of each community. 'This cluster of 40 entities relates to semiconductor supply chain risks; key players include...' Hierarchical summaries enable efficient query routing.

Query-time patterns

Query-focused exploration. For specific entity questions ('tell me about X'), start at the entity node, traverse neighbors, gather context. Bounded BFS to N hops away. LLM synthesizes from retrieved subgraph.

Global summary queries. For aggregate questions ('what are the themes'), route to community summaries at appropriate level. For a 100K-document corpus, top-level communities give high-altitude view; descend for details.

Hybrid: mix semantic retrieval with graph traversal. Start with vector retrieval to find entry points; from each, explore the graph neighborhood. Covers both 'find relevant' and 'explore from here.'

Implementation choices

Graph store. Neo4j is mature and widely deployed. TigerGraph and Memgraph are alternatives. Simpler setups can use Postgres with graph extensions (Apache AGE) or in-memory graphs with NetworkX. For sub-10M node graphs, Postgres is often sufficient; beyond that, dedicated graph databases earn their complexity.

Index strategy. Entity nodes indexed by name and aliases. Relationships indexed by type. For large graphs, community IDs as index for fast community-level queries.

LLM for summarization. Cheaper models for community summaries (they run once at build time but many summaries). Frontier models for final query synthesis.

Cost considerations

Build-time is dominated by LLM extraction. For a 100K-document corpus with average 1K token chunks: ~100M tokens of extraction work. At $0.50/1M tokens (mid-tier model), that's $50 of raw inference. Add community summarization: 10-20K summaries × 1K tokens = another $10-20. Total build: $60-100 for 100K docs.

Query-time cost is similar to plain RAG — one LLM call per query plus the overhead of graph traversal (negligible).

Incremental updates matter. Rebuilding from scratch on every corpus change is expensive. Design for incremental: new documents extracted, added to graph, affected communities re-summarized.

When GraphRAG earns complexity

Corpora over 100K documents where aggregate questions matter. 'What are the main themes' in a small corpus doesn't need community summaries; plain RAG handles it. At scale, communities become the only way to answer.

Relational question workloads. If users ask 'how is X connected to Y' frequently, the graph structure pays off. For factoid questions, don't bother.

Evolving corpora with stable entity spaces. News, regulatory filings, research literature — new documents mention existing entities. Graph structure keeps building over time. See knowledge graph RAG post.

Common pitfalls

Over-extraction. Pulling too many entities creates a cluttered graph with many weak connections. Prefer quality over quantity in extraction prompts. See RAG patterns post.

Stale summaries. Community summaries built 6 months ago referring to relationships that have changed. Invalidate summaries when underlying entities change significantly.

Ignoring evals. GraphRAG complexity demands evaluation rigor. Build eval sets specifically for graph-dependent questions; measure whether added complexity helps. See eval infrastructure.

Read next
Knowledge graph RAG: when relations beat chunks
Read next
Multi-hop retrieval: questions that span documents
Read next
Six RAG patterns that actually work in production
Tags
GraphRAGretrievalentity extraction
/ Next step

Want to talk about this?

We love debating this stuff. 30-minute call, no pitch, just engineering conversation.

~4h
avg response
Q2 '26
next slot
100%
NDA on request