Multi-tenancy is the hardest part of shipping AI in B2B SaaS. Traditional isolation handles databases, auth, and row-level access. AI adds surfaces — retrieval indexes, context assembly, cache keys, logs — each of which can leak data across tenants if implemented naively. This post is the isolation pattern we deploy, and the three failures we see most often.
Why AI multi-tenancy is uniquely hard
Traditional SaaS: tenant_id column on rows, ORM middleware enforces filtering. Hard to leak if you use the framework correctly.
AI adds: vector indexes (may or may not encode tenant_id), context assembly (pulling from multiple sources), LLM context window (ephemeral but logged), response caches (keyed by prompt not tenant), system prompts (shared across tenants, contain assumptions). Each is a potential leak point.
The isolation layers
Layer 1 — Authentication carries tenant_id. Every authenticated request knows its tenant. JWT or session token encodes it. Foundation for everything below.
Layer 2 — Retrieval always filters by tenant. Vector DB query includes tenant_id filter. Keyword search includes it. Hybrid search enforces on both sides. This must be in the retrieval library, not in calling code — otherwise developers forget on the third place they write retrieval logic.
Layer 3 — Context sanitization. Assembled context to LLM reviewed for cross-tenant references. Example failure: retrieval filter was correct, but one of the documents contains 'see attached message from Customer B, Tenant 42.' The document body leaks. Mitigation: sanitize during ingestion, or add a sanitization pass at retrieval time.
Layer 4 — Cache isolation. Semantic caching saves real cost but is a cross-tenant leak if naive. Tenant A asks 'what is our policy on X'; Tenant B asks the same. If cache key is just the prompt, Tenant B gets Tenant A's answer. Cache key must include tenant_id.
Layer 5 — Audit logs. Every AI request, retrieval, and response logged with tenant_id. Logs queryable by tenant. Retention policies per tenant if required.
The three common failures
1. Retrieval filter missed in one code path. A new feature added; developer wrote their own retrieval query and forgot the tenant filter. Fix: retrieval library enforces tenant_id at the interface level.
2. Shared cache across tenants. Developers added caching; didn't consider tenant dimension. Fix: cache key must include tenant_id, enforced in caching wrapper.
3. Log aggregation across tenants. Logs go to shared system without tenant labeling. Analyst querying for debugging sees across tenants. Fix: tenant_id as structured log field on every entry.
Testing for tenant isolation
Red-team your own system. User A asks questions that should only be answerable from Tenant A's data; verify Tenant B's data cannot leak. Add as automated CI tests. See red-teaming post.
Specific tests: identical document names in two tenants, verify retrieval returns the right one. Document in Tenant A referencing Tenant B by name, verify Tenant B cannot retrieve it. Identical prompts from two tenants with different expected answers, verify no cache crossover.