Feature flags are a core primitive for AI deployment. Gradual rollouts, user-cohort targeting, kill switches on eval regressions — all feature flag capabilities. The difference between traditional feature flags and AI feature flags is mostly in the kill-switch and metric-triggered rollback patterns. This post is the flag architecture we deploy and the specific AI-relevant use cases that justify it.
Four layers
Layer 1 — Per-user overrides. Engineering can force a specific variant for a specific user (themselves, a debug account). Essential for debugging but must be tightly scoped — overrides visible in audit logs, limited to engineers.
Layer 2 — User-cohort targeting. Beta users, tenants on a specific plan, customers in EU region. Targeted rollouts to specific groups before wider release.
Layer 3 — Percentage rollout. 1%, 10%, 50%, 100% of traffic routed to the new variant. See canary deployments post.
Layer 4 — Global kill switch. Instant off across all traffic. The critical safety feature. Eval-triggered auto-toggle, panic button for on-call engineers.
The kill switch is the critical primitive
When a new prompt or model causes issues in production, seconds matter. Rolling forward to find a fix takes minutes to hours. Rolling back via a feature flag takes seconds.
Our default: every AI feature has a kill switch. On-call engineer can disable within 10 seconds. Disabled state falls back to the previous behavior (previous prompt, previous model, or a gracefully-degraded response).
Automated kill switches: eval regression detection triggers the flag off. If live-traffic sampled evals show pass rate dropping below threshold for N consecutive windows, flag auto-flips to off. Engineer gets paged to investigate, not to do an emergency rollback.
AI-specific patterns
Prompt switching. The current production prompt is flagged; a candidate prompt is available behind the flag. Flip to promote; flip back to roll back.
Model routing. User-cohort-based routing (users on Free plan to cheaper model, Enterprise to premium). Implemented as a flag check in the gateway.
Feature disabling. Auto-summarize feature in docs product. Flag off when the summarization quality regresses; users see the feature as temporarily unavailable instead of broken output.
Guardrail toggling. New safety filter behind a flag. Enable for 10% of traffic; measure false-positive rate; adjust before wider rollout. See guardrails post.
Tooling options
LaunchDarkly (commercial), Unleash (OSS), Split (commercial). All have percentage rollouts, user targeting, kill switches, audit logs. Pick based on team size and budget.
Self-built flag systems are common and often justified. A simple Redis-backed flag service with a UI is 1-2 weeks of engineering. For small teams, this is often the right choice.
Key capabilities to insist on: sub-second flag propagation (kill switches depend on fast activation), audit trail (who flipped what when), backup/restore (flag state is critical config).
Flag lifecycle management
Flags accumulate. A codebase with 50+ dormant flags is a maintenance nightmare. Enforce lifecycle:
Create flag with expiration date (3-6 months typical). Regular audit: flags at 100% for 30+ days should be removed (cleanup PR). Flags at 0% for 60+ days should be removed or resurrected (feature abandonment check).
This discipline prevents flag sprawl. The cost of managing 500 flags is much higher than the cost of 50.
Auditability
Every flag change logged. Who, when, what changed. Immutable audit trail. In regulated environments, this log is part of compliance. In all environments, it's essential for incident investigation (why did behavior change on Thursday → someone flipped flag X).