AI code review in CI has gone from 'curiosity' to 'standard practice' at forward-leaning engineering teams over the last 18 months. The tools — CodeRabbit, GitHub's built-in review AI, custom implementations — are genuinely useful. They also produce noise that can undermine trust. This post is how we tune AI code review to add signal without adding friction.

Signal vs noise

High-signal categories vs low-signal. Tuning: system prompt, confidence threshold, dismissal-rate metric, per-file skip rules.

What AI catches well

Null or undefined handling gaps. Race conditions, shared state bugs, async/await mistakes. Security vulnerabilities — secrets in code, SQL injection patterns, authorization gaps, CORS misconfigurations. API contract inconsistencies across files. LLMs spot these reliably.

What AI does badly

Style and naming preferences. Commentary and documentation suggestions. Refactor suggestions for code that already works. All these add clutter, dilute attention from actual bugs, and train engineers to dismiss.

Tuning for signal

System prompt: 'Only flag likely bugs, security issues, or API inconsistencies. Do not suggest style changes, renames, or additional comments. Do not suggest refactors unless they fix a bug.' This alone drops low-signal comments by 60-80%.

Confidence threshold: ask the model to self-rate confidence. Only publish comments with confidence above threshold (70% starting point). Tune based on dismissal rates — if engineers dismiss over 50%, raise the threshold.

Dismissal feedback loop: when engineers dismiss with 'not a bug,' capture the signal. Use it in future prompts or to fine-tune. Convert dismissal from symptom into fix.

File and language scope: skip tests, generated code, vendored dependencies, migrations. Focus AI attention on application code where bugs have real impact.

Integration patterns

Comment inline on PR as a bot. Make sure attributed clearly — pretending it's human is worse than labeling. Don't block merges on AI comments; humans have final say. Track comments-per-PR (should trend down), dismissal rate (below 40% is healthy), and bugs-caught-vs-shipped.

The cultural piece

Some engineers resent AI reviewing their code. Frame it right: AI is a fast first pass for obvious issues so human reviewers can focus on architecture and design. Not 'AI judging your code.' Framing matters.

AI code review in CI: what actually catches bugs

What AI catches well

What AI does badly

Tuning for signal

Integration patterns

The cultural piece

Continue the thread.

AI pair programming: Copilot, Cursor, Claude Code patterns

Building AI-native developer tools: what developers actually want

Why evaluation infrastructure matters more than prompts

Want to talk about this?