AI code review in CI has gone from 'curiosity' to 'standard practice' at forward-leaning engineering teams over the last 18 months. The tools — CodeRabbit, GitHub's built-in review AI, custom implementations — are genuinely useful. They also produce noise that can undermine trust. This post is how we tune AI code review to add signal without adding friction.
What AI catches well
Null or undefined handling gaps. Race conditions, shared state bugs, async/await mistakes. Security vulnerabilities — secrets in code, SQL injection patterns, authorization gaps, CORS misconfigurations. API contract inconsistencies across files. LLMs spot these reliably.
What AI does badly
Style and naming preferences. Commentary and documentation suggestions. Refactor suggestions for code that already works. All these add clutter, dilute attention from actual bugs, and train engineers to dismiss.
Tuning for signal
System prompt: 'Only flag likely bugs, security issues, or API inconsistencies. Do not suggest style changes, renames, or additional comments. Do not suggest refactors unless they fix a bug.' This alone drops low-signal comments by 60-80%.
Confidence threshold: ask the model to self-rate confidence. Only publish comments with confidence above threshold (70% starting point). Tune based on dismissal rates — if engineers dismiss over 50%, raise the threshold.
Dismissal feedback loop: when engineers dismiss with 'not a bug,' capture the signal. Use it in future prompts or to fine-tune. Convert dismissal from symptom into fix.
File and language scope: skip tests, generated code, vendored dependencies, migrations. Focus AI attention on application code where bugs have real impact.
Integration patterns
Comment inline on PR as a bot. Make sure attributed clearly — pretending it's human is worse than labeling. Don't block merges on AI comments; humans have final say. Track comments-per-PR (should trend down), dismissal rate (below 40% is healthy), and bugs-caught-vs-shipped.
The cultural piece
Some engineers resent AI reviewing their code. Frame it right: AI is a fast first pass for obvious issues so human reviewers can focus on architecture and design. Not 'AI judging your code.' Framing matters.