Insurance was built on actuarial models; AI isn't replacing that, it's augmenting the unstructured-text layer around it. Claims narratives, underwriting applications, loss-adjuster notes, medical records in liability claims — this is where LLMs add value. This post is the split we've seen across our P&C and life insurer engagements.
Where LLMs fit in insurance workflows
First-notice-of-loss (FNOL) triage
A claim arrives via phone, email, or form. LLM reads the narrative, extracts structured fields (loss type, estimated severity, urgency indicators, suspected fraud signals), classifies into an appropriate claim track. Routing a catastrophic loss immediately to a senior adjuster vs auto-routing a minor fender-bender saves significant cycle time. In our deployments, FNOL triage reduces median time-to-assignment by 60-80%.
Document-heavy claim review
Complex claims — commercial liability, bodily injury, business interruption — involve hundreds of pages of medical records, invoices, photos, third-party reports. LLMs extract key facts, flag inconsistencies, link citations back to source documents. This is a direct time saving for adjusters: 4 hours of reading becomes 45 minutes of reviewing AI-prepared summaries with links to source.
Subrogation and third-party liability research
Given an auto claim, was there a third party at fault? The LLM reads police reports, witness statements, adjuster notes, and extracts the liability narrative. It suggests subrogation candidates. Adjuster validates. Modest but real recovery uplift.
Policy and coverage lookup
Frontline customer service representatives answering coverage questions. RAG over policy documents, endorsements, and state-specific regulations. Response quality improves markedly over keyword-search; training time for new CSRs drops.
Where LLMs do not fit
Pricing decisions, underwriting decisions, anti-discrimination-regulated decisioning: these require regulator-explainable models. A GLM, a GBM, or a logistic regression with documented features is what regulators expect. An LLM in the decision loop makes the model undefendable in state insurance department audits.
Fraud graph analysis (detecting organized rings via network anomalies) is a classical graph-ML problem. LLMs can read individual narratives and flag suspicious ones, but the cross-claim, cross-provider pattern detection belongs to specialized ML. See our fraud detection post.
Insurance-specific compliance concerns
Fair claims practices laws (state-level in the US, similar elsewhere) require consistent treatment of claimants. Any AI system that influences claim outcomes must be auditable — which claims did it recommend denying, on what basis. Log every recommendation with rationale and the documents that fed the decision.
Model risk management standards (SOX-level rigor in most carriers) apply to AI models as they do to pricing models. Version control, testing procedures, rollback plans, monitoring. The LLM-specific bits — prompt version control, eval datasets — plug into the existing MRM framework.
HIPAA concerns for health claims: medical records in liability or workers comp claims are PHI when you're handling them. See the healthcare post for the compliant stack.
Our deployment pattern
Start with a narrow, high-volume workflow (FNOL triage is common). Ship with human-in-loop review for 100% of AI decisions for the first 8-12 weeks. Collect disagreement data — where the adjuster overrode the AI. Use that data to tune. Gradually lift the review requirement to low-confidence cases only. Full autonomous routing takes 6-9 months of this iteration in our experience. Trying to skip ahead is how insurers ship AI that gets pulled six months later after a regulatory complaint.