Detect Auto Claims Fraud

Insurance Risk / Security Decrease Costs Improve Customer Experience Reduce Risk Binary Classification Blend Business Rules Engine End to End Fraud Detection
Predict claims fraud to enable straight through processing (STP) of payments to auto insurance claims.
Request a Demo


Business Problem

Insurance companies face many challenges when trying to optimize the efficiency of processing auto insurance claims. On average, it takes about 20 days to process a claim, which often frustrates policyholders. Insurance companies look for ways to increase the efficiency of their claims workflows.

Since increasing the number of claim handlers is expensive, insurance companies have increasingly relied on automation to accelerate the process of paying or denying their auto insurance claims. Automation can increase Straight-Through Processing (STP) by more than 20%, resulting in faster claims processing and improved customer satisfaction.

However, STP brings with it several considerations, with auto claims fraud being the most imperative. As insurance companies increase the speed by which they process claims, they also increase their risk of exposure to fraudulent claims. Unfortunately, most of the systems widely used to prevent fraudulent claims from being processed either require high amounts of manual labor or rely on static rules.

Intelligent Solution

While Business Rule Management Systems (BRMS) will always be required as they implement mandatory rules related to compliance, AI has the ability to supplement these systems by improving the accuracy of predicting which incoming claims are fraudulent.

By learning from historical cases of fraud and their associated features, AI applies its learnings to new claims to assess whether they fall under the same fraudulent patterns. Unlike BRMS, which are static and have hard-coded rules, AI generates a probabilistic prediction and gives transparency on the unique drivers of fraud for each suspicious claim.

This allows investigators to not only route and triage claims by their likelihood of fraud, but also enables them to accelerate the review process as they know which vectors of a claim they should evaluate. The probabilistic predictions also allow investigators to set thresholds that automatically approve or reject claims.

Value Estimation

What has ROI looked like for this use case? 

For ROI, multiple AI models are involved in a STP use case. For example, fraud detection, claims severity prediction, and litigation likelihood prediction are common use cases for machine learning which can augment business rules and human judgment. Insurers implementing fraud detection models have reduced payments to fraud by 15% to 25% annually, saving $1 million to $3 million.

How would I measure ROI for my use case? 

  • Identify the number of fraudulent claims that machine learning detected but manual processing failed to identify (false negatives). Then, calculate the monetary amount that would have been paid on these fraudulent claims if machine learning had not flagged them as fraud. 

For instance, 100 fraudulent claims * $20,000 each on average = $2 million per year

  • Identify fraudulent claims that manual investigation detected but machine learning failed to detect and calculate the monetary amount that would have been paid without manual investigation. 

40 fraudulent claims * $5,000 each on average = $0.2 million per year

  • The difference between these two numbers would be the ROI.

$2 million – $0.2 million = $1.8 million per year

Technical Implementation

Before We Get Started

The first step is to work with executives to identify and prioritize the decisions for which automation will offer the greatest business value. In this example, the executives agree that achieving over 20% STP in claims payment is a critical success factor and that minimizing fraud is one of their top priorities.

Then, working with subject matter experts, the team develops a shared understanding of STP in claims payment. In particular, the team builds a decision logic for claims processing. The first step is to determine which specific decisions to automate. The best practice is that the simple claims are automated and the more complex claims are sent to a human claims processor. The next step is to determine which decisions will be based on business rules and which decisions will be based on machine learning. The best practice is that the decisions that rely on compliance and business strategy are managed by rules, and decisions that rely on experiences, including whether a claim is fraudulent and how much the payment will be, are handled by machine learning. Once the decision logic is in good shape, it is time to build business rules and machine learning models. Clarifying the decision logic reveals the true data needs, which helps decision owners see exactly what data and analytics drive decisions. 

About the Data  

For illustrative purposes, this guide uses a simulated dataset that resembles the data an insurance company would have. The dataset consists of 10,746 rows and 45 columns. 

Problem Framing

The target variable for this use case is whether or not a claim submitted is fraudulent. It is a binary classification problem. In this dataset 1,746 of 10,746 claims (16%) are fraudulent. 

Below are examples of 44 features that can be used to train a model to identify fraud. They consist of historical data on customer policy details, claims data including free-text description, and internal business rules from national databases. These features help DataRobot extract relevant patterns to detect fraudulent claims.

Beyond the features listed below, it might help to incorporate any additional data your organization may collect that could be relevant to detecting fraudulent claims. For example, DataRobot is able to process image data as a feature together with numeric, categorical, and text features. Images of vehicles after the accident may be useful to detect fraud and help predict severity. DataRobot will automatically differentiate between important and unimportant features.

Sample Feature List
Feature NameData TypeDescriptionData SourceExample
IDNumericClaim IDClaim156843
DATEDateDate of PolicyPolicy31/01/2013
POLICY_LENGTHCategoricalLength of PolicyPolicy12 month
LOCALITYCategoricalCustomer’s localityCustomerOX29
REGIONCategoricalCustomer’s regionCustomerOX
GENDERNumericCustomer’s genderCustomer1
CLAIM_POLICY_DIFF_ANumericInternal PolicyPolicy0
CLAIM_POLICY_DIFF_BNumericInternal PolicyPolicy0
CLAIM_POLICY_DIFF_CNumericInternal PolicyPolicy1
CLAIM_POLICY_DIFF_DNumericInternal PolicyPolicy0
CLAIM_POLICY_DIFF_ENumericInternal PolicyPolicy0
POLICY_CLAIM_DAY_DIFFNumericNumber of days since policy takenPolicy, Claim94
DISTINCT_PARTIES_ON_CLAIMNumericNumber of people on claimClaim4
CLM_AFTER_RNWLNumericRenewal HistoryPolicy0
NOTIF_AFT_RENEWALNumericRenewal HistoryPolicy0
CLM_DURING_CAXNumericCancellation claimPolicy0
COMPLAINTNumericCustomer complaintPolicy0
CLM_before_PAYMENTNumericClaim before premium paidPolicy, Claim0
PROP_before_CLMNumericClaim HistoryClaim0
NCD_REC_before_CLMNumericClaim HistoryClaim1
NOTIF_DELAYNumericDelay in notificationClaim0
ACCIDENT_NIGHTNumericNight time accidentClaim0
NUM_PI_CLAIMNumericNumber of personal injury claimsClaim0
NEW_VEHICLE_BEFORE_CLAIMNumericVehicle HistoryVehicle, Claim0
PERSONAL_INJURY_INDICATORNumericPersonal Injury flagClaim0
CLAIM_TYPE_ACCIDENTNumericClaim detailsClaim1
CLAIM_TYPE_FIRENumericClaim detailsClaim0
CLAIM_TYPE_MOTOR_THEFTNumericClaim detailsClaim0
CLAIM_TYPE_OTHERNumericClaim detailsClaim0
CLAIM_TYPE_WINDSCREENNumericClaim detailsClaim0
LOCAL_TEL_MATCHNumericInternal Rule MatchingClaim0
LOCAL_M_CLM_ADD_MATCHNumericInternal Rule MatchingClaim0
LOCAL_M_CLM_PERS_MATCHNumericInternal Rule MatchingClaim0
LOCAL_NON_CLM_ADD_MATCHNumericInternal Rule MatchingClaim0
LOCAL_NON_CLM_PERS_MATCHNumericInternal Rule MatchingClaim0
federal_TEL_MATCHNumericInternal Rule MatchingClaim0
federal_CLM_ADD_MATCHNumericInternal Rule MatchingClaim0
federal_CLM_PERS_MATCHNumericInternal Rule MatchingClaim0
federal_NON_CLM_ADD_MATCHNumericInternal Rule MatchingClaim0
federal_NON_CLM_PERS_MATCHNumericInternal Rule MatchingClaim0
SCR_LOCAL_RULE_COUNTNumericInternal Rule MatchingClaim0
SCR_NAT_RULE_COUNTNumericInternal Rule MatchingClaim0
RULE MATCHESNumericInternal Rule MatchingClaim0
CLAIM_DESCRIPTIONTextCustomer Claim TextClaimthis via others themselves inc become within ours slow parking lot fast vehicle roundabout mall not indicating car caravan neck emergency
Data Preparation 

Data from the claim table, policy table, customer table, and vehicle table are merged with customer ID as a key. Only data known before or at the time of the claim creation is used, except for the target variable. Each record is a claim.

Model Training

DataRobot Automated Machine Learning takes care of appropriate preprocessing and partitioning of data. When modeling is done, you will see a variety of models sorted on the Leaderboard, where you will be able to select the best model based on accuracy, modeling speed, scoring speed, or interpretability, whichever is important to you. 

We will jump straight to interpreting the model results. Take a look here to see how to use DataRobot from start to finish and how to understand the data science methodologies embedded in its automation.

Interpret Results

Feature Impact reveals that the number of past personal injury claims (NUM_PI_CLAIM) and internal rule matches (LOCAL_M_CLM_PERS_MATCH, RULE_MATCHES, SCR_LOCAL_RULE_COUNT) are among the strongest features in detecting fraudulent claims. 

Feature Effects (partial dependence plot) shows that the larger the number of personal injury claims (NUM_PI_CLAIM) , the higher the likelihood of fraud (FRAUD). As expected, when a claim matches internal red flag rules, its likelihood of being fraud increases greatly. Interestingly, GENDER and CLAIM_TYPE_MOTOR_THEFT (car theft) are also strong features. 

The current data includes CLAIM_DESCRIPTION as text. A Word Cloud reveals that customers who use the term roundabout, for example, are more likely to be committing fraud than those who use the term emergency. (Perhaps fraudsters have a script to invent a fake story?) In the Word Cloud, the size of a word indicates how many rows include the word, i.e., how often it appears. The more red a word is, the higher association it has to claims scored as fraudulent. Blue words are terms associated with claims scored as non-fraudulent.

Prediction Explanations provide up to 10 reasons for each prediction score, which is available in real time. This provides SIU agents and claim handlers with useful information to check during their investigation. For example, DataRobot not only predicts that Claim ID 8296 has a 98.5% chance of being fraudulent, but it also explains that this high score is due to a specific internal rule match (LOCAL_M_CLM_PERS_MATCH, RULE_MATCHES) and the policyholder’s 6 previous personal injury claims (NUM_PI_CLAIM). When claim advisors need to deny a claim, they can provide the reasons why by consulting Prediction Explanations.

Evaluate Accuracy

The modeling results show that ENET Blender is the most accurate model with 0.93 AUC on cross validation. This is an ensemble of 8 single models. The high accuracy indicates that the model has learned signals to distinguish fraudulent from non-fraudulent claims. It makes sense to interpret the results. (Keep in mind that blenders take longer to score compared to single models and so may not be ideal for real-time scoring.) 

The Leaderboard shows that the modeling accuracy is stable across Validation, Cross Validation, and Holdout. Thus, you can expect to see similar results when you deploy the selected model. 

The steep increase in the average target value in the right side of the Lift Chart reveals that, when the model predicts that a claim has a high probability of being fraudulent (blue line), the claim tends to actually be fraudulent (orange line).

The confusion matrix shows that, of 2,149 claims in the holdout partition, the model predicted 372 claims as fraudulent and 1,777 claims as legitimate. Of the 372 claims predicted as fraud, 275 were actually fraudulent (true positives), and 97 were not (false positives). Of 1,777 claims predicted as non-fraud, 1,703 were actually not fraudulent (true negatives) and 74 were fraudulent (false negatives). Analysts examine this table to determine if the model is accurate enough for business implementation. 


To convert model predictions into decisions, you determine the best threshold to classify a claim as fraudulent or not; you use the ROC Curve tab to do this. The Threshold will differ depending on how you want to use the model predictions. For example, if the main use of the fraud detection model is to automate payment, then you want to minimize the false negatives (the number of fraudulent claims mistakenly predicted as not fraudulent) by adjusting the threshold to classify prediction scores into fraud or not. On the other hand, if the main use is to automate the transfer of the suspicious claims to SIU, then you want to minimize false positives (the number of non-fraudulent claims mistakenly predicted as fraudulent). Typically, you work within certain constraints. For example, you want to minimize the false negatives, but you do not want false positives to go over 100 claims because of the limited resources of SIU agents. In this case, you lower the threshold just to the point where the number of false positives becomes 100.

DataRobot allows you to simulate profit in the Profit Curve tab and set the threshold based on profit. You can play with the Payoff Matrix by setting the payoff associated with each of the four cells in the confusion matrix. For example, you can set the payoff value for true positive at $20,000, or the average payment associated with fraudulent claims. Assuming that a false positive means that the human investigator will not be able to spend time detecting a real fraudulent claim, the payoff value for false positive is -$20,000. True negative leads to auto pay and saves $100 for eliminating the manual processing of the claim. False negative means you miss fraudulent claims and thus you set the payoff value for false negative -$20,000. DataRobot then automatically calculates the threshold that maximizes profit. You can also measure DataRobot ROI by creating the same payoff matrix for your existing business process and subtracting the max profit of the existing process from that calculated by DataRobot.

Once the threshold is set, model predictions are converted into fraud or non-fraud according to the threshold. These classification results are integrated into BRMS and become one of the many factors that determine the final decision.

Business Implementation

Decision Environment 

After a call advisor at FNOL receives a claim, the rule engine applies the business rules and DataRobot generates scores based on the claim data gathered from the initial hearing about the accident, coupled with the policy data and vehicle data. For example, the rules engine checks for flags consisting of black-listed auto repair shops and hospitals, and DataRobot scores the likelihood that a given claim is fraudulent. The final decision is based on the combination of sub-decisions from multiple business rules and those from multiple DataRobot models. The decision logic determines which sub-decisions are made by business rules versus DataRobot, and how to combine the sub-decisions to reach the final decision.

Regarding fraud detection, after the selection of the right model, DataRobot makes it easy to deploy the model into the desired decision environment. In this tutorial, the fraud likelihood scores are sent to BRMS through API and post processed into high risk, medium risk, and low risk based on some thresholds. The sub-decision for the fraud detection model is based on the degree of risk, in which high risk goes to SIU, medium risk goes to claim handlers, and low risk goes to auto pay. This sub-decision is combined with other sub-decisions to form the final decision. In this way, this sub-decision influences the final decision to auto pay, auto deny, send to SIU, or send to claim handlers.

Decision Maturity 

Automation | Augmentation | Blend 

BRMS + DataRobot makes claims processing much faster through automation by allowing certain claims to be auto paid and auto denied. DataRobot also augments human decisions. Claims are directly sent to SIU or claim handlers with different skills based on the degree of risk. Moreover, human investigators can gain insights from explanations for each score that DataRobot provides.

Model Deployment

Predictions will be deployed through the API and sent to BRMS. All the models built by DataRobot AutoML are immediately ready to be deployed through API. Using DataRobot ML Ops, you can monitor, maintain, and update models in a single platform.

Decision Stakeholders

Decision Executors

The decision logic assigns claims that require manual investigation to claim handlers and SIU agents based claim complexity. They investigate the claims referring to insights provided by DataRobot and decide whether to pay or deny. They report to decision authors the summary of claims received and their decisions each week. 

Decision Managers

Executives monitor the KPI dashboard, which visualizes the results of following the decision logic. For example, they track the number of fraudulent claims identified and missed. They can discuss with decision authors how to improve the decision logic each week.

Decision Authors

Senior managers in the claims department examine the performance of the decision logic by receiving input from decision executors and decision managers. For example, decision executors will inform whether or not the fraudulent claims they receive are reasonable, and decision managers will inform whether or not the rate of fraud is as expected. Based on the inputs, decision authors update the decision logic each week.

Decision Process

Instead of claim handlers manually investigating every claim, business rules and machine learning will identify simple claims that should be auto paid and problematic claims that should be auto denied. The solution combines decisions from rules and from machine learning to arrive at one of the following final decisions:

  • Auto pay
  • Auto deny
  • Send directly to Special Investigation Unit (SIU)
  • Assign to claim handlers

Routing to claims handlers includes an intelligent triage, in which claims handlers receive fewer claims and just those which are better tailored to their skills and experience. For example, more complex claims can be identified and sent to more experienced claims handlers. SIU agents and claim handlers will decide whether to pay or deny the claims after investigation. 

Model Monitoring 

Each week decision authors will monitor the fraud detection model and retrain the model if data drift reaches a certain threshold. In addition, along with investigators, decision authors can regularly review the model decisions to ensure that data are available for future retraining of the fraud detection model. Based on the review of the model decisions, the decision authors can also update the decision logic. For example, they might add a repair shop to the red flags list and improve the threshold to convert fraud scores into high, medium, or low risk.

DataRobot provides tools for managing and monitoring the deployments, including accuracy and data drift. 

Implementation Risks

Business goals should determine decision logic, not data. The project begins with business users building decision logic to improve business processes. Once decision logic is ready, the true data needs will become clear.

Integrating business rules and machine learning to production systems can be problematic. Business rules and machine learning models need to be updated frequently. Externalizing the rules engine and machine learning allows decision authors to make frequent improvements to decision logic. When the rules engine and machine learning are integrated into production systems, updating decision logic becomes difficult because it will require changes to production systems.

Trying to automate all decisions will not work. It is important to decide which decisions to automate and which decisions to assign to humans. For example, business rules and machine learning cannot identify fraud 100% of the time; human involvement is still necessary for more complex claim cases.

banner purple waves bg

Experience the DataRobot AI Platform

Less Friction, More AI. Get Started Today With a Free 30-Day Trial.

Sign Up for Free
Explore More Insurance Use Cases
Insurance companies are using machine learning and AI to increase top and bottom line through gaining competitive advantages, reducing expenses, and improving efficiencies. They are optimizing all areas of their business from underwriting to marketing in order to make data-driven decisions to lead to increased profitability.