Reduce Policy Churn For Insurance Renewals

Insurance Underwriting Churn/Retention Improve Customer Experience Increase Revenue Binary Classification Blend Churn / Retention End to End
Proactively increase retention by predicting which policies will churn in the coming policy term.
Request a Demo


Business Problem

In 2017, the Property and Casualty Insurance industry saw few new customers enter the market: just 1% in auto and 4% in home, for a total of 2% overall (Bain). Due to the industry’s low volume of new policyholders, being able to retain existing customers becomes a significant priority. Although insurers put tremendous effort into accurately assessing risk and offering the lowest prices, Bain reports that more than half of US policyholders lapse because they can receive (on average) a 20% reduction in price elsewhere.

Unfortunately, many of the strategies insurers use today to reduce churn are largely reactive. Retention rates are important KPIs that allow insurers to keep track of their relationships with policyholders; however, KPIs only assess historical performance and don’t help insurers learn which policies will churn in the future. For underwriters to apply the appropriate intervention strategies, it’s critical that insurers understand which policies are at risk of churning.

Intelligent Solution

AI helps insurers proactively increase their retention rates by predicting which policies are likely to churn in their upcoming renewals. After learning the complex patterns behind why policies churned in the past, AI models can apply those patterns to policies in the future. These models not only show underwriters the general drivers of churn across their portfolio, but also reveal the top reasons of predicted churn for each individual policy. Using these insights, underwriters can address policies at risk based on their unique attributes.

Senior managers can leverage the aggregated predictions of churn to develop data-driven forecasting on renewals, while pricing actuaries can also use insights from these models to improve the competitiveness of their pricing plans across the various segments of their book.

Value Estimation

How valuable is this use case? 

Depending on the action taken based on model predictions and the size of the book, an improvement in the overall retention rate by 1% could mean a significant improvement in renewal income. Take a book of $1 billion written premium for example: 1% of improved retention rate amounts to a $10 million increase in gross income from this book of business. In addition, model prediction guided actions could potentially improve overall loss ratio; with an improved bottom line, insurers can also profitably grow the top line, improving the health of the book of business.

Technical Implementation

About the Data 

For illustrative purposes, we are going to use a synthetic historical dataset of a personal auto line where we already know whether past policies churned or not. Insurance churn rate is often evaluated at the policy level, therefore, all the features in this dataset are also organized at the policy level. 

Problem Framing

The target variable for this use case is a binary variable: a policy churned (1) or not (0). So this is a binary classification problem.

The features relevant to predicting this target revolve around policy data. Below are several examples of features that may be relevant. That said, beyond these features, we suggest incorporating any additional data your organization may collect that could be relevant to identify predicting churn. DataRobot will help you distinguish which ones are important and which ones aren’t.

Sample Feature List
Feature NameData TypeDescriptionData Source
Avg Driver AgeNumericThe average age for all drivers on the policyPolicy
Avg Premium per VehicleNumericAverage premium per vehicle for the current policy termPolicy
Avg Vehicle AgeNumericThe average age of all vehicles on the policyPolicy
Full Coverage ProportionNumericpercentage of vehicles with full coverage (both liability and physical damage)Policy
Driver CountNumeric# drivers on the policyPolicy
Gender PolicyCategoricalGender of drivers: 0 = all Female, 1 = all Male; 2 = mixedPolicy
Min Driver AgeNumericMinimum driver agePolicy
Policy PremiumNumericPolicy premium for the current termPolicy
IDNumericPolicy IDPolicy
Pct Premium ChangeNumericRelative premium changePolicy
Policy Credit IndicatorCategoricalPolicy Credit IndicatorPolicy
Policy Lapse IndicatorCategoricalWhether the policy has coverage lapses in the past yearPolicy
BI_LimitCategoricalBodily Injury LimitPolicy
Multi-Policy indicatorCategoricalMultiple Policies IndicatorPolicy
Churn IndicatorBinary NumericTarget: Whether a policy has churned or not, 1 = Churned; 0 = Not ChurnedPolicy
TenureNumericNumber of years policy has been insured by the carrierPolicy
TierNumericUnderwriting/Pricing Tiers, 1 = Best tier; 15 = worst tierPolicy
Years Prior InsurerNumeric# years the policyholder was insured by the prior carrierPolicy
Zip CodeCategoricalZip CodePolicy
Vehicle CountNumericNumber of vehicles on the policyPolicy
Data Preparation

Personal auto insurers usually have several databases: policy, vehicle, and claims. The necessary features from the separate tables should be joined so that  churn is evaluated on a policy level. A policy can have more than one vehicle, and churn is defined as when all of the vehicles are removed from the policy, not just one. 

Model Training

DataRobot Automated Machine Learning automates many parts of the modeling pipeline. Instead of hand-coding and manually testing dozens of models to find the one that best fits your needs, DataRobot automatically runs dozens of models and finds the most accurate one for you, all in a matter of minutes. In addition to training the models, DataRobot automates other steps in the modeling process such as processing and partitioning the dataset.

While we will jump straight to the model results, take a look here to see how to use DataRobot from start to finish and how to understand the data science methodologies embedded in its automation.

Interpret the Results

Feature Impact—Which features are important to the model:

For a selected model, it would be helpful to know which features are the key drivers of the model. The Feature Impact plot ranks the features from the most important to the least important and also shows the relative importance of those features. In the below example, we can see that BI_Limit is the most important feature for this model, followed by Avg Driver Age, Tier, Vehicle Count, and so forth.

Feature impact - DataRobot AI Platform

Feature Effects—How does each feature drive the model prediction:

Now that we know which features are important to the model, we can use the Partial Dependence graph to learn how each feature affects the predictions. In the Partial Dependence plot for Tenure (see below), it can be observed that the probability of churn decreases monotonically with policy tenure. In other words, the longer a policy stays with a carrier, the less likely it will churn with everything else held equal.

Partial Dependence graph - DataRobot AI Platform

Prediction Explanation—What are the drivers for each individual prediction:

People like explanations. When an underwriter sees a very high or low prediction for policy churn, they might be wondering what features are contributing to the predictions. The insights at each prediction level cannot only help the underwriter understand how a prediction is made, but also increase their confidence in using the model. DataRobot, by default, provides the top 3 Prediction Explanations while the user can request up to 10 explanations. Model predictions and explanations can be downloaded in a CSV file and you can control which predictions will be populated in the downloaded CSV file by specifying the thresholds for high and low prediction. The graph below shows the top 3 explanations for the 3 highest and lowest predictions. From this graph, you can tell that, in general, the high predictions (i.e., high retention or low churn) are associated with long tenure and higher liability limits; while the low predictions (i.e., low retention or high churn) are associated with younger drivers and higher average premium per vehicle.

Evaluate Accuracy

Lift Chart

A Lift Chart is one of the approaches to evaluate model accuracy and effectiveness. The Lift Chart below shows how effective the model is in terms of differentiating policy holders who are less likely to renew (on the left) from those who are more likely to renew (on the right). And the fact that the actual (orange curve) closely tracks the predicted (blue curve) tells us that the model is fitting the data well.

Lift chart - DataRobot AI platform

Business Implementation

Decision Environment 

After you are able to find the right model that best learns patterns in your data, DataRobot makes it easy to deploy the model into your desired decision environment. Decision environments are the ways in which the predictions generated by the model will be consumed by the appropriate stakeholders in your organization, and how these stakeholders will make decisions using the predictions to impact the overall process. 

Decision Maturity 

Automation | Augmentation | Blend 

Ideally, the policy Churn model should be integrated with the insurer’s policy administration system so that for every renewal policy, a policy churn score can be produced. Complicated business rules are normally put in the same system to trigger an underwriter review before a renewal policy is processed. “Policy churn score over xx” can be one of the defined business rules. 

Actuaries, product managers, and other management teams may want to receive monthly reports about policy retention, both actual and predicted.

Model Deployment 

There are several ways the model can be deployed, depending on how ready it is to be deployed.

DataRobot Drag and Drop or REST API—Before the model is fully integrated into production, a pilot may be beneficial for 1) testing the model performance using new data; 2) monitoring unexpected scenarios so business rules can be adjusted accordingly; and 3) increasing the end-users’ confidence in using the model outputs to assist business decision making. 

Connection to Other Systems—once everybody feels comfortable about the model and also the process, integration of the model to production systems (or policy center, in this case) can maximize the value of the model. 

Decision Stakeholders
  • Underwriters
  • Product managers
  • Pricing Actuaries
  • Marketing
  • Senior management team
Decision Process

Underwriters can use the predictions to determine whether any action can be taken proactively to avoid a policy churn. Product managers and Pricing Actuaries will use the predictions to assist their understanding of the competitive position of existing pricing plans so product managers can adjust its new business model and actuaries can take into account the findings in the next rate review.

Model Monitoring 

Regular reports are going to be produced and distributed to different stakeholders. For organizations with dashboard capabilities, model predictions can be integrated with the dashboard tool so real-time reports can be accessed by different stakeholders.

If the REST API is used to deploy the model, various metrics such as service health, data drift, and accuracy can all be monitored within DataRobot’s platform.

If the user chooses to deploy the model outside of DataRobot, DataRobot MLOps can be leveraged to monitor essentially all the models deployed across the organization.

Implementation Risks
  • Fail to make predictions intuitive for underwriters to understand
  • Fail to help underwriters interpret the predictions and understand why the model makes the predictions
  • Fail to build in proper business rules to capture abnormal activities
banner purple waves bg

Experience the DataRobot AI Platform

Less Friction, More AI. Get Started Today With a Free 30-Day Trial.

Sign Up for Free
Explore More Insurance Use Cases
Insurance companies are using machine learning and AI to increase top and bottom line through gaining competitive advantages, reducing expenses, and improving efficiencies. They are optimizing all areas of their business from underwriting to marketing in order to make data-driven decisions to lead to increased profitability.