DataRobot How AI and Machine Learning Helps Improve Insurance Pricing background image v1.0

How AI and Machine Learning Helps Improve Insurance Pricing

November 3, 2020
· 3 min read

Insurance pricing is a never-ending battle. With the advent of comparative raters in the P&C insurance market, prospects can compare prices on many companies instantly, and, not surprisingly, they usually choose the lowest offer. Inaccurate pricing is costly for insurance companies: it improves competitors’ customer base, reduces customer retention, and attracts risky customers. This is why actuaries spend hours on fine-tuning pricing models.

But how do actuaries actually create an insurance premium? And how can AI be helpful?

When a customer applies for a policy, the insurance company provides a customized premium that reflects the customer’s risk, the risk that s/he will file claim(s), and more specifically, how much those claims will cost during the policy coverage time frame. To estimate this risk, pricing actuaries rely on predictive models. These models measure how much the customer is expected to claim, based on information available at underwriting time. 

For decades, Generalized Linear Models (GLM) have been the workhorse for building insurance pricing models due to their flexible structure and interpretability. The pricing team would use historical data, including a handful of policy attributes and other information about the customer, to build their GLM models. These models are often split into a frequency and a severity model; the outcome of such an approach is a rating table that is easy to use and understand.

Table 1.2

The rating table above has a base rate of $200 and four rating criteria: age, marital status, gender, and deductible. Each criteria has rating factors associated with it, which are used as multiplicative coefficients. For example, for a single woman of 20 years old with a $100 deductible, the premium would be calculated as :

Premium = $200 (base rate) x 2.03 (20 years old) x 1.12 (Single) x 1.2 (Female) x 1.25 ($100)

Traditionally, the pricing team would not build one model predicting directly the incurred claim. They would first build a frequency model predicting the number of claims. And then a severity model predicting the average amount of one claim. Both models give a rating table that you can multiply to end up with the pure premium rating table.

Screen Shot 2020 10 15 at 10.10.37 AM

This methodology has been widely adopted in the insurance industry in many countries for decades. However, like any other approach, it is subject to its own limitations and pain points, and this is why it’s being challenged today.


Insurance has abundant features to work with; and often, these features are highly correlated with each other. GLM requires manual processes for interaction identification and feature selection, which has limited the number of features to be incorporated in the model and prevented the model from capturing the increasingly complex interactions in the data. 

In addition, it requires lots of manual calculations and analysis from the pricing team, which can spend weeks or months refining the pricing model of the motor insurance product. 

But most importantly, GLM today are less accurate than other types of machine learning algorithms, and this is the main reason why pricing teams have started to take a closer look at AI and machine learning. Indeed, inaccurate pricing is costly for insurance companies with an impact on portfolio health and retention.

Machine Learning models

Insuretech as well as large insurance companies have started to migrate their insurance pricing models away from GLM. Of course, widescale adoption depends on the country and the regulation. The UK, for example, is quite advanced in this space. Instead of using GLM with rating tables, they would use random forest, xgboost, or other models that estimate more accurately the customer risk and need less manual work. 

The machine learning methodology is similar to the traditional approach: building a model based on historical policy data that predicts the incurred claim amount. But there are differences: the models might not be linear, might not follow a frequency-severity approach, might not provide a rating table, and can use much more variables. Still, machine learning models keep interpretability and explainability. 

In the webinar How AI and Machine Learning Helps Improve Insurance Pricing, you can watch how such machine learning models can be built for a motor pricing product. Starting from raw and dirty data, the webinar covers training a machine learning model with explainability until its implementation. 

This new methodology is an open door for insurance companies’ pricing. Watch the webinar to learn more details behind how machine learning is a game-changer for insurance pricing. 

Embracing AI: How Automated Machine Learning is Revolutionizing Insurance
Download Now

About the author
Ming-Li Gridel
Ming-Li Gridel

Director, Data Science Practice, DataRobot

Meet Ming-Li Gridel
  • Listen to the blog
  • Share this post
    Subscribe to DataRobot Blog
    Newsletter Subscription
    Subscribe to our Blog