Machine Learning and Loss Development: From “Top-Down” to “Bottom-Up”

October 29, 2019
· 3 min read

If I was forced to choose one capability of automated machine learning likely to have the biggest impact on commercial insurance, it would be its ability to predict how individual losses will develop over time. This capability, although organizationally based in the actuarial function, immediately impacts reserving and creates the foundation for informed risk selection, underwriting, pricing, and claims handling.

Typically, estimates of loss development and reserve requirements are established by loss development factors derived from aggregate annual experience for a book of business or class of risk. This approach is referred to as the “top-down” approach, and it works well enough for financial reserving across the entire portfolio of risk, even though it requires some adjustments

The top-down approach breaks down, however, when we need to allocate loss development against individual losses, each with unique characteristics that cause them to deviate from other losses. The result is an inaccurate allocation of loss expense, cautious over-reserving, and/or increases in prior year loss development due to under-reserving. Each of these scenarios amounts to inefficient use of capital.

The alternative–once largely theoretical, but now fully practical–is to turn the process on its head and institute “bottom-up” loss development and reserving.

From informal polls of approximately two dozen insurance companies, we have seen that more than 90% of companies attempting to build individual loss development (ILD) models using GLMs alone have failed. Automated machine learning easily helps insurers overcome the limitations of traditional GLMs, apply more modern machine learning and unstructured text models, and successfully build ILD models in several weeks.

Powered by artificial intelligence and machine learning, the bottom-up approach predicts the development of individual claims over time based on their unique attributes (such as injury severity), then adds them up to establish reserve requirements. One might ask: Why is machine learning able to do this to a degree and with a level of precision not previously attained in predictive analytics?

Time values

The key technical breakthrough is the ability of artificial intelligence and machine learning to analyze and manage the shifting impact of different loss variables over the life of a claim.

As a claim progresses from the initial notice of loss to 60-90-180 days, and from one to as many as 15 years, there are changes in the relative impact of each loss variable and in the interaction among all the variables. The process is far too complex for even highly skilled staff to develop all the necessary models. It might require 5-15 models for each line of business. Even if you could develop the models, maintaining them would be a nightmare.

DataRobot’s artificial intelligence and machine learning platform allows users to “drag-and-drop” data and the system highly precise predictions of final claim costs throughout the lifecycle of the claim.

There are at least four approaches available for deriving predictive variables of loss development. These approaches include:

  • Comparative analyses of all comparable claims closed in an accident year;

  • Comparative analyses of claims from multiple accident years, with a “residual modifier” derived to predict the accident year a claim will close;

  • Comparative analyses of all comparable closed claims, regardless of accident year; and

  • The “distance method,” in which we predict the loss development period (time to close) for a claim and determine “distance variables” to measure and manage the projected costs.

These methods represent a series and levels of approaches with ascending levels of refinement. Customers typically start off by analyzing closed claims to identify the factors that contributed most significantly at different stages of the adjusting process. That analysis leads, in turn, to time period development estimates (24, 48,60, 120 months) that can eventually be applied to open claims.

Each of the methods listed above can be explored in a small fraction of the time it has previously taken to develop predictive models. As users gain more experience they test more advanced techniques and achieve even more accurate outcomes.

Insurance will never be the same.

New call-to-action

About the author
Neal Silbert
Neal Silbert

General Manager of Insurance, DataRobot

As an insurance industry executive and management consultant, Neal has served as an analytics thought leader and driver of innovation for the last 25 years. Recently, he was the VP of Predictive Analytics at Zurich North America, focusing on bringing the latest advances in predictive analytics to insurance product development.

Meet Neal Silbert
  • Listen to the blog
  • Share this post
    Subscribe to DataRobot Blog
    Newsletter Subscription
    Subscribe to our Blog