Automated Machine Learning—A Gateway for Accelerated Data Science Upskilling
Actuaries have always been the jacks-of-all-trades. Mathematics and insurance knowledge form their professional foundation, but actuaries have also learned from other disciplines, such as law, accounting, marketing, and, of course, data science. But over the last decade or so, many actuaries have been finding it increasingly difficult to keep up with the rapidly developing field of data science. The good news is that this is beginning to change.
In just the past couple of years, a new technology has emerged called automated machine learning, or AutoML. This technology has put cutting-edge data science in the hands of actuaries, drastically accelerating the learning curve and their ability to apply data science to the myriad of insurance problems.
Actuaries know that insurers have an abundance of use cases they can apply predictive modeling to: identifying leads to solicit in marketing campaigns, identifying which houses to inspect upon renewal, routing the right claims to salvage and subrogation teams to maximize recoveries — the list goes on and on. Essentially, most of these use cases come down to sorting, and this sorting is achieved through data science techniques. The idea is to sort the leads so that the best ones to solicit, inspect, and review are at the top. You only have a small team of home inspectors and claim reviewers, and they’re going to continue inspecting homes and reviewing claims. Therefore, you want to make sure you’re sending them the right homes to inspect and the right claims to review, which means that the sort order is absolutely critical.
However, no algorithm can achieve a perfect ordering. That is, there will always be leads, homes and claims that don’t get to the top of the list. Thus, there are missed opportunities. But the better the sorting is, the fewer missed opportunities you will have, and the more value your inspectors, adjusters and others can add to your organization. Processes like this, where a simple sorting is driving the optimization, exist everywhere in insurance. Insurers know it, and they rightly feel a sense of urgency to optimize these processes. If their competitors are applying these techniques to a vast array of use cases and they are not, they’ll lose. If competitors are framing the problem better than they are, again they will lose. If the competitors are sorting better than they are, again they’ll lose. This is no secret. The headline of an article that appeared on Bloomberg.com on June 28, 2018 read, “AI Will Thrash the Economy Like a ‘Tsunami,’ Allstate CEO Says.”
To stay ahead of their competition,, insurers should do three things: (1) apply these techniques to more business problems than their competitors; (2) frame the problems in a more appropriate way than their competitors; and (3) achieve a better sorting than their competitors. To apply these techniques to more use cases, they need experts to identify those use cases, (i.e., they need actuaries), and they can now use AutoML to scale up the volume. To frame the problem in the most appropriate way, they need subject matter experts; again, they need actuaries. And how can an insurer get a better sorting than their competitor? In short, they need access to a breadth of sophisticated machine learning algorithms.
Let’s take a marketing campaign as an example. There are many ways to sort the leads. Some insurers use a model to sort the leads according to who is most likely to respond to their mail — a sorting that completely ignores profitability. Others might use a model to sort them by some combined measure of profitability and likelihood to respond. You as the subject matter expert must determine the criteria that you’ll use to sort them. This is what we call “framing the problem” and this is where your domain expertise, not your ability to code, is key. There is nobody better qualified to frame the problem than the actuary. But even if two insurers frame the problem the same way, one can still get a better sort order and beat their competitor. They do this by trying a variety of modeling algorithms. Traditionally, Generalized Linear Models, or GLMs, have been the primary modeling algorithm used by insurers. But there are a plethora of machine learning (aka, predictive modeling) algorithms that often outperform GLMs, thus resulting in a better sorting. Hence, we see the pressure actuaries have felt to keep up with data science techniques.
Many actuaries try to develop their data science skills so that they can build these models themselves. This is frustrating and inefficient. If there’s one quality I’ve seen throughout the actuarial profession, it is pragmatism. Actuaries are keenly aware that their job is not to produce data science as an end unto itself; their job is to produce results that are valuable for the company. Many actuaries try the self-study approach either through textbooks or increasingly via online courses offered by Coursera, Udemy, and others. Most actuaries want to be actuaries; they don’t want to be lawyers, they don’t want to be accountants, and they don’t want to be data scientists either. But the amount of learning involved to produce machine learning models is tantamount to beginning a new career. Until now.
AutoML is accelerating actuaries’ abilities to apply the most cutting edge data science techniques to business problems. Tools like DataRobot provide a GUI (as well as optional programming interfaces via R and Python) which take in a dataset made up of historic training examples, applies all the data science best practices a data scientist would apply, and automatically produces accurate, interpretable, and deployable predictive models in minutes to hours.
Actuaries are experts who understand the business and their company’s data. In other words, they can identify use cases, they can frame the problem, and they can certainly pull together the appropriate dataset. By using AutoML tools, they don’t have to clean the data, nor do they have to worry about variable selection, encoding variables, finding interactions, finding transforms, overfitting, etc, etc. Some AutoML tools will try dozens of different approaches, each of which is custom built for your problem, pairing a variety of data preprocessing steps with all the different modeling algorithms that might work on your use case. And all of the models are interpretable — there are no black boxes. I typically take a dataset and produce these models in under an hour. (Deployment might be even simpler, but that’s not really the point of this article.) Some AutoML tools even handle free-form text, (e.g., you can incorporate adjuster notes into your claims models).
What if the actuary actually does want to learn more technical data science? Or what if an insurer wants their actuarial teams to further develop their data science skills? With some AutoML tools, every step is documented, so you can drill down into the process to learn more about what is happening at each step behind the scenes. Often, you can drill down to the documentation of the algorithms and even all the way to published academic research. This allows actuaries to produce valuable data science results while learning. Learning while you go is a proven approach. Actuaries in the process of gaining their credentials rotate through different areas of an organization because insurers can’t wait until the actuary is done with their exams to begin producing results, and the practical experience gained is invaluable. AutoML enables the actuary to do the same thing with data science. (But lest you be scared away, don’t worry, there are no exams.)
Actuaries are the ideal end user for an AutoML tool. They are experts on the business use cases and they are experts on the data, and that’s exactly the skill set needed to effectively use AutoML. Actuaries can now produce machine learning results, adding value to their organizations, while learning data science. Will this result in lower demand for data scientists? If you think your company will run out of data science problems to solve, then maybe the answer is yes. But I can assure you that your company isn’t running out of data science problems anytime soon. There are plenty of data science problems that AutoML doesn’t handle, so everyone’s job is safe.