Predicting the Risk of Loan Default with AI

September 6, 2017
· 2 min read

This post was originally part of the DataRobot Community. Visit now to browse discussions and ask questions about the DataRobot AI Platform, data science, and more.

The Problem

Borrowers tend to turn to banks for money when purchasing homes. The banks, in turn, often trade those mortgage loans in Fannie Mae. The Federal National Mortgage Association (FNMA), or Fannie Mae, is a government-sponsored agency tasked with expanding the secondary mortgage market with the goal of making more mortgages available to potentially less-qualified borrowers. Fannie Mae is like the stock market: it purchases mortgages, converts them to securities, and then trades them. Banks and other investors then purchase these mortgage-backed securities.

To succeed in these markets—and to avoid a repeat of the 2008 financial crisis–banks need to be able to accurately estimate the probability that certain loans will default. Banks use both data from Fannie Mae as well as their own balance sheets as input to the risk models they maintain internally and provide to regulators.

Loan Default Predictions with DataRobot

As 2008 showed, a superficial mortgage risk model may not capture the full picture, leaving banks and the entire system open to catastrophe. Simplifying assumptions that analysts make to build their models on a business’s schedule can turn into disaster when those assumptions don’t hold up. By dramatically accelerating the modeling process without sacrificing accuracy, DataRobot allows banks to understand and analyze default risk at multiple levels with multiple models. DataRobot’s automatic explanatory visualizations make it easy for analysts to spot patterns and prove to regulators that they are effectively managing the risks of their portfolios.


Colin is a loan trader for a bank that invests in mortgages. He sees hundreds of mortgages every day; his job is to decide which mortgages to invest in and which to divest. Colin collects data from Fannie Mae showing the default performance of prior loans; collects economic data from the Fed; and then consolidates the data and prepares it for analysis. With so many mortgages to choose from, and so much information accompanying each, it is not humanly possible for him to model and select the optimal portfolio. Because models become outdated faster than the data science team can tweak them to better suit his requirements, Colin turns to DataRobot to build models that predict mortgage defaults and to score each new mortgage he sees.

Training data
Prediction data
Data Dictionary

Predict Likelihood of Loan Default

Reduce defaults and minimize risk by predicting the likelihood that a borrower will not repay their loan.

Learn more
About the author
Linda Haviland
Linda Haviland

Community Manager

Meet Linda Haviland
  • Listen to the blog
  • Share this post
    Subscribe to DataRobot Blog
    Newsletter Subscription
    Subscribe to our Blog