DataRobot D.R.I.V.E. 60 Game MLB Projections Start of Season Update Blog background v1.0

D.R.I.V.E.: 60 Game MLB Projections – Start of Season Update

July 23, 2020
by
· 4 min read

DataRobot Intelligent Value Estimator for 2020 MLB Team and Player Performance

Recently, we predicted how the 60-game 2020 MLB season would unfold — for teams and for individual players. Please see our original post for all of the details of how we build the models with DataRobot’s leading artificial intelligence platform.

Who is projected to win each division in the abbreviated season? Who will make the wildcard? Who will be the top pitchers and batters? This article discusses these questions and more, including how our predictions were made and how things would have been different if there were 16 playoff teams, as had been planned at one point.

In this blog, we updated the projections by removing players who opted out, such as David Price, Buster Posey, Nick Markakis, Felix Hernandez, and Ryan Zimmerman. We also removed players who are out for the season, such as Noah Syndergaard, Aroldis Chapman, and Luis Severino.

How did these roster changes affect our predictions?

  • The Giants lose a win because of Posey being out, but they are still projected for last place.
  • The Dodgers lost a win because of David Price, but they are still projected for first place by four games.
  • The Nationals and Mets flip-flopped by fractions of a win, and both are still projected to tie for 2nd place in the NL East.

Projected Playoff Matchups

The playoffs would be exciting, with each league having three-way Wild-Card ties: the Angels / Indians / Rays in the AL and the Nationals / Mets / Padres in the NL:

Screen Shot 2020 07 08 at 3.09.05 PM
Previous 2020 Playoff Matchups

Projected standings are:

Hypothetical 16-Team Bracket:

  • National League:
    • (1) Dodgers* vs. (8) Diamondbacks
    • (2) Braves* vs. (7) Phillies
    • (3) Reds* vs. (6) Nationals
    • (4) Padres vs. (5) Mets
  • American League
    • (1) Astros* vs. (8) Red Sox
    • (2) Yankees* vs. (7) Rangers
    • (3) Twins* vs. (6) Rays
    • (4) Indians vs. (5) Angels

*Division Winners

Here is the updated dashboard, where you can see each team and player for the 60-game season:

What’s important to remember about this projection, this season, and the variability of baseball is that even with eight teams making the playoffs, the differences in wins between the 1st seeds and the 9th seeds is only five wins and seven wins in the American and National Leagues respectively. In the reality of five-team playoffs in each league, the margin for error will be even smaller.

See our prior post for additional details on the methodology and how we took roughly 1,500 season-specific statistics for each player and added 2,000 additional variables for each player, leveraging DataRobot’s enterprise AI platform to make our predictions.

The Results

Tableau is the leading visualization solution on the market, enabling business users across a business — and in this case the public — to get value from DataRobot’s AI-based projections. Our D.R.I.V.E. MLB Tableau dashboard below shows DataRobot’s projections for the 60-game 2020 MLB season, with final win-loss records, division standings, and player performance. Many of our customers deploy the predictions they get from DataRobot through Tableau as well* since it makes for a useful combination of insights and interpretation.

*With the DataRobot and Tableau tech stack, once a model has been built in DataRobot, customers can easily democratize the value of machine learning for insight consumers at large with actionable, intelligent dashboards from Tableau. Visit the Tableau Extension Gallery to get the DataRobot extension.

2020 Major Individual Awards:

MVPs
MVPs

“What If” There Were 16 Teams in the Playoffs?

At one point, MLB was considering a 16-team playoff scenario. What would that have looked like, given DataRobot’s forecasts? In addition to the 10-team playoff teams, the Rangers and Red Sox would have also made the playoffs in the American League. The Phillies and Diamondbacks from the National League.

Playoffs 16 Teams
Playoffs – 16 Teams

Conclusion

The Dodgers, Yankees, Astros, Mike Trout, Alex Bregman, and Mookie Betts are projected to be the top teams and players in the shortened season. Using machine learning, baseball can predict future performance based on past information. Similarly, any industry can predict future performance where chance, human behavior, and the complexities among various data sources are involved. We hope to hear your thoughts.

ebook
Automated Machine Learning: A Game-Changer for Sports
Download now
About the author
Andrew Engel
Andrew Engel

General Manager for Sports and Gaming, DataRobot

Andrew Engel is General Manager for Sports and Gaming at DataRobot. He works with DataRobot customers across sports and casinos, including several Major League Baseball, National Basketball League and National Hockey League teams. He has been working as a data scientist and leading teams of data scientists for over ten years in a wide variety of domains from fraud prediction to marketing analytics. Andrew received his Ph.D. in Systems and Industrial Engineering with a focus on optimization and stochastic modeling. He has worked for Towson University, SAS Institute, the US Navy, Websense (now ForcePoint), Stics, and HP before joining DataRobot in February of 2016.

Meet Andrew Engel

Sarah Khatry
Sarah Khatry

Applied Data Scientist, DataRobot

Sarah is an Applied Data Scientist on the Trusted AI team at DataRobot. Her work focuses on the ethical use of AI, particularly the creation of tools, frameworks, and approaches to support responsible but pragmatic AI stewardship, and the advancement of thought leadership and education on AI ethics.

Meet Sarah Khatry

John Sturdivant
John Sturdivant

AI Success Director at DataRobot

He has led or advised CEOs in digital transformations across several industries and geographies. He lives in Dallas, TX with his wife and dog. Prior to joining DataRobot, he was Head of Digital and Transformation at TSS, LLC and a consultant at McKinsey & Co.

Meet John Sturdivant
  • Listen to the blog
     
  • Share this post
    Subscribe to DataRobot Blog
    Newsletter Subscription
    Subscribe to our Blog