Let’s Take It from the Top: DataRobot’s Predictions for 2020 Primetime Emmy Awards
This year’s 72nd Primetime Emmy Awards will take place on Sunday, September 20th. Looking back at last year’s post where we used DataRobot to predict the winners for the 2019 Primetime Emmy Awards for the “Outstanding Drama Series” and “Outstanding Comedy Series” categories, we correctly predicted Game Of Thrones winning Outstanding Drama Series but missed the mark on Outstanding Comedy Series (Fleabag was the winner, whereas we predicted Veep would take the top prize).
Since Game Of Thrones is absent from this year’s 2020 Emmy nominations, let’s see if the data says anything about potential underdog winners!
Building the Project
Similar to last year, I used the OMDb API (Open Movie Database API) to fetch historical data and extra information of all the nominees and winners for the Emmys in the “Outstanding Drama Series” and “Outstanding Comedy Series” categories dating back to the 1966 Emmys. The target we want to predict is “win,” which indicates whether the nominee won the category for a given award year.
Pulling the data into the latest version of DataRobot, I followed data science best practices by building an OTV (Out-of-Time Validation, a.k.a. datetime partitioned) project. I also made sure to factor in the recency of the data. In other words, data from the 1960s is not that useful to predict the winners in 2020; it would instead be more suitable to base newer predictions on data from the 1990s-2010s.
Looking at the individual features, we can see which ones have the strongest relationship to past winning. Some interesting trends that emerge for winning an Emmy include:
- The actors who starred in the show
- The writer(s) of the show
- “number_times_won_before,” which is the number of times the show won an Emmy prior to the specific award year
- “nominee,” i.e. the name of the show, most likely due to repeated winners (*cough* Game Of Thrones)
- “received_gg_nom_before,” which is a flag whether the show received a Golden Globe nomination previously
After Autopilot completed, I picked one of the best-scoring models on the Leaderboard to see interesting insights that DataRobot models provide. In a similar vein to last year, I looked at both Feature Impact and Feature Effects to understand the model’s behavior when predicting an Emmy winner.
For Feature Impact, this particular model determined that features such as “nominee,” “writer,” “actors,” “plot,” “received_gg_win_before,” “number_times_won_before,” and “network” were all indicators of a show winning an Emmy.
For Feature Effects, it was interesting to see how indicative a show’s network was in determining whether or not the show would win an Emmy. Some networks that have historically broadcasted Emmy-winning shows are: AMC, NBC, HBO, Amazon, and Netflix. Given how much money that players like HBO, Amazon, and Netflix are pushing into original-content recently, we should expect to see more Emmy-nominated shows from them in future award shows.
Lastly, rather than looking at a show’s writer(s), I wanted to see if certain actors lead to the show having a chance to win an Emmy. At least according to the data, Jack McBrayer, Tina Fey, and Lorraine Bracco are great actors to have on cast if you want an Emmy-winning show. In the word cloud, the pinker/redder the text is, the more-likely the nominee is to win an Emmy.
Drum roll! The predicted winners are…
For “Outstanding Drama Series,” we predict The Handmaid’s Tale will win. The following is the top 3 predicted winners, from most likely to least likely:
For “Outstanding Comedy Series,” we predict The Marvelous Mrs. Maisel will win. The following is the top 3 predicted winners, from most likely to least likely:
When we compare our predictions to those from publications such as IndieWire and The Hollywood Reporter, our results are aligned with experts predicting that The Marvelous Mrs. Maisel will win Outstanding Comedy Series this year. However, most publications including IndieWire and The Hollywood Reporter predict that either Succession or Ozark will win Outstanding Drama Series. This goes to show that while machine learning tools can inform ourselves about the historical data, it’s still very important to listen to your gut and to pay attention to what experts are saying. Furthermore, we only had a limited set of features to work with and future research could incorporate additional data such as “number of streaming services the show broadcasted on,” “positive sentiment score of expert reviewers,” and so forth.
What shows are you rooting for? I’m going to be cheering for my personal favorites: Ozark and The Good Place! “Jake Jortles!”