Data Science Fails: There’s No Such Thing As A Free Lunch
When I was young, I took a packed lunch to school every day, and since I grew up in Australia, my packed lunch would include a couple of Vegemite sandwiches. Unless you grew up in Australia, you’ve probably never tasted it. And judging by this American’s first taste reaction of “Oh, that’s bad!”, you probably wouldn’t like the taste if you tried it out. But I loved my Vegemite sandwiches, and they were my one-and-only lunchtime choice, no matter what the circumstances.
While this blog isn’t about Vegemite, it is related to lunch, specifically the no free lunch theorem. In short, the theorem states that no algorithm can be equally good at learning everything, which means that you can’t know in advance which algorithm will work best on your data. I recommend “The Lack of A Priori Distinctions between Learning Algorithms” for readers with a technical background who want to know more about the theoretical background. However, despite this well-established theorem, it is common practice for data scientists to rely on only a limited number of modeling methods. Typically, companies are trying to roll out projects quickly, leading to tight timeframes, and all too often people will end up trying only one algorithm and/or limited preprocessing and tuning parameters.
Humans can be biased in choosing algorithms. Sometimes they prefer to run a particular type of algorithm that they are comfortable with, or they will always choose to use a specific approach to treat missing values during data preparation. These are examples of cognitive biases such as status-quo bias and anchoring. In the data science community, there’s often a lot of hype surrounding the latest algorithm, whether it be about “deep learning” or a “decision jungle.” All of this hype and attention can also trigger human bias. The people building AI solutions are human, and just like everyone else, they are subject to biases such as the availability heuristic, the bandwagon effect, and appeal to novelty.
Case Study: Earthquake Aftershocks
The following case study demonstrates the bandwagon effect and appeal to novelty biases. It is a real-life example where the data scientist used a hyped algorithm instead of checking for the best algorithm. By the end of this case study, you’ll surely realize there is no such thing as a free lunch!
In August 2018, Nature published the article “Deep learning of aftershock patterns following large earthquakes.” Using a training dataset containing more than 130,000 mainshock–aftershock pairs, the authors trained a deep learning algorithm to “identify a static-stress-based criterion that forecasts aftershock locations without prior assumptions about fault orientation.” The paper describes the details of the deep learning algorithm, including its inputs and six-layer architecture. After comparing the model accuracy versus well-known physics-based models, the authors concluded that “the neural-network forecast can explain aftershock locations better than can widely used metrics.” The results were surprisingly accurate, reporting an AUC of almost 0.85.
Very quickly, news of this article went viral. Headlines included “Google and Harvard team up to use deep learning to predict earthquake aftershocks” and “Artificial intelligence nails predictions of earthquake aftershocks.” The research was even included in the release notes for TensorFlow as an example of what deep learning could do.
However, for me, the article and subsequent hype raised a couple of red flags:
- Forecasting seismic activity is challenging. The accuracy looks too good to be true.
- The paper uses only a single machine learning algorithm, applying only one architecture.
Due to space limitations, we will return to the first identified red flag in a later blog. Here we will focus on the second point, the lack of algorithmic diversity.
At the same time that the Nature article was published, industry analysts were commenting on the hype associated with certain technologies. One industry analyst listed “Deep Neural Networks” as being at the peak of the hype cycle. The authors of the Nature article had only used an algorithm that industry analysts categorized as overhyped!
A year later, Nature published a follow-up article, “One neuron versus deep learning in aftershock prediction,” using the same data but written by different authors. This new article compared the original model against a simple linear logistic regression model, concluding that the simpler model “provides comparable or better accuracy.” The authors further concluded that “deep learning does not offer new insights or better accuracy in predicting aftershock patterns.”
Humans can be biased in choosing algorithms. You may have your favorites, or you may be excited to try the coolest and latest algorithms. But to avoid an algorithm bias, step aside and let competition between a champion and challenger model decide which method is superior. A lack of diversity in model-building usually leads to suboptimal results. A recent benchmarking exercise on a wide range of business use cases concluded: “The diversity of algorithms earning top accuracy rankings demonstrates the need to test as many different algorithms as possible to find the best one for your data.”
Colin Priest is the VP of AI Strategy for DataRobot, where he advises businesses on how to build business cases and successfully manage data science projects. Colin has held a number of CEO and general management roles, where he has championed data science initiatives in financial services, healthcare, security, oil and gas, government and marketing. Colin is a firm believer in data-based decision making and applying automation to improve customer experience. He is passionate about the science of healthcare and does pro-bono work to support cancer research.
We will contact you shortly
We’re almost there! These are the next steps:
- Look out for an email from DataRobot with a subject line: Your Subscription Confirmation.
- Click the confirmation link to approve your consent.
- Done! You have now opted to receive communications about DataRobot’s products and services.
Didn’t receive the email? Please make sure to check your spam or junk folders.
Belong @ DataRobot: AAPI Heritage Month with the ACTnow! CommunityMay 25, 2023· 3 min read
Deep Learning for Decision-Making Under UncertaintyMay 18, 2023· 5 min read
Getting Value Out of Generative AIMay 10, 2023· 3 min read
As a data scientist, I’ve worked with many companies that are looking to implement AI and ML in marketing use cases. Though many marketers are excited about the possibilities of AI, they also have trouble understanding what AI is and how to utilize it in their jobs. There are a lot of great ways to use AI in marketing organizations.…
It feels like a lot of AI consulting these days is like the technology itself, more promise than payoff. In her book The Business of Consulting, Elaine Blech shares a joke about consulting’s reputation, where a consultant is asked the time by a client. The consultant in turn asks for the client’s watch and says, “Before I give you my…
ML operations management platforms are essential to getting models into production and keeping them there. A model or pipeline that is not in production is one that cannot provide any value (or limited value) to the business. But while they have very high operational benefits, they are not value-add from a business point of view.