Data Science Fails: Change is the Only Constant

November 15, 2019

by

· 8 min read

Heraclitus, a Greek philosopher who lived 2,500 years ago, is credited with saying, “Change is the only constant.” Similarly, in 1789, Benjamin Franklin wrote, “In this world nothing can be said to be certain, except death and taxes.” This blog is about how we sometimes forget these two ideas when using artificial intelligence systems.

Just two years ago, the internet was flooded with advertisements for get-rich-quick schemes that involved investing in bitcoin. There were also news stories in mainstream media about bitcoin millionaires, with headlines such as “Meet Erik Finman, the teenage bitcoin millionaire.” That news article told the story of a teenager who bought his first bitcoin for $10 at the age of 12, later seeing bitcoin prices turn it into more than a thousand dollars. Erik made a bet with his parents that he would be a millionaire before the age of 20 and won.

On December 17, 2017, the bitcoin price hit a new high of $19,783.06, up 1,824% since January 2017. That’s a time span of less than a year! But just five days later, the bubble started to burst, and bitcoin prices dropped one-third in only 24 hours. By the time the prices finished falling, the bitcoin market had lost 80% of its total value.

I saw people using spreadsheets to fit curves to the historical price series and projecting those prices into the future, predicting further price increases and high returns. People assumed the past bitcoin price growth was sure to continue. They were asking me why I hadn’t invested in bitcoin. But their assumption that bitcoin price trends would continue forever was wrong. This is an example of the cognitive bias known as the availability heuristic, whereby humans tend to overestimate the likelihood of events with greater “availability” in memory, which can be influenced by how recent the memories are or how unusual or emotionally charged they may be.

Cargo Cult and AI

With all the hype about AI, some people have developed unrealistic expectations about AI. I’ve lost count of how many times someone has asked me whether I can build an AI that will predict the stock market and make them rich. If it were that easy, at this very moment I’d be lying in a deck chair, sipping cocktails on a tropical island, while an AI quickly and efficiently made millions for me!

A more subtle version of the same logic goes as follows: With humans subject to strong cognitive biases, and with the availability of vast amounts of data that could be processed to better predict market movements, one might think that artificial intelligence is well suited to algorithmic trading. After all, aren’t computers much better at processing data and doing mathematics than humans, making logical decisions purely on the facts, and avoiding irrational exuberance?

But AIs are not a magical and certain route to wealth. Historical data is not enough to reliably predict outperforming investment strategies. Seth Weingram, Director of Client Advisory at Acadian Asset Management, warns that investors who try to predict the market and rely on naive AI approaches will get into trouble, that people who try to predict the sharemarket or interest rates using AI might end up with flawed analysis that can lead to financial losses. “You see market-naive folks who are trying to apply these techniques get into trouble,” he says. “There’s a risk that you don’t actually have enough data to meaningfully train your algorithm.”

Case Study: Predicting Equity Market Prices

In 2018, ABC Investments (not their real name) launched a fund that utilizes machine learning to identify sources of potential returns. Machine learning techniques were embedded within the investment process, using quantitative methods to time investments. The launch material promised that machines would “forecast market moves more accurately” and apply “an elegant approach with great return potential.” The fund algorithms also automated ESG (environmental, social, and corporate governance) to prevent investments in unethical equities, such as controversial weapons manufacturers. In short, investors in the fund were placing a multimillion-dollar bet that algorithms would be more effective at ﬁguring out the complex world of discretionary investing than a human portfolio manager.

In the fine print at the bottom of the announcement of the AI fund launch, ABC included the standard investment product caveat: “Past performance is not a guide to future results.” This caveat was to prove prophetic. Over the past twelve months, the fund showed poor returns, underperforming the sharemarket. An ABC spokesperson explained the underperformance as “the result of challenging markets and changing behavior of so-called equity factors.” This was a surprising excuse to make, not only due to the promise at fund launch that machines would be more accurate at timing the market, but also because it is well known that equity factors change all the time; they are not stable. One published paper on factor investment concluded that “factors, like markets, have been documented to be extremely hard to time. Most institutions would rather avoid timing decisions given their inherent difficulty.”

Just like many others who have tried and failed, ABC was unable to use AI to predict equity markets with sufficient accuracy to consistently outperform an index.

Why Is Equity Market Price Prediction The Wrong Use Case for AI?

It’s not enough for an algorithm to have worked in the past. A successful AI investment algorithm needs to be equipped to identify when the investment regime is changing, and thus when the time has come to rotate onto different factors. Algorithms that were trained on data sourced from a time period during which only a single investment regime occurred will need to be retrained to learn the new market paradigms, and unless the algorithms recognize the regime change, they will be slow to update their investment strategies.

Most modern artificial intelligence systems are powered by machine learning. Rather than manually entering hundreds of rules written by experts, machine learning learns by example from historical data. To put it simply, machine learning finds patterns in historical data and assumes that the patterns will apply in the future. But over the past 18 months, investment markets have changed. We are now in an environment of trade wars and inverted yield curves, a very different situation to what the machine learning algorithm seemed to have been trained on. Not only have the markets changed recently, but markets will continue to change in the future.

The problem of market changes and AI is compounded by new “alternative” data sources, where available history tends to be even more limited. Some investment managers are using social media data to understand market sentiment, but Twitter was only founded in 2006, so tweets span a very limited range of market conditions—arguably, excepting 2008-2009, one very long bull market. AIs that use big data will necessarily have been trained only on data from the past several years. Without adequate historical data, an AI cannot accurately choose investment strategies. Equity markets do not remain constant, and there is no certain way to outperform the market consistently over time.

Unlike more common AI use cases, such as marketing, all market participants are trying to extract signal from the same dataset — there is a finite number of stocks, bonds and other investments available, and historical data covering these is available to all. Additionally, the length of recent stock market cycles makes it difficult to find alternative strategies. You can’t go back and run tests on alternative strategies, because market conditions don’t repeat. Adding extra data sources doesn’t remove this limitation. The ability of AI to learn patterns is limited by the number of historical outcomes.

Investment market outcomes are changed by the actions of investment funds. One must not only consider predictions, but also game theory. By adopting new investment strategies created by AIs, you are changing the outcomes, possibly invalidating any predictive accuracy. The situation resembles AI-powered fraud detection because the behavior of the AI that detects fraud signals to fraudsters that they, in turn, need to change their behavior. As the fraudsters change their behavior, the predictions of the AI become less accurate. It becomes an arms race. I know of a fraud detection business that retrains its AIs every three days! However, the difference between fraud detection and investments is that there isn’t enough new market data to retrain an investment fund AI every three days.

Nevertheless, AI Can and Does Add Value to Trading Operations

The point here is not that AI-driven investment is always a data science fail, but rather that proper data science principles need to be followed, and you need to select the right use cases. There are many excellent uses for AI in the securities and investments industry, for example identifying optimal strategies to get trades executed, reducing operational risk, optimizing responses to requests for quotations, predicting changes in market dispersion, forecasting economic indicators, and even identifying the relevance of particular news stories to a stock or bond’s price.

But while some are reaping the benefits of AI to dramatically improve their operational sophistication and effectiveness, many are more pessimistic about the opportunities for using AI to fully automate strategy, stock selection, and timing the market. In a recent interview, Nobel-Prize-winning economist Robert Shiller commented saying, “I think we still need human oversight.”

Conclusion

There’s nothing unique to investment funds in these examples. The same principles apply to any AI use case where the environment changes, or outcomes are not certain. A best practice internal AI governance and risk management process will include plans that cover the following general principles:

Choose the right use cases for AI. Most modern AIs rely on pattern recognition, using machine learning techniques to find those patterns. The most successful AI use cases are when past patterns can be reliably extended into the future.
AIs can be overconfident, just like humans. Follow best practices to ensure AI humility. Ensure that your AI warns you when it is about to make a decision using data that is outside the range of its training data. Identify decisions where the outcomes are abnormally uncertain. Set up AI processes so that the system automatically triages difficult decisions and edge cases to humans.
Managing an AI can be similar to managing human staff. Just as you would manage human experts using performance indicators and regular training to update their skills, you should not deploy an AI and leave it to run without performance tracking and updates. Best practice is to use MLOps systems to proactively warn when live data is dissimilar to training data or when accuracy is deteriorating.

Are your AI governance processes protecting you from overconfident or out-of-date AIs? Click here to arrange for a demonstration of DataRobot’s MLOps for AI you can trust.

About the author

Colin Priest

VP, AI Strategy, DataRobot

Colin Priest is the VP of AI Strategy for DataRobot, where he advises businesses on how to build business cases and successfully manage data science projects. Colin has held a number of CEO and general management roles, where he has championed data science initiatives in financial services, healthcare, security, oil and gas, government and marketing. Colin is a firm believer in data-based decision making and applying automation to improve customer experience. He is passionate about the science of healthcare and does pro-bono work to support cancer research.

Meet Colin Priest

Share this post

Subscribe to DataRobot Blog

First Name

Last Name

Email

Country

State

Yes! Please email me news and offers for DataRobot products and services.

DataRobot is committed to protecting your privacy. You can find full details of how we use your information, and directions on opting out from our marketing emails, in our Privacy Policy.