From R&D to ROI: Five Reasons ML Doesn’t Go Into Production – and How to Solve Them
I’ve worked in the data analytics space my entire career; making tools and making sense of data is my passion. The journey has been interesting – and never more so than today, as we live through this new chapter in data analytics and machine learning.
The potential is huge – we are likely witnessing the most revolutionary technology shift we’ll see in our lifetimes. Which begs the question: why are we not seeing it in production? Why do so many machine learning (ML) projects fail?
According to Gartner, 51% of enterprises have started their AI journey, but just 10% of ML solutions get deployed. This is because ML and ML in production are two different beasts, and many people don’t fully understand the barriers they face, questions they should ask, and objectives to strive for.
In the field, we often see people focus on data collection, data cleansing, building out models, accuracy rates, and explainability, assuming those are the keys to success. The reality is many ML problems are not technology problems at all. Instead success stems from digging into infrastructure, orchestration, integration, deployment, and core business objectives.
ML isn’t the problem: it’s the people around it, managing it, and how you go about those things. So let’s break this all down into five major problem areas, the key question to ask in each case, and how to solve it.
1. Lack of process
Getting funding for a project or experiment is the easy part. Not thinking through how to get it into production is a major problem. Too many people think about achieving results, but aren’t intentional about a clear pathway to actualizing them.
Teams need to consider how a project will affect the bottom line, how it will be funded, who’ll need to be involved, and whether everyone fully understands these things. The number of projects where instigators say something will go live and “we’ll figure the other stuff out later“ is too high – which is why 90% fail.
Ask: How do we get from proof of concept to production?
The solution: Start from how you’ll deliver value to your organization and work back from there. Plan and fund the deployment up front – who’ll take it over and be responsible for it, and how it will be implemented. Set clear deployment criteria around ownership and updates.
Bring in IT/DevOps stakeholders early, because data science and ML teams on their own won’t cut through a large organization’s red tape. And build for repeatability, because the only truth in this space is that tomorrow you’ll have more models in production than you do today.
2. The wrong incentives
When ML efforts are part of innovation mandates, they’re designed to be ‘out there’, but this is optimization technology. ML is supposed to optimize business processes, increase revenue and reduce costs. It must align with the organization and enable it to meaningfully do things it couldn’t do before. Innovation alone won’t get results to the business.
Ask: What is a justifiable improvement
The solution: ML needs to be effective, integratable and usable – not ‘demoware.’ You must be able to answer how it will affect the business and whether you’ll save money on getting a model into production today. Should you find money would be lost for each day without the model, that’s incentive to fund it.
It’s useful to start an ML problem by thinking about how to improve a business objective and what your constraints are. Are you focusing on the right things? If you’ll save a specific amount, does that justify the resource? What is the process you’re optimizing and the tech stack you’ll use? Who’ll own it, what are the key drivers for performance, and do you have the right data and modeling tools?
Companies that approach this from the wrong direction fail. You must first consider changing how value will be added to the business and only then build out the stack.
3. The wrong teams
ML projects are more likely to fail when an organization doesn’t apply its skillsets in the proper place or at the right time. This might be asking the wrong people to build things – for example, having data scientists with a lack of engineering experience build infrastructure. Or a company bridging a communication gap between DevOps and data scientist teams too late.
Culture is an issue too. Bringing someone into a large enterprise and telling them to deliver stuff can be daunting if they lack experience in process within large organizations, and are used to rolling a model on a laptop and deploying it when they like.
Ask: Do I have the right people to make a solution deployable in my org?
The solution: Don’t expect people to do tasks they are not suited for, and don’t rely on finding ‘unicorns’ who know ML, production, DevOps and engineering. They exist but are hard to find, so stop chasing them. Instead, create hybrid high-performing teams that combine DevOps engineers, data scientists and software engineers.
Beyond that, ensure you use software and platforms that enhance your data science and ML teams. And look for tooling to help you with data prep, model training, scaling and ops, to figure out where to hire people or buy a solution that will fill a gap.
4. The wrong technology
All sorts of technology problems can trip up getting ML into production: lack of defined stacks/best practices; not building for repeatability, measurability and auditability; not thinking about access to data.
On the last of those, you’ll hear people argue: “If only I had access to this data, I could build a better model.” Well, if you don’t, you don’t. That’s reality and the difference between production and dev!
Ask: What’s the best ML architecture for my organization?
The solution: Design to execute at scale and for repeatability and efficiency. Tightly integrate components you’ll use in the ML stack through APIs and programmability, but ensure you can swap out, replace and upgrade them when tech, data sources and needs evolve.
Above all, be agile. What you’re using today is not what you’ll use six months from now. But also be open to integration with in-house technologies: the less you need to replace internally in your organization, the less friction you’ll have in getting ML out there. Be tactical about what you need to add.
5. A lack of champions
This final point isn’t special to ML: with deployment of new technology and potentially expensive science experiments, the lack of a champion can be a death knell. ML projects without exec sponsorship rarely see the light of day, regardless of other considerations.
Ask: How do I get buy-in from stakeholders?
The solution: You need someone who’s forward thinking and who can understand the business case for what you’re trying to do. Also, it doesn’t hurt if their own personal standing could be improved by backing your project – I’ve genuinely in the past asked stakeholders: “How do I ensure your bonus this year?”
Ultimately, you need to align ML with the organization itself, because without champions, getting ML into development is hard. Everything looks like cost, so figure out how to get buy-in, involve stakeholders up and down the command chain early on, align values and interests, and collaborate to achieve your goals.
Some prior points I’ve explored can further help you with this solution. Take into account all five and you’ll be on the path to success – one of the 10%, rather than the 90% that never go anywhere.