Improving Model Management in Uncertain Times
With the continued unfolding of the COVID-19 pandemic, the world’s economies and societies are going through an extended period of uncertainty. This ongoing volatility brings new challenges for organizations and teams managing predictive models.
It’s tricky to maintain a grip on production model management and monitoring under normal circumstances. The current turbulent times highlight this issue even further, with models experiencing decay at an accelerated rate, wreaking havoc on business-critical processes that rely on predictions. Accessing your ability to manage model risk and model monitoring becomes vital for data science and IT teams overseeing these models, as well as business stakeholders who rely on these predictions.
In a recent webinar titled Managing Models in Uncertain Times, we highlight some of the essential practices and tools which cover various stages of model risk and model monitoring management — a concept commonly known as ML Ops. It includes technologies and practices aimed at providing a scalable and governed means to rapidly deploy and manage ML applications in production.
ML Ops focuses on four critical areas of investment for organizations monitoring their production models:
- Deployment: the convergence of data science and IT teams aimed at publishing a machine learning model into an existing production environment
- Monitoring: the process of assessing model performance and quality over time by monitoring for service health, accuracy, data drift, and many other critical metrics about the model
- Machine learning lifecycle: tools and techniques around model retraining, the testing of champion-challenger models, automated replacement and ongoing maintenance to ensure the continuous performance of the existing production models
- Governance: sets the rules and controls for machine learning models running in production, including approval workflows, access control, change and access logs, and traceability of model results.
Making investments in all of these areas allows enterprises and their models to be less susceptible to volatile events, like market crashes, rapid regulatory changes, and other events like the ongoing pandemic. On top of it all, ML Ops improves productivity and happiness of data science teams by allowing them to focus on the actual model ROI, not its behavior in the wild.
ML Ops seeks to centralize and automate a lot of the manual processes involved in deploying and monitoring models. This, in turn, minimizes the potential risks around model deployment and streamlines the manual components of the process.
There’s an abundance of issues that a coherent and comprehensive ML Ops framework can solve, such as:
- Data drift occurs when training data and the production data change over time such that the model loses predictive power. But it can also be abrupt or gradual. Therefore, it’s crucial to be able to detect the pattern and correct it, while it hasn’t yet disrupted the production model’s performance.
- Service issues can always disrupt machine learning pipelines, from run-time errors, data error, and system outages to system throughput and cache load issues.
Move Fast, Fix Things
And this is just the tip of the ML Ops iceberg. ML Ops tools have to be able to alert stakeholders about important model behaviors, like potential data drift. In turn, these alerts need to integrate with communication systems, like Slack and email.
Also, many of the ML Ops processes revolve around model assessment and candidate model management, with continuous evaluation and retraining of existing models, as well as ongoing A/B testing of the existing model inventory. All of this should be organized with a human in the loop. That ensures the performance, compliance, and stability of the system.
Learn more about how ML Ops can help lower management and business risks around production models by watching the full on-demand webinar.
After that, you can check out results from some polls we ran during the webinar or followup with us and other community members through this community discussion in the Research Center
Rajiv Shah is a data scientist at DataRobot, where he works with customers to make and implement predictions. Previously, Rajiv has been part of data science teams at Caterpillar and State Farm. He enjoys data science and spends time mentoring data scientists, speaking at events, and having fun with blog posts. He has a PhD from the University of Illinois at Urbana Champaign.
As the head of Model Risk Management at DataRobot, Seph Mard is responsible for model risk management, model validation, and model governance product management and strategy, as well as services. Seph is leading the initiative to bring AI-driven solutions into the model risk management industry by leveraging DataRobot’s superior automated machine learning technology and product offering.
We will contact you shortly
We’re almost there! These are the next steps:
- Look out for an email from DataRobot with a subject line: Your Subscription Confirmation.
- Click the confirmation link to approve your consent.
- Done! You have now opted to receive communications about DataRobot’s products and services.
Didn’t receive the email? Please make sure to check your spam or junk folders.
Optimizing Large Language Model Performance with ONNX on DataRobot MLOpsJune 1, 2023· 11 min read
Belong @ DataRobot: AAPI Heritage Month with the ACTnow! CommunityMay 25, 2023· 3 min read
Deep Learning for Decision-Making Under UncertaintyMay 18, 2023· 5 min read
According to a recent survey by NewVantage Partners, only 15% of leading enterprises have deployed AI into widespread production. Why so few? For organizations to overcome the hurdles of deploying and managing AI, they have to overcome several major hurdles around model deployment, management, and monitoring, in addition to bridging the gap between IT and data science teams. These are…
Data science teams are scrambling to update their models in the wake of extreme and unforeseen worldwide changes brought on by the global COVID-19 pandemic. In the face of these unprecedented events, one of the key concerns among many data scientists is that their current models could be generating inaccurate or misleading predictions. In the webinar, AI in Turbulent Times:…
The world is going through extremely turbulent times. With the ongoing disruption of our lives, communities, and businesses from the COVID-19 pandemic, predictions from existing machine learning models trained prior to the pandemic become less reliable. There is plenty of historical data, but historical examples from before the pandemic may not provide the relevant examples needed to train a model…