DataRobot has Partnered with Labelbox to Bring Best-In-Class Unstructured Data Labeling Capabilities to our AI Cloud Platform
Thanks to DataRobot, leveraging vast amounts of data to generate AI-powered business insights and outcomes is no longer the stuff of science fiction – by pairing our AI Cloud platform with your enterprise data stack, it’s now possible for business stakeholders to make decisions based on the outputs of AutoML and AutoTS, all while models are centrally monitored and governed using MLOps. To date, however, enterprises’ vast troves of unstructured data – photo, video, text, and more – have remained mostly untapped.
At DataRobot, we are acutely aware of the ability of diverse data to create vast improvements to our customers’ business. Standard data types such as .CSV files only represent less than 20%1 of all enterprise data. The rest are complex, unstructured formats such as image, video, natural language, geospatial, and dozens of others.
Representative datasets are essential to any AI project, but current methods of building unstructured datasets are often slow and resource-intensive. DataRobot’s already market-leading AutoML, AutoTS, and MLOps products will only be able to drive more value after fully unlocking the power of data-agnostic AI.
Today, managing unstructured data is an arduous task. From managing the labeling and annotation processes to dealing with resource constraints, unlocking the ability to label unstructured data – and support the processes required to do so at scale – remains immensely challenging.
This is why we’re excited to announce our partnership with Labelbox, the leading provider of unstructured data labeling capabilities. Labelbox’s technology reduces the time required to label complex datasets by 5-10 times, allowing a small team to no longer need to iterate for months to deliver accurate training data for high model performance.
Labelbox is the data-centric infrastructure for modern AI teams, allowing them to rapidly create training data and improve model performance with minimal human supervision. Labelbox is primarily designed to help AI teams build and operate production-grade machine learning systems. Tens of thousands of leading AI teams have used Labelbox’s products to date, including hundreds of Fortune 500 companies, non-governmental organizations, and government agencies.
We’re excited to partner with DataRobot to simplify AI development in the enterprise by providing a powerful approach to active learning. By combining DataRobot and Labelbox, ML teams can more easily collaborate on the creation and management of high quality training data in Labelbox. Afterwards, ML teams can utilize DataRobot for their model runs, and then use Model Assisted Labeling to label new data, visualize your DataRobot model predictions, and make corrections to their model. This will significantly speed up the time needed to develop production AI applications and bring the power of AI to more enterprises.
In working with Labelbox, we have done more than increase the volume of usable data for our customers – we’ve significantly improved the ability to generate business intelligence from AI.
Labelbox serves as a crucial link between idea and implementation with our customers. The need for AI/ML is clear, so the value for DataRobot is there. However, being able to have labeled data is a prohibitive prerequisite. Labeling video, facilitated by Labelbox, provides the data for modeling and tightly integrating via Labelbox and DataRobot’s APIs provides seamless connections from data labeling through modeling, deployment, and prediction.
DataRobot + Labelbox + Snowflake Model-Assisted Labeling Solution
In the previous demo, we start with a training set of movie reviews and sentiment labels in a Snowflake table. DataRobot ingests this training data to produce models that predict if a review is positive, negative, or neutral. We pick the best model and perform Model Assisted Labeling (MAL) in Labelbox to allow reviewers to inspect predictions on a new batch of movie reviews. We make corrections to the model output through Labelbox’s text labeling tool and produce a new training set for DataRobot.
As demonstrated, Labelbox’s capabilities pair elegantly with our mission to unleash the full power of human and machine intelligence, allowing ML teams to operate more effectively. Its technology works by leveraging your own model to make labeling easier, more accurate, and faster, in some cases saving ML teams 50-70% on their entire labeling budget by utilizing MAL. Supported labeling types span everything from classification, object detection, and segmentation of video to transcription and global plus local classification of audio.
In the increasingly interoperable universe of AI/ML, plug-and-play integrations with best-in-class solutions have the power to drastically improve the efficiency of ML teams. The AI community has realized that in order to truly unlock the power of augmented intelligence, they must have access – in easy-to-use, actionable fashion – to unstructured enterprise data. To ensure that data is put to use, DataRobot will continue to develop an expanded suite of solutions for multi-tool operations, enabling our customers to be extraordinarily successful.
We are excited to welcome Labelbox into the DataRobot Partner Ecosystem and look forward to continuing pushing the limits of what’s possible using DataRobot.
To learn more about what’s possible with DataRobot and Labelbox, check out Labelbox’s blog post on optimizing your entire ML pipeline, watch DataRobot CFDS Joel Gongora’s tweet sentiment classification demo, or contact us directly at firstname.lastname@example.org.