Testing External Datasets in DataRobot
This post was originally part of the DataRobot Community. Visit now to browse discussions and ask questions about DataRobot, AI Cloud, data science, and more.
This article showcases how you can upload your own external testing datasets into DataRobot to evaluate the performance of your models.
DataRobot will handle partitioning automatically, ensuring that models are evaluated on out-of-sample data. However, analysts are still given the option to upload any number of additional test datasets and compare metric scores to ensure consistency prior to deployment.
Note: To ensure that you can access this functionality, contact your DataRobot representative for information on enabling the feature.
Uploading External Dataset
To upload the external dataset, navigate to the Make Predictions page by clicking the Predict tab for your DataRobot model (Figure 1).
Use one of the Import data from options and upload your dataset. When the dataset finishes uploading, you see the option “Run External Test” (Figure 2). Click that link.
DataRobot takes a few moments to finish calculating the accuracy metrics against this dataset for the related model.
Sorting by external test data accuracy
To see and use the newly calculated accuracy, select Menu > Show External Test Column (Figure 3).
You can now sort models by external test scores and calculate scores for more models.
Currently, external test datasets are supported for:
- Binary Classification
- Multiclass Classification
- OTV projects
Search DataRobot documentation for Testing with external datasets.
We will contact you shortly
We’re almost there! These are the next steps:
- Look out for an email from DataRobot with a subject line: Your Subscription Confirmation.
- Click the confirmation link to approve your consent.
- Done! You have now opted to receive communications about DataRobot’s products and services.
Didn’t receive the email? Please make sure to check your spam or junk folders.
How the DataRobot AI Platform Is Delivering Value-Driven AIMarch 16, 2023· 4 min read
New DataRobot and Snowflake Integrations: Seamless Data Prep, Model Deployment, and MonitoringMarch 16, 2023· 5 min read
Earlier we covered Ordinary Least Squares regression with a single variable. In this posting we will build upon that by extending Linear Regression to multiple input variables giving rise to Multiple Regression, the workhorse of statistical learning. We first describe Multiple Regression in an intuitive way by moving from a straight line in a single predictor case to a 2d…
Before you begin modeling and making predictions, you might ask yourself, “How much data do I need?”. Is there such a thing as too much data? We will tackle this topic in AI Simplified: Data Requirements. The larger the dataset, the trickier it is to make sure that each and every piece of data is relevant to your particular business…
Whether you are preparing data for analytics or reporting, performing a data migration or consolidation, or creating a unified view of customer, product, or vendor, the organization name attribute is a critical data component. It must be clean and standardized to allow an accurate view of your business operations and to support optimal business decisions, especially after an M&A event.…