DataRobot AI Catalog Overview
This post was originally part of the DataRobot Community. Visit now to browse discussions and ask questions about DataRobot, AI Cloud, data science, and more.
This article describes the DataRobot AI Catalog: what it is, how to access it, and how to use it to find datasets and feature lists suitable for your modeling projects.
The DataRobot AI Catalog is a centralized hub to store materialized and virtual datasets which can then be used for model training and batch predictions. You can launch a DataRobot project using data found in the AI Catalog from:
- The Browse AI Catalog button
- The AI Catalog tab
Both of these features are on the main page (Figure 1).
Figure 1. Ways to access the DataRobot AI Catalog
Clicking on either one of those options will take you to the AI Catalog where you can browse all the datasets that you have access to (Figure 2).
Figure 2. Sample list of datasets that are accessible via the AI Catalog
At this point, you can locate the desired dataset, click it, and launch a new project by clicking Create project (Figure 3).
Figure 3. Launch a new project with this dataset via the Create project button
Alternatively, you can learn more about that dataset by reading its metadata which is available in the Info tab (Figure 4).
Some dataset metadata includes:
- a short description of what it is about
- the individual who created it
- when it was created
- when it was last modified
- its status, such as when it was snapshotted or profiled
- dimensions of the dataset
Figure 4. Metadata available for each dataset
Users with permissions to write can modify and/or add the name of the dataset, its description, and tags to the metadata so others can gain better understanding of the dataset and make it easier to find it during search.
On the Profile tab you can have access to a preview of the data. Clicking on any column brings up the summary statistics of the feature in that column (Figure 5).
Figure 5. Preview of the data showing the summary statistics for the ‘gender’ column
The Feature Lists tab displays all feature lists that are available for this dataset (Figure 6). You can also create custom feature lists by selecting the features you want to include in your new list and pressing the Create New Feature List From Selection button. DataRobot will bring up a pop up window that allows you to name your new feature list. Any feature lists found in the Feature Lists tab will be automatically imported in any project launched from the AI Catalog. Moreover, everybody with access to this dataset will also have access to all associated feature lists.
Figure 6. AI Catalog Feature Lists tab
Finally, the Version History tab displays the different versions that are available for this dataset. You can create a new project from any version of your dataset (Figure 7).
Figure 7. AI Catalog Version History tab
See the DataRobot public platform documentation:
We will contact you shortly
We’re almost there! These are the next steps:
- Look out for an email from DataRobot with a subject line: Your Subscription Confirmation.
- Click the confirmation link to approve your consent.
- Done! You have now opted to receive communications about DataRobot’s products and services.
Didn’t receive the email? Please make sure to check your spam or junk folders.
Accelerate Your AI Journey with the DataRobot Partner EcosystemMarch 28, 2023· 3 min read
How MLOps Enables Machine Learning Production at ScaleMarch 23, 2023· 4 min read
A New Era of Value-Driven AIMarch 16, 2023· 2 min read
Data is the fuel that drives high-scale innovation with AI. Enterprises that put a strategy in place encouraging a culture of collaboration and sharing of their data assets will benefit the most. These organizations will see exponential gains in productivity with AI. Their employees will work together on projects, sharing their ideas, things they create, and their domain expertise. However,…
AI projects have many more unknowns than traditional technology projects. You have to know the right use case to start with and know the value you can expect even before you start. You need to understand what data sources to go after and how to get the data ready. You have to pick the right model to meet expected performance goals. Train it, test it, tune it. The list goes on. While you are trying to figure all this out, organizational leaders expect results from their investments in AI faster than ever before.
Visual AI is New in DataRobot 6.0 In Release 6.0 of DataRobot, we are thrilled to announce a ground-breaking new capability in our Automated Machine Learning product. DataRobot Visual AI gives you the ability to easily incorporate image data into your machine learning models alongside tabular and text-based data types. This enables your organization to get value from computer vision,…