DataRobot PartnersUnify all of your data, ETL and AI tools in our open platform with our Technology Partners, extend your cloud investments with our Cloud Partners, and connect with DataRobot Services Partners to help you build, deploy or migrate to the DataRobot AI Platform.
This post was originally part of the DataRobot Community. Visit now to browse discussions and ask questions about DataRobot, AI Platform, data science, and more.
This article describes the DataRobot AI Catalog: what it is, how to access it, and how to use it to find datasets and feature lists suitable for your modeling projects.
The DataRobot AI Catalog is a centralized hub to store materialized and virtual datasets which can then be used for model training and batch predictions. You can launch a DataRobot project using data found in the AI Catalog from:
The Browse AI Catalog button
The AI Catalog tab
Both of these features are on the main page (Figure 1).
Figure 1. Ways to access the DataRobot AI Catalog
Clicking on either one of those options will take you to the AI Catalog where you can browse all the datasets that you have access to (Figure 2).
Figure 2. Sample list of datasets that are accessible via the AI Catalog
At this point, you can locate the desired dataset, click it, and launch a new project by clicking Create project (Figure 3).
Figure 3. Launch a new project with this dataset via the Create project button
Alternatively, you can learn more about that dataset by reading its metadata which is available in the Info tab (Figure 4).
Some dataset metadata includes:
a short description of what it is about
the individual who created it
when it was created
when it was last modified
its status, such as when it was snapshotted or profiled
dimensions of the dataset
Figure 4. Metadata available for each dataset
Users with permissions to write can modify and/or add the name of the dataset, its description, and tags to the metadata so others can gain better understanding of the dataset and make it easier to find it during search.
On the Profile tab you can have access to a preview of the data. Clicking on any column brings up the summary statistics of the feature in that column (Figure 5).
Figure 5. Preview of the data showing the summary statistics for the ‘gender’ column
The Feature Lists tab displays all feature lists that are available for this dataset (Figure 6). You can also create custom feature lists by selecting the features you want to include in your new list and pressing the Create New Feature List From Selection button. DataRobot will bring up a pop up window that allows you to name your new feature list. Any feature lists found in the Feature Lists tab will be automatically imported in any project launched from the AI Catalog. Moreover, everybody with access to this dataset will also have access to all associated feature lists.
Figure 6. AI Catalog Feature Lists tab
Finally, the Version History tab displays the different versions that are available for this dataset. You can create a new project from any version of your dataset (Figure 7).