DataRobot AI Catalog Overview

April 17, 2020
by
· 3 min read

This post was originally part of the DataRobot Community. Visit now to browse discussions and ask questions about DataRobot, AI Cloud, data science, and more.

This article describes the DataRobot AI Catalog: what it is, how to access it, and how to use it to find datasets and feature lists suitable for your modeling projects.

The DataRobot AI Catalog is a centralized hub to store materialized and virtual datasets which can then be used for model training and batch predictions. You can launch a DataRobot project using data found in the AI Catalog from:

  • The Browse AI Catalog button
  • The AI Catalog tab

Both of these features are on the main page (Figure 1).

Figure 1. Ways to access the DataRobot AI CatalogFigure 1. Ways to access the DataRobot AI Catalog

Clicking on either one of those options will take you to the AI Catalog where you can browse all the datasets that you have access to (Figure 2).

Figure 2. Sample list of datasets that are accessible via the AI CatalogFigure 2. Sample list of datasets that are accessible via the AI Catalog

At this point, you can locate the desired dataset, click it, and launch a new project by clicking Create project (Figure 3).

Figure 3. Launch a new project with this dataset via the Create project buttonFigure 3. Launch a new project with this dataset via the Create project button

Alternatively, you can learn more about that dataset by reading its metadata which is available in the Info tab (Figure 4).

Some dataset metadata includes:

  • a short description of what it is about
  • the individual who created it
  • when it was created
  • when it was last modified
  • its status, such as when it was snapshotted or profiled
  • dimensions of the dataset

Figure 4. Metadata available for each datasetFigure 4. Metadata available for each dataset

Users with permissions to write can modify and/or add the name of the dataset, its description, and tags to the metadata so others can gain better understanding of the dataset and make it easier to find it during search.

On the Profile tab you can have access to a preview of the data. Clicking on any column brings up the summary statistics of the feature in that column (Figure 5).

Figure 5. Preview of the data showing the summary statistics for the ‘gender’ columnFigure 5. Preview of the data showing the summary statistics for the ‘gender’ column

The Feature Lists tab displays all feature lists that are available for this dataset (Figure 6). You can also create custom feature lists by selecting the features you want to include in your new list and pressing the Create New Feature List From Selection button. DataRobot will bring up a pop up window that allows you to name your new feature list. Any feature lists found in the Feature Lists tab will be automatically imported in any project launched from the AI Catalog. Moreover, everybody with access to this dataset will also have access to all associated feature lists.

Figure 6. AI Catalog Feature Lists tabFigure 6. AI Catalog Feature Lists tab

Finally, the Version History tab displays the different versions that are available for this dataset. You can create a new project from any version of your dataset (Figure 7).


Figure 7. AI Catalog Version History tabFigure 7. AI Catalog Version History tab

More information

See the DataRobot public platform documentation:

Documentation
Import and Create Projects in the AI Catalog
Learn More
About the author
Linda Haviland
Linda Haviland

Community Manager

Meet
  • Listen to the blog
     
  • Share this post