Simplifying Big Data and AI with DataRobot and Databricks

June 1, 2018
· 2 min read

Many organizations are looking for ways to apply AI and analytics to their business, which requires attention all the way from data prep to machine learning to deployment. At DataRobot, we’re pleased to announce our partnership with Databricks, which allows us to provide companies with a robust solution to accelerate their analytics innovation and building of AI applications. 

Founded by the original creators of Apache Spark™, the Databricks Unified Analytics Platform accelerates innovation by unifying data engineering, data scientists, and business. Databricks enables organizations to achieve faster time-to-value by creating end-to-end data pipelines that go from ETL and interactive exploration to production all in one place, with unprecedented performances – 10-100x faster than Apache Spark.


Why Databricks and DataRobot

Together, Databricks and DataRobot offer a unique combination of tools that empower AI and machine learning teams — from data scientists to “citizen data scientists” like business analysts, software engineers, and data engineers — to be more productive by providing the resources needed for project success.

Databricks brings the Unified Analytics Platform to DataRobot users to deliver ETL capabilities to cleanse, reformat, join, and optimize datasets to build machine learning models. DataRobot brings the power of automated machine learning to Databricks users, allowing them to quickly build, validate, test, and determine the best machine learning model for their AI challenges. Within minutes, DataRobot can iterate on thousands of combinations of machine learning models and parameters that would take days or weeks to do manually.


The end-to-end workflow when working with Databricks and DataRobot is: 

  1. Read your data into Spark Dataframes and transform your data

  2. Take your Spark DataFrame and then serialize it to a Python DataFrame

  3. Send that data to DataRobot to build, train, and evaluate a collection of machine learning models to consider

  4. Validate the model and retrieve model insights in DataRobot

  5. Choose the best model for your business and then use one of DataRobot’s model deployment options to operationalize the model

  6. The chosen model can be moved into Databricks to run at huge scale


Together, Databricks and DataRobot enable data scientists and citizen data scientists to accelerate and scale the development and delivery of machine learning models. Through this collaboration, users of both solutions are empowered with key capabilities of automated machine learning, robust ETL, and rapid model development and deployment. This greatly increases productivity and removes bottlenecks in the analytics process and building of AI applications.

For more information on the integration between Databricks and DataRobot, please read about the detailed workflow process here.


New Call-to-action


About the Author:

Dan Ganancial leads Partner Marketing at DataRobot, and he is responsible for driving joint marketing initiatives with technology alliance and channel partners. Dan is a marketing professional with more than 10 years of experience in partner, product, and strategic marketing. He has held several roles in his career related to sales, business development, and marketing where he has produced a strong record in driving both customer and revenue growth. Follow him on Twitter – @datarobotdan



About the author
Dan Ganancial
Meet Dan Ganancial
  • Listen to the blog
  • Share this post
    Subscribe to DataRobot Blog
    Newsletter Subscription
    Subscribe to our Blog