The 4 Ways Intelligent Automation Helps Achieve DataOps Background

The 4 Ways Intelligent Automation Helps Achieve DataOps

August 13, 2019
· 3 min read

To accommodate the growing popularity of self-service data preparation among our customers, we focused our most recent 2019 release on fusing the generation of data prep flows and the operationalization of them. The larger the team of data analysts, data engineers, and data practitioners, the more data preparation flows get created and reused across the organization. Instead of making operationalization of these flows a separate task, we wanted the system to handle that intelligently.

Say hello to Intelligent Automation.

In speaking with customers who are using the new release, it is fulfilling to hear them express the benefits of our recent innovation. I repeatedly hear customers articulate one of four particular aspects of Paxata Intelligent Automation.

1) Automated Runtime and Dependency Management

With a single click, Paxata automatically detects and sequences all related data prep projects into a single data flow. For example, take one data preparation project that is dependent on a dozen other projects, all curated from the teamwork and collaboration of several data practitioners.

Then, instead of manually linking the various data flows together—typically a task done by IT or a data engineer—Paxata Intelligent Automation will recursively map all the dependencies. That way, when a single project is set to run, the rest execute accordingly.

“Paxata’s new Intelligent Automation greatly improves our organization’s ability to accelerate the continuous delivery of our customer master and data quality projects. We can now automate any curated data preparation workflow with a single click, and Paxata intelligently discovers and links all the dependent projects together, which helps us save time and simplify our data operations.” — Byron Hernandez, Senior Data Analyst, Cox Automotive.

2) Visual Impact Analysis

The new Automated Project Flow graph is useful to see how incoming data is impacting downstream processes. This view gives full visibility to all upstream sources and downstream outputs for any given data preparation project, helping the user visualize the impact before making any changes.

Data project flowchart starting with dataset and ending with final project

Tracking the History and Analyzing Data Snapshots

For an operationalized process, the ability to record (what ran when to generate what results), audit, and even troubleshoot is very important. Paxata Intelligent Automation has this functionality built in, so business analysts or data engineers can set it and forget it. For every automation job that runs, Paxata automatically snapshots the version of Dataset and Projects that were executed. If you need to go back to the last quarter and analyze what data and transformations resulted in the answers, you have the ability to do so quickly and easily.

Paxata also automatically versions the results so you can whip up a periodic trend report.

4) Integration with Other Data Processes

In enterprises where an orchestration engine already exists, Paxata’s Intelligent Automation can be invoked within that context using REST APIs. This helps processing of data preparation flows within a larger data architecture. The same automated recursive map ensures that all the dependencies are sequenced appropriately for every run. 

Many of these points reinforce the core principles of DataOps, ensuing the orchestration of data to production use cases and agile deployment of new changes. To scale data preparation at the enterprise, one can either hire and assemble a team of DataOps professionals or take advantage of intelligence algorithms that facilitate the same process with no or little DataOps oversight. The choice, I think you’ll agree, is an easy one.

Free Trial
DataRobot Data Prep

Interactively explore, combine, and shape diverse datasets into data ready for machine learning and AI applications

Try now for free

About the author

Value-Driven AI

DataRobot is the leader in Value-Driven AI – a unique and collaborative approach to AI that combines our open AI platform, deep AI expertise and broad use-case implementation to improve how customers run, grow and optimize their business. The DataRobot AI Platform is the only complete AI lifecycle platform that interoperates with your existing investments in data, applications and business processes, and can be deployed on-prem or in any cloud environment. DataRobot and our partners have a decade of world-class AI expertise collaborating with AI teams (data scientists, business and IT), removing common blockers and developing best practices to successfully navigate projects that result in faster time to value, increased revenue and reduced costs. DataRobot customers include 40% of the Fortune 50, 8 of top 10 US banks, 7 of the top 10 pharmaceutical companies, 7 of the top 10 telcos, 5 of top 10 global manufacturers.

Meet DataRobot
  • Listen to the blog
  • Share this post
    Subscribe to DataRobot Blog
    Newsletter Subscription
    Subscribe to our Blog