Why Data Analytics Should Start with Self Service Data Prep Background

Analytic Innovation – Reimagining Data to Insight

January 9, 2019
· 4 min read

As the Co-Founder and SVP of Strategic Alliances, I have been fortunate to be involved in most of our customer implementations from day one. I often get asked about customer use cases for data preparation and what it takes to succeed in these types of projects. What has become clear to me is that the customers who are having the most transformative business outcomes are not those who try to use self-service data prep to replace older ETL processes, but rather those who are reimagining the entire data to insights value chain. Secondly, analytic innovation needs to be a strategic initiative with support at the highest level in the organization, as suggested by Michael Gorriz, Group CIO of Standard Chartered Bank explains this:

“Joining the dots across different data sources and extracting value out of data is an important element of the Bank’s strategy as a diverse global bank. Data Prep is helping us transform raw data into an insightful information fabric and this can help increase our speed of decision-making. This investment is an important step in building our capabilities as a data-driven company where employees in all functions are able to harness data at their fingertips.”

While this point of view might seem pretty obvious, statistics bear out the fact that top performing companies proactively embrace modern data management and analytical processes, while laggards seem to be slow to embrace. As an example, a recent report from Gartner, titled Applied Infonomics: Seven Steps to Monetize Available Information Assets (Nov. 2018)1 shows top performers are more than twice as likely as typical performers to have monetized information assets, and eight times more likely to have done so than trailing organizations. In my experience, many of the companies that struggle are getting stuck in modernizing technology piece parts instead of rethinking the entire value chain.

The Failure of Business Intelligence

Business Intelligence (BI) has grown in popularity over the years, but has ONLY excelled at answering traditionally, repeatedly asked questions. Moreover, all of the processes around BI center on carefully curating data into rigidly modeled structures to get the perfect answers to our old questions. Changing the questions asked would require massive rework by IT developers to load more data or different data into a newly modeled schema, which of course would take weeks or months.

There are two challenges with this traditional approach:

  1. How many of us operate within a business or competitive landscape that is static or slow moving?
  2. Simply answering the same question every day means we are exclusively focusing on optimizing that one aspect of the business, while ignoring the other areas that could also be improved.

Reimagining Analytics

Reimagining your analytical value chains and processes means you have to start with the actionable part of the word: IMAGINE. What are the new questions to ask?

For example, is there a way to know exactly at what times our new users are using our product and for how long? Is this different from our less loyal customers? To answer this type of question, you then need to a data architecture that is built for exploration and discovery versus one that answered to a few canned questions in the past.

There are three tenets for success with your modern analytics processes:

  1. Put business people in the driver seat with self-service data preparation. Business analysts need to be able to easily find, profile, standardize, and enrich data that they can then use in their BI tools or predictive models. This requires toolsets designed for the business as the traditional developer tools are too complex to learn. Forcing them to continue asking IT for data will keep you in the old world.
  2. Build out your data architecture to support your enterprise’s use of all your data. A modern data architecture needs to run in a multi-cloud hybrid environment and it need to support processing data at any scale. It also needs to facilitate your ability to spin up resources for a new project quickly and break it down once completed. And you need to be able to manage, secure and govern all of the data and data processes across your entire organization.
  3. Embrace a culture of collaboration across business, IT, and data scientists. Business must take the lead, because it knows the business context and the meaning of the data. IT understands the legacy data technologies and how to access and manipulate the data. The data scientists possess the statistical and mathematical knowledge to help you determine the best algorithms to use in order to obtain the insights you seek. Without all three your data teams are not going to create agility, insight delivery, at enterprise scale.

Closing thoughts

If you are interested in finding out more about how Data Prep can help reimagine your data to insights process, please contact us. We regularly work with enterprises globally to help them discover use cases where self-service data preparation can bring dramatic changes.

Free Trial
DataRobot Data Prep

Interactively explore, combine, and shape diverse datasets into data ready for machine learning and AI applications

Try now for free


  1. Applied Infonomics: Seven Steps to Monetize Available Information Assets
    Analyst(s): Douglas Laney | Alan D. Duncan | Lydia Clougherty Jones | Mike Rollings, November 21, 2018
About the author

Value-Driven AI

DataRobot is the leader in Value-Driven AI – a unique and collaborative approach to AI that combines our open AI platform, deep AI expertise and broad use-case implementation to improve how customers run, grow and optimize their business. The DataRobot AI Platform is the only complete AI lifecycle platform that interoperates with your existing investments in data, applications and business processes, and can be deployed on-prem or in any cloud environment. DataRobot and our partners have a decade of world-class AI expertise collaborating with AI teams (data scientists, business and IT), removing common blockers and developing best practices to successfully navigate projects that result in faster time to value, increased revenue and reduced costs. DataRobot customers include 40% of the Fortune 50, 8 of top 10 US banks, 7 of the top 10 pharmaceutical companies, 7 of the top 10 telcos, 5 of top 10 global manufacturers.

Meet DataRobot
  • Listen to the blog
  • Share this post
    Subscribe to DataRobot Blog
    Newsletter Subscription
    Subscribe to our Blog