Introducing Automated Time Series Anomaly Detection Blog bg v.1

Introducing Automated Time Series Anomaly Detection

June 18, 2020
by
· 4 min read

The DataRobot Automated Time Series product has traditionally been built on a supervised machine learning workflow, which allows users to forecast future events by specifying a target variable to train on. However, there are cases in which we would like to infer information from time series data without knowing the target. This may be in the form of detecting a faulty sensor in a machine or in the form of detecting unusually high network activity on a smart home device. In order to detect anomalous events, we need to look at the dataset holistically — knowing that anomalies can occur anywhere.

Automated Time Series Walkthrough
DataRobot Community Resource

In Release 6.1 on DataRobot, we introduce Time Series Anomaly Detection, a fully unsupervised machine learning workflow that allows users to detect anomalies without specifying a target variable. 

Types of Anomalies

As you might imagine, anomalies can occur in different forms. We may have a single spike on a flat region like this:

Time series data
Time series data

We also see clustered sine waves as follows:

Clustered sine waves
Clustered sine waves

Or several different data types layered on top of one another:

layered data types
Layered data types

With DataRobot’s Anomaly Detection for Time Series, we have a new set of blueprints that leverage leading anomaly detection algorithms, developed to detect a wide array of anomaly types such as these right out-of-the-box.

Using Time Series Anomaly Detection

A core belief of DataRobot is that our products should help accelerate productivity for your data scientists and even help democratize data science for non-data scientists, such as business analysts. Time Series Anomaly Detection is no exception. We designed the UI to be as familiar and easy to use as any of our other products. 

To get started, you follow a few basic steps:

Choose a prediction target
  1. Upload your dataset to AI Catalog or directly to your project as usual. On the Autopilot screen, you will select the “No target?” option. 
  1. Next, proceed to click through the “Set up time aware modeling” options as per normal. You will not have to choose a forecast window for anomaly detection, as we are detecting anomalies in real time. Now click “Start” to begin autopilot.
  2. Once autopilot has completed you will see that models are ranked by “Synthetic AUC”. This metric is generated by binning the most common and the least common values to synthetically label points in time as anomalies. These labels are then used to compute the synthetic AUC for the model. 
Leaderboard
Leaderboard
  1. You can also upload a partial or full dataset with labeled anomalies to generate the actual AUC metric. In order to use this functionality, select a model, click on the predict tab, and then upload a dataset with the labels.
  2. You then select “Forecast Range Predictions” and enter the label column name. Click on the compute prediction button as shown below: 
Forecast Range Prediction
Forecast range prediction
  1. Once the predictions are computed, go to the menu and click on “Show external test column.”  You will see that the metric will change from “Synthetic AUC” to “AUC” as follows:
External test column
External test column
  1. You can also further investigate anomalies under Evaluate > Anomaly over Time. This feature allows you to flip through different series and backtests to see when the anomaly occurred. Additionally, each anomaly is scored with a probability between 0-1 to show the certainty with which we can say that an anomaly occurred in that point in time. 
Anomaly over time chart
Anomaly over time chart
  1. Similar to standard Automated Time Series functionality, it is also possible to create a min or max blender model. For anomaly detection, a max blend model can detect all possible anomalies. These blenders will be especially useful for users with a higher tolerance for false positives.

So, in a few easy steps, assuming you are happy with your model, you are now ready to deploy your model to detect new anomalies in real time. You can do this in all of the usual ways that you are familiar with from the standard Automated Time Series product. 

Get Started with Anomaly Detection Today

At DataRobot, we are proud to bring Anomaly Detection for Automated Time Series to the market. We encourage you to explore this new functionality. You can contact us directly if you are interested.

For existing customers, Anomaly Detection is included with your DataRobot Automated Time Series license. We’ve also included videos in the DataRobot Community to show you these capabilities in more detail and to help you build your first few models. Anomaly Detection is generally available today. So check it out!

Release 6.1
Explore the Latest DataRobot Release

Cutting-edge of enterprise AI

Learn more

About the author
Fareya Ikram
Fareya Ikram

Software Engineer at DataRobot

Fareya works as a software engineer for the Automated Time Series product at DataRobot. She joined DataRobot in February after graduating with a degree in computer science from Worcester Polytechnic Institute. Fareya enjoys working with the product and marketing teams, and learning from customer feedback.

Meet Fareya Ikram
  • Listen to the blog
     
  • Share this post
    Subscribe to DataRobot Blog
    Newsletter Subscription
    Subscribe to our Blog