What are Eureqa Models?
Eureqa models are a powerful addition to the DataRobot Automated Time Series product. They help you find human-readable mathematical formulas to explain the patterns in your data and build accurate, transparent, and actionable models solving real-world problems much faster than other approaches.
AI That Applies the Laws of Physics to Solve Real-World Problems
Eureqa algorithms were designed by DataRobot’s Chief Scientist, Michael Schmidt back in 2007.
His idea was to develop a genetic algorithm that can fit different analytic expressions to trained data and return the best mathematical formula as a machine learning model. This is a fundamentally different approach compared to traditional supervised machine learning models such as tree-based, regression, or deep learning. The approach has since been cited in over 800 peer-reviewed publications and used in applications ranging from finance to neuroscience.
In essence, Eureqa models are trained just like any other supervised machine learning algorithm. You provide the algorithm with labeled training data representing historic information, and the algorithm will fit an analytic expression to that training data.
Eureqa is the world’s most successful and well-known algorithm for discovering mathematical relationships formulas in data, with thousands of research publications citing Eureqa in their results. DataRobot contains multiple blueprints utilizing Eureqa to build predictive models for time series, regression, and classification problems.
Transparent and Extendable
One of the reasons customers love Eureqa models is because the algorithms return human-readable and interpretable analytical expressions, which subject-matter experts can quickly review.
Experts can easily incorporate their domain knowledge too. For example, suppose you know the underlying relationships in the system that you are modeling. In this case, you can give Eureqa a hint, such as, the formula of heat transfer of how house prices work in a particular neighborhood. Experts provide known relationships as building blocks or a starting point to learn from. Eureqa will then incorporate these corrections and work from there.
Less Complexity. Better Results.
Eureqa is particularly good at feature selection because it is forced to reduce complexity during the model building process. For example, if the data has 20 different columns used to predict the target variable, the search for a simple expression will result in an expression that uses only the strongest predictors.
It also works very well in small datasets. That’s why Eureqa models are popular for scientific researchers who gather data from physical experiments that don’t produce massive amounts of data.
Smart Feature Selection
Eureqa performs its own specialized feature engineering. Within given complexity constraints, Eureqa blueprints optimize for the simplest possible model, to constrain the number of features a model can use. Eureqa is an expert at identifying only the most relevant features to start with.
Just like LASSO, feature selection is embedded inside the objective process for the algorithm.
Eureqa tests millions of combinations of features in different formulas with limited (or penalized) complexity and is very effective at reducing the number of features used in the ultimate model.