How to Formulate a Machine Learning Question
Machine learning uses data to create a model that addresses a business question you want answered. You first need to understand the problem you want to solve. The format of your question influences what algorithm is used to solve the problem.
For example, say you are an e-commerce marketing manager and you want to run an email campaign to increase products sold for past customers. You can ask different questions to determine your email campaign strategy. The answers to these questions indicate the type of machine learning problem. I will give hypothetical question examples for classification, regression, time series, natural language processing, and anomaly detection problems.
The answer to your question about the email campaign may be categorical:
- “Based on past customer email data, should I email this customer?” Answers to this question would fall into a “yes” or “no.” Use the answer to determine email recipients.
- “Based on past purchasing patterns, what type of buyer group should the customer be segmented into?” Answers might fall into categories such as “high spender” and “low spender.”
These questions have categorical answers making them classification problems.
The answer to your question may be numeric:
- “Based on past items per shopping cart, what is the items per shopping cart for this customer?” Use the items per cart to target customers for the email campaign.
- “Based on past transaction $, what is the transaction $ for this customer?” Use the transaction $ to target customers.
These questions have numeric answers and can be considered regression problems.
The question you’re asking may have an answer that changes over time:
- “When is the best date and time to send the email?” You would predict email open rates over time by date and hour of the day. Use the time when open rate is predicted to be the highest.
- “If I don’t send the email campaign, what will website traffic be?” You would predict website traffic had the email campaign not been sent to determine impact and if the campaign is worth it.
When there is a relationship between your target and time it typically means it is a time series problem. Learn how to distinguish time series from other regression problems.
Natural language processing
The answer to your question could have a language component:
- “What keywords and content should I include in the email?” You could use natural language processing to analyze customer reviews to determine whether the sentiment is positive or negative and get ideas for email content.
- “What do customers like about product x?” You could use natural language processing to analyze specific product reviews to decide what attributes to market in the email campaign.
The answer to your question may require you to distinguish between “normal” and “anomalous” observations:
- “Is the customer review from a bot account?” You could answer this question with anomaly detection.
- “Is this email address fake?” You could also answer this question with anomaly detection.
Not surprisingly, a “lack of clear question to answer” appeared as a major barrier for data scientists on Kaggle’s State of Data Science in 2017 Survey. Involving different parts of the business can help you evaluate machine learning opportunities more thoroughly from all angles. However, once you formulate the question you want to be answered, you should ensure your data is relevant to the problem and ready for machine learning algorithms.
We will contact you shortly
We’re almost there! These are the next steps:
- Look out for an email from DataRobot with a subject line: Your Subscription Confirmation.
- Click the confirmation link to approve your consent.
- Done! You have now opted to receive communications about DataRobot’s products and services.
Didn’t receive the email? Please make sure to check your spam or junk folders.
How AI Helps Address Customer and Employee ChurnJune 8, 2023· 4 min read
Optimizing Large Language Model Performance with ONNX on DataRobot MLOpsJune 1, 2023· 11 min read
Belong @ DataRobot: AAPI Heritage Month with the ACTnow! CommunityMay 25, 2023· 3 min read
Learn how AI can help businesses reduce customer and employee churn with granular insights and targeted intervention tactics. Explore DataRobot AI Platform.
Many companies are experiencing mounting pressure to have a generative AI strategy, but most are not equipped to meaningfully put generative AI to work. For AI leaders, there are deeper questions you need to ask as you consider your path with generative AI.