Multiclass Classification

What does Multiclass Classification mean?

There are two types of classification algorithms: binary and multiclass. In multiclass classification, each record belongs to one of three or more classes, and the algorithm’s goal is to construct a function which, given a new data point, will correctly identify into which class the new data point falls.

For example, a multiclass algorithm can determine which parental guideline rating a movie is likely to receive – “PG,” “TV-14,” “R,” “G,” etc. – based on patterns it learns from this sample movie dataset:

sample movie dataset

The movies have also been tagged with descriptions – “General Audience,” “Suitable for all ages,” etc. This is different than the rating system in that each movie can be described by one or more of the tag categories, which is known as a multi-label classification problem.

Why is Multiclass Classification important?

Multiclass classification extends the number of practical business problems analysts can gain insight into with machine learning. For example, it enables a business to predict which product a customer will purchase next from several options, allowing it to estimate expected revenue and adjust business practices and resources accordingly. This is just one example of a business use case for multiclass.

DataRobot + Multiclass

The DataRobot automated machine learning platform automatically defaults to the appropriate classification technique for your target variable and runs a wide array of classification algorithms on your data. Then, the feature impact functionality will expose which inputs are the most important for determining classes, and the confusion matrix will show how accurate each algorithm is for your dataset.