 # Platt Scaling for Classification Models

January 24, 2020
by
· 3 min read

This post was originally part of the DataRobot Community. Visit now to browse discussions and ask questions about DataRobot, AI Cloud, data science, and more.

This article introduces the popular calibration method, Platt Scaling. For many problems, it is convenient to get a probability P(y=1|x) which is a classification that not only gives an answer, but also a degree of certainty about the answer. However, some classification models like (SVM and Decision Trees) do not provide such a probability, or they provide poor probability estimates.

Platt Scaling amounts to training a logistic regression model on the classifier outputs—has a way of transforming the outputs of a non-probabilistic classification model into a probability distribution over classes.

We will see an example where we train an SVM and then train the parameters of an additional sigmoid function to map the SVM outputs into probabilities.

``(mathrm{P}(y=1 | x) = frac{1}{1 + exp(Af(x) + B)}=)``

We would like to obtain AB, — two scalar parameters that are learned by the algorithm.

This idea was suggested in Probabilistic Outputs for Support Vector Machines and Comparisons to Regularized Likelihood Methods published in 1999 by John C. Platt.

### Packages Needed

• `caret`
• `kernlab`

### Functions Needed

1. `createDataPartition()`
2. `train()`
3. `predict()`

We will use `spam`, a Spam Email database that comes with the `kernlab` package.

Dataset Description: A dataset collected at Hewlett-Packard Labs, that classifies 4601 emails as spam or non-spam. In addition to this class label there are 57 variables indicating the frequency of certain words and characters in the email.

The last column (i.e., variable 58) indicates the type of the mail and is either “nonspam” or “spam”, (i.e. unsolicited commercial email).

Load the `caret` and `kernlab` packages:

``````library(caret)
library(kernlab)``````

Load the `spam` data:

``data(spam)``

Create training and test sets:

``````inTrain <- createDataPartition(y=spam\$type, p=0.75, list=FALSE) # creates test/training partitions # returns Training Set Indeces
training <- spam[inTrain,] # Training Set
testing <- spam[-inTrain,] # Test Set
dim(training) ``````
``##  3451   58``

Fit predictive models over different tuning parameters:

``````set.seed(32343) # to allow reproducibility of results
modelFit <- train(type ~.,data=training, method="svmLinear") # Use the 'type' variable as labels; 'training' data to train
modelFit``````
``````## Support Vector Machines with Linear Kernel
##
## 3451 samples
##   57 predictors
##    2 classes: 'nonspam', 'spam'
##
## No pre-processing
## Resampling: Bootstrapped (25 reps)
##
## Summary of sample sizes: 3451, 3451, 3451, 3451, 3451, 3451, ...
##
## Resampling results
##
##   Accuracy  Kappa  Accuracy SD  Kappa SD
##   0.9       0.8    0.007        0.01
##
## Tuning parameter 'C' was held constant at a value of 1
## ``````

### Final model using the best parameters

In the train control statement, you must specify `classProbs = TRUE` if the class probabilities must be returned.

``````modelFit <- train(type ~.,data=training, method="svmLinear", trControl = trainControl(method = "repeatedcv", repeats = 2,
classProbs =  TRUE))
modelFit\$finalModel``````
``````## Support Vector Machine object of class "ksvm"
##
## SV type: C-svc  (classification)
##  parameter : cost C = 1
##
## Linear (vanilla) kernel function.
##
## Number of Support Vectors : 686
##
## Objective Function Value : -638
## Training error : 0.067806
## Probability model included.``````

### Make test data predictions

Note: The returned values are probabilities themselves. Returned type can be Votes; however, such an option is not available.

``````predictProbs <- predict(modelFit,newdata=testing, type="prob")
``````##    nonspam  spam
## 1 5.22e-05 1.000
## 2 3.07e-01 0.693
## 3 4.24e-01 0.576
## 4 8.45e-03 0.992
## 5 1.22e-01 0.878
## 6 7.43e-02 0.926``````

Train a Logistic Regression model:

``````labels <- testing\$type
labels <- as.numeric(labels)-1
processed_data <- data.frame(predictProbs[,2],labels)
LOGISTIC_model <- train(labels ~.,data=processed_data, method="glm",family=binomial(logit))
LOGISTIC_model\$finalModel``````
``````##
## Call:  NULL
##
## Coefficients:
##       (Intercept)  predictProbs...2.
##             -3.78               8.76
##
## Degrees of Freedom: 1149 Total (i.e. Null);  1148 Residual
## Null Deviance:       1540
## Residual Deviance: 449   AIC: 453``````

Display the Logistic Regression model coefficients:

``LOGISTIC_model\$finalModel\$coefficients``
``````##       (Intercept) predictProbs...2.
##             -3.78              8.76``````

A and B are now estimated.

## References

TRIAL

Take your machine learning game to the next level

Linda Haviland

Community Manager

Meet Linda Haviland
• Listen to the blog

Subscribe to DataRobot Blog
Thank you

We will contact you shortly

Thank You!

We’re almost there! These are the next steps:

• Look out for an email from DataRobot with a subject line: Your Subscription Confirmation.
• Click the confirmation link to approve your consent.
• Done! You have now opted to receive communications about DataRobot’s products and services.

Didn’t receive the email? Please make sure to check your spam or junk folders.

Close Subscribe to our Blog