This post was originally part of the DataRobot Community. Visit now to browse discussions and ask questions about DataRobot, AI Platform, data science, and more.
Linear regression is a way to model the relationship between a response variable and one or more explanatory variables. In linear regression, the data is modeled by a linear function.
Package(s) needed: “scatterplot3d” (license: GPL-2)
Simple linear regression (only one explanatory variable)
“mtcars” is a built-in dataset of R that contains fuel consumption and other aspects of car design and performance for 32 cars.
data(mtcars)
head(mtcars)
## mpg cyl disp hp drat wt qsec vs am gear carb
## Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
## Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
## Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
## Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
## Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
## Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1
Fitting the dataset with a simple linear regression where the response variable is “mpg” and the explanatory variable is “wt” (weight) and printing out information about the simple linear regression model.
slm <- lm(mpg ~ wt, data = mtcars)
summary(slm)
##
## Call:
## lm(formula = mpg ~ wt, data = mtcars)
##
## Residuals:
## Min 1Q Median 3Q Max
## -4.543 -2.365 -0.125 1.410 6.873
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 37.285 1.878 19.86 < 2e-16 ***
## wt -5.344 0.559 -9.56 1.3e-10 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.05 on 30 degrees of freedom
## Multiple R-squared: 0.753, Adjusted R-squared: 0.745
## F-statistic: 91.4 on 1 and 30 DF, p-value: 1.29e-10
Plotting a graph of “wt” vs. “mpg” and adding a line of best fit.
plot(x = mtcars$wt, y = mtcars$mpg, main = "Car Weight vs. Car MPG", xlab = "Weight", ylab = "MPG", col = "blue")
abline(slm, col = "red")
Plotting a residuals vs. fitted graph with a line of best fit. The residuals are generally close to 0 except for a few outliers.
plot(slm, 1)
Plotting a normal Q-Q graph of the standardized residuals. The residuals are generally close to the diagonal line except for a few outliers. This suggests the errors are normally distributed, an assumption of linear regression.
plot(slm, 2)
Multiple linear regression (multiple explanatory variables)
Loading the library for 3D scatterplot.
library(scatterplot3d)
Fitting the “mtcar” dataset with a multiple linear regression model, where the response variable is “mpg” and the explanatory variables are “wt” (weight) and “disp” (displacement), and then printing out information about the multiple linear regression model.
mlm <- lm(mpg ~ wt + disp, data = mtcars)
summary(mlm)
##
## Call:
## lm(formula = mpg ~ wt + disp, data = mtcars)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.409 -2.324 -0.768 1.772 6.348
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 34.96055 2.16454 16.15 4.9e-16 ***
## wt -3.35083 1.16413 -2.88 0.0074 **
## disp -0.01772 0.00919 -1.93 0.0636 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.92 on 29 degrees of freedom
## Multiple R-squared: 0.781, Adjusted R-squared: 0.766
## F-statistic: 51.7 on 2 and 29 DF, p-value: 2.74e-10
Plotting a graph of “wt” and “disp” vs. “mpg” that displays residual errors.
s3d <- scatterplot3d(x = mtcars$wt, y = mtcars$disp, z = mtcars$mpg, main = "Car Weight and Car Displacement vs. Car MPG with Residuals", xlab = "Weight", ylab = "Displacement", zlab = "MPG", color = "blue", pch = 20)
s3d$plane3d(mlm, lty = "dotted")
orig <- s3d$xyz.convert(mtcars$wt, mtcars$disp, mtcars$mpg)
plane <- s3d$xyz.convert(mtcars$wt, mtcars$disp, fitted(mlm))
i.negpos <- 1 + (resid(mlm) > 0)
segments(orig$x, orig$y, plane$x, plane$y, col = c("blue", "red")[i.negpos], lty = (2:1)[i.negpos])
Related Materials
“scatterplot3d” function
Linear regression
Plotting 3D scatterplots with residuals
DataRobot Documentation portal (Regression Problems section)
DEMO
See DataRobot in Action
Schedule a live demo to learn more about how the DataRobot AI platform can help you deliver value and success
Sign up