library(tidyverse)
library(tidymodels)
library(scatterplot3d)
library(palmerpenguins)
Additive Models
Solutions
Load packages and data
Today
By the end of today you will…
- understand the difference between and additive vs interaction model
- understand the geometric picture of multiple linear regression
- be able to build, fit and interpret linear models with \(>1\) predictor
Additive vs interaction model
We are going to fit a model that looks at the impact of bill length on flipper length, while also accounting for the species of penguin. In this context, define what an additive model is vs an interaction model.
additive: The relationship between flipper length and bill length does not depend on species
interaction: The relationship between flipper length and bill length depends on species
Plots
additive model
penguins <- na.omit(penguins)
fitlm <- lm(flipper_length_mm ~ bill_length_mm + species, data = penguins)
penguins$predlm = predict(fitlm)
ggplot(penguins, aes(x = bill_length_mm,
y = flipper_length_mm,
color = species)) +
geom_point() +
geom_line(aes(y = predlm), linewidth = 1)
interaction model
penguins |>
ggplot(
aes(x = bill_length_mm,
y = flipper_length_mm,
color = species)) +
geom_point() +
geom_smooth(method = "lm" , se = F)
`geom_smooth()` using formula = 'y ~ x'
Fitting the additive model
To fit the additive model, we can use the + sign. Use the plus sign to add species to the linear model code fit from Monday’s class.
Call:
lm(formula = flipper_length_mm ~ bill_length_mm + species, data = penguins)
Residuals:
Min 1Q Median 3Q Max
-24.8669 -3.4617 -0.0765 3.7020 15.9944
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 147.5633 4.2234 34.940 < 2e-16 ***
bill_length_mm 1.0957 0.1081 10.139 < 2e-16 ***
speciesChinstrap -5.2470 1.3797 -3.803 0.00017 ***
speciesGentoo 17.5517 1.1883 14.771 < 2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 5.833 on 329 degrees of freedom
Multiple R-squared: 0.8283, Adjusted R-squared: 0.8268
F-statistic: 529.2 on 3 and 329 DF, p-value: < 2.2e-16
What are the equations for the three different species in the additive model?
See the slides!