Type 3 ANOVA in R -- an easy way to publish wrong tables
In R, so-called “Type I sums of squares” are default. With balanced designs, inferential statistics from Type I, II, and III sums of squares are equal. Type III sums of squares are returned using car::Anova instead of base R anova. But to get the correct Type III statistics, you cannot simply specify car:Anova(m1, type = 3). You also have to set the contrasts in the model matrix to contr.sum in your linear model fit.
Best practice: many google searches will return code that resets the contrasts globally using options. I use code that specifies the contrasts only in the linear model that we want to specify contr.sum. This is much safer, at least if you like to interpret model coefficients using default R contrasts – the coefficients are differences relative to a reference and the interactions are relative to what you would have gotten if things were additive. If I reset the contrasts globally using options, and then, in code further down the document, fit a linear model and interpret the coefficients as if the model was fit with R default contrasts, then my interpretation of the effects would be wrong.
library(data.table)
library(car)
# Loading required package: carData
n <- 5
fd <- data.table(fac1 = rep(c("cn","tr"), each = n*2*2),
fac2 = rep(rep(c("wt", "ko"), each = n*2), 2),
fac3 = rep(rep(rep(c("f","m"), each = n),2),2),
y = rnorm(n*2*2*2))
linear model set with R defaults, using anova – correct Type I
m1 <- lm(y ~ fac1*fac2*fac3, data = fd)
anova(m1)
# Analysis of Variance Table
#
# Response: y
# Df Sum Sq Mean Sq F value Pr(>F)
# fac1 1 0.0064 0.00642 0.0066 0.93562
# fac2 1 0.1998 0.19978 0.2062 0.65283
# fac3 1 0.1578 0.15781 0.1629 0.68921
# fac1:fac2 1 0.8592 0.85918 0.8868 0.35341
# fac1:fac3 1 1.2276 1.22755 1.2670 0.26871
# fac2:fac3 1 0.1080 0.10803 0.1115 0.74062
# fac1:fac2:fac3 1 2.9696 2.96960 3.0650 0.08958 .
# Residuals 32 31.0043 0.96888
# ---
# Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
linear model set with contr.sum, using Anova – correct Type III
type3 <- list(fac1 = contr.sum, fac2 = contr.sum, fac3 = contr.sum)
m2 <- lm(y ~ fac1*fac2*fac3, data = fd, contrasts = type3)
Anova(m2, type = 3) # correct type 3
# Anova Table (Type III tests)
#
# Response: y
# Sum Sq Df F value Pr(>F)
# (Intercept) 0.0420 1 0.0433 0.83643
# fac1 0.0064 1 0.0066 0.93562
# fac2 0.1998 1 0.2062 0.65283
# fac3 0.1578 1 0.1629 0.68921
# fac1:fac2 0.8592 1 0.8868 0.35341
# fac1:fac3 1.2276 1 1.2670 0.26871
# fac2:fac3 0.1080 1 0.1115 0.74062
# fac1:fac2:fac3 2.9696 1 3.0650 0.08958 .
# Residuals 31.0043 32
# ---
# Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
linear model set with R defaults, using Anova – wrong Type III
This is garbage.
m1 <- lm(y ~ fac1*fac2*fac3, data = fd)
Anova(m1, type = 3) # correct type 3
# Anova Table (Type III tests)
#
# Response: y
# Sum Sq Df F value Pr(>F)
# (Intercept) 0.7616 1 0.7860 0.38192
# fac1 0.9845 1 1.0161 0.32101
# fac2 0.1149 1 0.1186 0.73278
# fac3 3.1633 1 3.2649 0.08019 .
# fac1:fac2 0.3171 1 0.3273 0.57128
# fac1:fac3 4.0078 1 4.1366 0.05033 .
# fac2:fac3 2.1052 1 2.1728 0.15024
# fac1:fac2:fac3 2.9696 1 3.0650 0.08958 .
# Residuals 31.0043 32
# ---
# Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1