Type 3 ANOVA in R -- an easy way to publish wrong tables
In R, so-called “Type I sums of squares” are default. With balanced designs, inferential statistics from Type I, II, and III sums of squares are equal. Type III sums of squares are returned using car::Anova
instead of base R anova
. But to get the correct Type III statistics, you cannot simply specify car:Anova(m1, type = 3)
. You also have to set the contrasts in the model matrix to contr.sum
in your linear model fit.
Best practice: many google searches will return code that resets the contrasts globally using options
. I use code that specifies the contrasts only in the linear model that we want to specify contr.sum
. This is much safer, at least if you like to interpret model coefficients using default R contrasts – the coefficients are differences relative to a reference and the interactions are relative to what you would have gotten if things were additive. If I reset the contrasts globally using options
, and then, in code further down the document, fit a linear model and interpret the coefficients as if the model was fit with R default contrasts, then my interpretation of the effects would be wrong.
library(data.table)
library(car)
## Loading required package: carData
n <- 5
fd <- data.table(fac1 = rep(c("cn","tr"), each = n*2*2),
fac2 = rep(rep(c("wt", "ko"), each = n*2), 2),
fac3 = rep(rep(rep(c("f","m"), each = n),2),2),
y = rnorm(n*2*2*2))
linear model set with R defaults, using anova – correct Type I
m1 <- lm(y ~ fac1*fac2*fac3, data = fd)
anova(m1)
## Analysis of Variance Table
##
## Response: y
## Df Sum Sq Mean Sq F value Pr(>F)
## fac1 1 0.0064 0.00642 0.0066 0.93562
## fac2 1 0.1998 0.19978 0.2062 0.65283
## fac3 1 0.1578 0.15781 0.1629 0.68921
## fac1:fac2 1 0.8592 0.85918 0.8868 0.35341
## fac1:fac3 1 1.2276 1.22755 1.2670 0.26871
## fac2:fac3 1 0.1080 0.10803 0.1115 0.74062
## fac1:fac2:fac3 1 2.9696 2.96960 3.0650 0.08958 .
## Residuals 32 31.0043 0.96888
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
linear model set with contr.sum, using Anova – correct Type III
type3 <- list(fac1 = contr.sum, fac2 = contr.sum, fac3 = contr.sum)
m2 <- lm(y ~ fac1*fac2*fac3, data = fd, contrasts = type3)
Anova(m2, type = 3) # correct type 3
## Anova Table (Type III tests)
##
## Response: y
## Sum Sq Df F value Pr(>F)
## (Intercept) 0.0420 1 0.0433 0.83643
## fac1 0.0064 1 0.0066 0.93562
## fac2 0.1998 1 0.2062 0.65283
## fac3 0.1578 1 0.1629 0.68921
## fac1:fac2 0.8592 1 0.8868 0.35341
## fac1:fac3 1.2276 1 1.2670 0.26871
## fac2:fac3 0.1080 1 0.1115 0.74062
## fac1:fac2:fac3 2.9696 1 3.0650 0.08958 .
## Residuals 31.0043 32
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
linear model set with R defaults, using Anova – wrong Type III
This is garbage.
m1 <- lm(y ~ fac1*fac2*fac3, data = fd)
Anova(m1, type = 3) # correct type 3
## Anova Table (Type III tests)
##
## Response: y
## Sum Sq Df F value Pr(>F)
## (Intercept) 0.7616 1 0.7860 0.38192
## fac1 0.9845 1 1.0161 0.32101
## fac2 0.1149 1 0.1186 0.73278
## fac3 3.1633 1 3.2649 0.08019 .
## fac1:fac2 0.3171 1 0.3273 0.57128
## fac1:fac3 4.0078 1 4.1366 0.05033 .
## fac2:fac3 2.1052 1 2.1728 0.15024
## fac1:fac2:fac3 2.9696 1 3.0650 0.08958 .
## Residuals 31.0043 32
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1