Update - This post has been updated

A very skeletal analysis of

Sharon, G., Cruz, N.J., Kang, D.W., Gandal, M.J., Wang, B., Kim, Y.M., Zink, E.M., Casey, C.P., Taylor, B.C., Lane, C.J. and Bramer, L.M., 2019. Human Gut Microbiota from Autism Spectrum Disorder Promote Behavioral Symptoms in Mice. Cell, 177(6), pp.1600-1618.

which got some attention on pubpeer.

Commenters are questioning the result of Fig1G. It is very hard to infer a p-value from plots like these, where the data are multi-level, regardless of if means and some kind of error bar is presented. A much better plot for inferring differences is an effects plot with the CI of the effect. That said, I’ll try to reproduce the resulting p-value.

Caveats

Failure to reproduce or partial reproducibility might be an error in my coding, or my misunderstanding of the author’s methods, or my lack of knowledge of statistics generally or the R functions that I use more specifically.

library(ggplot2)
library(nlme)
library(lmerTest)
library(car)
library(emmeans)
library(lmtest)
library(data.table)
data_path <- "../data/Data from Human Gut Microbiota from Autism Spectrum Disorder Promote Behavioral Symptoms in Mice/Fig1"
#data_path <- "data/Data from Human Gut Microbiota from Autism Spectrum Disorder Promote Behavioral Symptoms in Mice/Fig1"
fn <- "Fig1EFGH_subset8.csv"
file_path <- paste(data_path, fn, sep="/")
mouse <- fread(file_path)
mouse[, Gender:=factor(Gender)]
mouse[, ASD_diagnosis:=factor(ASD_diagnosis, c("NT", "ASD"))]
mouse[, MouseID:=factor(MouseID)]
mouse[, Round:=factor(Round)]

Figure 1G

Estimate of marginal means and Cohen’s d

One could reasonably estimate Cohen’s d several ways given the 2 x 2 design. Here are two

m1 <- lmer(OFT_Distance ~ ASD_diagnosis + (1|Donor), data=mouse)
(m1.emm <- emmeans(m1, specs=c("ASD_diagnosis")))
##  ASD_diagnosis emmean  SE   df lower.CL upper.CL
##  NT              4600 262 5.70     3951     5249
##  ASD             4235 206 6.08     3731     4738
## 
## Degrees-of-freedom method: kenward-roger 
## Confidence level used: 0.95
m2 <- lmer(OFT_Distance ~ Gender + ASD_diagnosis + (1|Donor), data=mouse)
(m2.emm <- emmeans(m2, specs=c("ASD_diagnosis")))
##  ASD_diagnosis emmean  SE   df lower.CL upper.CL
##  NT              4603 259 5.68     3960     5246
##  ASD             4233 204 6.07     3735     4731
## 
## Results are averaged over the levels of: Gender 
## Degrees-of-freedom method: kenward-roger 
## Confidence level used: 0.95

The effect, standardized to cohen’s D using residual standard deviation from model: (NT - ASD)/sigma

coef(summary(m1))["ASD_diagnosisASD", "Estimate"]/sigma(m1)
## [1] -0.3928202
summary(contrast(m2.emm, method="revpairwise"))[1, "estimate"]/sigma(m2)
## [1] -0.3970379

The reported cohen’s D for the distance traveled is doTD-oASD = -0.58

Fig 1G p-value

Method no. 1 (as stated in caption to Fig 1)

The figure caption gives the method as “Hypothesis testing for differences of the means were done by a mixed effects analysis using donor diagnosis and mouse sex as fixed effects and donor ID as a random effect. p values were derived from a chi-square test”. This would be a Likelihood Ratio Test (LRT). The LRT requires that the model be fit by maximum likelhood for the statistic to have any meaning. The default fit in R is REML.

The LRT of the model as specified in the caption of Fig1G

# correct LRT
m1 <- lmer(OFT_Distance ~ Gender + ASD_diagnosis + (1|Donor), REML=FALSE, data=mouse)
m2 <- lmer(OFT_Distance ~ Gender + (1|Donor), REML=FALSE, data=mouse)
lrtest(m2, m1)
## Likelihood ratio test
## 
## Model 1: OFT_Distance ~ Gender + (1 | Donor)
## Model 2: OFT_Distance ~ Gender + ASD_diagnosis + (1 | Donor)
##   #Df  LogLik Df  Chisq Pr(>Chisq)
## 1   4 -1706.7                     
## 2   5 -1706.0  1 1.4994     0.2208

The p-value is .2208

The LRT of the model as specified but using the default lmer settings, which is fit by REML, which results in a meaningless chisq statistic

m1 <- lmer(OFT_Distance ~ Gender + ASD_diagnosis + (1|Donor), data=mouse)
m2 <- lmer(OFT_Distance ~ Gender + (1|Donor), data=mouse)
lrtest(m2, m1)
## Likelihood ratio test
## 
## Model 1: OFT_Distance ~ Gender + (1 | Donor)
## Model 2: OFT_Distance ~ Gender + ASD_diagnosis + (1 | Donor)
##   #Df  LogLik Df  Chisq Pr(>Chisq)    
## 1   4 -1695.0                         
## 2   5 -1687.6  1 14.692  0.0001266 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

the p-value is 0.00013. Is this the reported p-value? I don’t know. A problem is that the methods section suggests a different model was used. This leads to:

method no. 2 (as stated in methods section)

“Statistical analysis for behavioral outcomes in fecal transplanted offspring. Comparison of behavioral outcomes between TD Controls and ASD donors were tested using longitudinal linear mixed effects analyses, with test cycles and donors treated as repeated factors. Analyses were performed in SPSS (v 24); a priori alpha = 0.05. All outcomes were tested for normality and transformed as required. Diagonal covariance matrices were used so that intra-cycle and intra-donor correlations were accounted for in the modeling. The donor type (TD versus ASD) was the primary fixed effect measured, and mouse sex was an a priori covariate.”

(I think) it would be unlikely to get the wrong LRT using SPSS. Also the specified model differs from that in the caption. Was the response transformed (note it’s not the reponse that needs to be “normal” but the residuals from the linear model (or equivalently the conditional response))

correct LRT

# correct LRT
m1 <- lmer(OFT_Distance ~ Gender + ASD_diagnosis + (1|Cycle) + (1|Donor), REML=FALSE, data=mouse)
## boundary (singular) fit: see ?isSingular
m2 <- lmer(OFT_Distance ~ Gender + (1|Cycle) + (1|Donor), REML=FALSE, data=mouse)
lrtest(m2, m1)
## Likelihood ratio test
## 
## Model 1: OFT_Distance ~ Gender + (1 | Cycle) + (1 | Donor)
## Model 2: OFT_Distance ~ Gender + ASD_diagnosis + (1 | Cycle) + (1 | Donor)
##   #Df  LogLik Df  Chisq Pr(>Chisq)
## 1   5 -1706.7                     
## 2   6 -1706.0  1 1.4974     0.2211

Hmm. Need to explore the error.

incorrect LRT

# correct LRT
m1 <- lmer(OFT_Distance ~ Gender + ASD_diagnosis + (1|Cycle) + (1|Donor), data=mouse)
m2 <- lmer(OFT_Distance ~ Gender + (1|Cycle) + (1|Donor), data=mouse)
lrtest(m2, m1)
## Likelihood ratio test
## 
## Model 1: OFT_Distance ~ Gender + (1 | Cycle) + (1 | Donor)
## Model 2: OFT_Distance ~ Gender + ASD_diagnosis + (1 | Cycle) + (1 | Donor)
##   #Df  LogLik Df  Chisq Pr(>Chisq)    
## 1   5 -1694.9                         
## 2   6 -1687.6  1 14.663  0.0001286 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Basically the same results

Update

  1. Thomas Lumley also explored statistical models based on the author’s two descriptions. The first set are similar to mine above, but using Satterthwaite or Kenward-Rogers ddf on F-test instead of a LRT. I punted on the “diagonal covariance matrices” of method 2, but Lumley ran with it and, with a bit of SPSS manual description, figured out that the result can be reproduced using a GLS that weights the error variances using levels of Donor. This isn’t a mixed model with Donor as a random effect, as stated in the methods. Most importantly, since donor isn’t a random effect, the model uses complete pooling, which ignores the correlated error arising from the clustered design (mice recieving microbes from the same donor). The consequence is a super liberal denominator df for the test, and overly optimistic p-values.
type3 <- list(Gender=contr.sum, ASD_diagnosis=contr.sum) # change the contrasts coding for the model matrix
m2a <- lm(OFT_Distance ~ Gender*ASD_diagnosis, data=mouse, contrasts=type3)
Anova(m2a, type="3")
## Anova Table (Type III tests)
## 
## Response: OFT_Distance
##                          Sum Sq  Df   F value  Pr(>F)    
## (Intercept)          3863414020   1 3828.6270 < 2e-16 ***
## Gender                  1795588   1    1.7794 0.18372    
## ASD_diagnosis           4676457   1    4.6343 0.03252 *  
## Gender:ASD_diagnosis     722345   1    0.7158 0.39851    
## Residuals             203835380 202                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# contrasts statement doesn't work with gls so change variable in data.table
mouse[, Gender.sum := C(Gender, contr.sum)]
mouse[, ASD_diagnosis.sum := C(ASD_diagnosis, contr.sum)]
m2b <- gls(OFT_Distance ~ Gender.sum*ASD_diagnosis.sum, weights=varIdent(form=~1|Donor), data=mouse)
anova(m2b, type="marginal")
## Denom. DF: 202 
##                              numDF  F-value p-value
## (Intercept)                      1 4336.040  <.0001
## Gender.sum                       1    2.378  0.1246
## ASD_diagnosis.sum                1   15.345  0.0001
## Gender.sum:ASD_diagnosis.sum     1    1.871  0.1729

some checks

What is MouseID?

In the pubpeer response, Sharon states “specify linear mixed effects modeling of individual mice, nested within donor, and round of testing”. I assume individual mice is referenced by MouseID. But…

Number of Mice

length(levels(mouse$MouseID))
## [1] 75

Explore a single mouse

mouse[MouseID==1, 1:10]
##     V1 Round Cycle MouseID Gender Donor ASD_diagnosis OFT_Distance
## 1:   0     1     1       1 Female    C1            NT      2763.60
## 2: 566     9     3       1   Male    C1            NT      4565.92
## 3: 632    10     3       1   Male    C4            NT      4615.37
##    OFT_velocity OFT_LrgCntr_freq
## 1:      5.08254              113
## 2:      7.60987              101
## 3:      7.84499              128
# mouse ID=1 is both female and male? So MouseID cannot be the mouse tested behaviorally

Hmm. MouseID=1 is both male and female? Is this true of some/all MouseID?

# check number of mouseID:Gender combinations
unique(paste(mouse$MouseID, mouse$Gender,sep=":"))
##   [1] "1:Female"  "2:Female"  "3:Female"  "4:Female"  "5:Male"   
##   [6] "6:Male"    "7:Male"    "8:Male"    "9:Female"  "22:Male"  
##  [11] "23:Male"   "24:Female" "25:Female" "26:Female" "27:Female"
##  [16] "28:Male"   "29:Male"   "30:Female" "31:Female" "32:Female"
##  [21] "33:Female" "35:Female" "35:Male"   "36:Female" "37:Female"
##  [26] "38:Female" "39:Female" "40:Male"   "41:Male"   "42:Male"  
##  [31] "43:Male"   "44:Male"   "46:Female" "46:Male"   "47:Female"
##  [36] "48:Female" "49:Female" "45:Male"   "47:Male"   "48:Male"  
##  [41] "49:Male"   "50:Male"   "51:Male"   "52:Male"   "53:Male"  
##  [46] "54:Male"   "55:Female" "56:Female" "57:Female" "58:Female"
##  [51] "59:Female" "60:Female" "61:Female" "62:Female" "30:Male"  
##  [56] "31:Male"   "32:Male"   "33:Male"   "34:Male"   "64:Female"
##  [61] "65:Female" "66:Female" "67:Female" "68:Female" "69:Female"
##  [66] "70:Female" "71:Female" "72:Female" "73:Female" "74:Female"
##  [71] "75:Female" "1:Male"    "2:Male"    "3:Male"    "4:Male"   
##  [76] "9:Male"    "10:Male"   "11:Male"   "12:Male"   "13:Male"  
##  [81] "14:Male"   "15:Male"   "16:Male"   "17:Male"   "18:Male"  
##  [86] "19:Male"   "20:Male"   "21:Male"   "24:Male"   "25:Male"  
##  [91] "26:Male"   "27:Male"   "36:Male"   "41:Female" "42:Female"
##  [96] "43:Female" "44:Female" "45:Female" "50:Female" "51:Female"
## [101] "53:Female" "54:Female" "63:Female" "34:Female" "40:Female"
## [106] "52:Female"

So 75 “individual mice” (identified by MouseID) but 106 combinations of MouseID:Gender. So I’m not sure what MouseID identifies – maybe litter (see below). Certainly not an individual mouse that were tested.

What is a cycle v round?

Again, in the pubpeer response, Sharon states “specify linear mixed effects modeling of individual mice, nested within donor, and round of testing

dim(mouse)
## [1] 206  50
length(unique(paste(mouse$MouseID, mouse$Round,sep=":")))
## [1] 204

206 data points (consistant with results) but 204 combinations of MouseID and Round. Which have > 1

mouse[, ID_x_Round:=paste(MouseID, Round, sep="_x_")]
dt <- mouse[, .(N=.N), by=ID_x_Round]
dt[N>1,] # 35 x 1 and 46 x 1
##    ID_x_Round N
## 1:     35_x_1 2
## 2:     46_x_1 2
mouse[MouseID=="35", ]
##     V1 Round Cycle MouseID Gender   Donor ASD_diagnosis OFT_Distance
## 1:  32     1     1      35 Female A24-new           ASD      4089.96
## 2:  33     1     1      35   Male A24-new           ASD      3855.53
## 3: 167     3     1      35   Male      N5            NT      5201.08
## 4: 600     9     3      35   Male      C1            NT      2687.43
## 5: 666    10     3      35 Female      C4            NT      4974.22
##    OFT_velocity OFT_LrgCntr_freq OFT_LrgCntr_duration OFT_LrgCntr_latency
## 1:      7.42706              102              78.6119            0.467133
## 2:      6.84133              115              78.3450            0.867534
## 3:      8.79415              159              88.3217           36.269600
## 4:      4.51749               75              44.5779            6.473140
## 5:      8.42008              138              82.7161            7.507510
##    OFT_SmlCntr_freq OFT_SmlCntr_duration OFT_SmlCntr_latency OFT_Movement
## 1:               45              74.8081             0.00000      62.1817
## 2:               66              88.0214             1.33467      66.0698
## 3:               66             101.5350            45.44540      56.5645
## 4:               31              68.0681             6.84017      36.3844
## 5:               63             147.0140             7.67434      58.0771
##    MB_visible MB_buried MB_percentBuried DSI_Agrsn_freq DSI_Agrsn_duration
## 1:         15         5               25             NA                 NA
## 2:          7        13               65             NA                 NA
## 3:         18         2               10             NA                 NA
## 4:         11         9               45             11             61.295
## 5:         19         1                5              0              0.000
##    DSI_Social_freq DSI_Social_duration DSI_Groom_freq DSI_Groom_Duration
## 1:              NA                  NA             NA                 NA
## 2:              NA                  NA             NA                 NA
## 3:              NA                  NA             NA                 NA
## 4:              12              51.995              6             14.505
## 5:              11              15.110              3              6.040
##    SI_Habit_Social SI_Habit_Cntr SI_Habit_NonSocial SI_Sociability_Social
## 1:        234.1680       181.648            126.793               251.185
## 2:        256.3900       175.175            154.621               225.292
## 3:        205.5390       192.259            181.949               310.410
## 4:         92.8262       151.518            310.344               325.392
## 5:        102.8360       179.980            293.460               235.002
##    SI_Sociability_Cntr SI_Sociability_NonSocial Unnamed: 30 Unnamed: 31
## 1:             70.8709                  257.391          NA          NA
## 2:            116.4500                  245.112          NA          NA
## 3:             52.7194                  219.186          NA          NA
## 4:             33.4001                  210.677          NA          NA
## 5:             79.0457                  218.986          NA          NA
##    Unnamed: 32 USV_freq USV_Duration DSI_USV_median DSI_USV_StdDev
## 1:          NA       NA           NA             NA             NA
## 2:          NA     1063      47.8241             NA             NA
## 3:          NA      519      22.9990             NA             NA
## 4:          NA       NA           NA        0.04745        0.07175
## 5:          NA       NA           NA        0.00620        0.01302
##    DSI_USV_Freq DSI_USV_Duration DSI_USV_Slbl_num DSI_USV_Slbl_Duration
## 1:           NA               NA               NA                    NA
## 2:           NA               NA               NA                    NA
## 3:           NA               NA               NA                    NA
## 4:          154           7.3079               44                    85
## 5:           77           0.4778               42                     5
##    DSI_USV_Slbl_Gap_Length DSI_USV_Slbl_Bout_freq DSI_USV_Slbl_Bout_gap
## 1:                      NA                     NA                    NA
## 2:                      NA                     NA                    NA
## 3:                      NA                     NA                    NA
## 4:                    8079                     28                   501
## 5:                    3170                     26                   501
##    DSI_USV_Slbl_SlblPerBout Unnamed: 45  SI_index Gender.sum
## 1:                       NA          NA -0.610135     Female
## 2:                       NA          NA -2.106700       Male
## 3:                       NA          NA  8.612603       Male
## 4:                        2          NA 10.699649       Male
## 5:                        2          NA  1.763923     Female
##    ASD_diagnosis.sum ID_x_Round
## 1:               ASD     35_x_1
## 2:               ASD     35_x_1
## 3:                NT     35_x_3
## 4:                NT     35_x_9
## 5:                NT    35_x_10
mouse[MouseID=="46", ]
##     V1 Round Cycle MouseID Gender   Donor ASD_diagnosis OFT_Distance
## 1:  43     1     1      46 Female A24-new           ASD      3622.66
## 2:  44     1     1      46   Male A24-new           ASD      3300.87
## 3: 609     9     3      46 Female A24-new           ASD      4195.94
## 4: 677    10     3      46 Female A11-old           ASD      4029.45
##    OFT_velocity OFT_LrgCntr_freq OFT_LrgCntr_duration OFT_LrgCntr_latency
## 1:      6.57209               81              62.9296            25.42540
## 2:      5.63688              105              91.2246             6.27294
## 3:      7.01313              154             112.3120            12.94630
## 4:      6.87479              101              59.4928             1.56823
##    OFT_SmlCntr_freq OFT_SmlCntr_duration OFT_SmlCntr_latency OFT_Movement
## 1:               27              89.4895            25.89260      61.4769
## 2:               43             148.6820             7.34067      58.4853
## 3:               57              59.6263            13.24660      49.4967
## 4:               36              31.4648             8.44177      48.5347
##    MB_visible MB_buried MB_percentBuried DSI_Agrsn_freq DSI_Agrsn_duration
## 1:         18         2               10             NA                 NA
## 2:         16         4               20             NA                 NA
## 3:         14         6               30              0                  0
## 4:         15         5               25              0                  0
##    DSI_Social_freq DSI_Social_duration DSI_Groom_freq DSI_Groom_Duration
## 1:              NA                  NA             NA                 NA
## 2:              NA                  NA             NA                 NA
## 3:              14              22.945              1              7.855
## 4:              15              25.230              2             10.105
##    SI_Habit_Social SI_Habit_Cntr SI_Habit_NonSocial SI_Sociability_Social
## 1:         93.1598       212.346           253.7200               331.798
## 2:        261.5280       199.666           128.3280               323.323
## 3:        158.0580       196.964            94.4611               365.265
## 4:        111.3450       229.096           183.6500               303.804
##    SI_Sociability_Cntr SI_Sociability_NonSocial Unnamed: 30 Unnamed: 31
## 1:             48.3150                  191.458          NA          NA
## 2:             80.0801                  173.040          NA          NA
## 3:             47.9813                  113.113          NA          NA
## 4:             80.0133                  179.613          NA          NA
##    Unnamed: 32 USV_freq USV_Duration DSI_USV_median DSI_USV_StdDev
## 1:          NA       NA           NA             NA             NA
## 2:          NA       69       0.3854             NA             NA
## 3:          NA       NA           NA        0.00550        0.01227
## 4:          NA       NA           NA        0.00677        0.01244
##    DSI_USV_Freq DSI_USV_Duration DSI_USV_Slbl_num DSI_USV_Slbl_Duration
## 1:           NA               NA               NA                    NA
## 2:           NA               NA               NA                    NA
## 3:          154           0.8478               43                    16
## 4:          193           1.3078              154                    18
##    DSI_USV_Slbl_Gap_Length DSI_USV_Slbl_Bout_freq DSI_USV_Slbl_Bout_gap
## 1:                      NA                     NA                    NA
## 2:                      NA                     NA                    NA
## 3:                    7564                     26                   158
## 4:                    2315                     51                   200
##    DSI_USV_Slbl_SlblPerBout Unnamed: 45 SI_index Gender.sum
## 1:                       NA          NA 13.41026     Female
## 2:                       NA          NA 15.13842       Male
## 3:                        2          NA 26.35489     Female
## 4:                        3          NA 12.84512     Female
##    ASD_diagnosis.sum ID_x_Round
## 1:               ASD     46_x_1
## 2:               ASD     46_x_1
## 3:               ASD     46_x_9
## 4:               ASD    46_x_10

In addition to two 35 x 1 (mouse ID x Round), this same MouseID is used for mice of different sex and different donors. Does MouseID=litter?