Bias in pre-post designs -- An example from the Turnbaugh et al (2006) mouse fecal transplant study

This post is motivated by a twitter link to a recent blog post critical of the old but influential study An obesity-associated gut microbiome with increased capacity for energy harvest with impressive citation metrics. In the post, Matthew Dalby smartly used the available data to reconstruct the final weights of the two groups. He showed these final weights were nearly the same, which is not good evidence for a treatment effect, given that the treatment was randomized among groups.

This is true, but doesn’t quite capture the essence of the major problem with the analysis: a simple t-test of differences in percent weight change fails to condition on initial weight. And, in pre-post designs, groups with smaller initial measures are expected to have more change due to regression to the mean. This is exactly what was observed. In the fecal transplant study, the initial mean weight of the rats infected with ob/ob feces was smaller (by 1.2SD) than that of the rats infected with +/+ feces and, consequently, the expected difference in the change in weight is not zero but positive (this is the expected difference conditional on an inital difference). More generally, a difference in percent change as an estimate of the parametric difference in percent change is not biased, but it is also not an estimate of the treatment effect, except under very limited conditions explained below. If these conditions are not met, a difference in percent change as an estimate of the treatment effect is biased, unless estimated conditional on (or “adusted for”) the initial weight.

Regression to the mean also has consequences on the hypothesis testing approach taken by the authors. Somewhat perplexingly, a simple t-test of the post-treatment weights or of pre-post difference in weight, or of percent change in weight does not have elevated Type I error. This is demonstrated using simulation below. The explaination is, in short, the Type I error rate is also a function of the initial difference in weight. If the initial difference is near zero, the Type I error is much less than the nominal alpha (say, 0.05). But as the intial difference moves away from zero, the Type I error is much greater than the nominal alpha. Over the space of the initial difference, these rates average to the nominal alpha.

Bias in pre-post designs -- An example from the Turnbaugh et al (2006) mouse fecal transplant study

continue reading the whole post

R doodles. Some ecology. Some physiology. Much fake data.

Bias in pre-post designs -- An example from the Turnbaugh et al (2006) mouse fecal transplant study

continue reading the whole post

R doodles. Some ecology. Some physiology. Much fake data.

How to make plots with factor levels below the x-axis (bench-biology style)

What is an interaction?

How to estimate synergism or antagonism

Type 3 ANOVA in R -- an easy way to publish wrong tables

Linear models with a covariate ("ANCOVA")

Normal Q-Q plots - what is the robust line and should we prefer it?

ANCOVA when the covariate is a mediator affected by treatment

Bootstrap confidence intervals when sample size is really small

What is the consequence of normalizing by each case in the control?

Melting a list of columns