# A simple ggplot of some measure against depth

set up The goal is to plot the measure of something, say O2 levels, against depth (soil or lake), with the measures taken on multiple days library(ggplot2) library(data.table) First – create fake data depths <- c(0, seq(10,100, by=10)) dates <- c("Jan-18", "Mar-18", "May-18", "Jul-18") x <- expand.grid(date=dates, depth=depths) n <- nrow(x) head(x) ## date depth ## 1 Jan-18 0 ## 2 Mar-18 0 ## 3 May-18 0 ## 4 Jul-18 0 ## 5 Jan-18 10 ## 6 Mar-18 10 X <- model.

# Should the model-averaged prediction be computed on the link or response scale in a GLM?

[updated to include additional output from MuMIn, BMA, and BAS] This post is a follow up to my inital post, which was written as as a way for me to pen my mental thoughts on the recent review of “Model averaging in ecology: a review of Bayesian, information‐theoretic and tactical approaches for predictive inference”. It was also written without contacting and discussing the issue with the authors. This post benefits from a series of e-mails with the lead author Carsten Dormann and the last author Florian Hartig.

# On model averaging the coefficients of linear models

a shorter argument based on a specific example is here “What model averaging does not mean is averaging parameter estimates, because parameters in different models have different meanings and should not be averaged, unless you are sure you are in a special case in which it is safe to do so.” – Richard McElreath, p. 196 of the textbook I wish I had learned from Statistical Rethinking This is an infrequent but persistent criticism of model-averaged coefficients in the applied statistics literature on model averaging.

# An even more compact defense of coefficient model averaging

a longer, more detailed argument is here The parameter that is averaged “needs to have the same meaning in all “models” for the equations to be straightforwardly interpretable; the coefficient of x1 in a regression of y on x1 is a different beast than the coefficient of x1 in a regression of y on x1 and x2.” – David Draper in a comment on Hoeting et al. 1999. David Draper suggested this example from the textbook by Freedman, Pisani and Purves.

# Model-averaged coefficients of a GLM

This is a very quick post as a comment to the statement “For linear models, predicting from a parameter-averaged model is mathematically identical to averaging predictions, but this is not the case for non-linear models…For non-linear models, such as GLMs with log or logit link functions g(x)1, such coefficient averaging is not equivalent to prediction averaging.” from the supplement of Dormann et al. Model averaging in ecology: a review of Bayesian, information‐theoretic and tactical approaches for predictive inference.

# On alpha

This post is motivated by Terry McGlynn’s thought provoking How do we move beyond an arbitrary statistical threshold? I have been struggling with the ideas explored in Terry’s post ever since starting my PhD 30 years ago, and its only been in the last couple of years that my own thoughts have begun to gel. This long marination period is largely because of my very classical biostatistical training. My PhD is from the Department of Anatomical Sciences at Stony Brook but the content was geometric morphometrics and James Rohlf was my mentor for morphometrics specifically, and multivariate statistics more generally.

# Combining data, distribution summary, model effects, and uncertainty in a single plot

A Harrell plot combines a forest plot of estimated treatment effects and uncertainty, a dot plot of raw data, and a box plot of the distribution of the raw data into a single plot. A Harrell plot encourages best practices such as exploration of the distribution of the data and focus on effect size and uncertainty, while discouraging bad practices such as ignoring distributions and focusing on $$p$$-values. Consequently, a Harrell plot should replace the bar plots and Cleveland dot plots that are currently ubiquitous in the literature.
• page 1 of 2

#### R doodles. Some ecology. Some physiology. Much fake data.

Thoughts on R, statistical best practices, and teaching applied statistics to Biology majors

Jeff Walker, Professor of Biological Sciences

University of Southern Maine, Portland, Maine, United States