# What to write, and not write, in a results section — an ever-growing list

“GPP (n=4 per site) increased from the No Wildlife site to the Hippo site but was lowest at the Hippo + WB site (Fig. 6); however, these differences were not significant due to low sample sizes and high variability.” – Subalusky, A.L., Dutton, C.L., Njoroge, L., Rosi, E.J., and Post, D.M. (2018). Organic matter and nutrient inputs from large wildlife influence ecosystem function in the Mara River, Africa. Ecology 99, 2558–2574.

# Paired line plots

load libraries make some fake data make a plot with ggplot ggplot scripts to draw figures like those in the Dynamic Ecology post Paired line plots (a.k.a. “reaction norms”) to visualize Likert data load libraries library(ggplot2) library(ggpubr) library(data.table) make some fake data set.seed(3) n <- 40 self <- rbinom(n, 5, 0.25) + 1 others <- self + rbinom(n, 3, 0.5) fd <- data.table(id=factor(rep(1:n, 2)), who=factor(rep(c("self", "others"), each=n)), stigma <- c(self, others)) make a plot with ggplot The students are identified by the column “id”.

# GLM vs. t-tests vs. non-parametric tests if all we care about is NHST

This post has been updated. A skeleton simulation of different strategies for NHST for count data if all we care about is a p-value, as in bench biology where p-values are used to simply give one confidence that something didn’t go terribly wrong (similar to doing experiments in triplicate – it’s not the effect size that matters only “we have experimental evidence of a replicable effect”). tl;dr - At least for Type I error at small (n), log(response) and Wilcoxan have the best performance over the simulation space.

# Expected covariances in a causal network

This is a skeleton post Standardized variables (Wright’s rules) n <- 10^5 # z is the common cause of g1 and g2 z <- rnorm(n) # effects of z on g1 and g2 b1 <- 0.7 b2 <- 0.7 r12 <- b1*b2 g1 <- b1*z + sqrt(1-b1^2)*rnorm(n) g2 <- b2*z + sqrt(1-b2^2)*rnorm(n) var(g1) # E(VAR(g1)) = 1 ## [1] 1.007478 var(g2) # E(VAR(g2)) = 1 ## [1] 1.004495 cor(g1, g2) # E(COR(g1,g2)) = b1*b2 ## [1] 0.

# Compute a random data matrix (fake data) without rmvnorm

This is a skeleton post until I have time to flesh it out. The post is motivated by a question on twitter about creating fake data that has a covariance matrix that simulates a known (given) covariance matrix that has one or more negative (or zero) eigenvalues. First, some libraries library(data.table) library(mvtnorm) library(MASS) Second, some functions… random.sign <- function(u){ # this is fastest of three out <- sign(runif(u)-0.5) #randomly draws from {-1,1} with probability of each = 0.

# Reporting effects as relative differences...with a confidence interval

Researchers frequently report results as relative effects, for example, “Male flies from selected lines had 50% larger upwind flight ability than male flies from control lines (Control mean: 117.5 cm/s; Selected mean 176.5 cm/s).” where a relative effect is [$$100 \frac{\bar{y}_B - \bar{y}_A}{\bar{y}_A}$$] If we are to follow best practices, we should present this effect with a measure of uncertainty, such as a confidence interval. The absolute effect is 59.

# Interaction plots with ggplot2

ggpubr is a fantastic resource for teaching applied biostats because it makes ggplot a bit easier for students. I’m not super familiar with all that ggpubr can do, but I’m not sure it includes a good “interaction plot” function. Maybe I’m wrong. But if I’m not, here is a simple function to create a gg_interaction plot. The gg_interaction function returns a ggplot of the modeled means and standard errors and not the raw means and standard errors computed from each group independently.
• page 2 of 4

#### R doodles. Some ecology. Some physiology. Much fake data.

Thoughts on R, statistical best practices, and teaching applied statistics to Biology majors. Re-blogged at R-bloggers

Jeff Walker, Professor of Biological Sciences

University of Southern Maine, Portland, Maine, United States