Tuesday, April 19, 2016

Main effects vs. interaction

In order to find an example of bad statistical analysis used in the field of behavioral neuroscience, I turned to an article published in Nature Neuroscience (2011) by Nieuwenhuis, Forstmann & WagenmakersThis article published the most prevalent incorrect statistical analyses observed in Science, Nature, Nature Neuroscience, Neuron and The Journal of Neuroscience. Here, I will provide an example of one of the more prevalent statistical errors in published papers - one that involves comparing p-values  between two groups to show that one group that received treatment is significantly different, whereas the other is not. 

For example, "Optogenetic photoinhibition of the locus coeruleus decreased the amplitude of the target-evoked P3 potential in virally transduced animals (P = 0.012), but not in control animals (P = 0.3)."

Recently we were taught about how two use a two-way repeated measures ANOVA to assess main effects. In this example, one should have performed this analysis and then looked to see if there was a significant interaction between the two factors here: viral group and photoinhibition status. However, this example of bad statistics, instead, compares p-values representative of main effects, only. Here, P=0.012 represents the asterisk above the bar graphs showing reduced P3 amplitude in virally transduced mice. This asterisk does NOT indicate that the difference is a significant interaction. In other words, it does not indicate that P3 amplitude is significantly reduced by photo inhibition in the virally transduced mice, and not the control mice.

1 comment:

  1. Great post, Zibby. Taking your point even further, each t test was testing a null hypothesis related to the P3 amplitude for two separate groups. Not only does that mean these tests are unrelated (the point which you made) but it's mathematically dubious. For the virally infected group, assuming a one tailed test, the null hypothesis would be that P3 amplitude in controls is equal to or more than the virally infected group following photo-inhibition. The P value that was determined represents the probability of obtaining a result that is equal to or more extreme than that observed. If the null hypothesis were assumed to be true for each test, then the resultant p value simply arises from chance. If the experiment were repeated, there would be no expectation that the same p values would be seen. Based on the assumptions of null hypothesis testing, it would be equally likely for any p value to occur on a second test.