Wednesday, March 23, 2016

Error Bar Misinterpretation

Nature Methods published an article fairly recently that explains common misconception surrounding error bars. If you're like me, I thought error bars was something I could easily look at and understand. Don't they just represent the likelihood of variance between my replicate samples? Well, first, what are you using your error bars to represent? Standard deviation, standard error of the mean, or a confidence interval? This article (which provides interactive supplementary data where you can see the raw data used in the discussion as well as make your own data), provides examples of how data can look very different (or even not significant) depending on the way the error bars are represented.

A very common misconception is that a gap between bars means that the data are significant while if the bars overlap they are not significant. That is not the case, and again, it all depends on the type of bars you choose to use.

An an example, figure 1 (above, n=10) shows how error bars cannot be compared. The left graph shows what happens to the p-value when the error bars from SD, SEM, and 95% CI are adjusted to the same lengths. The right graph shows what happens to the size of the error bars when adjusting to a significant p-value (p=0.05). As you can tell, just because SEM error bars do not overlap does not indicate significance and just because SD error bars do overlap does not mean that the data are not significant.

When showing data with error bars it is important to be clear about which measure of uncertainty is being represented in order for the reader to be able to interpret the results properly.

Short summary:
SD: represents variation of the data and not the error of your measurements.
SEM: represents uncertaintiy in the mean and its dependency on the sample size.
CI: represents an interval estimate indicating the reliability of a measurement.