Plain
and simple, confidence intervals allow us to express the accuracy of our data.
They're suitable for almost any type of biological measurement as you need
three basic things: the sample mean, the standard deviation, and the sample
size. Once you determine the degree of confidence you find acceptable for the
experiment, you can use those three values to calculate a confidence interval
(CI). 95% confidence is standard, though some situations call for a much more
narrow CI, such as astrophysics or nuclear weapons. If you choose a
conventional CI of 95%, then you are stating the range of values within which
there is a 95% chance that the true population value lies.

The
example of weather brought up in the Garfield comic is an easy one to think
about in this context. If you want to state a high confidence interval, then
you must either have many measurements, a small standard deviation in your
measurements, or being willing to accept a very large range of values as your interval. Sure you
can say confidently that the temperature in Atlanta, GA is between -40°F and
200°F at any given time, because it’s not meteorologically relevant to state
otherwise. Say, instead, that you measure the temperature once an hour for 24
hours, so your sample size equals 24. Perhaps your sample mean is 72°F with a
standard deviation of ±7°F. Even though your calculated sample mean is
absolutely 72°F, you didn’t measure the temperature continuously all day long,
so there’s a very real chance that your mean is an inaccurate representation of
the actual fluctuations that occurred, particularly if the temperature quickly
rose and then dropped again or vice versa. So you can calculate that you are
95% confident the

**true**temperature mean for the past 24 hours is between 69.2°F and 74.8°F.
You
must make many assumption about your data in order to calculate a confidence
interval correctly. Namely, you assume that your sample is a random
representation of the population, that your data are independent and unbiased,
and that your data was accurately obtained. In the scenario I outlined, a
faulty thermometer that measures 5 degrees warmer than the actual temperature would
ruin the accuracy of your CI. The reality is biology isn’t always as easy to
interpret as the temperature scenario I outlined, or to understand when there are problems with the data. What if your project is so
novel that you don’t know saying the value could be between -40 and 200 is ludicrous?
Maybe the value should be 60 and 80, but the time frame in which you’re
capturing your data is so large that unknown biology is skewing your results
and causing a large standard deviation?

95%
confident sounds like such a sure thing, but it’s important to think critically
about what these values are telling you. Is it actually biologically relevant?

You give a great example to understand confidence intervals. I think you make a great point that it is important to understand what the numbers mean in order to determine if the confidence interval is giving biologically relevant information.

ReplyDeleteI think the example in the comic is really helpful for understanding what confidence intervals actually mean. For some reason I always have trouble remembering that the higher the percentage in a confidence interval, the wider the interval has to be. The ridiculously wide confidence interval in the comic does a good job of illustrating this. The percentage for a confidence interval between -40 and 200 degrees is probably very close to 100%, which is why this interval has to be so wide.

ReplyDeleteMy PI always says, "You should be able to look at the data and see the difference." Sometimes I think that I should put confidence intervals on all of my data and end my statistical analyses there. Then when I shown my PI the data, she can truly "just look at the data" and tell if there is a "statistical" difference or not.

ReplyDelete