Prior to taking this class, I had a lengthy conversation with my PI about statistics.
We debated over what statistical methods were appropriate to use for our experiments. She opted for the classic t test and I opted for anything but that.
During this debate she would often throw out statements like,
“We don’t base our conclusions solely on whether or not something is significant.”
“We should be able to tell if a result is significant or not just by looking at the data.”
“We can’t publish without stats.”
“Even if a result is significant, it doesn’t matter if it doesn’t have any biological relevance.”
Looking back on this debate now, I realize my PI was/is a follower of BOT.
BOT standing for the “Bloody Obvious Test” coined back in 1987 by Ian Kitchen. Kitchen noted that there was pressure from journals to use statistics and that p-hacking was a problem.
“but it does seem that too often we
labour over their (statistics) use unnecessarily
and indeed on other occasions we
manipulate them to prove a very
thin point.” –Ian Kitchen
Because of these issues, Kitchen proposed the use of the “Bloody Obvious Test. ”
The protocol for the BOT is as follows:
Question #1: “Is it bloody obvious that the values are different?”
Answer: Yes. The test is positive, proceed to “Go” and collect $200.
Answer: No. Proceed to question number 2.
Question #2: “Am I making a mountain out of a molehill?”
Kitchen really wanted to drive home the point that statistics were being abused to appease “the gods of statistics” who happened to frequently sit on journal review boards. He wanted to remind scientists that sometimes the easiest and most obvious answer is the right answer. Lastly, he wanted scientists to recognize that statistical significance doesn’t always equal scientific significance.
Sadly, Kitchen didn’t stop these issues from persisting in science today. Scientists are still appeasing “the gods of statistics” because to be successful in science, you have to publish.
As the reality of science publishing seems unlikely to change and the pressure to include stats continues, I propose we optimize the BOT with confidence intervals.
Confidence intervals are a form of statistics that provides a range in which the true population value may lie. Traditionally, we set confidence intervals at 95%. A 95% confidence interval tells us that there is a 95% percent chance that confidence interval contains the true population parameter of interest.
The addition of CIs would add a statistical robustness to the BOT, that would perhaps appease “the gods of statistics.” Also, the addition of confidence intervals wouldn’t detract from the initial step of the BOT. We could still ask Question #1 without a pesky p-value getting in the way of our conclusion. Instead, confidence intervals would be to the BOT “as a drunk uses a lamp-post; for support rather than illumination.”