Statistical
significance is present in the everyday life of most people, such as weather predictions, political campaigns, medical studies, quality testing,
insurance and the stock market. Most of the people in the world use
statistics unconsciously by noticing patterns in daily circumstances and
drawing conclusion based on those patterns. On a greater scale, researchers use
statistics to represent their data in a meaningful way.
But
what does “significant” means?
If
you would open a dictionary you would find the definitions “important” or
“meaningful”, but saying that research results are significant, doesn’t mean
that they are important. Indeed, a statistical significant result means that
two the difference seen between two groups is real and not given by chance. In
other words, the falsification of the null hypothesis will occur by chance only
under a certain percentage that appears to be set at 5%.
It
is still unclear where the origin of the 5% threshold lies, but the most
reliable source can be found in the discussion published by Fisher in 1926 on
the theoretical basis of the experimental design.1
The
real question is, what does this p-value tell us in terms of significance in
research?
When
conducting studies, researchers should keep in mind three main points:
1. The
dichotomization of p-values into “significant” and “non-significant” leads to a
loss of important informations. Two values might be significant, but that
doesn’t imply that they are the same.
2. Statistical
significance is not directly linked to clinical significance. As statistical
tests are influenced by the sample size, a significant study does not always
mean that the outcome is clinically meaningful. A large study might be
significant and not be clinically relevant, while a small study can be
important as outcome, but not statistically significant.
3. Although
it is tempting to rely only on p-values, the weight that researchers give to
them should not be overemphasized. The most important question should remain on
the qualitative level of the study, such as design, sample type, patients and
bias.
Nowadays,
we are overwhelmed by advertising for weight loss pills, miraculous
anti-wrinkles creams and any other kind of aesthetic treatment stating that you
will get significant results based on data collected in clinical trials. What
they clearly forget to mention, it’s what they truly mean by “significant”.
1.
Fisher RA, The arrangement of field experiments,
J.
Ministry Agric.,1926, 33:503-513
The ability for scientists to create significance out of nothing significant is a universal problem within the field! A habit has been formed among scientists that obtaining significant results equates to doing significant research. Therefore, the pressure to find something significant, all the while neglecting the quality of the research design, sample type and size, is leading to a "significant" era of science that is not significant at all. We must remember that the p-value is defined by the probability of error-- and more experiments, more stringent experimental design, and larger sample size are necessary to answer the question about what is truly significant! Great article, Camilla!
ReplyDeleteI agree with this. It was almost amazing that when doing the bad stats assignment, it was almost too easy to find stats errors in every paper that I encountered (disclaimer, I do not know if there is stat. signifcant proof that every paper harbors stats errors). It is almost scary to me that published results are out there for the public without good science and good stats. Like you were saying, setting an arbituary p-value is an exercise we should stop practicing. But how do we fix this problem? Should we remove the word stat. significant? But it does scare me that some of these papers lead to clinical trials that fail based on bad stats from a pre-clinical data. Great insight and great article!
ReplyDelete