As scientists we like numbers, and simple yet clear tests that provide us insight into a biological process. This is also true when it comes to interrupting the statistical significance of our data (the numbers we produced from our simple yet clear tests). For this reason, many scientific papers rely on the p-value as a “clear” way to say, “look our data is interesting!, now lets move on”. Harvey Motulsky writes “P values and conclusions about statistical significance can be useful, but there is more to statistics the P values and asterisks.” Instead, he suggests a focus on effect size because that will tell you if your statistically significant results is scientifically or clinically impactful. Additionally, sample size can have huge impacts on the P value, that many reading scientific papers do not take into account. It is important to remember, larger sample size can make a result appear more significant even if the mean and standard deviation are the same as a data set with a small sample size and a larger P value.
Recently, the arguments against P values have started to leak into the scientific community as the journal Basic and Applied Psychology banned the use of P values in their manuscripts. This is a great start to address the flaws of a simple P value assessment of significance. However, it does not get to the deeper issue. It is great that they have banned the use of P values, but what we scientists need is a deeper understanding of why. As this comment published in Nature argues, the P value is just one aspect of the data pipeline that we are messing up. In reality, decisions about experimental design, randomization, sample sizes, and types of statistical tests have a huge impact on the results of the experiment. This is why it is so important to think entirely through your experiment and the way you analyze it before you start. Although I do not believe that simply banning the use of P values will ultimately be the fix to all our challenges in statistics, but it does bring awareness to all scientist that there needs to be a shift in how we approach significance. Perhaps to change our thinking from “is there a difference?” to “how big is the difference and will it be impactful?".