As of late, reproducibility has been a hot topic in the field of science ethics. As I thought about our inherent bias and how we can combat human nature to overcome these biases, I found myself considering the system we are forced to work within. In a career that is competitive and driven by the old motto “publish or perish!”, there is a strong emphasis on both the quantity and the quality of publications. But therein lies the rub: how do we judge the quality of work that is to be published? Peers who are, hypothetically, both the experts and the competition in your field? Publishers more experienced with 20th century English literature than the latest super resolution microscopy techniques? So as a community we establish rules to govern what constitutes “quality” research. How many technical and biological replicates did you do? Was your data significant?
By creating these rules and trying to fit them, valuable negative results are lost in lab notebooks as unpublishable. Worse, data is massaged until it fits a “significant” result. The journal of Basic and Applied Psychology has gone so far as to outright ban p- values from their publications! Novella wrote that the use of p-values encourages lazy thinking, as if the magical value of p<0.05 must mean your hypothesis is correct, when in reality that number is entirely arbitrary. The founders of PubPeer, an open and anonymous forum for discussing scientific work post publication, describe the difficulties of publishing quality work as “chas[ing] ‘metrics’” in the search for impact when we are “ruled by often incompetent kingmakers”. These metrics are held up as the gold standard for success, but ultimately lead to the failure of good science in favor of winning the numbers game.
Statistics show the system is broken. According to an article in ASBMB today by Jeremy Berg, 75-90% of important cancer research is not reproducible. Another article showed that 25% of randomly selected publications in three cancer journals had major flaws in imaging data. I cite this specifically because imaging data and statistics seem to me to be the least understood analyses. Reviewers fail to see the problems in these areas, not only because of the ability of the researcher to manipulate that data, but also because the reviewers don’t understand the methods enough to think critically about them.
The system is broken, yes, but reading these articles also made it clear that people are thinking deeply about the issues we face and the best steps to take to fix the problems for the future.