As academic scientists, we of course
are invested in our research, and (ideally) want to leave our own mark on our
respective fields. The data that we generate and publish do not just contribute
to personal edification, but also the understanding of a topic on a global
scale. With the wide accessibility of scientific journals, data is consistently
reanalyzed and published findings applied to new experiments in a growing
international research network. Keeping this in mind, it is more critical than
ever before that the research community emphasize the importance of sound (and
complete) data and experimental reproducibility.
In his TED talk, Dan Ariely
discussed how it was more likely for every person in a room to cheat a little
than for just one to completely cheat. Students would give themselves a “4” in
lieu of the “2” they deserved, presumably so that they would get a slightly
better reward while still maintaining a degree of self-respect. This is sadly
applicable in the research world, in the form of “cherrypicking” data or only withholding
any negative findings from papers. Findings may often be smudged or spun in a
certain light, or an incomplete story presented, not nearly enough to warrant a
retraction, but just enough that the findings may not be entirely trustworthy. I
agree with Jared Horvath’s Scientific American article in that funding provides a constant pressure for
scientists to focus first on generating marketable data, and second on generating
complete or valid data. However, while funding is a legitimate concern, I do
not think that it is an excuse to perform unviable research or twist results.
Granted, this is easy for me to say as a graduate student with a guaranteed
stipend, but labs that produce questionable data do more than fail to
contribute to science; they actually detract from ongoing research. False data
can mislead other researchers who may use these findings as a baseline for
their own projects, which in turn could possibly fail or lead to more
misdirection. The withholding of negative data could lead other labs to pursue
these and waste precious grant money rediscovering what should already be
public domain.
After reading some of these articles,
it seems that it should be easier than ever to make sure that research is
well-executed, given the formation of organizations, such as the PLoS ONE New Reproducibility Initiative
and PubPeer. While these opportunities should be taken with a grain of salt,
they seem like a viable means for experts to help fact-check or ensure that
results hold true. I do think there is a critical difference between difficult
and irreproducible experiments, in that some procedures may have a low success
rate due to the necessity for high level of technical skill or specialized
setup. However, if even a group of experts in the same field cannot
recapitulate a finding, something is likely at fault with the underlying
experimental strategy or the published data.
I agree with your assessment of the current realm of scientific reproducibility (or lack thereof) of data. Though there is a lot of negative press regarding flagrant exaggerations, or worse, blatant forgeries of data, I believe the field is self-correcting. Though a large amount of time, effort, and resources will be spent it the self-correction process, it still occurs. One of the most blatant instances of data forgery, the scandal at Duke University involving Anil Potti, is such an example of the field of science as self-correcting. Though the harmful effects of his forged data can never be understated, in this instance ethical scientists reviewed his data and discovered that it was fraudulent. As a researcher that used fruits flies, I have the benefit of working in a very collaborative scientific environment. It is customary for labs to share their fly crosses and other reagents with others in the community. It is because of this fact that the fruit fly field is so self-correcting. People will not hesitate if they cannot reproduce your results. I believe that if science continues to be this collaborative, then we can still separate out the fraudulent from the real data.
ReplyDelete