As many others on this blog have already discussed, the
reproducibility crisis is a serious concern shared by many scientists. While some
blame the current culture of scientific achievement and others blame a
widespread misapplication of statistics, the exact reasons behind this crisis
are difficult to determine. In all of these discussions on scientific
reproducibility, the question still remains: how do we fix it?
One proposed solution is sharing data. As Jeff Leek
discusses in his article, there is much debate and fear about
the idea of sharing dating for increased transparency and discovery. For many
years, scientific findings have been shared in journals, where researchers
present the results and interpretations of their studies via descriptions and
figures. This method has been the fundamental means of scientific progress,
allowing scientists to build discoveries off of the foundation of others. However,
it is increasingly debated whether papers are enough – in an increasingly
connected world, should scientists also share their raw data, allowing others
to truly dive into the analyses performed as well as search for new findings of
their own. While open-source data could open up a new world of discovery, there
are also potential risks: data-sharers could lose an advantage in their field
if others publish findings before them and without credit and data-analyzers
could potentially misinterpret or improperly use the dataset without proper
training. There are both pitfalls and advantages to data sharing, but as the
science community begins to acknowledge and address the reproducibility crisis,
open source data is a very viable solution.
What does open-source data look like in practice? In the
field of neuroscience, there are several organizations and research groups
pioneering data sharing. One such group, Neurodata Without Borders, attempts to
address the logistical problems of sharing data. One obstacle to open-source
data is that different research groups use very specialized techniques and
store data in various distinct ways that can be difficult for a potential data
analyst to understand. The Neurodata Without Borders pilot project attempts “to
develop a unified, extensible, open-source data format for cellular-based
neurophysiology data.” With a unified database, this organization aims to make
data-sharing accessible and practical for scientists across the globe. In
another pioneering effort to facilitate data sharing, a group of neuroscience
laboratories across the world recently came together to form the “International Brain Lab.” This lab is a giant collaboration and project of reproducibility,
where laboratories in various locations will use the same tasks and protocols
to develop a standard model of neural processing. The International Brain Lab’s
“standard protocol attempts to address all possible sources of variability…. from
the mice’s diets to the timing and quantity of light they are exposed to each
day and the type of bedding they sleep on. Every experiment will be replicated
in at least one separate lab, using identical protocols, before its results and
data are made public.” With solutions such as these, perhaps the trend of
irreproducibility in science will be replaced with a more positive trend of
collaboration and unity in scientific discovery.
No comments:
Post a Comment