Big data, which refers to a huge
number of diverse data created at a high rate, should
be leveraged by scientists to gain knowledge previously inaccessible via
traditional tools and techniques. The scientific field is rapidly advancing;
researchers are performing increasingly difficult experiments and creating exponentially
greater data sets. Complex mathematical techniques for crunching big data have
been developed, but these statistical techniques are not well understood by the
greater scientific community (read
more here). In fact, the McKinsey group asserts that “the United States
alone faces a shortage of 140,000 to 190,000 people with analytical expertise
and 1.5 million managers and analysts with the skills to understand and make
decisions based on the analysis of big data" (read
the full report here). This
uncertainty forces scientists to either 1) rely on outdated and often inappropriate
statistical techniques, or 2) apply new statistical techniques without formal
training, increasing the likelihood of errors.
I find myself at this crossroad, searching for a third path where I can
gain the statistical training I need to analyze the complex data I am
generating in my laboratory. As a graduate student in the field of
neuroscience, I expected my classes to adequately prepare me for everything I
would encounter at the bench top. However, I have realized I need additional training
in statistics and programming that goes beyond the scope of my required
coursework. This begs the question: how
many of my fellow graduate students feel the same as I do? I believe that more rigorous
statistical training should be required of all scientists, regardless of whether they are in the beginning stages of their training or approaching their retirement, because the future of science lies in big data.
No
ReplyDelete