I picked up my first pipette in 2009 in a virology lab that focused on a single domain of a single protein of a single virus. The majority of our assays were in vitro, and even cell culture was used sparingly. We essentially studied a single cog of a machine outside the context of that machine, and this was standard practice in the field.
Today I am sifting through massive piles of RNA sequencing, proteomics, and metabolomics data in an effort to see the machine more clearly. As our high throughput and analytical methods have progressed, scientific research has gained the ability to investigate not only the intricacies of biology, but also the opportunity to examine the bigger picture. With that opportunity, however, come significant challenges. We can spend months chasing down artifacts or trying to piece together seemingly contradictory data with little success. Statistics is our only guide in this process, and many of us have only rudimentary training in the nuances of analysis necessary to navigate these mountains of data.
The days of “your favorite protein” may be coming to an end, replaced by a more complex and wholistic approach at the bench. Amin et. al. describe this new approach in an excellent article about systems biology and the rise of big data. They summarize the “omics” approach in the figure below, and argue that research must move away from any one aspect of biology and begin to integrate all approaches into the investigation of a scientific question.
With this more complex approach comes a need for more advanced statistics in graduate and even undergraduate education. Big data has become an integral part of our scientific lives and increasingly an important aspect of our personal lives as well. In the midst of this contested presidential election, popular interest in big data is incredibly high, and interested onlookers would do well to familiarize themselves with the art and science of statistics.