One reason may
be the use of different technologies has continued to steer towards more
quantitative results while working more with data on a continuous variable scale. Often many original publications from the
early 1900s of viruses demonstrate infection through fairly binary data, a dead
or living mouse or a specimen able to grow in one cell type and not another. In the case of one of the first papers on a
classic influenza assay presented in the 1942 Hirst paper “The Quantitative Determination of Influenza Virus and Antibodies by Means of Red Blood Cell Agglutination".

the data was presented by Dr. George Hirst as such as
follows:

Here you see that data is being presented by + signs that correlate to a grouping of 0-25%, 25-50%, 50-75%, and 100% as perceived by the viewers comparison to agglutination controls of known values. In this case, the data is assessed on a binary scale. These values of course are sufficient to show that with increasing dilutions of serum you have a loss of agglutination inhibition titers that render the agglutination near that of the negative controls. In this case the data can even be quantified, but no such numeric quantification is done, or statistics assessed. But to virologists in 1942 as well as those now, we would all come to the conclusion that there were specific substances in this serum that have inhibited agglutination at lower dilutions.

As the basic questions of virology have been
answered and we continue to explore more subtle nuances, both methods and
evaluations of results have changed while the understanding and enhancement of
statistics may still be evolving.
Classical virology entails using techniques of viral isolation,
cytopathic effects, and animal survival studies. As techniques began to advance, such as serum
antibody titers and blots of RNA, DNA, or protein were able to be imaged; results became either a binary or quantitative result. Even as these methods are
still used today, many times virologists are
still assessing an approach from a binary perspective, but our data has
changed. The statistical methods or data presentation previously used remains the same, while ever-increasingly
powerful quantitative data is presented such as qRT-PCR and next gene sequencing. Virologists seem content with giving a number or measurement that shows an integer difference without seeing a statistical difference. A

*de facto*approach that any difference must be important to the whole system. One example can be the western blot, which is used to measure the expression of a specific protein. In many cases, seeing any amount of protein expressed was sufficient, or a reduction in band intensity showed a difference between a wild type virus and mutated variant virus. Any virologist could argue that the change in a protein level is important to the virus or whatever mutation was selected impacts that protein, but how do we assess a situation accurately and also with a statistical perspective? Many times experiments are conducted comparing a mutant virus to a wild-type and the baseline is established as wild-type levels, not overall means but rather one measurement. A baseline is set not on repeated measures of data but the one designated western blot that was selected to demonstrate the impact. Replicates of the data may be done but since the nature of experiment such as protein expression can change each repeat, each replicate is entirely independently measured than the next one. What results are examples of assessment of data as seen in this Gabriel*et al.*Plos Pathogens paper "Interaction of Polymerase Subunit PB2 and NP with Importin alpha1 Is a Determinant of Host Range of Influenza A Virus"

Statistical assessments of data continues to be lacking for virology publications, even to the extent that a field-specific journal has published its own article explaining the importance of correctly assessing statistics tailored to the designs and questions often found to be of interest to virologists. The statistical methods are available for analysis, the Journal of Virology itself published a wonderful flowchart on how to assess appropriate statistical methods for virology experiments and yet data continues to emerge with little to no statistical significance attached.

Is the problem less in the available techniques and statistical methods or information and more about the disregard in proper statistical approach within the field by the users themselves? Have the classical approaches to virology continued to plague the statistical approaches to virology even with modernization of technique and statistical approaches? What conclusion is clear, is that a large challenge in statistics within this field is on the users themselves and peer-review methods. If no standard is maintained for statistical significance when it can be assessed, why do these experiments get published? Is the classical approach (any difference is significant!) to the field clouding the statistical judgements going forward and lastly...Can you teach an old virologist new tricks?

You bring up a point that we as students will most likely struggle with as we move forward in our graduate school career. The question as to whether old scientists can learn new tricks is not just one of scientists but one of human nature. Humans don't appreciate change, and after doing something some way for some period of time, it can be nearly impossible to try to get them to do it another way--it's worked for so many years, why change?

ReplyDeleteThe necessity of statistics in virology and other scientific fields is clear, as you have pointed out. Although a change might be visible to the naked eye like the experiments in the 40s, experiments today are not likely to contain changes visible to the naked eye, and if they did, it is likely to breed bias in the research.

The fact that work is still being published after implementing tools to provide better statistical access and a "call to arms" for virologists to start using stats, almost makes the fault lie in those that continue publishing results with no statistics.

In order to make this change, maybe it will not come from those that publish data or the professors that submit it. Maybe it will have to start with us students, making the changes subtly in the graphs we present to our professors, and then slowly in our work to be published. We might not be able to teach the old scientists new tricks, but we can still slowly change the field to accept the use of statistics.