My dissertation research has focused on defining the immune response of nonhuman primates infected with simian malaria parasites. One of the biggest challenges in nonhuman primate research is small sample sizes due to the cost of performing research utilizing these models. To put it into perspective, one monkey can range anywhere from $2,000 - $8,000 depending on the species and specific experiments that will be performed, and each animal costs $8 - $10 a day to house and feed. These costs add up quickly so researchers are required to limit the number of animals used, and in most cases, a monkey study consisting of anywhere from 3-7 animals is considered “well-powered” by the NHP research community. However as we have learned throughout the course and based on my experience with the heterogeneous responses of outbred models like NHPs, this sample size is typically not sufficient to rigorously and fairly test most scientific and statistical hypotheses. Further, most of the time the data does not meet the assumptions of a most parametric statistical tests, but most papers will use these test to gain significance, or in other words “p-hack”. This introduces bias into the nonhuman primate literature and provides a reason why many people are skeptical of NHP research. To fix this problem, there needs to be appropriate funding available for this NHP research to properly power and fairly assess the question being evaluated, or appropriate nonparametric statistical test should be used. However, this becomes difficult with small sample sizes as many nonparametric test require at least 5 subject to obtain a p-value of less than 0.05.
I have experienced the burn of an underpowered experiment that requires analysis by a nonparametric statistic first hand. Whenever I conducted one of the first experiments for my PhD, I had a result that was clearly significant based on the “bloody-obvious” test (see image to left). Prior to the experiment, I predicted the phenotype based on the literature and did a power calculation, which stated that I needed 5 animals to fairly test my hypothesis. My statistical hypothesis was that the mean parasite burden during a primary infection was different than the mean parasite burden during a relapse infection. Unfortunately during the experiment, one of my animals succumbed to the infection prior to having a relapse, which brought my sample size down to 4 animals. Whenever I performed a Wilcoxon matched pairs test (which I argue was the appropriate statistical test in this situation), I did not have enough data points to fairly test the hypothesis and got a P value of 0.0625 despite the phenotype; I should point out that this wasn't graphed correctly and should actually be connected by a line to imply that the analysis was paired. Whenever I presented this data, there was a huge debate on whether I had run the statistical test incorrectly and many thought, including PIs, I shouldn’t use nonparametric statistics even though this is clearly the appropriate test to perform. In the end, I succeeded in arguing my point and was able to report the phenotype in a publication even though we didn’t have significance because due to the death of one animal the study lacked the power needed to assess the data by nonparametric statistical methods.Overall, I think that NHP researchers should embrace nonparametric statistics even though it may require more resources to generate significant data, but the benefit of producing reliable data that draws reproducible conclusions is key. Overall, the extra resources are well worth it even if it means that one has to do less experiments or hire one less technician. I think that “less is more” whenever it comes to science, particularly in the realm of NHP research.