Thursday, January 18, 2018

Preventing Population and Gating Bias

One of the primary tools used to generate scientific data in the field of immunology is flow cytometry. The data generated by this technique is then analyzed using a program called FlowJo, which requires user input to look at specific populations generated by the experiment.  Due to this requirement for user input, there may be some inherent bias generated while looking at specific populations through a method called “gating” which allows you to isolate a specific population and further compare that population with other experimental parameters which then require further gating, leading to the possibility of more bias.

These biases can stem from the thoughts along the lines of, “Oh, this is where I should end this gate to include [x] amount of the population,” or, “Gating around this section of this population will make my data significant enough to be published.” In addition to this gating bias, those analyzing flow my also generate a bias on what they would like to analyze. For example, the scientist may want to investigate the relationship between variable x and y, but decide not to look at x and z even though z is on their panel for the sake of gating. This form of bias may prevent the analysis of data that could be significant and play a key role in what is being investigated. For both types of biases, there are methods that are being and have been developed to prevent this from occurring.

There are types of scripts in R that can generate gating for analysis based on control samples, which would prevent gating bias (unless the controls were biased, but that’s just bad science). There are also programs like CITRUS that allow users to input their data into a system that will use an algorithm to analyze each variable to every other variable and report which variables are significant when compared to each variable. These types of methods can help prevent biased generated in the previously mentioned ways, and as scientists we should continue investigating methods that allow us to analyze our data in ways that prevent us from overlooking valuable information or skewing data in our favor.

Flow gating example for you non-flow users:


What it means to be scientific

The word “scientific” is one that I have been thinking about a lot recently. It comes up in casual conversation all the time, as in “that’s not very scientific” to mean that something is uncertain, or “she has a very scientific mind” to describe someone who is organized and meticulous. I cringe internally when I hear this phrase, because, I think, science is not very scientific. The public perception of science is that it is methodical, linear, organized; that we as scientists think of a question and that science gives an answer. It is no surprise that they think this, because it’s how we present our work both in academic journals and to the public. We create a story when writing a paper or presenting our work, making it sound like we knew what was going to happen all along and that the conclusions we drew were inevitable. But we know that this is not how science works. In reality, we come up with a question and a possible way to find the answer; our idea doesn’t work, or gives us an answer we weren’t expecting, or the project is more difficult than we anticipated – science is a messy, nonlinear, unplanned process.
This disconnect between the reality of conducting science and the story we tell to the public creates two problems. First, it elevates scientists to a level in our society where we are almost mythical beings, solving the unanswered questions of the universe.  Second, it creates public distrust when we start to talk about the problems of bias and irreproducibility in science, or when we have to walk back claims we’ve made. How could such a clear process result in results that can’t be obtained twice? How could such pure people, working to explain the world to everyone else, be biased? While we as scientists have to grapple with how to make our results replicable and what irreproducibility means for our fields, I think that being more honest about how science works would help to prevent the sense of disillusionment the public feels when we are inevitably revealed to be imperfect.
For instance, scientists, doctors, and journalists should stop overstating the impact and significance of our work. As explained in this Vox article, we often describe our findings as miraculous or curative when in fact they have simply moved the field forward one small step. A quick google search revealed that we’re going to cure HIV in the next three years, when in reality human CRISPR trials are much further off. Instead of telling the public that we’ve cured HIV or any particular disease, we can be more honest and tell them that we’ve unlocked another small piece of the 10,000-piece jigsaw puzzle – that doesn’t make it less exciting that we’re curing HIV in mice, it just makes it less confusing for the public when HIV isn’t cured in 2020. As graduate students, we can be honest with our friends and family about the realities of daily life as a scientist rather than painting a rosier picture of the scientific process. We can also help them to interpret findings in the media and explain when results are truly outstanding versus being an interesting novel finding.  

Finally, we can also be more honest with ourselves about the role our discoveries play in the larger scientific community. My favorite analogy for science is that “solving” any question is like pushing a boulder up a mountain. No individual is going to get the boulder to the top, but we will each push the boulder a bit further. This does not mean that our work is not important. This does not mean, for those of us who study problems relevant to human health, that our work will not help real people someday. But if we can be honest with ourselves about the relative insignificance of our individual discoveries, I think it will help to keep science closer to the pure ideal of the pursuit of unbiased knowledge.

Wednesday, January 17, 2018

The fight for publication

In our world of science there are always two main goals, get papers and then get grants. In order for those to happen you have to start with a base of good experiments, including new ideas, great experimental design and significant results. But there is always a battle to get into the best journal, the one with prestige, the one that will get your name out into the science world. In order to do that we must appease the reviewers, those anonymous peers whose job is to judge and critique your work. The system is made to prevent fraud and produce the best possible scientific research, but is the best science always published in the best journals?

There is always an inherent bias in the publications of papers. The more data that you have, the more famous your lab is, the fancier the techniques, the more likely your work is to be published in a higher tier journal. I am not saying that any of that science is not worthy, because it is truly amazing, but the competition that there is to get into better journals is a part of life in this career. Often times there are beautiful experiments and amazing results in smaller journals and how can science be pure and fair if we are fighting for the fame of a high tier journal. Shouldn’t the goal be to produce the results that are accurate and advancing your field?

The new website PubPeer wants to change the world of science publication, to start an anonymous conversation. This would allow for a type of worldwide journal club. Even though this still occurs after the publication of papers imagine being able to discuss the worlds best papers with talented scientists all over the world. Even though this doesn’t remove the inherent bias of the publication process it may add a new dimension by allowing for a more global discussion and may lead to less fraud in the long term. Personally, hearing other researchers’ opinions of papers at journal club has always helped me to look at both others and my own work more critically. The Pubpeer discussion that is available for the world to see could add an extra level of review, this time not chosen by a journal but the actual scientific community, the very same people who are reading the papers we are all trying to publish.

Fighting hypothesis myopia

Irreproducibility downright scares me. When I decided to embark on this journey as a scientist, I thought I knew what I was getting into, but having come across the need to reproduce results in science, I realize that there’s sides to this I never considered. I’ve always been afraid of failure, and whether that comes in the form of my work not reproducing expected results or my work not being able to be reproduced by others because of mistakes I’ve made, I know that I will encounter “failure” frequently in this line of work. Much of this fright comes from not being able to step away from the notion that a failed replication does not equal scientific failure. Jeff Leek’s article relaxed me, mainly because it reminded me that success in science comes in many flavors, not just successful replication. He summarizes it well when he states that failure to replicate could stem from an “unusual event” or other “unmodeled confounders”. I know that I am human, and I will make mistakes, but all I can do is focus on my skills and ensure that my procedures are rigorous and my reports are authentic. I should listen to Claude Levi Strauss, who once said “A scientist is not a person who gives the right answers, but one who asks the right questions.

The biggest trap that could lead to irreproducible studies is this so-called “hypothesis myopia”, of which I will admit, I have been and possibly still am guilty. The basis of this cognitive fallacy is the fixation and falling in love with a single hypothesis but failing to try to disprove it. Any finding coming as a result of hypothesis myopia could completely distort the way in which we see the world, by making us focus on what we wish the data to be instead of what the data truly show. The solution, as Regina Nuzzo so eloquently phrased it in her nature article, is to counter biases, which act like an accelerator in the world of science, by pumping the brakes and slowing down to be more skeptical of findings. We must fight hypothesis myopia with further testing, and remember the words of Adam Savage at the San Francisco March for Science last year, “Bias is the enemy of science, but science is also the enemy of bias.

Positively Negative

Some may say that science is overly positive. Researchers are constantly on the hunt for positive results. There is a certain allure to discovering a new process; it’s a lot more satisfying to say that “x leads to y,” instead of “x has no effect on y.” This positivity bias leads to many experiments with negative results pushed to the side when it does not fit the narrative that the scientist is crafting for publication. As a result, the published literature lacks a full picture of what is and (is not) happening in nature.  To combat this, there are some journals that have been created solely to publish negative data. For example, The Journal of Negative Results, Ecology and Evolution Biology aims to publish studies that have scientific rigor but yielded negative results, in an effort to “expand the capacity for formulating generalizations.” These types of journals do seem to be shifting the tide, if only just a little, toward the publication of negative results. The Journal of Negative Results inBioMedicine, which launched in 2002, is now defunct as of September 2017, stating that since their launch, other journals have begun to publish negative or null results alongside reports of positive results. While this is encouraging, many of the high impact journals place little to no emphasis on the publication of negative results. If the journals that have the most visibility do not place as much value on null results, then it is likely that many of the scientists that have the desire to publish in those journals will bias themselves against prioritizing their negative data by attempting to publish it. Matosin et al., (2014) argues that the publication of negative results is not merely making a story out of nothing and that all data should be published whether positive or negative, along with a hypothesis to explain the data. Just as scientists can think of and discuss reasons why something is happening, they should be able to think of and discuss reasons why something is not happening.

Restore Public Faith in Science - Fix Bad Statistics

Stepping out of the ivory tower of academia, we come face-to-face with a public that has a growing distrust of science. When bad science makes its way into the media, it feeds the distrust by giving leverage to their arguments – if scientists say it, it must be true, right? For example, we can point to study after study where it has been shown that vaccines do not cause autism, but one heavily flawed paper from the 1980s has now convinced thousands of new parents that vaccinating your children is unnecessary and even dangerous, despite the eventual retraction of that paper (thanks, Andrew Wakefield). Biomedical science is probably the most publicly discussed field, because the bench results will theoretically make their way to humans as treatments and cures.

The “publish or perish” mentality has driven scientists to value results over process. And who could blame us, when our ability to do research is funded based on our ability to produce novel results, and our skill in gaining this funding is what keeps us employed? With the pressure on to achieve the desired results in the shortest possible time, we arbitrarily decide a sample size of three is enough to detect an effect, if one exists. Results in hand, we open the door for our intrinsic biases to sneak in and permeate themselves throughout our data analysis, hoping to achieve a value of p < 0.05. Further, we encourage this behavior throughout the tiers of the lab – it starts with the PI, who is thankful this data will fit nicely into the grant renewal and trickles down to the relieved graduate students, who can write the data up into a manuscript to check a box off their graduation requirements.

This results in inadequate training in statistical design and analysis of experiments, generates science that produces results that cannot be replicated, and perpetuates a cycle of scientists untrained in recognizing an inherently flawed study. If the people who are considered well-informed are unable to identify these issues, it is almost certain that a layperson would not be able to distinguish between statistically sound and cutting-corners science.

We owe it to the public, and to ourselves, to do better.

The experiment is not working

Science. The word itself has so many different connotations. For myself, science is definitive or at least it should be. Ideas can change and the experiments may not explain the full story but science should be definitive. This is where biases enter the arena. In the pursuit of definitive science, researchers are cherry-picking the data that fits their particular narrative. Researchers think they are speaking the truth and are publishing the truth. However, the other experiments that do not fit the narrative are slid into supplemental figure 8 or unpublished.

            From Dan Ariely’s The Honest Truth about Dishonesty, he speaks about an experiment with ten minute conversations between strangers. Afterwards, each stranger would state that they had not lied during the conversation. Upon further scrutiny, it would be revealed that they had lied two to three times. In my opinion, this is what happens during observational biases that occur every day in scientific research. The pressure to succeed, excel, and continually publish. This has created an environment where the successful experiments are taken to the PI and the experiments that do not fit into the story are not mentioned. PIs are beginning to write papers before all of the research is performed. We are placing huge biases

            During my time in research, there has been several instances where experiments from the lab or from different labs are not believed to be correct because they do not fit the narrative the PI is trying to tell. Beyond this, I heard one PI state that the experiment was not working because it did not fit their narrative. The experiment was working but it was not giving the expected result needed to fit within their already published story. They proceeded to repeat this experiment until it gave them the result they wanted for three experiments…

Scientists believe they are communicating truth but we are slipping in lies.