Unbiased Research: Patterns Real and Imagined

Sunday, April 10, 2016

Patterns Real and Imagined

One of the interesting topics covered in Harvey Motulsky’s first few chapters is the argument that probability is not intuitive because people tend to identify patterns even when none are present. He provides the example of basketball players being perceived as more likely to make or miss their next shot based on their current “streak” of successful or unsuccessful shots. As a side note, I think this is a rather poor example of the point, because it suggests that whether a basketball player makes or misses a basket is based on random chance as opposed to all the other factors that go into it. A better example is the randomly generated table provided on page 5, which could be interpreted as depicting patterns. As Motulsky points out, humans are adept at identifying patterns because it is evolutionarily advantageous to do so. I would propose that scientists are more perceptive than the average person to hints of patterns, as we are trained to detect regularities that point us to the underlying mechanisms that govern our world. That also means that we are decidedly prone to introduce bias into our work even with the best of intentions. If in the course of an experiment we start to see a trend emerging, we tend to look harder for more data that fit that trend. I discovered this during my first pilot experiments that measured disease severity in mice, and I have performed all subsequent experiments of this type blinded to genotype. Blinding is not a universal fix for this problem, though. Another instance of possibly spurious pattern recognition in data that comes to mind is multimodal populations. If you look at a scatter plot and see points clustered in what seem to be two groups, it is tempting to think that perhaps they reflect a bimodal response to a variable. Flow cytometry is another area where this can occur, as it is often possible to identify numerous populations that seem to express different combinations of marker intensity. In the complexity of biological systems, the possibility that these “patterns” in the data represent truly distinctive physiological entities is very real, and especially in more variable systems such as human studies or experiments with outbred animals, it is not at all unlikely that subsets of individuals could exhibit different responses to treatments that are based on underlying physiological differences. For instance, studies relevant to our lab’s work have found that a subset of depressed patients exhibit high levels of inflammatory markers and that their depressive symptoms can be improved with anti-inflammatories(Raison 2013). So it is important for scientists to recognize and pursue patterns that may lead to outcomes like this. But we must also recognize the potential for bias that comes if we choose to focus only on one perceived population of a dataset that “behaves better,” and also the potential to miss interesting findings by subdividing populations to the point that we lose experimental rigor.

1 comment:

UnknownApril 11, 2016 at 3:20 PM
If we want to see particular patterns, I believe our brains will eventually allow us to see them-- where they do or do not exist. And to your point, there is a tension as to whether this gleaning of patterns can yield to productive, unbiased results. This is where we must allow our data to speak for itself, and we must trust the tests designed pre-analysis to determine their significance. If a pattern is found to not be significant after testing with a high powered, low error threshold, then it could be that bias somehow crept its way in.
ReplyDelete
Replies