One of the interesting topics covered in Harvey Motulsky’s
first few chapters is the argument that probability is not intuitive because
people tend to identify patterns even when none are present. He provides the
example of basketball players being perceived as more likely to make or miss
their next shot based on their current “streak” of successful or unsuccessful shots.
As a side note, I think this is a rather poor example of the point, because it
suggests that whether a basketball player makes or misses a basket is based on
random chance as opposed to all the other factors that go into it. A better
example is the randomly generated table provided on page 5, which could be interpreted
as depicting patterns. As Motulsky points out, humans are adept at identifying
patterns because it is evolutionarily advantageous to do so. I would propose
that scientists are more perceptive than the average person to hints of patterns,
as we are trained to detect regularities that point us to the underlying mechanisms
that govern our world. That also means that we are decidedly prone to introduce
bias into our work even with the best of intentions. If in the course of an
experiment we start to see a trend emerging, we tend to look harder for more data
that fit that trend. I discovered this during my first pilot experiments that
measured disease severity in mice, and I have performed all subsequent
experiments of this type blinded to genotype. Blinding is not a universal fix
for this problem, though. Another instance of possibly spurious pattern
recognition in data that comes to mind is multimodal populations. If you look
at a scatter plot and see points clustered in what seem to be two groups, it is
tempting to think that perhaps they reflect a bimodal response to a variable.
Flow cytometry is another area where this can occur, as it is often possible to
identify numerous populations that seem to express different combinations of
marker intensity. In the complexity of biological systems, the possibility that
these “patterns” in the data represent truly distinctive physiological entities
is very real, and especially in more variable systems such as human studies or
experiments with outbred animals, it is not at all unlikely that subsets of
individuals could exhibit different responses to treatments that are based on
underlying physiological differences. For instance, studies relevant to our lab’s
work have found that a subset of depressed patients exhibit high levels of inflammatory
markers and that their depressive symptoms can be improved with
anti-inflammatories(Raison 2013). So it is important for scientists to recognize and
pursue patterns that may lead to outcomes like this. But we must also recognize
the potential for bias that comes if we choose to focus only on one perceived
population of a dataset that “behaves better,” and also the potential to miss
interesting findings by subdividing populations to the point that we lose
experimental rigor.
If we want to see particular patterns, I believe our brains will eventually allow us to see them-- where they do or do not exist. And to your point, there is a tension as to whether this gleaning of patterns can yield to productive, unbiased results. This is where we must allow our data to speak for itself, and we must trust the tests designed pre-analysis to determine their significance. If a pattern is found to not be significant after testing with a high powered, low error threshold, then it could be that bias somehow crept its way in.
ReplyDelete