What
if the same sets of data could show completely opposite trends depending on the
grouping of the data? The goal of this blog is to document my reaction to
recent articles that address the bias and irreproducibility in science, but as I was reading on the topic, I came across this video that I thought was worth sharing. One of
the articles that caught my attention was Jeremy Berg's letter on ASBMB. The
letter outlined what the scientific community should do to enhance the
reliability of scientific research. According to him the first thing we should
do (and perhaps the most important thing we should do) is to acknowledge and
take ownership of the problem. The second thing is that each researcher has a
responsibility to make their own work “as reliable as possible within the
limits imposed by resources and other constratints.”
The
second point is what caught my attention the most. The letter goes on to
explain that some published work is the result of one successful experiment out
of ten, and that reviewers have to address “clear flaws and inadequate
information” to improve reliability. However, bias and reproducibility are not
dichotomies in and of themselves. Dan Ariely talked a lot about cheating and
the moral code behind cheating, but in the end he briefly mentioned intuition.
He mentions that many of our intuitions are wrong and that it should be our
responsibility to test these intuitions.
This
is where I think Simpson’s Paradox comes into play. In brief, Simpson’s paradox
occurs when a trend appears in different groups of data, however when the
groups are combined this trend disappears of even reverses. The Ted-Ed video by Mark Lidell mentions that one study in the UK (which I looked for but could not find) appeared to show that smokers had a higher survival rate than non-smokers over a 20 year time period. Yet, when the participants were divided into age groups, the non-smokers were on average much older and thus more likely to die during the trial. This inherently suggested that non-smokers were actually living longer, but the grouping of the data showed a different trend. This raises a lot of questions such as, which types of bias should be aware of when analyzing statistics? What types of motivations could be at play when individuals or companies present statistics? And are there any ethical responsibilities that we face when working with statistics?
Good stuff. The hardest part of doing science is seeing our biases. Always has been, always will be.
ReplyDeleteGood stuff, Luis, thanks for sharing! I find it fascinating that this paradox creeps up frequently in daily life. It proves to be difficult to control for confounding factors that are causally relevant to the situation at hand. A real-life example of Simpson's paradox is one that millions of Americans have fallen prey to. Since 2000, the median US wage has risen about 1% however during the same period, the median wage for high school dropouts, high school graduates with no college education, people with some college education, and people with Bachelor's or higher degrees have all decreased. How can both be true? Well, it all depends on the perception of events depending on one's viewpoint. An economist might say the headline rate of overall median wages has increased, but an average individual American will confidently say wages have declined. Ultimately, the paradox is prevalent throughout our daily encounters, which further exposes the inherent challenges in statistics.
ReplyDelete