Monday, April 4, 2016

What's in a correlation?

What is correlation? Or perhaps a better question is, what is a good correlation? The answer isn’t very straightforward. I made up the following data to see if there was a correlation between a person’s midi-chlorian count and the number of soft drinks they consumed during a year.

The correlation statistics are as follows:
r= 0.3483
            r2= 0.1213

The p-value is very small, so you could conclude that there is a strong relationship between midi-chlorian count and soft drink consumption. If there really was no relationship between midi-chlorian count and soft drink consumption your chance of obtaining a correlation this strong is very unlikely. However, the r2 value is very small. This suggests that only about 12% of the variation in midi-chlorian count is explained by variation in yearly soft drink intake.

So, which measure do you look at to judge the correlation? The p-value is really small, which suggests that the correlation is unlikely to occur coincidentally. However, its important to remember that p-value is highly dependent on sample size. This study samples 111 individuals, and with a sample size that large, very small effect sizes can become statistically significant. The effect size is small since r2=0.1213, so we are left with the question, is an effect size of roughly 12% scientifically important? This is a difficult question to answer and it’s probably best left to the judgment of the scientist or the reader. I think this problem raises an interesting question about the strength of correlations reported in media. The news is full of correlation data between various categories, but how are the strengths of these correlations being judged? Do journalists and scientists look at low p-values and decide that a correlation is strong or do they look at a high r2 value and effect size? An alternative to this is for journalists to publish the actual data, and let readers conclude whether the correlation is strong enough to warrant action or consideration.


  1. Wouldn't this be dependent on the question in the first place? A good statistical design would be less "discovery" based and have some kind of acceptable level of change

  2. I like your point about correlations reported in the media, particularly in our bite-sized information consumption culture. A few months back, a study claiming that preferring black coffee correlates with psychopathic tendencies was being reported on by various online news sources including The Independent, Huffington Post, and Jezebel. In an effort to be easily digestible, the articles touched only briefly, if at all, on any aspects of the study other than the clickbait conclusion. There was no mention of the numerous limitations of the study, including offering a very meager compensation as incentive to participate in the study, the use of self-reporting which is notoriously unreliable, and the categorization of foods into flavor categories that not all would associate them with (ex. cheese was categorized as bitter). Regardless of the limitations, one can find in the actual paper that r-squared is only 0.02 for the correlation of bitter taste preference with psychopathic tendencies, which I find to be far too low to be indicative of any sort of connection between the two.

  3. [bang head]The p value here reports out the test of a non-zero slope. The p value doesn't indicate the strength of the correlation. It doesn't say how non-zero the slope is. The p value is low here because there are a lot of X,Y pairs.[/bang head]

    As for the media: they aren't scientists. They're just the messengers. It is up to the scientists to feed them unbiased and reliable information. And its up to the scientist to point out to the media when, how and why something is flawed. That's why we get paid the big bucks.

  4. I think overall, the only opinions that should be trusted in interpretation of data should be those of individuals who are educated and equipped to analyze that data. Following these parameters, I would say that most members of media are not equipped to report on scientific data.

    That being said, the responsibility then lies on the scientists to accurately report the data to the media, which I believe is a huge ethical issue in science today. As was discussed in many of the initial blog posts, scientists like to embellish their findings to generate interest, funding and respect within their field. Therefore, many researchers may prey on the statistical naïveté of the media and general populace to fraudulently promote the success of their science. In order to do so, I believe many scientists promote p-values and r-squared values as indisputable, easily understood metrics that concretely validate their findings. To eliminate this phenomenon, I believe the media and public should be educated that these statistical parameters do not guarantee validity of the claims that they back, but rather suggest that the relationships they describe deserve more analysis. The ultimate decision on whether the correlation significant should come from the scientific basis of the claim and not just on the data meeting a set of statistical guidelines.

    Additionally, all statistical analyses and guidelines (such as the p-value of 0.05 indicating significant data) are all human constructs! How different would the world of science be if p-values of 0.10 or 0.01 were chosen instead?

    All in all, I think that statistics offers a great way for scientists to describe and promote their data, but statistics should always be taken with a grain of salt. The strongest discoveries will not rely just on r-squared or p-values, but will instead incorporate these parameters into a sound, scientific explanation of the observed effect.