It is once again a presidential election year, which means mainstream media is practically covering our daily lives in a bloodbath of tortured poll results. Everywhere you look, there is yet another new opinion poll showing how the political pundits are faring against each other. However, before even looking at any of these results, we should be questioning how the survey was designed before it even began and how the data was actually collected.
The end goal of these political opinion surveys is to infer how the whole population of voters in the US will vote in November. However, sampling the opinion of every single American voter is completely impossible given the constraints of time, money, and willingness/truthfulness in every voter answering. So instead, polls must base their opinions on a smaller sample, a subset of individuals taken from the greater population. If the sample is chosen by random sampling, then the distribution of opinions of those people will be similar to that of the entire population. Randomization is key here. If there are any parts of the population that are intentionally (or unintentionally) excluded or over-represented in polling, then the sample distribution of opinions will not necessarily resemble that of the whole population. Problems in proper random sampling have been hotly debated for decades as discussed by Jill Lepore in a recent article in The New Yorker, and the debates have only grown more contentious in recent years with changes in technology. For example, many opinion polls are conducted by random dialing of phone numbers and collecting responses of whoever picks up. However, telephone response rates are currently in the single digits, and people who willingly pick up the phone often have very different political behaviors than the general public. Furthermore, random calling to cell phones was banned in the 1991 Telephone Consumer Protection Act, so calls are only going to land lines and not cell phones. And once again, the demographics of people who have land lines is not fully representative of the whole population, excluding younger voters in particular. After the UK elections in 2015, Nate Silver, the widely beloved political analyst behind FiveThirtyEight, highlighted a few recent examples of inaccurate political polls and some of the reasons to worry about bad sampling, such as the inability to rely on contact by phone, unreliable online polling methods, Americans withholding their true opinion in polls, and herd opinions swaying voters.
Although political polls aspire to have completely representative polling, an accurate sampling is nearly impossible in our current culture given the reasons above. To compensate for this, many political polls use methods that could be considered scientific heresy to scientists working in hypothesis-driven experiments. For non-probability sampling methods, pollsters make the assumption that each sample selected does not have an equal chance of being chosen from the population (i.e. younger voters with cellphones but no land lines are far less likely to be chosen in phone-based polling methods). There are various methods to compensate for the changes in variability, such as weighting the results of certain samples more than others. For example, a phone-based poll could weight the opinions of younger voters or minority voters more so than the opinions of older white voters more likely to have phone lines. But remember, although this seems like a practical method to “fix” political polls, it makes for really really bad science. Hypothesis-driven scientific studies only work on the assumption that all samples were RANDOMLY chosen from the population, i.e. the probability of choosing any one sample is equal for all samples in the population. Using non-probability sampling or not using true randomized selection throws this assumption out the window, runs it over with a semi-truck, and lights the remains on fire. Never trust scientific experiments using non-probability methods, they are scientifically unsound and completely useless when it comes to predicting anything of value about the whole population. And for political polls, make sure to investigate the weighting methods used and determine yourself how much you actually trust the reported results.
Another important feature of political polls is their sampling error. By pure chance, any sample poll of a population will have a slightly larger or smaller proportion of subjects voting a certain way compared to the true population vote. Pollsters only know the opinions of the small sample, not of the whole population, so the best way to show how sure you are that your sample represents the whole population is to calculate confidence intervals. A poll result percentage by itself is completely useless for making assumptions about the population without a confidence interval. By pure tradition, almost everyone uses a confidence interval of 95%. This means that the range of values reported has a 95% chance of containing the actual population value and a 5% chance that it does not contain the population value. For example, let’s pretend you polled 100 people about if they would vote for Bernie Sanders or Hillary Clinton in a state’s Democratic primary. If 54 of 100 voters in the sample say they would vote for Bernie, then the 95% CI for the sample is +/-9.77%, which means that the percentage range of 44.23% and 63.77% has a 95% chance of containing the population value of the proportion of voters who will vote for Bernie.
Increasing your sample size decreases your confidence interval, as illustrated below by PewResearch.This is because the confidence interval is roughly proportional to the square root of 1/n where n is the number of samples. So, bigger n leads to smaller confidence intervals.
If you repeat the poll above for Hillary vs Bernie in 1000 sampled voters, and find 54% of voters +/-3.09% support Bernie to 95% confidence, then the percentage range of 50.91% to 57.09% is 95% likely to contain the population percentage of voters supporting Bernie. It’s still the same sample percentage, but the 95% confidence interval is much smaller.
If you want to take a closer look at current political polls, I highly encourage you to look at the compiled list of national polls on FiveThirtyEight and search through the methods sections to see how each poll plays with non-probability sampling weighting, random sampling methods, and determination of confidence intervals. The lack of standards between polls is both horrifying and fascinating and makes me question the results of all political polls, especially when methods aren’t clearly reported.
I’ll leave you with this one interesting snippet of the methods section of an NBC/SurveyMonkeypoll that actually acknowledges inherent bias in their selection methods and how they compensated for that in their methods. It’s rare that a poll will actually acknowledge their bias like this, so enjoy this unusually honest example.