It is once again a presidential election year, which means
mainstream media is practically covering our daily lives in a bloodbath of tortured
poll results. Everywhere you look, there is yet another new opinion poll
showing how the political pundits are faring against each other. However,
before even looking at any of these results, we should be questioning how the
survey was designed before it even began and how the data was actually
collected.
The end goal of these political opinion surveys is to infer
how the whole population of voters
in the US will vote in November. However, sampling the opinion of every single
American voter is completely impossible given the constraints of time, money,
and willingness/truthfulness in every voter answering. So instead, polls must
base their opinions on a smaller sample,
a subset of individuals taken from the greater population. If the sample is
chosen by random sampling, then the
distribution of opinions of those people will be similar to that of the entire
population. Randomization is key here. If there are any parts of the population
that are intentionally (or unintentionally) excluded or over-represented in
polling, then the sample distribution of opinions will not necessarily resemble
that of the whole population. Problems in proper random sampling have been
hotly debated for decades as discussed by Jill Lepore in a recent article in The New Yorker, and the debates have
only grown more contentious in recent years with changes in technology. For
example, many opinion polls are conducted by random dialing of phone numbers
and collecting responses of whoever picks up. However, telephone response rates
are currently in the single digits, and people who willingly pick up the phone
often have very different political behaviors than the general public.
Furthermore, random calling to cell phones was banned in the 1991 Telephone
Consumer Protection Act, so calls are only going to land lines and not cell
phones. And once again, the demographics of people who have land lines is not
fully representative of the whole population, excluding younger voters in
particular. After the UK elections in 2015, Nate Silver, the widely
beloved political analyst behind FiveThirtyEight, highlighted a few recent
examples of inaccurate political polls and some of the reasons to worry about
bad sampling, such as the inability to rely on contact by phone, unreliable
online polling methods, Americans withholding their true opinion in polls, and
herd opinions swaying voters.
Although political polls aspire to have completely
representative polling, an accurate sampling is nearly impossible in our
current culture given the reasons above. To compensate for this, many political
polls use methods that could be considered scientific heresy to scientists working
in hypothesis-driven experiments. For non-probability sampling methods,
pollsters make the assumption that each sample selected does not have an equal
chance of being chosen from the population (i.e. younger voters with cellphones
but no land lines are far less likely to be chosen in phone-based polling
methods). There are various methods to compensate for the changes in
variability, such as weighting the results of certain samples more than others.
For example, a phone-based poll could weight the opinions of younger voters or
minority voters more so than the opinions of older white voters more likely to
have phone lines. But remember, although this seems like a practical method to “fix”
political polls, it makes for really really bad science. Hypothesis-driven scientific
studies only work on the assumption that all samples were RANDOMLY chosen from
the population, i.e. the probability of choosing any one sample is equal for
all samples in the population. Using non-probability sampling or not using true
randomized selection throws this assumption out the window, runs it over with a
semi-truck, and lights the remains on fire. Never trust scientific experiments using
non-probability methods, they are scientifically unsound and completely useless
when it comes to predicting anything of value about the whole population. And
for political polls, make sure to investigate the weighting methods used and
determine yourself how much you actually trust the reported results.
Another important feature of political polls is their
sampling error. By pure chance, any sample poll of a population will have a
slightly larger or smaller proportion of subjects voting a certain way compared
to the true population vote. Pollsters only know the opinions of the small
sample, not of the whole population, so the best way to show how sure you are that
your sample represents the whole population is to calculate confidence intervals. A poll result
percentage by itself is completely useless for making assumptions about the
population without a confidence interval. By pure tradition, almost everyone uses a confidence interval of
95%. This means that the range of values reported has a 95% chance of
containing the actual population value and a 5% chance that it does not contain
the population value. For example, let’s pretend you polled 100 people about if
they would vote for Bernie Sanders or Hillary Clinton in a state’s Democratic
primary. If 54 of 100 voters in the sample say they would vote for Bernie, then
the 95% CI for the sample is +/-9.77%, which means that the percentage range of
44.23% and 63.77% has a 95% chance of containing the population value of the
proportion of voters who will vote for Bernie.
Increasing your sample size decreases your confidence
interval, as illustrated below by PewResearch.This is because the confidence interval is roughly
proportional to the square root of 1/n where n is the number of samples. So,
bigger n leads to smaller confidence intervals.
If you repeat the poll above
for Hillary vs Bernie in 1000 sampled voters, and find 54% of voters +/-3.09%
support Bernie to 95% confidence, then the percentage range of 50.91% to 57.09% is 95% likely to
contain the population percentage of voters supporting Bernie. It’s still the
same sample percentage, but the 95% confidence interval is much smaller.
If you want to take a closer look at current political
polls, I highly encourage you to look at the compiled list of national polls on FiveThirtyEight and search
through the methods sections to see how each poll plays with non-probability
sampling weighting, random sampling methods, and determination of confidence
intervals. The lack of standards between polls is both horrifying and fascinating
and makes me question the results of all political polls, especially when
methods aren’t clearly reported.
I’ll leave you with this one interesting snippet of the
methods section of an NBC/SurveyMonkeypoll that actually acknowledges inherent bias in their selection
methods and how they compensated for that in their methods. It’s rare that a poll will actually acknowledge
their bias like this, so enjoy this unusually honest example.
Although there are several limitations on polling responses that make it almost impossible to have a completely representative sample, it should be clear to the general public that there is potential skewing of the polling results due to lack of random sampling. There have been several times this election season in which different news outlets have reported polls that end up being wrong after the primaries were complete. As a member of the general public, I have not heard one of those news outlets describe how inaccurate the sampling might be after one such poll is performed. I believe this is doing a disservice to those that follow particular news channels and could lead to some viewers discontinuing their viewing. I also believe it is misguiding viewers. Based on the reality that there could be pressure to follow the popular vote (or perhaps the opposite), especially if a citizen has not been following the race closely, this could potentially skew election results. Not only does inaccurate, nonrandom sampling lead to polling errors, but it could sway voters to vote in particular ways, a bias in itself.
ReplyDeleteI think Emily highlighted the major limitations, and in some cases, inappropriate experimental design of these polls. I would like to point out an additional argument, that in the example from NBC, roughly 2000 people polled are not registered voters. Meaning they will not even be voting in the election that they polls are trying to predict. This is just one example of how low the sample pool is that they will include even these unregistered voters in their analysis.
ReplyDeleteBut my greater comment is this: The argument made above shows the weakness of bias, sampling error, but also the lack of knowledge of how statistics affects our daily lives. In 2009, The National Center for Education reported that only 11% of high school students (in public schools) took a statistics course. Then, in higher education, statistics is only required for certain majors. That means that many students will graduate with advanced degrees and never have taken a statistics course. For that reason, many people rely on statisticians to do the work for them. This is no different when it comes to presidential election polling. It is clear when looking at the methods section of the NBC/SurveyMonkey Poll that there are trained statisticians conducting the surveys and that THEY are aware of the limitations of their polling. However, it is important that they disclose these limitations, and the impacts of these limitations, to their audiences. As Anna stated, these polls can affect how others decide to cast their vote. Therefore, these polls could be having a much bigger impact on our elections then we see on the surface.
I always knew that these polls were inaccurate to some degree, but I have to admit I did not think it was this bad. I think the general consensus is that most polling practices are terribly skewing the data even beyond not doing random sampling. The fact that they're trying to normalize their data based on a non-random sample and weight underrepresented votes is clearly a bad practice. If these polls are having issues with finding input, then they should just clearly state what the limits of their sample size were and not make overreaching claims. If there was a sample size of 10 voters (very bad for a poll, but easier for an example), and 9 of the individuals were Caucasian males, and 1 a Caucasian female, then it makes more sense to just say that the poll reflects ideas from Caucasian males rather than weight the lone female's opinion and try to advertise it as encompassing all similar demographic viewpoints. We even want these kinds of specialized polls from time to time (or at least politicians do when trying to appeal to certain constituents).
ReplyDelete