The Art and Science of Polling
Professor of Government and chair of the Government Department Dan Shea and his colleagues Assistant Professor of Government Carrie LeVan and Visiting Assistant Professor of Government Nicholas Jacobs have been carrying out a series of public opinion polls in Maine and across the country on the upcoming election and a host of related topics. They released the first poll in February and the second just this month. The team plans to conduct another Maine-centered survey after Labor Day, and one more in late October. Shortly after the election, they plan a national wave as well. Colby Magazine staff writer Kardelen Koldas ’15 talked to Shea about how reliable polls are constructed.
When you and your colleagues LeVan and Jacobs initiate a poll, what is the first thing you do?
In the most basic sense, we have a broad conversation about the types of things we’d want to find out and why a survey would make the most sense. Polls are difficult to construct and expensive to implement, and often it’s easier to find good information without a survey. So the first issue we confront is what’s the nature of the question and what is the best way to collect information to answer it.
What comes next?
Once we’ve settled on a poll, we move towards writing the survey—which is what we call the instrument. This is the most complex part of polling. There are a dizzying number of issues to consider, such as the order of questions, the appropriate way to word questions, and the right response options. The goal is to create an unbiased, valid, and reliable instrument. Are we measuring what we think we’re measuring? Are there any leading questions? Do the response options capture what the respondents want to report? It’s an art form as much as it is a science.
Indeed. What’s the process of generating questions?
We move from broad concepts to indicators. Concepts are general questions, broad topics. Indicators move to specific measures of that broad topic. Very often the concept is really important, so we will use several different indicators. And if we have it right, those indicators would be highly correlated to each other, which is called construct validity. For example, if Senator Susan Collins is rated highly in a question about her reelection rate, she should also be doing well in questions dealing with her approval rating. So an example of a concept would be Senator Collins’ support. In this election, that’s a really important concept, so we measure it in different ways. We often ask, does the question measure what we think it measures?
How do you decide on the order of questions?
Well, sometimes it simply makes sense to start from the general and move to more specific. But you wouldn’t want to give respondents a question that would lead them to answer a subsequent question in a different way. We wouldn’t ask, for example, what you think of Janet Mills’s handling of the pandemic and then ask about her overall approval rating. In other words, early questions shouldn’t prime later queries.
How do you ensure your questions will lead to a reliable survey?
We spend hours thinking about the right way to ask different questions. Nick, Carrie, and I, and the many students who are involved, will debate the perfect wording on a particular question for days. We’re very careful. Sometimes we use tried and true questions—ones created by academics or polling firms for years ago. A good example would be partisanship battery. Scholars have fine-tuned how to ask about one’s party preference and we use that approach. Good questions are good questions and there is no sense reinventing the wheel. We also test our instruments before they go into the field. We will take it ourselves and ask our students to take it. Even our spouses.
Okay, so now you have the instrument. How does the survey go out to the public?
There are three broad ways to do that: in-person, over the telephone, and online. Each has strengths and weaknesses, but most election polls are done either online or over the phone. We’re using a hybrid model so that we can include respondents over the phone—on both cell and landline—and others from online. You have to be careful about systematic biases when it comes to different modes of implementation. Older voters, for example, are more likely to answer a landline than younger voters. Conversely, younger voters are more skilled at moving through online surveys. There’s a massive move to online, but we think only using that approach would be problematic in Maine.
How do you find the survey takers?
We hire a telephone bank, a firm that makes the phone calls, and we conduct our online element through a marketing firm. We pay very close attention to how the demographics of the sample match the voters in Maine. Every night there is an assessment of how the sampling is going.
What do you mean?
We’re very careful about matching the sample to population parameters, especially partisanship, gender, age, income, and geography. For instance, in our polls we need one-half of the respondents to come from the First and one-half from the Second Congressional District. We need a certain number of men, a certain number of professionals, Democrats, and so forth. That process is called stratified cluster sampling. Within those population estimates, we conduct a random draw. Again, the idea is to match our sample to the actual demographics of Maine voters.
Can the firms always fill those buckets?
Yeah, buckets. That’s a good way to say it. Even when we’re very careful about matching the sample to voter demographics, it often doesn’t wind up perfectly. From there, we go to a complicated weighting process, where each voter is assigned a weight based on the percentage of that demographic group in our sample and population statistics. It’s a common practice to fine-tune the sample. We’re usually really close, but this fine-tunes things.
How do you know when you have a large enough sample to draw a larger conclusion from?
Writing the instrument and drawing the right sample are very important and complicated. The sample size is much more mechanical. We have decided that our sample for Maine voters will be somewhere between 800 and 1,000 respondents. That leads to a margin of error of about 3.8 percent, at a 95 percent confidence interval. In other words, we can be sure our findings match the population within plus or minus 3.8 percent, 95 percent of the time. Yeah, that’s a tad complicated. Put simply, that’s a really good sample size. You’ll see national surveys published in newspapers with smaller sample sizes. But we do that because we’re interested in exploring subgroups. That is to say, when we’re thinking about the difference between the First and Second Congressional District, the sample size will drop to 450. Donald Trump won the Second Congressional District last time and it will be close in November. So it’s important to have a sample large enough to feel confident in our findings. The same can be said about other subgroups, like Republicans, women, rural voters, and so forth. Every time you look at a subgroup, the sample drops.
In polls, what does the margin of error tell us?
Well, that’s a quick and easy way to consider the validity of a survey. And we should pay attention to that. Anything with the margin of error over five percent is probably tenuous. But as I’ve tried to suggest, the devil is really in the details of the instrument, the sampling, and the appropriate weighing process. Large sloppy polls are often not as accurate as good surveys with a smaller sample size. There’s a lot more to the story than the sample size.
Sometimes election results contradict the polls. Some might say that diminishes the value of polling. What do you think?
We’re interested in a lot more than just the horserace issues. Our primary focus is trying to understand political attitudes in Maine and beyond. We’re fascinated by a concept advanced by another political scientist called rural resentment. So Nick, Carrie, the students, and I are exploring a large set of issues. That said, trying to figure out where elections are tight is important for political actors on both sides of the aisle. For example, it wouldn’t make a lot of sense for activists to funnel buckets of money into the First Congressional District because that race is not going to be competitive. But on the other hand, because Colby and other polling firms have shown us that the Senate campaign is tight, activists can focus their resources.
When we come across a poll, what should we pay attention to if we want to assess its reliability?
Look below the surface. Good polling firms publish more than the results, namely their methodology. We publish the methodology and a massive crosstab spreadsheet with our results. We want to be as transparent as possible. Again, don’t jump to the sample size. Take a look at the instrument and the sampling.
What is one thing the general public should know about political polls?
Polls are snapshots at that particular moment in time. Real world conditions can change rapidly. Savvy consumers of polls should focus on a range of issues and not just sample size.