October 31, 2014

Who is going to win the governor’s race?


Is Schauer tied with Snyder? Down by 5? Somewhere in between?

Snyder-SchauerDebateThis may be old hat for some of you, but people keep asking me what’s going on with the polling in the Governor’s race and I thought I would put together a little primer. For reference, just in the last week, NYT/CBS has had Schauer up by 1 point, while the Detroit News/WDIV has Snyder up by 5. A couple weeks ago, we saw two polls come out in the same day, one with Schauer up 3, one with Snyder up by 8. EPIC-MRA’s current poll has Snyder up by two, when last week they had him up by 8.

Why are these estimates so far apart? Aside from sampling error (the variation that happens because you are only asking a subset of people rather than every single voter), and aside from any fingers on the scale, much of this is because different pollsters deal differently with a set of questions and challenges they have to address in order to conduct a poll.

  • Deciding who is a likely voter: For a pre-election poll, pollsters are trying to model something that literally does not exist yet, because exactly who will vote on Tuesday is not completely determined until it happens. This is further complicated by the fact that people believe they are supposed to vote and are reluctant to admit that they aren’t going to vote, so just asking people if they will vote or not leads to far more people saying they will vote than actually do. Most pollsters are not terribly transparent about their methods, but they use a variety of factors to try to predict who is actually likely to vote. Some use external data as a predictor, such as recorded vote history, and others use information from the survey respondents themselves, such as how closely they have been following the election or whether they know where their polling place is, which have been shown to be decent predictors of voting behavior — and theoretically this should catch shifts in who is likely to vote that a model based on past vote history would not. But the main story is that different methods from different pollsters lead to different results, and we won’t know for sure who will actually vote until they do it on Tuesday.
  • Pushing undecided to answer: What does it mean, one week out, if someone says they are not sure who they will vote for? You can take them at their word that they are undecided (or possibly unlikely to vote at all), or you can ask them which way they are leaning and report them as likely to vote for that candidate. In Michigan right now, the surveys reported with high numbers of undecideds tend to show a greater lead for Snyder. Surveys with lower numbers of undecided show a closer race. Probably what this means is that Snyder’s voters are more committed than Schauer’s – which makes sense given that Snyder is an incumbent, and only a few months ago Schauer’s name ID was in the 30s.
  • Method of data collection: Whether you are collecting data via phone or online and what percentage of interviews are on cell phones has an impact on the results you get partly because of who is included and who is missed in each sample. Response rates on the phone continue to fall, causing a great deal of worry about the validity of polling estimates – the basic problem is the difficulty of knowing if the people screening your calls are different from the ones who answer. But when the New York Times switched to an online panel for their polling this year, creating a big controversy in the polling world – this time the problem is that the people interviewed are not randomly generated but rather chose to be part of the panel, which causes its own set of problems. Basically, there are problems associated with any option here.

So all that variation is the rationale behind the polling aggregators, made famous by Nate Silver and The thinking is that by averaging all of this data together that was produced under different circumstances you will end up with a better measure. But there’s even some variation in the estimates from the polling aggregators: as I type this, Mark Blumenthal’s model at the Huffington Post has Snyder up by 1.7 points while Real Clear Politics has him up by 2.8 and Talking Points Memo has him up by 3.3.

Ultimately, what you probably care about – what I care about – is who’s going to win on Tuesday. Right now, my read on the data is that it is just too close to call. Nate Cohn at the Upshot argued the other day that there is a strong possibility the polls overall are undercounting likely Democratic voters — it seems plausible and I would be happy if so. Conventional wisdom is that a close race drives up turnout, which likely favors Dems. And Democrats are saying good thing about absentee ballot numbers. So Snyder could win, or Schauer could walk away with a victory. And if field efforts ever matter, this is the kind of race where they will. GOTV.

*In the interest of full disclosure, my wife Stephanie works for Schauer’s campaign, as many of you know. Not that it changes my opinion on any of this, and no one at the campaign had any input into this post, but it seemed fair to remind you.