I usually try not to wade too deeply into national poll wonkery in this space (really, I try), but a series of dust-ups over the methods of Democratic firm Public Policy Polling are particularly relevant due to their recent Maine poll and my recent defense of their accuracy.
Plus, Steve Mistler had declared that the flap is “of interest to Maine politics,” so that’s good enough for me.
I’ll try to provide a timeline and summary of what’s going on with PPP here. If you want a more complete picture, you should go read HuffPollster, especially their roundup of pollster opinion from this afternoon, Nate Cohn’s investigation, Nate Silver’s Twitter feed and PPP’s own blog.
Over the past couple months, there has been some questioning by pollsters and academic statisticians of PPP’s practice of randomly discarding responses as a way of weighting their sample. This is certainly unorthodox and could add more random error to their results than is necessary. It might be somewhat defensible if they would otherwise do a great deal of down-weighting and their goal was to more accurately report a margin of error for the poll.
Then, yesterday, PPP surprised poll watchers everywhere when they announced that they had performed a survey of the recall election in Senate District 3 in Colorado, had found the Democrat losing by a twelve-point margin (which is what actually happened), and had chosen not to publish the results. This raised a lot of questions about how PPP decides to release results and if they might be making some of those decisions for business or political reasons. PPP’s defense was that they hadn’t promised to release the poll in the first place, were concerned that there was a methodological error that had caused an incorrect result and were releasing it now not to brag about their accuracy but to altruistically provide information about the election to the public.
These occurrences, taken separately or together, might be dismissed simply as quirks or differences of opinion over polling methods. An article by Nate Cohn published today in The New Republic, however, raises some serious concerns that can’t be so easily explained away.
The piece examines some discrepancies in PPP’s results, in particular that the racial composition for their samples has varied to a greater extent than it should in different polls for the same election and that these changes in demographics consistently brought their poll results closer to the overall polling averages. This seems to suggest that they may have adjusted their sample to manipulate their results and bring them more in line with expectations. This is especially troubling given a study published in January that seemed to find PPP polls were more accurate when other pollsters polled a race first.
After some significant prodding by Cohn, PPP eventually revealed that they use an ad hoc weighting system that takes into account the racial composition of their sample, the reported previous votes of their respondents (for instance, who they voted for in 2008 when conducting 2012 presidential polls), and some kind of ambiguous sense of the pollster’s comfort level with the topline results.They had never before revealed these aspects of their methodology and had apparently been purposefully leaving the previous election questions that they were using for weighting off of their public releases.
To my mind, none of these practices are methodologically defensible and together they provide far too much opportunity for PPP to put their thumb on the scale and influence the results of their surveys. There’s no evidence that they changed reported outcomes for political reasons, but the bad methods and lack of transparency certainly don’t engender trust in this regard.
There are two general ways of conducting polls. One, often favored by media pollsters, is to call everyone randomly, weight the results by census data and use a series of questions to judge how likely individual respondents are to vote, including only the likeliest in the final sample. The other, usually favored by campaign pollsters, is to start with a list of voters whose voting history is known and are likely to vote (perhaps asking an additional voter intent question or two) and weight the final results to the pollster’s assessment of the likely composition of the electorate.
Until now, it was assumed that PPP was doing the latter, but that method doesn’t quite work if the pollster changes their view of how the electorate will look from poll to poll as much as they seem to have done.
So how did PPP manage such a good record with a questionable methodology? First, as mentioned, they may have been adjusting some of their results in a somewhat roundabout way to better fit polling averages. Second, their other methods, including gathering large sample sizes and using good voter lists, may have allowed them to get some solid results even while employing some bad weighting practices.
Interestingly enough, while these revelations put their accuracy record nationally and in many states in doubt, Maine would seem to be an exception. Unless there were other questionable weighting practices that haven’t yet been described, the lack of race as a real factor in poll outcomes here would seem to insulate their Maine results from much of this criticism.
This certainly isn’t an example of pollster malfeasance on the level of Research 2000 or Strategic Vision, or an example of severe incompetency on the level of Pharos Research Group or our own Maine Heritage Policy Center, but it is a disappointing discovery about a prolific and heretofore generally well-regarded polling firm. These issues will, unfortunately, contribute to public distrust of polling and make good research and a better understanding of public opinion that much more difficult.