The Truth About Polling Averages and How Misleading They Are

Statistical sampling is often used to measure damages and even demonstrate liability in some False Claims Act cases. Pharmaceutical and medical device companies also use it to demonstrate efficacy. It’s scientific, it works and we can rely on it in many endeavors.

What are we to make of the abomination of this process visited upon us every four years in Presidential politics through public polling?

Marketing companies have the right to grab publicity. The surest way to get free publicity is to release a poll within two months of a presidential election. It hardly matters if the poll is accurate, because somebody will publish it.

If you can’t beat them, join them. So, I intend to try to convince my firm to conduct such an exercise. Our poll, if I can convince the powers that be to do it, would be just as valid as any other poll, maybe more than many, but we would make clear it is for entertainment purposes only, not to be relied upon by anyone or put into any “average” analysis.

I intend to do this, so that the whistle I’m blowing today is heard.

No you can’t file a False Claims Act Case or even a Securities fraud case based on this, it’s just that this quadrennial publicity festival has allowed junk math to pollute a legitimate discipline we all rely on in this business.

Bad polls are put into polling averages and laundered like dirty money. The publicity encourages pollsters to produce numbers, certain that they too will get national attention if they call 600 people or say they do.  Averaging polls gets publicity for those who average the polls. We won’t learn until 6 months from now who was right, if anyone bothers to look back.

Yet some in these polls must be wrong, even as the averages are used to make everyone look right.  The average also creates a statistical bias towards a close race, which may or may not be correct, because it has to be based on some false data.

As I write this there are two recent polls on Arizona. One sponsored by a national broadcasting company, the other by an Arizona firm. Once says Biden is up 52-42 in Arizona. The other says Biden 47-44. Both were conducted over about the same period of time.  Both will be “averaged” into various services.

Yet, one of these polls must be wrong and both could be. It is not possible for this kind of a discrepancy to exist and both polls be a true reflection of the electorate in Arizona. Biden can’t simultaneously be at 52 and 47.  That’s just too much of a difference to fudge even through the usual dodges we allow these people to get away with presenting, or if it is allowable, what is the point?  We never ask pollsters, if you admit your poll can be off by 3 points and the election could be decided by a point what good is your poll? (It is good for nothing except publicity).

Averaging polls often obscures what the individual poll tells you. In these Arizona numbers both polls tell you completely different things. One says Biden has a huge lead. Absent an earthquake, Trump can’t win.  He’s down 10 and Trump could get all the undecided and still lose. Obviously, as well, if Biden is really up that much in Arizona you have to wonder about a lot of other states. On the other hand, we have a poll that says Biden has 47 and Trump within range at 44. Nobody is close enough to 50 to think that lead is safe.

Rather than averaging these and getting a Biden at 49, (closer to the poll showing Biden way up based on no underlying factors about how well it was conducted) don’t you really want to know which of these pollsters to believe? Don’t you want somebody explain why they could be so different? If the 52 points poll is wrong why ever use them again? If the 47 points is off why include that polling in any future analysis?

Only one group of analysts, (538) at least tries by banning some polls and rating pollsters with a letter grade and providing some background on the pollsters it publishes.

Everyone else just publishes.  Everyone averages the polls. In doing so, the people who average them look smart. They get to act as if they are analyzing something. They talk about resulting numbers as if they know something new.  They don’t.

Meantime, when the average is taken of numbers in battleground states, which by definition start out as close races, they invariably revert close to 50-50.  Last week there were two polls in Florida among many. The two outliers were conducted over the same period of time. One had Trump up 3 and with almost 49%, very close to solid. The other had Biden by 5 at 51%. Average that you get Biden by 1 a number that again says something entirely different than what either poll said. There were of course, many other polls showing a close race. The average says it’s a toss up including results based on two polls that say Florida is not necessarily a toss up.

Again, though, one or both of these polls have to be wrong, they can’t be 8 points different and both be right. What if one of these guys is right and everyone else is wrong?  Wouldn’t it make sense to try to see who is doing this right so we might know which to believe and follow? Averaging them eliminates any journal from any such responsibility, while everyone gets publicity.  If the guy showing Trump up 3 points last week was right, basically everyone else is wrong and we should abandon ship on almost everyone else. That would be worth knowing, instead of averaging that result into the pile and relying on something entirely different to be correct. You may not like that, but it is entirely possible, but completely obscured by using such averages.

Yes, most people tell you Florida is a toss up without benefit of a single call, based on past results. If somebody wins Florida by 5 points it’s a landslide. Are any polls saying it’s a toss up saying that in part, because it is the safe and accepted thing to predict? You could look pretty respectable putting out such a poll on Florida. No matter how you got the numbers somebody would believe you if you say its close there.

For now all you can do is look at the numbers and remember, just because they are numbers does not mean they are correct.  This is not up to the standards of statistical work we would accept in any other area.  There is real danger if this kind of publicity grab undermines reliance on legitimate statistical work prepared for a courtroom or in a laboratory. Standards matter and averaging bad data is not useful for much more than entertainment and publicity.