Outlier Poll Results Are Inevitable. They’re Also Sometimes Right.

[ad_1]

This afternoon’s New York Times/Siena College poll doesn’t look much like other polls.

It finds Donald J. Trump ahead by six percentage points among registered voters and three points among likely voters nationally. That’s his best result in a reputable national survey in months. You have to go back to a CNN/SSRS poll in April to find something showing him ahead by six points with registered voters. (A Quinnipiac poll today found Trump up by four.)

When a poll is considerably different from others, it’s often referred to as an outlier — as it falls outside the range of the other data.

Outliers are no fun for pollsters, but they’re inevitable. Historically, outliers are less accurate when judged against final election results than polls that hew closer to the average of other polls. But outliers often spark a media frenzy — the distinct findings make them seem more surprising or newsworthy. A Selzer/Bloomberg poll from almost this exact date 12 years ago caused a circus when it was released: It showed Barack Obama leading Mitt Romney by 13 points (he won by less than four).

When outlier results come around, it’s generally better to look at the average of polls. As it happens, we released our poll averages for the cycle on Monday: Mr. Trump is ahead by one point over President Biden after including the latest Times/Siena poll. That’s a safer measure of where the race stands.

But that doesn’t necessarily mean the poll should be tossed out altogether. Sometimes, the outliers are right.

Table of Contents

Why outliers are inevitable

Outliers are bound to happen.

Just as you can flip a coin “heads” five or six times in a row, simply by chance, you can survey 1,200 people and end up with 30 more Trump voters than you’d get over the longer run, simply by chance.

This kind of error is truly inevitable. It’s inherent in random sampling, and the “margin of error” is a measure of the likelihood of this kind of outlier. In theory, the “truth” will be outside of the margin of error one in 20 times. Coincidentally, this is the 20th Times/Siena survey of the 2024 presidential election.

If random sampling error is the reason the Times/Siena poll is an outlier, the next Times/Siena poll might look relatively “normal” in comparison. Indeed, the last Times/Siena national poll in April showed Mr. Trump ahead by two points among registered voters and one point among likely voters, which was similar to the averages at the time.

Looking more closely at the data, there’s a sign that this kind of random sampling error might be at play: It has to do with the response rate among Republicans versus Democrats.

If, hypothetically, an unusual number of Trump voters responded to our survey, simply by chance, we would expect to see an unusually high response rate among Republicans. That is what happened.

Overall, white registered Republicans were 17 percent likelier to respond to this survey than white Democrats (we’re looking at white respondents as a de facto control for race, as nonwhite voters are disproportionately Democratic and respond in lower numbers). This is the first Times/Siena national survey we’ve conducted in which white Republicans were likelier to respond than white Democrats: Our last three polls showed white Democrats at 3, 2 and 1 points likelier to respond.

Even though more Republicans responded to this Times/Siena poll, we weight the results by party registration, so the poll still shows the “correct” number of Democrats and Republicans to reflect the country as a whole. This kind of statistical adjustment reduces the chance that sampling errors yield outliers. But it doesn’t necessarily prevent outliers. It would not be enough if it turns out that Mr. Trump’s supporters were simply likelier to respond to the survey, regardless of party. In other words, weighting by party wouldn’t be enough if the higher response among Trump supporters also yielded too many independents who favor Mr. Trump and too many Democrats who favor him.

It’s at least conceivable that this is what happened, whether by chance or because Mr. Trump’s supporters are newly motivated to express their support in the wake of his criminal conviction — though we might expect that to affect other surveys besides the Times/Siena poll.

The other kind of outlier

In October 2022, we published a poll of Kansas’ Third District that was pretty surprising: It found the Democrat, Sharice Davids, ahead by 14 points in a district Mr. Biden had won by only four.

The result looked like an outlier. There weren’t any public polls, but there had been many private polls showing a close contest. The race was rated as a tossup, and outside groups spent nearly $10 million on the district.

One Democratic pollster told me our numbers were “cray” and sent the laughing-while-crying emoji. An unnamed Democratic strategist told The Kansas City Star: “I don’t know what model the Times is using but this is going to be a very tight race. Our polls show it’s a true tossup at this point.”

The poll also exhibited the telltale sign of a skewed sample: Registered Democrats were nearly 70 percent likelier to respond to the poll than Republicans — the highest tally I can recall seeing.

Taken together, the poll sure looked like an outlier. It looked as if it was probably wrong.

In the end, the numbers were pretty accurate. In a stunner, Ms. Davids won by 12 points.

That’s the hardest thing about outliers: Sometimes they’re right. Yes, outliers are less likely to be accurate. Yes, they might simply be off by chance. But they might also reflect a new shift in the race that other pollsters haven’t yet reported. They might reflect something novel — and correct — about a survey’s methodology.

And the average of other polls might not be as trustworthy as it looks. What if other pollsters found something surprising as well, but then made tweaks to bring their results into line with the consensus? (This is called herding.) What if other pollsters with similar results just didn’t publish their results at all? It’s happened before. And while outliers are less likely to be accurate, the most valuable poll of all is the outlier that’s right.

Looking back, the Times/Siena poll has a record of publishing outlying results that prove to be accurate. When a final Times/Siena poll differs from the average of other polls by at least four points, the Times/Siena poll has actually beaten the average of the rest in eight of nine outliers. This doesn’t include the aforementioned Kansas survey, as there were no other public polls in the race.

This is a surprisingly strong record. As mentioned earlier, outlying poll results are usually less accurate than the average of other polls. There are only five pollsters with even a modest track record of publishing outliers — at least five outliers, defined as a four-point difference from the average of other polls, in the final month of a general election — that subsequently defeated the average of other polls.

Only the highly regarded poll run by Ann Selzer (defeating the average in 12 of 14 outliers) joins the Times/Siena poll in publishing at least five outliers that beat the average of the others more than two-thirds of the time.

Of course, there’s no reason to assume that the Times/Siena poll will continue to outperform the rest in the future. And even if one could assume so, it wouldn’t mean today’s Times/Siena poll was “right,” either. There’s no way to be certain before November whether an unexpected poll result is a statistical fluke or if it’s capturing something about the race that other polls aren’t. That Selzer outlier showing Mr. Obama up 13 points in 2012? It was probably just an outlier. Ms. Selzer defended the poll with a lengthy memo, but her next poll, taken when Mr. Obama had a larger lead after the conventions, showed a much more typical six-point lead for him.

[ad_2]

Source link

Why outliers are inevitable

The other kind of outlier

Quick Links

Our Jobs