General Discussion
Related: Editorials & Other Articles, Issue Forums, Alliance Forums, Region ForumsWithin the Margin of Error doesn't mean what some think
Thinking about this because the Romney campaign says their internal polling in Ohio has Romney within the margin of error. That is meant to imply that the race is closethat it could go either way.
But Margin of Error is not a magic threshold below which a poll becomes arbitrary.
Sally 44%
Margin of error +/- 4%
This is outside the MOE. What this means is that Jane probably has something close to 56%. There is a smaller chance that she has 55% or 57%. There is an even smaller chance that she has 54% or 58%. And so on. The chance of numbers further from 56% slopes away in a bell curve.
And it is 95% likely that Jane's correct total is within 4 points of 56%is between 52%-60%. The margin of error represents the size of the range with a 95% chance of being correct and, since the numbers within that range are a bell curve, the numbers at the edges of the MOE are much less likely than the numbers in the middle of the MOE.
When someone describes a poll result as being within the margin of error that sounds like the result is wholly unreliable, whereas if it was within the margin of error it would be reliable. But the MOE is not a hard-edged thing.
Sally 44%
Margin of error +/- 6%
In this case the result is within the margin of error, but it is extremely unlikely that Sally is ahead in the race.
There is a 95% chance that Jane has somewhere between 50% and 62%. There is only a 5% chance of Jane being outside that range, and only a 2.5% chance of her being under 50%of her being outside the range on the low end.
So we have a 97.5% chance that Jane has 50% or more, and it is likely she is ahead big. Jane leading Sally 60%-40% is exactly as likely as Jane leading by only 52%-48%.
On the Road
(20,783 posts)That is an excellent explanation. Even the smartest media people seem to fall into this fallacy. It is so widespread it's difficult to convince people otherwise because, hey, x, y, and z all said so.
Speck Tater
(10,618 posts)They know that an educated populace will never go along with their bullshit.
unblock
(52,319 posts)don't listen to the liberal math addict! keep watching paid media for the thrilling twists and turns in this neck-and-neck race!
HopeHoops
(47,675 posts)But yes, that's an excellent explanation of the circumstances. With a sampling size as small as 500 you can achieve a pretty decent statistical prediction (assuming the population sample is well chosen). Generally a little over 1000 is considered the acceptable minimum, but the larger n becomes the more likely the results are representative of the entire population. The fly in the ointment is that population sample.
A few days ago, I had a Republican pollster come to the door (very pleasant woman) and the two on her list were my wife (Democrat) and eldest daughter (Independent). I assumed I wasn't because I had switched to Republican to register a protest vote in the primary (wrote in Jill Stein of the Green party as a "none of the above" vote). We had no seriously contested races on the Democratic ticket.
I pointed out to her that the poll she was conducting was, by definition, statistically flawed due to the exclusion of Republicans. We're all voting for Obama, so they missed 1/3 of our household (voters) with that poll. She understood quite well, but that's what she'd been tasked to do. We were also in total agreement that the voter ID law is pure bullshit and that our problem isn't voter fraud, it's that less than half of eligible voters even bother to show up. She also gave me an absentee ballot for my eldest (senior in college) despite already knowing she would vote for Obama.
yellowcanine
(35,701 posts)People also don't seem to understand that a whole bunch of close within the MOE polls basically saying the same thing are more reliable than one poll which has a large difference outside the margin of error. This is particularly true for likely voter polls because assumptions about likely voters can be very wrong.
HopeHoops
(47,675 posts)Jim Lane
(11,175 posts)You give this example:
Sally 44%
Margin of error +/- 6%
Then you write: "In this case the result is within the margin of error...."
As you stated, the margin of error means that there's no more than a 5% chance that the correct number differs from the reported number by more than the stated margin of error. BUT that's a margin for each candidate's total, not for the spread. If there's a 95% chance that Jane's total is within the range 50%-62%, and a 95% chance that Sally's is within the range 38%-50%, that doesn't mean that there's a 95% chance that they're tied or that Sally is leading, because both extreme events would have to happen for that to be the case.
Suppose these were independent variables -- Jane is running for the Senate and Sally is running for the House in a different state. Then the probability of both results being within the ranges given above would be .95 x .95 = .9025. The probability that Jane is actually doing better than Sally is only 90.25%, so the probability that they're doing equally well or that Sally is ahead is 9.75%. If you're using a 5% confidence interval (as implied by your use of the 95% mark, which I think is indeed the standard in these polls), then Jane's lead is not within the MOE.
The real world is more complicated because Jane and Sally are in the same race. The variables aren't independent -- there's a heavy covariance. Nevertheless, the covariance isn't 100%. Some people might respond Undecided/don't know/not sure, or, in many real polls, might be presented with or might volunteer the name of a minor-party candidate. Therefore, part of the uncertainty affecting each figure is that the correct result might show a lower percentage for that candidate but not a higher percentage for the other leading candidate.
The net result is that, if the MOE is given as 6%, then the MOE on the spread between the candidates is more than 6% (because calculating spread depends on two variables, each of which might be wrong) but less than 12% (because the overreporting of Jane's number will often coincide with the underreporting of Sally's). In the example you've given, with a spread of precisely 12%, Jane's lead is not within the margin of error. That is to say, if this situation is repeated 100,000 times, Jane will win more than 95,000 of the races.
Of course, your central point is that "margin of error" doesn't mean "It's impossible for our poll number to be more than this far away from the correct number." In that, you're completely correct, and it's a point often overlooked or misunderstood.
treestar
(82,383 posts)I was thinking just as the Rs want us to - that if Jane has 52% and the margin of error is 4% that it could be 48% and that Sally was a threat because if she had 48% she could have 42%!