Welcome to DU! The truly grassroots left-of-center political community where regular people, not algorithms, drive the discussions and set the standards. Join the community: Create a free account Support DU (and get rid of ads!): Become a Star Member Latest Breaking News General Discussion The DU Lounge All Forums Issue Forums Culture Forums Alliance Forums Region Forums Support Forums Help & Search

cthulu2016

(10,960 posts)
Wed Sep 26, 2012, 01:25 PM Sep 2012

Within the Margin of Error doesn't mean what some think

Thinking about this because the Romney campaign says their internal polling in Ohio has Romney within the margin of error. That is meant to imply that the race is close—that it could go either way.

But Margin of Error is not a magic threshold below which a poll becomes arbitrary.


Jane 56%
Sally 44%

Margin of error +/- 4%


This is outside the MOE. What this means is that Jane probably has something close to 56%. There is a smaller chance that she has 55% or 57%. There is an even smaller chance that she has 54% or 58%. And so on. The chance of numbers further from 56% slopes away in a bell curve.

And it is 95% likely that Jane's correct total is within 4 points of 56%—is between 52%-60%. The margin of error represents the size of the range with a 95% chance of being correct and, since the numbers within that range are a bell curve, the numbers at the edges of the MOE are much less likely than the numbers in the middle of the MOE.

When someone describes a poll result as being within the margin of error that sounds like the result is wholly unreliable, whereas if it was within the margin of error it would be reliable. But the MOE is not a hard-edged thing.

Jane 56%
Sally 44%

Margin of error +/- 6%


In this case the result is within the margin of error, but it is extremely unlikely that Sally is ahead in the race.

There is a 95% chance that Jane has somewhere between 50% and 62%. There is only a 5% chance of Jane being outside that range, and only a 2.5% chance of her being under 50%—of her being outside the range on the low end.

So we have a 97.5% chance that Jane has 50% or more, and it is likely she is ahead big. Jane leading Sally 60%-40% is exactly as likely as Jane leading by only 52%-48%.
8 replies = new reply since forum marked as read
Highlight: NoneDon't highlight anything 5 newestHighlight 5 most recent replies
Within the Margin of Error doesn't mean what some think (Original Post) cthulu2016 Sep 2012 OP
Thank You -- On the Road Sep 2012 #1
Yet another reason why Republicans want the people to remain ignorant. Speck Tater Sep 2012 #2
neigh! it's a photo-finish i tells ya! down to the wire! unblock Sep 2012 #3
Math is hard. HopeHoops Sep 2012 #4
And statistics are particularly baffling for a lot of people. yellowcanine Sep 2012 #7
My statistics professor in college was amazing. I gained both the mechanics and a deep understanding HopeHoops Sep 2012 #8
I think there's a minor error in your explanation Jim Lane Sep 2012 #5
Thank you - that is great to know treestar Sep 2012 #6

On the Road

(20,783 posts)
1. Thank You --
Wed Sep 26, 2012, 01:27 PM
Sep 2012

That is an excellent explanation. Even the smartest media people seem to fall into this fallacy. It is so widespread it's difficult to convince people otherwise because, hey, x, y, and z all said so.

 

Speck Tater

(10,618 posts)
2. Yet another reason why Republicans want the people to remain ignorant.
Wed Sep 26, 2012, 01:27 PM
Sep 2012

They know that an educated populace will never go along with their bullshit.

unblock

(52,319 posts)
3. neigh! it's a photo-finish i tells ya! down to the wire!
Wed Sep 26, 2012, 01:49 PM
Sep 2012

don't listen to the liberal math addict! keep watching paid media for the thrilling twists and turns in this neck-and-neck race!

 

HopeHoops

(47,675 posts)
4. Math is hard.
Wed Sep 26, 2012, 02:47 PM
Sep 2012

But yes, that's an excellent explanation of the circumstances. With a sampling size as small as 500 you can achieve a pretty decent statistical prediction (assuming the population sample is well chosen). Generally a little over 1000 is considered the acceptable minimum, but the larger n becomes the more likely the results are representative of the entire population. The fly in the ointment is that population sample.

A few days ago, I had a Republican pollster come to the door (very pleasant woman) and the two on her list were my wife (Democrat) and eldest daughter (Independent). I assumed I wasn't because I had switched to Republican to register a protest vote in the primary (wrote in Jill Stein of the Green party as a "none of the above" vote). We had no seriously contested races on the Democratic ticket.

I pointed out to her that the poll she was conducting was, by definition, statistically flawed due to the exclusion of Republicans. We're all voting for Obama, so they missed 1/3 of our household (voters) with that poll. She understood quite well, but that's what she'd been tasked to do. We were also in total agreement that the voter ID law is pure bullshit and that our problem isn't voter fraud, it's that less than half of eligible voters even bother to show up. She also gave me an absentee ballot for my eldest (senior in college) despite already knowing she would vote for Obama.

yellowcanine

(35,701 posts)
7. And statistics are particularly baffling for a lot of people.
Wed Sep 26, 2012, 04:48 PM
Sep 2012

People also don't seem to understand that a whole bunch of close within the MOE polls basically saying the same thing are more reliable than one poll which has a large difference outside the margin of error. This is particularly true for likely voter polls because assumptions about likely voters can be very wrong.

 

HopeHoops

(47,675 posts)
8. My statistics professor in college was amazing. I gained both the mechanics and a deep understanding
Wed Sep 26, 2012, 05:01 PM
Sep 2012
 

Jim Lane

(11,175 posts)
5. I think there's a minor error in your explanation
Wed Sep 26, 2012, 04:22 PM
Sep 2012

You give this example:

Jane 56%
Sally 44%

Margin of error +/- 6%


Then you write: "In this case the result is within the margin of error...."

As you stated, the margin of error means that there's no more than a 5% chance that the correct number differs from the reported number by more than the stated margin of error. BUT that's a margin for each candidate's total, not for the spread. If there's a 95% chance that Jane's total is within the range 50%-62%, and a 95% chance that Sally's is within the range 38%-50%, that doesn't mean that there's a 95% chance that they're tied or that Sally is leading, because both extreme events would have to happen for that to be the case.

Suppose these were independent variables -- Jane is running for the Senate and Sally is running for the House in a different state. Then the probability of both results being within the ranges given above would be .95 x .95 = .9025. The probability that Jane is actually doing better than Sally is only 90.25%, so the probability that they're doing equally well or that Sally is ahead is 9.75%. If you're using a 5% confidence interval (as implied by your use of the 95% mark, which I think is indeed the standard in these polls), then Jane's lead is not within the MOE.

The real world is more complicated because Jane and Sally are in the same race. The variables aren't independent -- there's a heavy covariance. Nevertheless, the covariance isn't 100%. Some people might respond Undecided/don't know/not sure, or, in many real polls, might be presented with or might volunteer the name of a minor-party candidate. Therefore, part of the uncertainty affecting each figure is that the correct result might show a lower percentage for that candidate but not a higher percentage for the other leading candidate.

The net result is that, if the MOE is given as 6%, then the MOE on the spread between the candidates is more than 6% (because calculating spread depends on two variables, each of which might be wrong) but less than 12% (because the overreporting of Jane's number will often coincide with the underreporting of Sally's). In the example you've given, with a spread of precisely 12%, Jane's lead is not within the margin of error. That is to say, if this situation is repeated 100,000 times, Jane will win more than 95,000 of the races.

Of course, your central point is that "margin of error" doesn't mean "It's impossible for our poll number to be more than this far away from the correct number." In that, you're completely correct, and it's a point often overlooked or misunderstood.

treestar

(82,383 posts)
6. Thank you - that is great to know
Wed Sep 26, 2012, 04:37 PM
Sep 2012

I was thinking just as the Rs want us to - that if Jane has 52% and the margin of error is 4% that it could be 48% and that Sally was a threat because if she had 48% she could have 42%!

Latest Discussions»General Discussion»Within the Margin of Erro...