Welcome to DU! The truly grassroots left-of-center political community where regular people, not algorithms, drive the discussions and set the standards. Join the community: Create a free account Support DU (and get rid of ads!): Become a Star Member Latest Breaking News General Discussion The DU Lounge All Forums Issue Forums Culture Forums Alliance Forums Region Forums Support Forums Help & Search

cthulu2016

(10,960 posts)
Tue Aug 28, 2012, 01:20 PM Aug 2012

How big a sample do these polls have? Big enough.

Last edited Tue Aug 28, 2012, 06:29 PM - Edit history (1)

It may be hard to believe that a national poll that samples only 1,000 people is a reliable picture of a nation of 300 million.

But, like may counter-intuitive things in math and science, a sample of 1,000 is plenty, and enlarging the sample would not make the poll dramatically more accurate.

The key is the quality of the sample, not the size. A low quality sample will not yield accurate results in any scenario. It is very difficult to have a reliable sample.

But a good sample will, as a matter of strightforward mathematical fact, yield reliable results with a surprisingly small sample... much smaller than intuition suggests would be nessecary.

Start flipping a coin and recording the results. You will quickly get the idea that the proportion of heads and tails is roughly 50-50. As you add more flips to the sample, each flip is less useful—one flip moves the percentages less and less as the sample of flips expands.

After you have coin flips down to around 50-50 (say, 49%-51%) there is little point to more flips. You will hone in closer and closer to 50%-50%, but 49.5%-50.5% is not a big improvement on 49%-51%. (And even if you flip 1,000,000 times it is quite unlikely that you will get exactly 500,000 heads.)

However many flips are in your sample, you express the results with a margin of error. Say your coin flip result is heads 51%-49% with a margin of error of plus or minus 2%. That does not mean that heads 53% or 49% are as likely as 51%. It means that the likeliest number is heads 51%, and that there is a 95% chance that the correct heads number is within 49%-53%. Both 49% and 53% are a good bit less likely than 51%.

Since our coin flip sample wasn't dead-on 50-50 the poll is not perfectly accurate, but it is accurate in what it says. It is very, very likely that heads = 49%-53%, which is correct.

And adding another 1,000 flips will not improve that result much. Say we get to 50.3%-49.7% with a margin of error of 1%. So it is 95% certain that the correct probability of heads is between 49.3% and 51.3%, and the likeliest result is 49.8%.

(I m making those MOEs up as examples. I don't know the MOE95% on 1,000 coin flips, but there is some number of flips with a MOE of 2% and a larger number of flips with an MOE of 1%.)

If our poll of flipped coins said that heads=50.3%, the most we can improve that headline number is 0.3%, even with an infinite number of flips.

One the other hand...the quality of the sample is almost everything. If you start out with a lop-sided coin that comes up 60% heads you will never get to 50-50. The sample is unrepresentative of coins.

Put 900,000 blue marbles and 100,000 red marbles in a big jar and stir them up. You won't have to pull out many marbles to get the idea that 90% of them are blue. But if they are not stirred up properly then you would have to count almost the whole million.

Unstirred marbles are a bad sample.

If a pollster assumes that black voters will be 7% of the electorate then his poll will be useless no matter how large the sample. If a pollster called only people who are home at 4 PM the result could never be a reliable snapshot of everybody, even if a million people were called at 4 PM.

If another pollster has a perfect sample of the electorate a national poll of only 500 people would usually be quite close to reality. But he would still want 800-1200 to have a low margin if error.

The mathematical parts of the science of polling are quite dependable. All of these poll samples are large enough to say what they say.

It is the composition of the sample that matters. And that varies a great deal from poll to poll. And there is little point in picking a favorite set of assumptions... in picking one pollster to trust... because no pollster always has the right assumptions.

And that is why averaging polls works better. It isn't because the sample is larger, but because the things that affect the quality of the samples are averaged out.

There will usually be at least one poll that is closer to the result than the average of polls. But we do not know which one that will be, and the average of polls will be more reliable than the typical individual poll.

12 replies = new reply since forum marked as read
Highlight: NoneDon't highlight anything 5 newestHighlight 5 most recent replies

Agnosticsherbet

(11,619 posts)
1. I see a lot of discussions about which poll is bad and why...
Tue Aug 28, 2012, 01:48 PM
Aug 2012

Just as I see discussions from the right and left saying why those polls that do not conform with their own ideas are made by unrepentant fugbuckers.

This is an excellent discussion of polls and why they are accurate, and I will point friends who do not understand polling to this, especially those who do not read DU.

 

Spitfire of ATJ

(32,723 posts)
2. Another trick is to poll 50% urban and 50% rural...
Tue Aug 28, 2012, 01:53 PM
Aug 2012

...That implies half the population lives in the country, but let's face it, with agribusiness most of that land is corporate owned so a 50% rural isn't an accurate reflection of overall state-wide opinion.

Jim__

(14,088 posts)
3. Big enough for what?
Tue Aug 28, 2012, 02:00 PM
Aug 2012

In the example that you give, 1,000 voters out of a population of 300,000,000. Yes, across a homogeneous population concerning a simple majoritarian vote, the margin of error is accurate. Is the presidential election across a homogeneous population and is it a simple majoritarian vote? No and no.

Suppose the the national poll of 1,000 voters has Romney up by 5 points. What does that tell you about the current state of the race? Suppose, at the same time, polls in all the swing states give Obama a minimum lead in each state of 2%. What does that tell you about the current state of the race?

Right now, selective demographic polls against targeted populations give a lot of information to the campaigns. They tell them what part of their campaign is working and what isn't. They tell them where to focus their efforts.

For the rest of us in the general population? They don't tell us much.

cthulu2016

(10,960 posts)
6. National polling does often offer usefully predictive results
Tue Aug 28, 2012, 02:21 PM
Aug 2012

Though the election is determined on a state-by-state basis, nobody reading this has ever seen a race where the popular vote winner did not get enough people casting votes intending to vote for him to also win the electoral college.

(The Florida butterfly-ballot didn't record those votes, but even Pat Buchanan admits that the majority of votes cast in Florida in 2000 were intended for Gore.)

So yes, the national popular vote is, historically, a fine predictor of the electoral college outcome.

Perfect polls in all 50 states would be even better, but since state polling is, historically (for whatever reasons), more variable than national polling we get a garbage-in/garbage-out problem.

The aggregation of 50 more variable state polls does not necessarily yield a better result.

Yes, a candidate could (legitimately) win the electoral college while getting fewer national votes. It has almost happened a few times.

But that does not, in itself, make state-by-state polling superior. One should usually look at both, and within the limitations of what they purport to say.

A convention bounce will tend to move all the state polls. A targeted TV ad campaign in Nevada will affect only Nevada.

A national poll saying Romney is up by 1% is not really saying Romney will win, it is offering an indirect sense of how Romney would do the day the poll was conducted... that if the sample is perfectly representative of who will vote, that it is 60-70% likely that Romney would get more votes on that day, but not many more.

And news organizations tend to increasingly report 1% differences as a tie, which is a good trend.

People read poll results as much stronger statements than the results themselves claim to be. So in that sense, people should be somewhat less reliant on them, psychologically.

And yes, anyone interested would want to look at a mix of national and state polls. And the closer the national polls the more interest one would have in state polls.

If Obama was up by 7% nationally he wins. Same for Romney. At 1%-2% it becomes more interesting to see state polls in florida or virginia or ohio because in that case the national polls are not really telling us who will win.

Jim__

(14,088 posts)
9. "If Obama was up by 7% nationally he wins." - No.
Tue Aug 28, 2012, 03:00 PM
Aug 2012

At this time in '88, Dukakis was up by 7 nationally. The Bush campaign changed tactics. These polls contain information that is useful to the campaign. There is not much information for the general population.

cthulu2016

(10,960 posts)
11. I assumed it was understood that I was referring to the last pre-election poll
Tue Aug 28, 2012, 06:15 PM
Aug 2012

That's why Donald Trump and Herman Cain are not the Republican nominee... and why John Glenn didn't become president in 1984.

Jim__

(14,088 posts)
12. I thought you were referring to "these" polls.
Tue Aug 28, 2012, 07:08 PM
Aug 2012

People do best to ignore "these" polls - that is, the polls that we are seeing now.

 

HopeHoops

(47,675 posts)
4. Curiously, 1024 will only give you a 4.3% margin of error. Seems weird, but it's true.
Tue Aug 28, 2012, 02:12 PM
Aug 2012

I don't bother with polls until very close to the election, and I prefer sample sizes in the 100K range with a MOE of around 2.3%, but that's the math geek in me speaking.

surrealAmerican

(11,364 posts)
5. ... and yet, there is a minimum sample size.
Tue Aug 28, 2012, 02:21 PM
Aug 2012

In the coin flipping example, if you are calculating the odds based on only three flips, you are guaranteed to get useless data. Do we know what the minimum sample size is for voter polls?

cthulu2016

(10,960 posts)
8. Each sample size has a corresponding margin of error
Tue Aug 28, 2012, 02:28 PM
Aug 2012

It seems like three coin flips would have a MOE95% of about 100!

Different pollsters aim for MOEs anywhere from 2%-4% and that dictates the sample size they need.

National election polls seem to always be in the 800-1200 sample range. At this point in the cycle there are a lot of likely voter polls that have an overall sample over a thousand to reach a likely voter sample of about 800.

A MOE approaching 5% seems to be discarded. Nobody seems to make it down to 2%. (That's just my impression, I am not looking at any polls while saying that.)

Latest Discussions»General Discussion»How big a sample do these...