Then there's the fact that the actual response rates on polling are incredibly small ~ used to be they could call folks and 40% of the time the person who picked up the phone would go on to give answers to the pollster which were then used in the report on that poll. PEW reports that now 91% of all calls go unanswered, or result in a non-response.
These things end up not being random due to the vested interest thesis ~ to wit, certain groups will answer all polling calls simply to make sure their response is counted. This results in their attaining an unrepresentative importance in the polls. Such groups are mostly on the left.
A reasonable question would be about what pollsters are doing if they no longer have reliable random sample responses? And the answer would be they are engaging in AUGURY.
The likely cause of this drop in response rates is CALLER ID. People who get called now have a choice about answering the phone. They aren't answering pollster's calls!
There can be no better example about how crazy polling has been this year than the recent Pew and National Journal polls. Both were taken over the exact same period of time. Both had the data collected by the same company - Princeton and Associates.
Pew had the race at 47-47, with Romney having a decided turnout and enthusiasm advantage.
National Journal had it 50-45, in favor of Obama, with the internals favoring Obama.
Likely very similar sets of raw data, two very different assumptions as to the likely electorate.
I believe you are 90% correct, but there are a few more factors involved.
Cell phones rarely get polled.
Caller ID helps to filter out unwanted calls and a lot of the poor get polled cannot afford it, and are more than willing to participate in the polls if they think it will get them more.
People with jobs are not at home during the day to answer the phone, and at night...they filter those calls from numbers they don't recognize.
In short, polling by phone has gotten to be the most inaccurate method available, yet it is the cheapest.
“The secret to random sample surveys is to keep them random. Once you get into weighting, or attempting to fit results to a turnout model, they are no longer random.”
Absolutely true, but random samples will not work in this type of statistical study. A random sample from a population of humans will have a significant, and often hard to understand or measure, sample selection bias. Sample selection bias is the biggest factor in gutting an otherwise properly constructed statistical study. I remember at school many years ago we had a presentation from a visiting economist about a study she did about the benefits of a government study program providing free pre-natal vitamins. Early on in the presentation I raised my hand and asked whether she had accounted for a sample selection bias (meaning that those who already would have cared about the health of their babies are the ones who would take the time to show up for the free vitamins). I was not trying to be rude, the question just sort of came out. Well, one could tell from the tone in the room for the rest of her presentation that we thought her study was essentially worthless.
A random sample would work if, for example, we want a sample from a series of products being produced at a manufacturing plant. In this case, either a random sample or a periodic sample would be fine.
However, a random sample of voters is not so easy. If we use phones, we will have a bias. If we find voters on the street, we will have a bias, etc. Therefore, the method is to divide the voting population into many sub-categories, get a random sample from each category, and then add the categories together with a weighted average. And then we have the rub: how does one allocate the weights to the various samples? This is where the years of experience and prior data is vital to the poll. But still, weighting accurately something that will happen in the future is really not so easy to do.
The most accurate method is to get a random sample of voters as they are leaving a voting booth. Of course, even in this case we may have sample selection bias (a Republican voter at a Democratic box may, for example, not want to participate in the survey; or the voting patter of early or absentee voters may be very different than the pattern on election day; etc.).
By the way, in our Texas county a young election worker for a local candidate worked up our early voting turnout numbers. By looking at the turnout at our local voting precinct level (voting box level), and comparing to prior turnout, we can tell that the precincts with a higher percentage of republicans are turning out higher than the precincts with a lower percentage of republicans.