Very nice explanation of sample selection bias.
I would like to hear you discuss a little more how to balance the various samples in order to get an accurate reflection of reality.
This is not my area, but the way it is generally done is via proprietary models based on a very carefully guarded history, prior surveys and voting results, correlated with other voting patterns.
For your reading, I just found a very good article summarizing the process:
On a side note: I am convinced that many liberal pollsters are nothing more than political propaganda hacks. On the other hand, I am not sure what to think about various polls. Nate Silver, certainly no dummy, but also a potential political hack, has a very accurate track record (and his few misses have been about equal missing in favor and against a Republican); and I am concerned about his analysis. However, again, Nate could be completely genuine in his modeling, but could be missing differences in voter turnout and motivation.
Think of it this way: Say I run a paint department at a hardware store. The paint is all white. When a customer wants a certain color I use a machine to add different formula of pigments to make a particular color. Even if I enter the instructions into the machine correctly, if the machine (unknown to me) deposits more of a certain pigment than I thought, the outcome will not be as I expected. Thats what polling is like: trying to accurately determine the fractional weighing among up to a hundred different pigments. Yes, within each pigment the sample is random, but those many results need to be mixed together, and it is in this weighted average where the practice is as much art as it is science.
I do know based on our local analysis that higher percentage R precincts seem to be voting at a relatively higher pace than the lower percentage R precincts (certainly a good bump for our local R candidates).