Skip to comments.Brewster Kahle's Internet Archive
Posted on 10/14/2012 10:31:24 AM PDT by thecodont
Brewster Kahle was a 19-year-old computer science student at the Massachusetts Institute of Technology when a friend posed a simple, yet life-changing question: "What can you do with your life that is worthwhile?"
Kahle came up with two answers. The first, developing a microchip to ensure the privacy of telephone conversations, didn't pan out. But 32 years later, Kahle is still happily pursuing his second big idea - to create the digital-age version of the Great Library of Alexandria.
His Internet Archive - fittingly based in an old Richmond District church that architecturally harks back to the ancient Egyptian library - is building a rich repository of modern digital culture. It's best known for the online Wayback Machine, which provides a searchable online museum of the Internet, archiving more than 150 billion Web pages that have appeared since 1996. The nonprofit archive stretches beyond the Internet. It has recorded 350,000 television news broadcasts, including reports from around the world during the week of the 2001 terrorist attacks, and stores 200,000 digitized books.
The nearly 10 petabytes - equivalent to about 10 billion books - of material in the archive also has 900,000 audio files, including 9,000 fan-made recordings of Grateful Dead concerts. Volunteers are even converting old home movies and stock footage of post-World War II San Francisco into digital form.
It's a mind-boggling, and constantly growing, amount of digital data, and it's all available for free, as the site's welcome says, to "researchers, historians, scholars, and the general public." With 50 times as much data expected to be produced over the next decade, it will be an ever-increasing challenge to capture, catalog and store it.
Read more: http://www.sfgate.com/news/article/Brewster-Kahle-s-Internet-Archive-3946898.php#ixzz29INcgqQh
(Excerpt) Read more at sfgate.com ...
link to the archive. you can find things the media and others would like to have disappeared.
It would be more useful if A) it archived pages faster and B) didn’t honor the “no robots no follow” crap.