Free Republic
Browse · Search
News/Activism
Topics · Post Article

Skip to comments.

Full-up Google choking on web spam?
The Register ^ | Thursday 4th May 2006 | Andrew Orlowski

Posted on 05/04/2006 12:14:59 PM PDT by nickcarraway

Webmasters have been seething at Google since it introduced its 'Big Daddy' update in January, the biggest revision to the way its search engine operates for years.

Alarm usually accompanies changes to Google's algorithms, as the new rankings can cause websites to be demoted, or disappear entirely. But four months on from the introduction of "Big Daddy," it's clear that the problem is more serious than any previous revision - and it's getting worse.

Webmasters now report sites not being crawled for weeks, with Google SERPS (search engine results pages) returning old pages, and failing to return results for phrases that used to bear fruitful results.

"Some sites have lost 99 per cent of their indexed pages," reports one member of the Webmaster World forum. "Many cache dates go back to 2004 January." Others report long-extinct pages showing up as "Supplemental Results."

This thread (http://groups.google.com/group/alt.internet.search-engines/browse_thread/thread/5a15acac5a0245ce/b429366a28a29507#b429366a28a29507) is typical of the problems.

With creating junk web pages is so cheap and easy to do, Google is engaged in an arms race with search engine optimizers. Each innovation designed to bring clarity to the web, such as tagging, is rapidly exploited by spammers or site owners wishing to harvest some classified advertising revenue.

Recently, we featured a software tool that can create 100 Blogger weblogs in 24 minutes, called Blog Mass Installer. A subterranean industry of sites providing "private label articles," or PLAs exists to flesh out "content" for these freshly minted sites. And as a result, legitimate sites are often caught in the cross fire.

But the new algorithms may not be solely to blame. Google's chief executive Eric Schmidt has hinted at another reason for the recent chaos. In Google's earnings conference call last month, Schmidt was frank about the extent of the problem.

"Those machines are full," he said (http://www.iht.com/articles/2006/04/21/business/GOOGLE.php). "We have a huge machine crisis."

And there's at least some anecdotal evidence to support the theory that hardware limitations are to blame.

"The issue I have now is Googlebot is SLAMMING my sites since last week, but none of it makes it into the index. If it's old pages being re-indexed or new pages for the first page, they don't show up," writes one webmaster.

The confusion has several consequences which we've rarely seen discussed outside web circles.

Giving Google the benefit of the doubt, and assuming the changes are intentional, one webmaster writes: "In which case Google's index, and hence effectively 'the Web as most people know it' is set to become a whole lot smaller in the coming weeks."

It's barely more than a year since Yahoo! and Google were engaged in a willy-waving exercise to claim who had the largest index. (See My spam-filled search index is bigger than yours! (http://www.theregister.co.uk/2005/08/16/google_yahoo_junk/))

Now size, it seems, doesn't matter.

There's also the intriguing question raised by search engines that are unable to distinguished between nefarious sites and legitimate SEO (search engine optimization) techniques? The search engines can't, we now know, blacklist a range of well-establish techniques without causing chaos. In future, will the search engines need to code for backward bug compatibility?

And lingering in the background is the question of whether the explosion of junk content - estimates put robot-generated spam consists of anywhere between one-fifth and one-third of the Google index - can be tamed?

"At this rate," writes one poster on the Google Sitemaps Usenet group, in a year the SERPS will be nothing but Amazon affiliates, Ebay auctions, and Wiki clones. Those sites don't seem to be affected one bit by supplemental hell, 301s, and now deindexing."

With $8 billion in the bank, Google is better resourced and more focussed than anyone - but it's still struggling. Financial analysts noted that its R&D expenditure now matches that of a wireline telco.

Only a cynic would suggest that poor SERPs drive desperate businesses to the search engines own classified ad departments - so if you want to play, you have to pay. Banish that unworthy thought at once.


TOPICS: Business/Economy; Crime/Corruption; Culture/Society; Extended News; Miscellaneous; News/Current Events; US: California
KEYWORDS: google; internet; spam; web

1 posted on 05/04/2006 12:15:01 PM PDT by nickcarraway
[ Post Reply | Private Reply | View Replies]

To: ShadowAce

ping


2 posted on 05/04/2006 12:15:12 PM PDT by nickcarraway
[ Post Reply | Private Reply | To 1 | View Replies]

To: nickcarraway
Ah yes, more complaining instead of going out and building a better mousetrap.

I love Google, I don't care how liberal they are.

3 posted on 05/04/2006 12:16:46 PM PDT by Extremely Extreme Extremist (FR's most controversial FReeper)
[ Post Reply | Private Reply | To 1 | View Replies]

To: rdb3; chance33_98; Calvinist_Dark_Lord; Bush2000; PenguinWry; GodGunsandGuts; CyberCowboy777; ...

4 posted on 05/04/2006 12:17:29 PM PDT by ShadowAce (Linux -- The Ultimate Windows Service Pack)
[ Post Reply | Private Reply | To 1 | View Replies]

To: Extremely Extreme Extremist

I have to say I find google less efficient these days (privacy issues?), but they are still more on the ball than any of the others. Pretty much tried them all. If anyone has good suggestions, clue me.


5 posted on 05/04/2006 12:22:59 PM PDT by timsbella (Mark Steyn for Prime Minister of Canada!)
[ Post Reply | Private Reply | To 3 | View Replies]

To: ShadowAce

Try www.DogPile.com

I am not associated with anyone!


6 posted on 05/04/2006 12:25:02 PM PDT by Freeper (I was culture in the 60's and now with Clinton "running things" I am suddenly Counter-Culture.)
[ Post Reply | Private Reply | To 4 | View Replies]

To: timsbella

The best one ever was Northern Lights, but it went to paid only several years ago. I got much more on that one than I ever did on Google, although that was the second best. How I have missed it!


7 posted on 05/04/2006 12:28:01 PM PDT by twigs
[ Post Reply | Private Reply | To 5 | View Replies]

To: Freeper

I've used:

dogpile, yahoo, lycos, webcrawler, ixmetasearch, clusty, and dmoz

and I think these are among the better ones, but still, a flawed google produces more good results


8 posted on 05/04/2006 12:28:05 PM PDT by timsbella (Mark Steyn for Prime Minister of Canada!)
[ Post Reply | Private Reply | To 6 | View Replies]

To: nickcarraway

I don't think Google, "IMPROVED." I think they, sold out, to the spammers and hackers. The end result, is that they created more problems than they supposedly, solved...


9 posted on 05/04/2006 12:28:31 PM PDT by Mrs. Darla Ruth Schwerin
[ Post Reply | Private Reply | To 1 | View Replies]

To: twigs

Yes, that was a good one. Wasn't it taken over by overture (which also isn't bad, but not among the very best). I like Highbeem for news headlines. Very nice. Technorati is good for blogs.


10 posted on 05/04/2006 12:29:40 PM PDT by timsbella (Mark Steyn for Prime Minister of Canada!)
[ Post Reply | Private Reply | To 7 | View Replies]

To: Freeper



I thought Dogpile just organizes results from other search engines...?
11 posted on 05/04/2006 12:37:01 PM PDT by macamadamia
[ Post Reply | Private Reply | To 6 | View Replies]

To: Mrs. Darla Ruth Schwerin

I agree, they sold out. I used an engine called Wisenut, which is still around, but after it was swallowed up by Looksmart it seemed like they "dumbed it down". I feel the same way about Google. As I posted before, not as good, but still better than the rest.


12 posted on 05/04/2006 12:38:10 PM PDT by timsbella (Mark Steyn for Prime Minister of Canada!)
[ Post Reply | Private Reply | To 9 | View Replies]

To: timsbella

I really don't know what happend to my Northern Lights. I just know that I've missed it a lot. I do a lot of genealogical research online and it yielded a lot more dead people that I needed that any other. It was excellent on current news too.


13 posted on 05/04/2006 12:40:21 PM PDT by twigs
[ Post Reply | Private Reply | To 10 | View Replies]

To: nickcarraway
If some of these web sites would spend half that effort on their content and maybe buy a little paid advertising they may be more successful.

Some folks obsess over search engine rankings.

14 posted on 05/04/2006 12:42:10 PM PDT by Flyer (Tag line removed to appease humblegunner)
[ Post Reply | Private Reply | To 1 | View Replies]

To: Extremely Extreme Extremist
I love Google, I don't care how liberal they are.

I quit using it because of FR reports as to Google’s leftist tendencies. Plenty of alternatives out there.

15 posted on 05/04/2006 12:43:45 PM PDT by stillonaroll
[ Post Reply | Private Reply | To 3 | View Replies]

To: nickcarraway
Only a cynic would suggest that poor SERPs drive desperate businesses to the search engines own classified ad departments - so if you want to play, you have to pay. Banish that unworthy thought at once.

Far and away the most important part of the article.

16 posted on 05/04/2006 12:45:37 PM PDT by rattrap
[ Post Reply | Private Reply | To 1 | View Replies]

To: nickcarraway

I used to use google the way I would use the encyclopedia, now it's more like the yellow pages.


17 posted on 05/04/2006 12:47:43 PM PDT by Old Professer (The critic writes with rapier pen, dips it twice, and writes again.)
[ Post Reply | Private Reply | To 1 | View Replies]

To: twigs

If that's your thing, I like to bypass ancestry.com and use rootsweb - you can free search the SSDI. The Mormon's have an amazing site called familysearch.org - traced my husband's family back to the early 1800s and all of my Jewish relatives in Europe! No idea how they gleaned all this stuff. Great international links.


18 posted on 05/04/2006 12:48:13 PM PDT by timsbella (Mark Steyn for Prime Minister of Canada!)
[ Post Reply | Private Reply | To 13 | View Replies]

To: Old Professer

http://accoona.com/


19 posted on 05/04/2006 12:51:35 PM PDT by navysealdad
[ Post Reply | Private Reply | To 17 | View Replies]

To: stillonaroll
I quit using it because of FR reports as to Google’s leftist tendencies. Plenty of alternatives out there.

Got any good URL's for alternate search engines?

20 posted on 05/04/2006 12:55:41 PM PDT by Centurion2000 (Before I refuse to take your questions, I have an opening statement. - Reagan)
[ Post Reply | Private Reply | To 15 | View Replies]

To: nickcarraway
Does any search engine do a good job of searching inside Free Republic? Google has generally been awful if I want to look inside of FR. Yahoo's advanced search is what I have been using recently. Is anything better?
21 posted on 05/04/2006 12:59:09 PM PDT by KarlInOhio (Contrary to those who say that United 93 was released too soon, I fear it was shown far too late.)
[ Post Reply | Private Reply | To 1 | View Replies]

To: Extremely Extreme Extremist
I love Google, I don't care how liberal they are.

I tried looking for the terms 911 + hijacker + biography and I got pages and pages and pages of conspiracy theories. It took me longer than I wanted to find some real information.

When you search on 911 hijackers and get "Brad's 911 Investigation"...honest. Then there is a problem.
22 posted on 05/04/2006 1:06:26 PM PDT by Arkinsaw
[ Post Reply | Private Reply | To 3 | View Replies]

To: timsbella

Yes, I love familysearch. I've found great stuff there. I, too, avoid ancestry. A lot of my research is now in storage and I'm not going to shell out money until I have everything in front of me. How exciting about the info you've found on your family. My major family is from the burned counties of Virginia, so research is difficult.


23 posted on 05/04/2006 1:17:49 PM PDT by twigs
[ Post Reply | Private Reply | To 18 | View Replies]

To: timsbella

That's probably why they did it. They were so far ahead, they figured it'd be okay to, "get a little stupid" for $$$$...


24 posted on 05/04/2006 1:21:46 PM PDT by Mrs. Darla Ruth Schwerin
[ Post Reply | Private Reply | To 12 | View Replies]

To: Centurion2000

ask.com got a good review in the WSJ. I use Google but mostly out of force of habit.


25 posted on 05/04/2006 1:33:05 PM PDT by Tribune7
[ Post Reply | Private Reply | To 20 | View Replies]

To: timsbella

Very cool, thanks for the advice - anyone else have any genealogical advice? I've been thinking about getting back into it and tracing down both my family tree and my husband's family tree. I used to have some pretty good charts, but they've gotten lost over time...


26 posted on 05/04/2006 1:36:06 PM PDT by Kaylee Frye
[ Post Reply | Private Reply | To 18 | View Replies]

To: timsbella

I've tried one called brainboost.com that seems interesting.


27 posted on 05/04/2006 1:37:51 PM PDT by isom35
[ Post Reply | Private Reply | To 5 | View Replies]

To: nickcarraway
I'm still finding Google the best of a bunch of not very good options, too. My biggest criticism is that they don't seem to weed. Once they get pages in, they seem to stay that way. Often the top hits are sites that have been dead for ages.

What's especially strange about that is that their cached version of a page tends to be pretty up to date. So why aren't their robots reporting when they revisit a site and it's kaput?

28 posted on 05/04/2006 2:54:48 PM PDT by prion (Yes, as a matter of fact, I AM the spelling police)
[ Post Reply | Private Reply | To 1 | View Replies]

To: nickcarraway

I've largely given up on Google. I use Yahoo and A9 (which no longer gets its results from Google).

And when I do have to use Google, I do it with Firefox, with the proper extensions installed to insure I don't see their ads and they don't get to track my actions.


29 posted on 05/04/2006 3:12:16 PM PDT by Dont Mention the War (This tagline is false.)
[ Post Reply | Private Reply | To 1 | View Replies]

To: nickcarraway
"Google's index, and hence effectively 'the Web as most people know it' "

I still do over half of my searching with Google, but it's getting less satisfactory all the time. I'm seldom looking for porn or something to buy.

30 posted on 05/04/2006 3:23:33 PM PDT by mrsmith
[ Post Reply | Private Reply | To 1 | View Replies]

To: Tribune7
I have been using Clusty as my #1 search but I get the best kind of results I need by using Ask Jeeves and just ask a question in about 6 or 7 words.
31 posted on 05/04/2006 3:25:44 PM PDT by tubebender (Tagline...I don't need no stinking tagline...)
[ Post Reply | Private Reply | To 25 | View Replies]

To: timsbella
At one time Google did have the best engine but when they started skewing results based on inbound links it resulted in sites with incredibly relevant content but without incoming links to move to the bottom of the result listings. Naturally a huge market in buying/selling inbound links has sprung up and those that can buy them or engage in referrer spam tend to get top ranking.

Nor has Google done much with expanding beyond regular old listings and using things such as clustering such as Vivisimo which is my current favorite.

I have a great example from just a couple of days ago; I needed tech info on a product. Searched Google and got a bunch of sites selling it. Fired it up through Vivisimo and got the tech spec page right away.

32 posted on 05/04/2006 3:36:26 PM PDT by Proud_texan (I'm gonna break my rusty cage and run)
[ Post Reply | Private Reply | To 5 | View Replies]

To: Proud_texan

I've used vivisimo, and it's not bad. If you want consumer advise, I really like epinions. They have not steered me wrong yet.


33 posted on 05/04/2006 3:38:28 PM PDT by timsbella (Mark Steyn for Prime Minister of Canada!)
[ Post Reply | Private Reply | To 32 | View Replies]

To: Dont Mention the War

I've found A9 more than a little "hamstrung" of late. I like their add on features though.


34 posted on 05/04/2006 3:39:59 PM PDT by timsbella (Mark Steyn for Prime Minister of Canada!)
[ Post Reply | Private Reply | To 29 | View Replies]

Disclaimer: Opinions posted on Free Republic are those of the individual posters and do not necessarily represent the opinion of Free Republic or its management. All materials posted herein are protected by copyright law and the exemption for fair use of copyrighted works.

Free Republic
Browse · Search
News/Activism
Topics · Post Article

FreeRepublic, LLC, PO BOX 9771, FRESNO, CA 93794
FreeRepublic.com is powered by software copyright 2000-2008 John Robinson