Posted on 11/07/2006 8:42:00 PM PST by annie laurie
I was fortunate enough to be the first journalist to get an in-depth demonstration of Powerset, which is a natural-language search engine. Simply put, the technology analyzes the meaning and relationships of words in context so that it can accommodate questions asked in natural language, such as "How're the Giants doing," rather than questions asked in sketchy "keywordese" inputs, like "scores, Giants."
...
Since Powerset indexes a fraction of a fraction of what Google currently handles, we confined our test of Powerset to searching the The New York Times and Wikipedia sites and then checked how Google stacked up when doing the same.
Sample input: "What does News Corp. own?" In Powerset, the top 3 results were very relevant -- and specific. They included a link to a document about Fox TV studios owned by News Corp; another was a link to Balkan News, owned by News Corp. Yet another was about Foxtel being owned by News Corp. The same search on Google generated relatively relevant results, such as a link to the about News Corp. page on Wikipedia. But one result was a link to a Netscape "news" story about President Bush and how the U.S. "does" not torture prisoners.
In capturing the Netscape page, Google implied that its technology found relevancy because the words "news" and "does" were in close proximity. But the result had nothing to do with the company News Corp. In the query, "news" was referring to a company name and "does" referred to an action by that company.
That's the problem with existing search engines, Powerset's founders say. Conventional search indexes words based on the occurrence they're mentioned and their proximity are to one another. Where they fall short is they don't index the relationships between words or the meanings of the words ...
(Excerpt) Read more at marketwatch.com ...
Askjeeves cheated. They used human editors to precook results for common questions.
However, ask.com is actually somewhat useful. In Bambi's interview with Pell, he cites searching for "books by children" as an example of a search that is hard for regular search engines but easy for Powerset (because it understands the preposition "by" instead of tossing it as noise). Ask.com actually does better on this problem than Google. If you submit the query to Ask, it shows "Narrow Your Search" links over on the right, which are more or less on target.
On the other hand, there are limits. Neither Google nor Ask does very well on "journalist named after a disney character" [Google] [Ask], for instance.
I find that the hard queries are the ones in which the mere conjunction of the keywords isn't enough where the keywords don't have much selectivity, but their relationship does. It's a hard problem. It requires the computer to understand the ideas presented in the text at some level, pick out and index the semantic relationships, then understand the query at a similar level and search the index. It will be interesting to see if Powerset is any more successful than the existing attempts.
Powerset's future: (1) forget about it or (2) get acquired by Google or Microsoft or (3) give up on being a public search engine in favor of producing a more powerful intranet search product or (4) displace Google (least likely).
"HP Deskjet" "ink cartridge" "next day delivery" and "visa"
Surprisingly, only 107 links. Now if you remove the term and then the count goes up to 416.
And moreover, when you include 'and', Google tells you it has no effect, yet clearly it does.
I tried your query on Froogle, and it gave no results at all until I got rid of the last two terms. One would think Froogle could tackle the delivery modes and payment methods problems, given they already know the context is shopping.
You would think that verizon would be smart enough to auto search for pages containing the phrase "cell phones".... but no... they just give "We are not able to process your request. ".
The same applies for just about any web site out there. Even google should be able to handle the request. Try typing http://www.google.com/cell%phones and you get back an error page. Good grief... a search engine site that cannot even be bothered to search for a reasonable answer.
So much to do... so little time...
Let's try Pell's example, books by children. Not bad.
Let's see if it can tell us how long a shake is. Oops, make that time, not shingles. Nope, can't get shingles out of its silicon head.
Now let's google it. Bingo! First hit.
This is a GOOD THING. Google's politics make me yearn for another option.
http://www.google.com/search?q=d'oh
http://www.google.com/search?q=cell+phones+site%3Averizon.com
Their response is
Sorry, Lexxe is not sure if there is a correct answer for your query. Please check the web and cluster results. Thank you.
Thailand?
The answer is:
bluesuitmom
(Opinion) Google is easy enough to use by putting in keywords.
Google is clean and uncluttered, and yet it is more successful at getting advertising revenue than its competitors.
Disclaimer: Opinions posted on Free Republic are those of the individual posters and do not necessarily represent the opinion of Free Republic or its management. All materials posted herein are protected by copyright law and the exemption for fair use of copyrighted works.