Skip to comments.Some online reviews are too good to be true; Cornell computers spot 'opinion spam'
Posted on 07/27/2011 8:36:23 AM PDT by LibWhacker
If you read online reviews before purchasing a product or service, you may not always be reading the truth. Review sites are becoming targets for "opinion spam" -- phony positive reviews created by sellers to help sell their products, or negative reviews meant to downgrade competitors.
The bad news: Human beings are lousy at identifying deceptive reviews. The good news: Cornell researchers are developing computer software that's pretty good at it. In a test on 800 reviews of Chicago hotels, a computer was able to pick out deceptive reviews with almost 90 percent accuracy. In the process, the researchers discovered an intriguing correspondence between the linguistic structure of deceptive reviews and fiction writing.
The work was reported at the 49th annual meeting of the Association for Computational Linguistics in Portland, Ore., June 24, by Claire Cardie, professor of computer science; Jeff Hancock, associate professor of communication; and graduate students Myle Ott and Yejin Choi.
"While this is the first study of its kind, and there's a lot more to be done, I think our approach will eventually help review sites identify and eliminate these fraudulent reviews," Ott said.
The researchers created what they believe to be the first "gold standard" collection of opinion spam by asking a group of people to deliberately write false positive reviews of 20 Chicago hotels. These were compared with an equal number of carefully verifed truthful reviews.
As a first step, the researchers submitted a set of reviews to three human judges -- volunteer Cornell undergraduates -- who scored no better than chance in identifying deception. The three did not even agree on which reviews they thought were deceptive, reinforcing the conclusion that they were doing no better than chance. Historically, Ott noted, humans suffer from a "truth bias," assuming that what they are reading is true until they find evidence to the contrary. When people are trained at detecting deception they may become overly skeptical and report deception too often, still scoring at chance levels.
The researchers then applied computer analysis based on subtle features of text. Truthful hotel reviews, for example, are more likely to use concrete words relating to the hotel, like "bathroom," "check-in" or "price." Deceivers write more about things that set the scene, like "vacation," "business trip" or "my husband." Truth-tellers and deceivers also differ in the use of keywords referring to human behavior and personal life, and sometimes in features like the amount of punctuation or frequency of "large words." In parallel with previous analysis of imaginative vs. informative writing, deceivers use more verbs and truth-tellers use more nouns.
Using these approaches, the researchers trained a computer on a subset of true and false reviews, then tested it against the rest of the database. The best results, they found, came from combining keyword analysis with the ways certain words are combined in pairs. Adding these two scores identified deceptive reviews with 89.8 percent accuracy.
Ott cautions that the work so far is only validated for hotel reviews, and for that matter, only reviews of hotels in Chicago. The next step, he said, is to see if the techniques can be extended to other categories, starting perhaps with restaurants and eventually moving to consumer products. He also wants to look at negative reviews.
This sort of software might be used by review sites as a "first-round filter," Ott suggested. If, say, one particular hotel gets a lot of reviews that score as deceptive, the site should investigate further.
"Ultimately, cutting down on deception helps everyone," Ott said. "Customers need to be able to trust the reviews they read, and sellers need feedback on how best to improve their services."
We research almost everything we purchase anymore and about the only thing you can do, afaik, is look at the number of negative reviews. If there are hundreds of negative reviews, it won't do a company much good to flood the place with a thousand phony good reviews; everyone will still see all the negatives and will get the sense they should probably steer clear. Anyway... Good job!
Also using Google or other search tools to look up a business that you think is sketchy, you can get a lot of phony testimonials.
We need this used on Obama Speeches and Debbie W. Schultz rants.
If you have heard the commercials for Reputation Defender for their On-line Reputation Management, listen carefully. They are basically offering this spamming service.
there’s a website that advertises this service, aimed at doctors, lawyers, etc. They’ll spam the review sites with positive reviews, driving the negative ones down the search list. The theory being nobody looks past the first page on Google.
There's part of the problem with their study. They're using people who are at the peak of their susceptibility to propaganda.
Usually software like this focus’s on key words and phrases - and develops filters based on these. Context and the level of sincerity are ‘fluff’ - that are difficult to quantitize.
So, all the spammers have to do is evolve the keywords to avoid the filters.
They should use it on Consumer Reports testing anything “green” related to cars.
Geesh...that ain’t so bad. A lot of Freepers here can’t be bothered to read past the first thread on the main page - hence the reason for so many first page duplicates.
...betcha a “political PR consultant” has figured out a way to take advantage of this at FR - assuming they haven’t already.
I use TripAdvisor quite a bit to help decide on which B&B to use when traveling. However, a few years ago I noticed that the “top rated” B&B’s for any given area - ESPECIALLY Washington DC - had tons of suspicious reviews. Best way to tell with the TripAdvisor reviews is to check how many of the “reviewers” have only left one “contribution”. B&B owners have friends and acquaintances leave these glowing reviews to drive the ratings way up.
A lot of times the phony reviews have a lot in common - “home away from home” and, oh here is a perfect example right from the site:
“Another visit to Washington DC, another lovely stay at the beautiful Chester A. Arthur Bed and Breakfast. This charming B&B is my favorite place to stay in Washington because it makes me feel like I’m living in a glorious past. From the moment you walk through the front door you are greeted with a splendor preserved from the 1800’s - gilded chandeliers and period furniture welcome you to another time. The rooms are big and the beds comfortable; I am always impressed with the care of the bathrooms. I stay in a lot of B&B’s and this is one of my favorites - away from, etc.........”
How the heck many bathrooms if this putz using at the place that he can be so impressed with the “care of the bathroomS” ?
We stayed at that place and it was one of our least favorite B&B experiences. The owners were flat out weird, and the “formal” breakfast was the most awkward meal I have ever “attended”. Food was crap. Room was tiny, bed was weird, furniture was mismatched and beat up, bathroom was tiny and none too clean. I notice the anal owners of that B&B are now leaving snotty responses to every review that doesn’t give them the highest rating.
There was a woman from France there our first tortuous breakfast, and after bragging about his award winning frech speaking skill, the host began to jabber at the French lady in whatever he thought was french. She just stared at him like he was a zoo animal, then shrugged her shoulders and started eating.
It was that visit that taught me about spending the time to check the percentage of “excellent” reviews, how many left just one contribution, and had a sign up date the same time as the review. I also decided that while in DC I would not risk another B&B stay there.
Disclaimer: Opinions posted on Free Republic are those of the individual posters and do not necessarily represent the opinion of Free Republic or its management. All materials posted herein are protected by copyright law and the exemption for fair use of copyrighted works.