The scientist, all from the Hebrew University in Jerusalem, carried out their experiments on 66,000 Amazon reviews. They claim to have managed to successfully detect sarcasm in 77 percent of cases, the comparison point being whether three human readers agreed with the computer that the comments were sarcastic.
The main problem with developing a system was that sarcasm comes in many forms, many of which require specialist knowledge beyond the actual words used. For example, the review title “Defective by design” on an iPod review only makes sense if you realize it is not a literal claim but rather a pun referring to Apple’s reputation for visual designs.
Similarly, the claim “This book was really good until…” could be genuine criticism (“This book was really good until the plot twist.”) or sarcasm (“This book was really good until page 2.”)
To develop the system, the scientists isolated the most obviously sarcastic comments by hand, then built up a formula of terms and sentence styles which appeared most often. They also searched for reviews which contained positive words but had a low star rating from the reviewer.
There’s certainly evidence that sarcasm and irony cause problems for analysts. During the recent British elections, there were semantic analysis studies which aimed to track the sentiments of Twitter users about various candidates. This proved unsuccessful on a day when there was a short-lived craze of blaming every problem under the sun upon one candidate with the hashtag #nickcleggsfault. What the computer systems didn’t realize was that the craze was an ironic response to what was seen as unduly harsh criticism of Clegg by the mainstream media.
Incidentally, I was tempted to end this article with a sarcastic remark. But that would have just been the most original thing ever. Nobody would ever have thought of doing that.