Tech Notes: spam

Wednesday, August 10, 2011

Blog: Researcher Teaches Computers to Detect Spam More Accurately

Researcher Teaches Computers to Detect Spam More Accurately
IDG News Service (08/10/11) Nicolas Zeitler

Georgia Tech researcher Nina Balcan recently received a Microsoft Research Faculty Fellowship for her work in developing machine learning methods that can be used to create personalized automatic programs for deciding whether an email is spam or not. Balcan's research also can be used to solve other data-mining problems. Using supervised learning, the user teaches the computer by submitting information on which emails are spam and which are not, which is very inefficient, according to Balcan. Active learning enables the computer to analyze huge collections of unlabeled emails to generate only a few questions for the user. Active learning could potentially deliver better results than supervised learning, Balcan says. However, active learning methods are highly sensitive to noise, making this potentially difficult to achieve. Balcan plans to develop an understanding of when, why, and how different kinds of learning protocols help. "My research connects machine learning, game theory, economics, and optimization," she says.

View Full Article

Monday, July 25, 2011

Blog: Cornell Computers Spot 'Opinion Spam'

Cornell Computers Spot 'Opinion Spam'
Cornell Chronicle (07/25/11) Bill Steele

Cornell University researchers have developed software that can identify opinion spam, which are phony positive reviews created by sellers to help sell their products, or negative reviews meant to downgrade competitors. In a test of 800 reviews of Chicago-area hotels, the program was able to identify deceptive reviews with almost 90 percent accuracy. The researchers, led by professors Claire Cardie and Jeff Hancock, found that truthful hotel reviews were more likely to contain concrete words that had to do with the hotel, such as "bathroom," "check-in," or "price," while deceptive reviews contained scene-setting words, such as "vacation," "business trip," and "my husband." In general, deceivers use more verbs and honest reviewers use more nouns. The researchers found that the best results came from combining keyword analysis with the ways certain words are combined in pairs. The next step will be to see if the system can be extended to other categories, such as restaurants and consumer products, says Cornell graduate student Myle Ott.

View Full Article

Tech Notes

Wednesday, August 10, 2011

Blog: Researcher Teaches Computers to Detect Spam More Accurately

Monday, July 25, 2011

Blog: Cornell Computers Spot 'Opinion Spam'

Blog Archive

Blog Labels