A Formula for Perfect Sentiment Analysis
December 16, 2009 | No Comments
…0% Humor
+ 0% Sarcasm
+ 0% Emoticons
+ 0% SpellingErrors
+ 0% Misattributions
+ 0% TextMessaging
+ 100%GrammaticalClarity
_______________________________
= 100% Perfect Sentiment Analysis
Sentiment analysis has been around in one form or another for many years. In a traditional sense, market researchers have been using it to better understand consumer preferences as expressed through survey responses. For example, every survey where you’ve written something in the “Other” box has ended up on someone’s desk who’s job it was to interpret the message you wrote. Not just the messages that were clear and carefully written, but every single message. Given that any one survey probably had only about 300 verbatims that needed analysis, it was a doable job for a person.You would think that they would interpret every single message correctly.
According to many scientific studies carried out in a variety of academic organization, 80% to 85% is as accurate as human beings can get when they manually interpret words written by other people. Surprised? Well, if you think about all the components underlying your messages, the humor that only you and friends laugh at, the references that only you and your spouse know, the emoticons that you saw someone else use and now misuse yourself, your horrible spelling and grammar, it is indeed a wonder that people are able to get 80% of the interpretations correct.
Fast forward to today, now, where years of dedicated, scholarly activity have resulted in automated sentiment analysis processes that demonstrate a level of quality and validity that researchers are happy with. It’s not perfect nor is it quite as accurate as manual scoring by people, but it does a really good job. Now also fast forward to a point in time where the quantity of data is not hundreds, not thousands, but millions of data points from millions of sources in millions of formats. Now is the time to join opportunity and ability. The sheer quantity of data that can be analyzed vastly outweighs small decreases in validity associated with automated systems. Fortunately for us, millions of data points plus comprehensive sentiment analysis systems means stable, valid, and meaningful results.