Thursday, March 29, 2012

My Sentiments on Sentiment Analysis

My Sentiments on Sentiment Analysis

I have recently started working with sentiment analysis.  This is a new and interesting field in business intelligence and big data.  Sentiment providers promise to be able to quantify how users “feel” about your organization through the interpretation of comments, tweets or other content provided to you. 

There are a few limitations of this type of automated analysis.  Firstly, a computer algorithm has difficulty with figurative speech, sarcasm and the “lingo” of the day.  Surely, a computer will believe “sick” is negative, but what will it think of “that snowboard is sick, dude”? 

Secondly, the brevity of many of the social media forums leads to less accuracy in scores.  A computer can glean hundreds of adjectives and verbs and rate them in an essay, but when you are limited to a Tweet of 140 characters, there is not much for the computer to score.

Companies like Repustate give you tools to monitor your online sentiment from your Twitter or Facebook feed for example.  They also will allow you to access their scores through an API so that you can bring sentiment analysis into your own application.

So I did this with QlikView using the power of QVSource and who provides 50,000 free calls a month.  I will not get into the details of how to do this because it is actually pretty easy.  Instead I want to examine how accurate or “usable” the resulting information looks to be.

Repustate will take the text string you provide and answer back with a number.  A zero or close to zero number equates to a neutral sentiment.  A positive number obviously indicates a happy or encouraging sentiment and a negative number would be negative emotion.  Obviously, words like “amazing” and “encouraging” should trigger positive scores and phrases like “pissed off” or “horrible service” should warrant a negative score. 

So let’s first look at string length. 
1:     Horrible Service
Score = 0
2:     I received horrible service from a company that shall remain nameless.
Score = -1

If you just say “horrible service”, Repustate gives this a neutral score of zero.  But if you lengthen the string to “I received horrible service from a company that shall remain nameless”, the score comes back as -1.  So even when reducing to the presumably negative words, string length is a huge factor.

Let’s look at some other anecdotal examples.  Consider the following two tweets.  Which one is positive and which is negative?
1:     @_____ I agree that the #personalcloud will replace the PC but I am wondering how comfortable people will be with this concept. Security?
2:     @_____ I could not agree more! I am far from technical yet #QlikView allows me to create my own dashboards. #empowers

Oddly, example one received a positive 3.00, on the upper end of our sample while the second example received a negative 3.00, on the lower end of our sample.  This seems a little contrary to my judgement.  I would say the first example is slightly negative and the second is quite positive.

Consider this third example:
#BioPharma Companies make faster and smarter decisions using #BI

You would think this would result in a positive sentiment due to the phrase “faster and smarter” but it actually resulted in a neutral score of zero.

Tweets are limited to 140 characters and probably lots of these characters are likely rendered useless with the the hashtags, usernames and urls.  So let’s take a look at some longer strings and see if we can get some more accurate values.  Both quotes are by Thomas Jefferson.
1:     We hold these truths to be self-evident: that all men are created equal; that they are endowed by their Creator with certain unalienable rights; that among these are life, liberty, and the pursuit of happiness.
Score     =             1.00
2:     Enlighten the people generally, and tyranny and oppressions of body and mind will vanish like evil spirits at the dawn of day. 
Score     =             -1.00

So the first score makes sense, but I think a human would interpret the second example to be positive in sentiment. 

Let’s look at a slightly longer quote about happiness from Sharon Salzberg:
As I go through all kinds of feelings and experiences in my journey through life -- delight, surprise, chagrin, dismay -- I hold this question as a guiding light: "What do I really need right now to be happy?" What I come to over and over again is that only qualities as vast and deep as love, connection and kindness will really make me happy in any sort of enduring way.
Score = -.103

So the computer rates a quote exactly about happiness with a slightly negative score.  I am not sure I understand the logic there.

Try this made up text string:
This was supposed to be an amazing and beneficial seminar but I should have known better.
Score = 1.00

You can trick the algorithm fairly easily.

These examples were picked because they illustrate some issues with automatic sentiment analysis.  The seemingly odd scoring might not be indicative of the accuracy in a larger set.

But in general, I think we need to be cautious on how much value we attach to sentiment analysis.  The good news is, the algorithms will definitely improve over the next few years as more organizations begin to look at this accumulation of social media content that we can clearly call big data.  How do we analyze this data and create an understanding that will help us make decisions?  We are starting this conversation now and I am sure it will only grow in importance over the next few years.