Hey there,
It’s been a while since we last posted but we are pretty
excited about our new sentiment analysis transforms so I thought I would make a
quick post about it.
Sentiment analysis can be described as the use of natural
language processing (NLP) to extract the attitude/opinion of a writer towards a
specific topic. With the overwhelming amount of data being posted on the Internet
every day and no way to read it all, sentiment analysis has become a really
useful tool for extracting and aggregating opinions on a specific topic from
many different sources. The potential use for sentiment analysis is endless, a
few examples are things like brand reputation monitoring, market research,
stock-exchange monitoring, etc. The transform that we built takes a Tweet as
its input entity and returns either positive, neutral or negative entity. This
way a large amount of Tweets can easily be categorized according to their
sentiment.
Although sentiment extraction is a relatively new area of
research there are quite a few methods of going about it and a lot of companies
offering different sentiment analysis APIs. With many APIs to choose from it
was quite difficult to decide which one would work best for the transform. I
decided to use my top four APIs, aggregated their result and use that as the
output of my transform. The problem with this method was that most of the time
the APIs would return different and often obviously incorrect results (I won’t
mention any names). While some APIs seemed to work well on certain topics of Tweets
they would fail horribly on others. After much experimentation I settled for
using only AlchemyAPI’s sentiment analysis tool which seems to work the best out
of all the APIs that I tested, and I tested quite a few so well done to them.
I then built a new machine named Twitter Analyser to use
with the new sentiment analysis transform. This machine takes a phrase in as
its input and searches Twitter for Tweets with this phrase. From the Tweets
that are returned hash tags, links, sentiment and uncommon words found in the Tweets
are extracted. The uncommon words are extracted with one of our other new
transforms that checks the word against an ordered list of common words, if the
input word does not occur in the list before a certain threshold the word is
returned as an entity. The To Words
transform can takes in two transform settings: the threshold it must search in
the list of common words and words that should be ignored by the transform. The
machine will run every 5 minutes to search Twitter for new Tweets. Running the
machine in bubble view it is easy to see common hashtags, links, words and sentiment
between Tweets of a certain topic. The screenshot of the graph below shows an
example of using the Tweet Analyser machine on the phrase AlchemyAPI:
In this image the entities are sized according to their number of incoming links so you can see what is common between many Tweets. From the image you can see common hashtags like: #ai, #deeplearning and #sentimentanalysis as well as pick out the common links and words between the Tweets.
As always enjoy responsibly!
Paul
PS: As most of you already know we have recently released an
update to Maltego [version 3.5.2], our YouTube video here gives a quick
breakdown of the new features: https://www.youtube.com/watch?v=QK6PX4Fq5xY&list=UUThOLpqhLFFQN0nStdkyGLg
Comments
Post a Comment