Across the world today, the US Presidential Election is hijacking conversations, whether in real life (yes, people do still talk in real life), or across social networks like Twitter. Rightly so – with Obama in his final term, his successor will play a similarly huge role when it comes to world affairs, and depending on how you look at things, the current candidates are either promising or terrifying. This is why when you log into Twitter or Facebook, you’ll see countless posts about the latest thing Donald Trump said and another bunch about what the other candidates say.
But in this sea of conversations on social media, how can avoid being adrift and lost? This is where WSO2’s Election Monitor comes in.
Unlike its April Fools prank, this is an actual system you can checkout right now. Powered by WSO2’s open-source analytics platform, the Election Monitor collects data from Twitter by listening to what people say on prominent hashtags such as #Election2016. After collecting these tweets, it analyzes them to give you a clear and easy-to-understand overview of what’s being said. Additionally, it collects news articles across the Internet to identify whether the media is portraying a candidate in a positive or negative light.
So what can you get out of the WSO2 Election Monitor? Let’s take a look.
Democrats vs Republicans
The first figure you’ll find as you scroll down is the number of accounts talking about Republicans and the number of accounts talking about the Democrats. With this, you get a rough picture as to how many people are supporting each party, a picture that’s updated regularly within a 24-hour window.
Behind the scenes, WSO2’s Complex Event Processor and Enterprise Service Bus swings into action. A massive number of posts from Twitter (tweets) are collected via the WSO2 Enterprise Service Bus. This is where the Complex Event Processor identifies unique accounts of people amongst the collection.
After identifying the people, the system calculates which party they support by checking which hashtags they use most frequently. For example, a person setting timelines on fire with the #FeelTheBern hashtag is most likely to be a Bernie Sanders supporter. Whereas a person who features a wall’s worth of #makeamericagreatagain is most likely a Donald Trump supporter. This way, you get a picture of how many Republicans and Democrats are active on Twitter.
The conversations on Twitter
Currently, the WSO2 Enterprise Service Bus collects approximately 10 tweets per second – close to 0.8 million tweets each day. In the previous section, we saw how many Tweeps supported each party. This section is where you can find out what they are saying. The column on the left tells you what they say in real time. One of the other two columns tells you which tweets are the most popular under the #USElection2016 hashtag and the next showcases popular tweets from the candidates themselves.
You’ll find that the live column doesn’t show all the incoming tweets live from Twitter. There’s actually many more tweets that the WSO2 Enterprise Service Bus collects, but to show them all would be pointless as they’d come in and flash by so fast that you wouldn’t be able to read any of them.
What’s really interesting though, is how the other two columns work. Both the popular #USElection2016 tweets column and the column showing popular tweets from candidates use a variation of the Hackernews / Reddit algorithms. These algorithms rank each tweet by a metric (such as popularity). As time passes, the ranking of each tweet drops, helping to give newer popular tweets more exposure. For a more detailed insight into how the algorithm is implemented, check out the WSO2 Election Monitor blog here.
What people are talking about
By now you’ve seen the people’s conversations on Twitter and the popular things said by the candidates. Scroll a bit further down and you’ll find word clouds. These word clouds give you an even clearer snapshot of exactly what the supporters of each candidate are talking about.
Backstage, the system creates these word clouds by analyzing the tweets of the supporters of each candidate. This is done by using the Ark Tweet NLP library to identify the nouns and adjectives used in the tweets. Without this library, you’d see a lot of useless words such as “the” and “Trump” all the time. Furthermore, as this is updated in real-time, the system needs to keep track of a large library of words. To do this, it uses a Count-Min Sketch data structure. Look here for technical details of the implementation.
So far, the WSO2 Election Monitor has shown us the tweets about the US Presidential Election and exactly what people are discussing. But who are the real influencers in these conversations and the community? Just a quick scroll further down, the community graph is where you’ll find the answer to that question.
On the community graph, each person is represented by node (circle), which is colored based on their party affiliation. The links between the nodes are the number of retweets between two tweeps on the graph. As a tweep gets a retweet from another, their node grows. This shows who is getting the most attention and so identifies the biggest influencers tweeting about the US Presidential Election.
Behind the scenes, the data for the graph is collected using Spark SQL Queries. Once the data has been collected, it’s visualized using the D3.js library. Considering how popular a topic the election is, this graph is limited to showing the top 200 influencers at any given time.
The links being shared
When tweeting, lots of people use links either to support their case with facts or just share funny memes, so obviously, there are many links related to the election floating around. By identifying the most popular links over a 24-hour period, the WSO2 Election Monitor shows another aspect of what people are thinking.
Under the hood, the WSO2 Complex Event Processor counts the number of times a URL appears in the tweets the Election Monitor has collected and the ones with the highest popularity are displayed.
What the media thinks
While its previous components showed us what the people were thinking, the last component of the WSO2 Election Monitor helps you understand what the media is saying. In any election, the media plays a key role in influencing the decisions of voters, and as such it’s important to understand how the media portrays each candidate.
To get an idea of how the media portrays each candidate, the Election Monitor has a line graph that shows the media sentiment over time. This is calculated by collecting the most relevant news stories for a candidate via Google News. Each story is then broken down and the sentences analyzed using the WSO2 Complex Event Processor. By counting the number of positive and negative sentences, a sentiment score is generated, which is used to identify whether an article portrays a candidate in a positive or negative light.
The WSO2 Election Monitor is an amazing system. It collects a massive amount of data every second, processes it, and then spits out easy-to-digest facts that give an overview of the US Presidential Election. However, as the WSO2 Election Monitor team points out, sentiment analysis isn’t perfect. It’s a challenge to obtain 100% accuracy in determining whether a person is saying something good or bad – sarcasm would be an added challenge. Nonetheless, it is a powerful tool, and if you want to get real-time updates from multiple sources to find out what people are saying about the US Presidential Election and what’s happening, it’s is a great place to start.