Tuesday, November 8, 2016

US Electorate

With the presidential election cycle coming to an end I decided to write a post about the application and use of data in the US electorate. The article I am focusing on is “FAQ: Analyzing Social Data to Understand the US Electorate” by wired magazine. This article focuses on the social analytics firm called “Networked Insights” who is using data to gauge the feelings and intentions of the American electorate. I will speak briefly about how the data is found and analyzed and what signals they use to understand the data. Networked Insights has an analytics engine named “Kairos” which “processes unstructured data from millions of sites, blogs, and social platforms like Twitter and Tumblr. Billions of public posts are then analyzed and classified across 25,000 topics, emotions, and demographics—turning noisy social data into insights.” They built 4 metrics into this engine “Awareness, Positivity, Negativity and Intent”. They use a combination of Boolean classifiers, language classifiers and machine learning to interpret the true meaning of the posts
There are a few issues with this system and it lies in fact that language is innately human. Tone is hard to understand while communicating over the internet. Even for people who know the individual who creates the post, the tone is not always clear. This system needs to process mass amounts of data and broadly represent the emotions and intentions of the individual and when communication is hard to understand for the friends of the poster, I find it hard to believe this system can accurately decrypt their underlying message. The article describes their system and the combination of classifiers they use to decrypt and analyze messages but language in such an unstructured form cannot be accurately represented.
Another issue I have with this is the system can misrepresent true voting and emotions. They use a variety of platforms to source their material however we have to look at who the people posting are. My grandparents do not have any social media and the elderly are known to be one of the most active age groups when it comes to voting. They have a system in place where they try to keep a diverse group and avoid selection bias but the fact remains that the amount of people who are more active on social media and other platforms are usually younger.
One thing we also have to look at is slang and the role of the slang into their system. Slang and acronyms are constantly used in social media and words have a different meaning to different people. People from Boston may say that “Trump is wicked” meaning Trump is awesome to a Bostonian but to the system it could show that he is a wicked individual. Acronyms also play a role because a 3 letter acronym could mean a wide variety of things.

            This system has the ability to show the emotions in broad strokes but it disenfranchises citizens and cannot accurately demonstrate the emotions.

"FAQ: Analyzing Social Data to Understand the US Electorate." Wired.com. Conde Nast Digital, n.d. Web. 08 Nov. 2016.


  1. I had previously written a blog post about Hillary Clinton and her use of Big Data in her campaign so I specifically chose to read and comment on Kevin’s post to see the use of Big Data in a different aspect of the election cycle. To quickly recap, the main idea of my first blog post was the benefit of using Big Data in a campaign and the opportunity loss of not using it as a resource. Hillary Clinton gained traction throughout her campaign, which indicated that she had been using Big Data, and had been using it well, but as we have seen from the the results of the election it was not an indicator of her demise. With this in mind, I strongly agree with Kevin’s criticism of Networked Insights’ analytics engine, “Kairos.” Analytics are not necessarily reliable due to its mathematical nature. Computers are not people, so no matter how much data or formulas the computer holds, they cannot always accurately predict the behavior or actions of humans, especially because humans are always changing and emotions cannot be predicted. Kevin is also correct in the fact that tone is hard to interpret over the Internet. Most of the time, I can get a grasp of the tone and meaning behind the things that my friends post on social media or via messaging, but it is not always crystal clear and I end up questioning what they actually meant. If I, as an actual person, cannot determine a definite tone, how could an analytics engine accurately analyze and interpret it? Lastly, I would also like to comment on the point that was made about slang. I would never have thought of that while reading the article, but it is a very fair point. Is Kairos equipped to handle a situation in which a word in the dictionary is given a different context in the various American geographic cultures? It is an interesting point to think about the possible errors in calculation due to slang words.
    Also, another flaw that I would like to add on to Kevin’s criticism is that the people who are posting on social media or on the Internet, are the people who are more outgoing or active in politics or have strong feelings towards an issue or person. A person who would rather not voice their opinions on the Internet are also not considered in these analytics despite whether or not they have strong opinions or feelings.

  2. As an individual who has a great interest in politics, I found Kevin’s post thought provoking and interesting. The concept of using the social interactions of the largest voting block to gauge possible electoral outcomes is quite brilliant. As Kevin acknowledged, “millennials” are not only the largest demographic able to vote, but they are also the largest demographic that engages in social media usage at a ridiculously high volume. Analyzing the millions of tweets, posts, comments and re-blogs can give pundits, analysts and campaign strategists granular insight into the heartbeat of the American people.

    Kevin did a great job of pointing out a major concern of mine when he talked about only younger voters being extremely active on social media. As we know, elderly generations are the most active voting group that exists in our nation. While this use of data is highly insightful, there’s a good chance that the most likely voters voice will be overlooked because of their lack of social engagement on these platforms. As pointed out in the post, this could give those analyzing the data a false sense of what the electorate is thinking and feeling.

    One thing that’s always in the news is how millennials are an undependable group when it comes to voter turnout. This is a startling fact based on how large of a group they are. An aspect of data usage that the article didn’t touch on is how strategists can work to stem and hopefully reverse this trend. In the aftermath of the election last week, I have seen countless people on both sides of the aisle post their opinions on Facebook and other social media platforms. While many of these individuals may have cast a vote for either of the four candidates, an overwhelming number of peers in their generation did not and. It would be beneficial to candidates of all parties to analyze these potential voters and understand why they don’t vote, so they can try to appeal to them and convince them that they are the right person for the job.


Note: Only a member of this blog may post a comment.