With the presidential election cycle coming to an end I decided to write a post about the application and use of data in the US electorate. The article I am focusing on is “FAQ: Analyzing Social Data to Understand the US Electorate” by wired magazine. This article focuses on the social analytics firm called “Networked Insights” who is using data to gauge the feelings and intentions of the American electorate. I will speak briefly about how the data is found and analyzed and what signals they use to understand the data. Networked Insights has an analytics engine named “Kairos” which “processes unstructured data from millions of sites, blogs, and social platforms like Twitter and Tumblr. Billions of public posts are then analyzed and classified across 25,000 topics, emotions, and demographics—turning noisy social data into insights.” They built 4 metrics into this engine “Awareness, Positivity, Negativity and Intent”. They use a combination of Boolean classifiers, language classifiers and machine learning to interpret the true meaning of the posts
There are a few issues with this system and it lies in fact that language is innately human. Tone is hard to understand while communicating over the internet. Even for people who know the individual who creates the post, the tone is not always clear. This system needs to process mass amounts of data and broadly represent the emotions and intentions of the individual and when communication is hard to understand for the friends of the poster, I find it hard to believe this system can accurately decrypt their underlying message. The article describes their system and the combination of classifiers they use to decrypt and analyze messages but language in such an unstructured form cannot be accurately represented.
Another issue I have with this is the system can misrepresent true voting and emotions. They use a variety of platforms to source their material however we have to look at who the people posting are. My grandparents do not have any social media and the elderly are known to be one of the most active age groups when it comes to voting. They have a system in place where they try to keep a diverse group and avoid selection bias but the fact remains that the amount of people who are more active on social media and other platforms are usually younger.
One thing we also have to look at is slang and the role of the slang into their system. Slang and acronyms are constantly used in social media and words have a different meaning to different people. People from Boston may say that “Trump is wicked” meaning Trump is awesome to a Bostonian but to the system it could show that he is a wicked individual. Acronyms also play a role because a 3 letter acronym could mean a wide variety of things.
This system has the ability to show the emotions in broad strokes but it disenfranchises citizens and cannot accurately demonstrate the emotions.
"FAQ: Analyzing Social Data to Understand the US Electorate." Wired.com. Conde Nast Digital, n.d. Web. 08 Nov. 2016.