10.1201 9781003200154-4 Chapterpdf
10.1201 9781003200154-4 Chapterpdf
4
Introduction to Social Media
Analytics
1. Introduction
Social Media Analytics is the art and science of extracting valuable hidden
insights from vast amounts of semi-structured and unstructured social media data
to enable informed and insightful decision-making. It is considered a science,
as it involves systematically identifying, extracting, and analyzing social media
data (such as tweets, shares, likes, and hyperlinks) using sophisticated tools and
techniques. It is also an art, interpreting and aligning the insights gained with
business goals and objectives. To get value from analytics, one should master both
its art and science.
The science part of social media analytics requires skilled data analysts,
sophisticated tools and technologies, and reliable data. Getting the science right,
however, is not enough. To effectively consider the results and put them into
action, the business must master the other half of analytics, that is, the art of
interpreting and aligning analytics with business objectives and goals. Interpreting
analytics results, for example, requires representing the data in meaningful ways,
having domain-specific knowledge, and training. Analytics should be strategically
aligned to support existing business goals. Without a well-crafted and aligned
social media strategy, the business will struggle to get the desired outcomes
from analytics.
over the Internet horizon in July 2006, and interest in it (that is people searching
for it) has steadily increased since then.
Google trends also show that majority of the interest in social media analytics
is coming from India, Canada, the United States and the United Kingdom, and
users who are interested in social media analytics also searched for a variety of
topics including Google analytics, social media marketing, social media tools,
marketing analytics, Facebook analytics, Twitter analytics, and social media
management. A full list of social media related terms is shown in Table 1. The
scoring shown in Table 1 is on a relative scale where a value of 100 is the most
• Who are our influential social media nodes (e.g., people and organizations)
and their position in the network?
• Which social media platforms are driving the most traffic to our corporate
website?
• Where is the geographical location of our social media customers?
• Which keywords and terms are trending over social media?
• How active is social media in our business and how many people are
connected with us?
• Which websites are linked to our corporate website?
• How are my competitors doing on social media?
When it comes to social media data and using it to generate business value,
the statement at the beginning of the chapter can be no more than correct. In the
context of social media, seeing is no more than believing, rather analyzing it is.
In other words, business (and social and political) decisions should be based on
digging deep into the social media data rather than just by believing what we see
over social media. On the social web, each second tons of data are generated,
which may carry potential business insights; however, not all the social data is
gold. A vast amount of social data is either fake or useless. To separate good data
from bad, social media analytics coupled with human judgement is the answer.
Another visible difference comes from the way the information (i.e., text,
photographs, videos, audio, etc.) is created and consumed. Social media data
originates from the public Internet and is socialized by nature. Socialized data is
provided for the collective good. It is created and consumed using various social
media platforms and social technologies to maintain social and professional
ties (e.g., Facebook, LinkedIn, etc.), and to facilitate knowledge sharing and
management (Wikipedia, blogs, etc.). Socialized data creates awareness (i.e.,
Twitter), or is used to exchange information in the form of text, audio, video,
documents, and graphics, to name a few (Khan 2013). Social media data is social,
informal, and not bound (i.e., the Internet is a boundary), unlike conventional
analytics data, which is bureaucratic, formal in nature, controlled by organizations,
and bound or trapped within the organizational network or intranet. More
importantly, the value or impact of socialized data is determined by the extent
to which it is shared with other social entities (e.g., people or organizations): the
more it is shared (i.e., socialized) the greater its value.
For example, the value/effect of information can be measured by attracting
more followers (e.g., on Twitter or Facebook). Another measure is page views or
clicks, or regarding socio-political impact (e.g., information disseminated using
social media to organize political or social movements may have more effect
regarding organizing the events). However, the majority of the conventional
business data is confined within organizational databases, limitedly shared, and
can serve as a source of competitive advantage.
70 Social Media Analytics in Predicting Consumer Behavior
The following are the six general steps, at the highest level of abstraction,
that involves both the science and art of achieving business value from social
media data.
Step 1: Identification
The identification stage is the art part of the social media analytics value creation
process which is concerned with searching for and identifying the right source
of information for analytical purposes. The numbers and types of users and
information (such as text, conversation, and networks) available over social
media are huge, diverse, multilingual, and noisy. Thus, framing the right question
and knowing what data to analyze is extremely crucial in gaining useful business
Introduction to Social Media Analytics 77
insights. The source and type of data to be analyzed should be aligned with
business objectives.
Most of the data for analytics will come from business-owned social media
platforms, such as an official Twitter account, Facebook fan pages, blogs, and
YouTube channels. Some data for analytics, however, will also be harvested from
nonofficial social media platforms, such as Google search engine trends data or
Twitter’s publicly available search stream data. The business objectives that need
to be achieved will play a major role in identifying the sources and type of data
to be mined.
Step 2: Extraction
Once a reliable and mineable source of data is identified, next comes the
extraction of the data. The type (e.g., text, numerical, or network) and size of
data will determine the method and tools suitable for extraction. A small-size of
numerical information, for example, can be extracted manually (e.g., going to
your Facebook fan page and counting likes and copying comments), and a large-
scale automated extraction is done through an API (application programming
interface). Manual data extraction may be practical for small-scale data, but it
is the API-based extraction tools that will help us get the most out of various
social media platforms. Mostly, the social media analytics tools use API-based
data mining.
APIs, in simple words, are sets of routines/protocols that social media service
companies (e.g., Twitter and Facebook) have developed that allow users to access
small portions of data hosted in their databases. The greatest benefit of using
APIs is that it allows other entities (e.g., customers, programmers, and other
organizations) to build apps, widgets, websites, and other tools based on open
social media data. Some data, such as social networks and hyperlink networks,
can only be extracted through specialized tools.
Two important issues to bear in mind here are the privacy and ethical issues
related to mining data from social media platforms. Privacy advocacy groups have
long raised serious concerns regarding the large-scale mining of social media
data and warned against transforming social spaces into behavioral laboratories.
The social media privacy issue first came into the spotlight particularly due to
the large-scale “Facebook Experiment” carried out in 2012. In this experiment,
Facebook manipulated the news feeds feature of thousands of people to see if
emotion contagion occurs without face-to-face interaction (and absence of
nonverbal cues) between people in social networks (Kramer et al. 2014). Though
the experiment was consistent with Facebook’s Data Use Policy (Verma 2014) and
helped to promote our understanding of online social behavior, it does, however,
raise serious concerns regarding obtaining informed consent from participants
and allowing them to opt out.
The bottom line here is that data extraction practices should not violate a
user’s privacy and the data extracted should be handled carefully. The policies
78 Social Media Analytics in Predicting Consumer Behavior
should explicitly detail social media ownership regarding both accounts and
activities such as individual and page profiles, platform content, posting activity,
data handling, and extraction, etc.
Step 3: Cleaning
This step involves removing the unwanted data from the automatically extracted
data. Some data may need cleaning, while other data can go into analysis directly.
In the case of text analytics cleaning, coding, clustering, and filtering may be
needed to get rid of irrelevant textual data using natural language processing
(NPL). Coding and filtering can be performed by machines (i.e., automated) or
can be carried out manually by humans. For example, Discovertext combines
both machine learning and human coding techniques to code, cluster, and classify
social media data (Shulman 2014).
Step 4: Analyzing
At this stage, the clean data is analyzed for business insights. Depending on the
layer of social media analytics under consideration and the tools and algorithm
employed, the steps and approaches to take will greatly vary. For example, nodes
in a social media network can be clustered and visualized in a variety of ways
depending on the algorithm employed. The overall objective at this stage is to
extract meaningful insights without the data losing its integrity.
While most analytics tools lay out a step-by-step procedure to analyze social
data, having background knowledge and an understanding of the tools and their
capabilities are crucial in arriving at the right answers.
Step 5: Visualization
Data or information visualization is the process of converting numerical data into
a graphical (or visual) format to reveal hidden patterns and casual relationships
in data to help facilitate business decision making. In fact, data visualization is
the use of computer-supported, interactive, visual representations of abstract
data to amplify human understanding (Card et al. 1999), thus enabling us to gain
knowledge about the hidden internal structures and causal relationships in data.
The notion of using visuals to understand data and information is not new.
The use of maps and graphs has been around since the 17th century. However, with
the advent of computer programs and affordable tools data visualization is easy
to accomplish with minimum effort and skill. Thanks to power data visualization
tools (such as Tableau and SAS Visual Analytics) anyone can process and visualize
a large amount of data with a click of a button in no time.
7. Visual Analytics
Visual analytics is the science of analytical reasoning facilitated by interactive
visual interfaces (Thomas and Cook 2005). Visual analytics is becoming an
important part of interactive decision making facilitated by solid visualization
Introduction to Social Media Analytics 79
Temporal data (when)—Temporal data visualization slice and dice data on a time
horizon and can reveal longitudinal trends, patterns, and relationships hidden in
the data. Google Trends data, for example, can visually investigate longitudinal
search engine trends.
Geospatial data (where)—Geospatial data visualization is used to map and locate
data, people, and resources.
One of the interesting data visualization techniques employed is word
cloud. Word clouds are images composed of words, used in a particular text or
subject, where the size of each word indicates its frequency or importance. Many
Text Analytics platforms produce Word Clouds based on an analysis of the text
examined. Wordle is the most well-known word cloud software and is free to use
at Wordle.net. Words are stacked into a box or some other shape with the largest
words being the most prevalent though the information is mostly descriptive.
Other forms of visualizations include trees, hierarchical, multidimensional
(chart, graphs, tag clouds), 3-D (dimension), computer simulation, infographics,
flows, tables, heat maps, plots, etc.
Step 6: Interpretation or Consumption
This step relies on human judgments to interpret valuable knowledge from the
visual data. Meaningful interpretation is of particular importance when we are
dealing with descriptive analytics that leaves room for different interpretations.
While companies are quickly mastering sophisticated analytical methods, skills,
and techniques needed to convert big data into information, there seems to be a
gap between an organization’s capacity to produce analytical results and its ability
to effectively consume it. For example, a study of 2,719 business executives,
managers, and analytics professionals from the world found that the greatest
problem with creating business value from analytics is not data management issues
or complex data modeling skills. But it was translating analytics into business
actions and making business decisions based on the results, not producing the
results per se (Kiron et al. 2014).
The study also reported that there are three levels of analytical maturity in
organizations:
1. Analytically Challenged: These organizations lack sophisticated data
management and analytical skills and generally rely more on management
experience in decision making.
2. Analytical Practitioners: Such organizations tend to use analytics for
operational purposes, have “just good enough data” and are working to
become more data driven.
3. Analytical Innovators: Analytical innovators organizations are more strategic
in their use and application of analytics, place greater value on data, and
Introduction to Social Media Analytics 81
have higher levels of data management and analytical skills. These are the
organizations that are the most successful in translating analytics results into
business actions and decisions making.
Figure 6. Current VS., potential use of social media analytics (Khan 2018)
Table 4 lists some example tools on each layer of social media analytics. These
tools can be used to measure different layers of social media data, especially when
aligned with an organization’s business strategy. The Digital Methods Initiative
(DMI) Internet Studies research group provides a compresence list of methods
and tools for social media data (the list can be accessed using this shortening
URL: goo.gl/EiTWi
Table 4. Examples of social media analytics tools with respect to its layers
Data Tools
Text Discovertext
Lexalytics
Tweet Archivist
Twitonomy
Netlytic
LIWC
Voyant
Actions Lithium
Twitonomy
Google Analytics
SocialMediaMineR
Network NodeXL
UCINET
Pajek
Netminer
Flocker
Netlytic
Reach
Mentionmapp
Mobile Countly
Mixpanel
Google Mobile Analytics
Location Google Fusion Table
TweepsMap
Trendsmap
Followerwonk
Esri Maps
Agos
(Contd.)
86 Social Media Analytics in Predicting Consumer Behavior
Table 4. (Contd.)
Data Tools
Hyperlinks Webometrics Analyst
VOSON
Research Engines Google Trends
Google Correlate
Multimedia Crimsonhexagon Image Analytics
YouTube Analytics
SAS Visual Analytics
Google Cloud Vision API
Simply Measured
References
Asur, S. and B.A. Huberman. 2010. Predicting the future with social media. 2010 IEEE/
WIC/ACM International Conference on Web Intelligence and Intelligent Agent
Technology, pp. 492–499: 10.1109/WI-IAT.2010.63
Bekmamedova, N. and G. Shanks. 2014. Social media analytics and business value: A
theoretical framework and case study. 2014 47th Hawaii International Conference on
System Sciences, pp. 3728–3737: 10.1109/HICSS.2014.464
Berinato, S. 2016. Visualizations that really work. Harvard Business Review (June 2016).
Card, S., J. Mackinlay and B. Shneiderman. 1999. Readings in Information Visualization:
Using Vision to Think. Morgan Kaufmann Publishers.
Chen, H., R.H.L. Chiang and V.C. Storey. 2012. Business intelligence and analytics: From
big data to big impact. MIS Q., 36(4), 1165–1188.
Delen, D. and H. Demirkan. 2013. Data, information and analytics as services. Decision
Support Systems, 55(1), 359–363. http://dx.doi.org/10.1016/j.dss.2012.05.044
Keim, D., G. Andrienko, J-D. Fekete, C. Görg, J. Kohlhammer and G. Melancon. 2008.
Visual analytics: Definition, process, and challenges. pp. 154–175. In: J.T.S. Andreas
Kerren, Jean-Daniel Fekete and Chris North [Eds.]. Information Visualization. Vol.
4950, Berlin, Heidelberg Springer-Verlag.
Khan, G.F. 2013. Social media-based systems: An emerging area of information systems
research and practice. Scientometrics, 95(1), 159–180. 10.1007/s11192-012-0831-5
Khan, G.F. 2018. Creating Value with Social Media Analytics: Managing, Aligning, and
Mining Social Media Text, Networks, Actions, Location, Apps, Hyperlinks, Multimedia,
Search Engines Data. CreateSpace Independent Publishing Platform, April 23, 2018.
Kiron, D., P.K. Prentice and R.B. Ferguson. 2014. Raising the bar with analytics. MIT
Sloan Management Review, https://sloanreview.mit.edu/article/raising-the-bar-with-
analytics/
Kramer, A.D.I., J.E. Guillory and J.T. Hancock. 2014. Experimental evidence of massive-
scale emotional contagion through social networks. Proceedings of the National
Academy of Sciences, 111(24), 8788–8790. 10.1073/pnas.1320040111
Introduction to Social Media Analytics 87
Lustig, I., B. Dietrich, C. Johnson and C. Dziekan. 2010. The analytics journey: An IBM
view of the structured data analysis landscape: Descriptive, predictive and prescriptive
analytics. Analytics-Magazine, available at: http://www.analytics-magazine.org/
november-december-2010/54-the-analytics-journey.
MarketsandMarkets. 2016. Social Media Analytics Market worth 9.54 Billion USD by
2022. https://www.marketsandmarkets.com/PressReleases/social-media-analytics.asp:
Petersen, R. 2012. 166 Cases Studies Prove Social Media Marketing ROI. BarnRaisers.
Syncapse. (2013). THE VALUE OF A FACEBOOK FAN 2013: Revisiting Consumer Brand
Currency in Social Media. New York, NY.
Thomas, J.J. and K.A. Cook. 2005. Illuminating the Path: The Research and Development
Agenda for Visual Analytics. IEEE Press.
Verma, I.M. 2014. Editorial Expression of Concern: Experimental evidence of massive
scale emotional contagion through social networks. Proceedings of the National
Academy of Sciences, 111(29), 10779. 10.1073/pnas.1412469111
Wong, P.C. and J. Thomas. 2004. Visual Analytics. IEEE Computer Graphics &
Applications, 24, 20–21.