Data Visualizing With Ai
Data Visualizing With Ai
WITH AI
VALAN P
Introduction What is data
Data visualization visualization?
Visualization process
important in reports long-term memory. information in a visual, the results are much clearer (see
the graph below).
80
• The human mind can see an image for just 13 mil-
liseconds and store the information, provided that it is
75 62
56 58
60
associated with a concept. Our eyes can take in 36,000
visual messages per hour.
45
40
• 40% of nerve fibers are connected to the retina.
36
20
200
We’ll close this introduction with a 2012 reflection by Alberto Cairo, a specialist in
100
information visualization and a leader in the world of data visualization. For the
0 author, a good visual must provide clarity, highlight trends, uncover patterns, and
United Russia South Europe Canada Australia Japan
States Africa reveal unseen realities:
2) Exploring
We create visuals so that users can analyze data and, from it, dis- cover
realities that not even the designer, in some instances, had
Some visuals are designed to lend a data set spatial dimensions, or to offer numerous
considered.”
subsets of data in order to raise questions, find answers, and discover opportunities.
When the goal of a visual is to explore, the viewers start by familiarizing themselves
with the dataset, then identifying an area of interest, asking questions, exploring, and 2 Available at: https://www.fusioncharts.com/whitepapers/downloads/Principles-of-Data-Visualization.pdf
finding several solutions or answers. 3 Available at: http://www.r2d3.us/visual-intro-to-machine-learning-part-1/
Data types, relationships,
and visualization formats
netquest.com 8
Data types, 2 kinds of data
relationships, and Before we talk about visuals themselves, we must first understand the different
visualization formats kinds of data that can be visualized and how they relate to one another.
The most common kinds of data are4:
5 Source: Hubspot, Prezy, and Infogram (2018). Presenting Data People Can’t
Ignore: How to Communicate Effectively Using Data. | p.10 of 16 | Available at:
https://offers.hubspot.com/presenting-data-people-cant-ignore.
7 data relationships
Data relationships can be simple, like the progress of a single metric over time (such as visits to a blog over the course of 30 days or the number of users on a social network),
or they can be complex, precisely comparing relationships, revealing structure, and extracting patterns from data. There are seven data relationships to consider:
Ranking: A visualization that relates two or more values Series over time: Here we can trace the changes in the
Nominal comparisons: Visualizations that compare
with respect to a relative magnitude. For example: a values of a constant metric over the course of time. For
quantitative values from different subcategories. For
company’s most sold products. example: monthly sales of a product over the course of two
example: product prices in various supermarkets.
years.
Deviation: Examines how each data point relates to the Distribution: Visualization that shows the distribu-
others and, particularly, to what point its value differs tion of data spatially, often around a central value.
from the average. For example: the line of deviation for For example: the heights of players on a basketball team.
tickets to an amusement park sold on a rainy versus a Partial and total relationships: Show a subset of data
normal day. as compared with a larger total. For example: the per-
centage of clients that buy specific products.
11 formats 1. Bar chart
There are two types of visualizations: static and Bar charts are one of the most popular ways of visual- izing They are very versatile, and they are typically used
interactive. Their use depends on the search and data because they present a data set in a quickly to compare discrete categories, to analyze changes
analysis dimension level. Static visuals can only understood format that enables viewers to identify highs over time, or to compare parts of a whole.
analyze data in one dimension, whereas inter- and lows at a glance. The three variations on the bar chart are:
active visuals can analyze it in several.
5,500
5,000
4,500 Jan
Education
4,000
3,500
3,000
Feb
2,500 Entertainment
2,000
1,500
Mar
1,000 Heatlh
500
0
Jan Feb Mar Apr May 0% 20% 40% 60% 80% 100% 0% 20% 40% 60% 80% 100%
2. Histograms
400K
Histograms represent a variable in the form of bars, where
>120
the surface of each bar is proportional to the frequency of 350K
60-80
• Vertical columns 100K
3. Pie charts
0.4
10.000
00
0.2 5.000 5.0 0
.00
10 0
.00
0 0 15 0
.00
0 20 0
0.2 0.4 0.6 0.8 1.0 1.2 .00
25 00
.0
30 0
Scatter plot Scatter plot with grid 35.00
0
.00
40
5. Heat maps 45
.00
0
A
• Mosaic diagram
1 2 3 4 5 6 0% 10% 30% 50% 70% 100%
• Color map
Mosaic diagram Color map
6. Line charts 7. Bubble charts 8. Radar charts
These are used to display changes or trends in data These graphics display three-dimensional data and These are a form of representation built around a regular
over a period of time. They are especially useful for accentuate data in dispersion diagrams and maps. polygon that is contained within a circle, where the radii
showcasing relationships, acceleration, deceleration, and Their purpose is to highlight nominal comparisons and that guide the vertices are the axes over which the values
volatility in a data set. classification relationships. The size and color of the are represented. They are equivalent to graphics with parallel
bubbles represent a dimension that, along with the coordinates on polar coordinates. Typically, they are used to
data, is very useful for visually stressing specific values. represent the behavior of a metric over the course of a set
The two variations on the bubble chart are: time cycle, such as the hours of the day, months of the year,
or days of the week.
• The bubble plot: used to show a variable in three
dimensions, position coordinates (x, y) and size.
Line chart
• Bubble map: used to visualize three-dimensional
values for geographic regions.
Radar chart
9. Waterfall charts
400K
These help us understand the cumulative effect
350K
of positive and negative values on variables in a
sequential fashion. 300K
250K
200K
150K
100K
50K
0
Start A B C D E F G H I J K L End
Fall Rise
A
Tree maps display hierarchical data (in a tree struc- B C
A
ture) as a set of nested rectangles that occupy sur-
200
face areas proportional to the value of the variable
they represent. Each tree branch is given a rectangle, E H
1.0
through our visualizations is no easy task. Stephen
These represent the relationship of a series over Few (2009), a specialist in data visualization, proposes
time, but unlike line charts, they can represent 0.8 taking a practical approach to selecting and using an
volume. The three variations on the area chart are: appropriate graphic:
0.6
• Standard area: used to display or compare a pro- • Choose a graphic that will capture the viewer’s
gression over time. 0.4 attention for sure.
• Stacked area: used to visualize relationships as part
of the whole, thus demonstrating the contribution of 0.2 • Represent the information in a simple, clear, and
each category to the cumulative total. precise way (avoid unnecessary flourishes).
0
• 100% stacked area: used to communicate the dis-
1 2 3 4 5 6
tribution of categories as part of a whole, where the • Make it easy to compare data; highlight trends
cumulative total does not matter. Standard area and differences.
0.8
0.8 • Give the viewer a clear way to explore the
graphic and understand its goals; make use of
0.6
0.6 guide tags.
0.4
0.4
0.2
0.2
0
0
0 1 2 3 4 5 6
0 1 2 3 4 5 6
A B C
The goal of data visualizations is to help us understand 2. Zoom and filter: The second step involves supple- menting 2.
the object they represent. They are a medium for com- the first so that viewers understand the data’s underlying
municating stories and the results of research, as well structure. The zoom in/zoom out mechanism enables us to
Containers
as a platform for analyzing and exploring data. There- select interesting subsets of data that meet certain criteria
The overall shape of the archi-
tecture and technology choices.
fore, having a sound understanding of how to create while maintaining the sense of position and context.
data visualizations will help us create meaningful and
easy-to-remember reports, infographics, and dash-
boards. Creating suitable visuals helps us solve problems
and analyze a study’s objects in greater detail.
3. Details on demand: This makes it possible to select
a narrower subset of data, enabling the user to interact
3.
Components ZOOM AND
with the information and use filters by hovering or click-
FILTER
The first step in representing information is trying ing on the data to pull up additional information. Logical components and their
to understand that data visualization. interactions within a container.
The chart on the right side summarizes the key points to
Ben Shneiderman gave us a useful starting point in his designing such a graphic, with an eye to human visual
text “The Visual Information-Seeking Mantra” (1996), perception, so that users can translate an idea into a set 4.
which remains a touchstone work in the field. This of physical attributes.
Classes DETAILS ON
author suggests a simple methodology for novice users DEMAND
Component or pattern imple-
to delve into the world of data visualization and experi- These attributes are: structure, position, form size,
mentation details.
ment with basic visual representation tasks.5 and color. When properly applied, these attributes can
5 Shneiderman, B. (1996). The Eyes Have It: A Task by Data Type Taxonomy for
Information Visualizations. Visual Information Seeking Mantra (p. 336). Available at:
present information effectively and memorably.
https://www.cs.umd.edu/~ben/papers/Shneiderman1996eyes.pdf
Layout and design: Structuring: the importance
Furthermore, the visual hierarchy of elements plays a role in
of layout
communicative this encoding process, because the elements’ organization
and distribution must have a well-defined hierarchical system
elements All visual representations begin with a blank dimensional in order to communicate effec- tively (Meirelles: 2014). In a
space that will eventually hold the information which will be sense, visualizations are paragraphs about data, and
communicated. The process of spatial coding is a fundamental they should be treated as such. Words, images, and
part of visual representation because it is the medium in which numbers are part of the information that will be visualized.
In order to begin designing our reports and state- ments, it the results of our compositional decisions and the meaning of When all of the elements are integrated in a single structure
is essential to understand that visual repre- sentations are our visual statement will be visualized, thereby having an and visual hierarchy, the infographic or report will organize
cognitive tools that complement and strengthen our mental impact on the user. space properly and communicate effectively, according to
ability to encode and decode your user’s needs.
information6. Meirelles (2014) notes that: “All Edward Tufte (1990) defines “layout” as a scheme for
graphic distributing visual elements in order to achieve organi- zation
representation affects our visual perception, and harmony in the final composition. Layout planning and
because the elements of transmission utilized act design serve as a template for applying hierarchy and control
as external stimuli, which activate our emotional to information at varying levels of
state and knowledge.” detail.7 In his book Envisioning Information, Tufte offers
Thus, when our mind visualizes a representation, it several guidelines for information design:
6 Meirelles, I (2014). “La información en el diseño,” (p.21-22). Barcelona: Parramón. 7 Tufte, E. (1990). Envisioning Information. Cheshire: Graphics Press.
Visual variables
and their semantics
Cool colors
Saturation: this refers to the intensity of a given color’s
hue. It varies based on brightness. Darker colors are less
saturated, and the less saturated a color is, the closer
it gets to gray. In other words, it gets closer to a neutral Saturated colors
(hueless) color. The following graphic offers a brief sum-
mary of color application.
Isabel Meirelles (2014) notes that selecting a color pal- 2. Diverging palettes TIP: The qualitative color scheme is perfect for visualiz-
ette in order to visualize data is no easy task, and she ing data because it affords a high degree of contrast and
recommends following Cynthia Brewer’s advice uses These are more suitable for ordering categorical data, helps you draw attention to important points, especially
three different kinds of color schemes, based on the and they are more effective when the categorical if you use one predominant color and use the second as
nature of the data: division is in the middle of the sequence. The change in an accent in your design.
brightness highlights a critical value in the data, such as
the mean or median, or a zero. Colors become darker to
1. Monochromatic sequential palettes or represent differences in both directions, based on this Finally, don’t forget to use palettes that are comprehen- sible
their analogue meaningful value in the middle of the data. to people who can’t see color. Color blindness is a disability or
limited ability that makes it difficult to distin- guish certain pairs
These palettes are great for ordering numeric data that of colors, such as blue and yellow, or red and green. One
progresses from small to large. It is best to use brighter strategy for avoiding this problem is to adapt designs that use
color gradients for low values and darker ones for more than just hue to codify information; create schemes that
higher values. TIP: Try to emphasize the most important information slightly vary another channel, such as brightness or
using arrows and text, circles, rectangles, or contrasting saturation.
colors. This way, when you visualize your data, your
analysis will be more understandable.
64% 63%
in understanding and limit Notebooks 55% 54%
unnecessary tagging
Entertainment
Symbols and icons are another avenue for visualizing Lifestyle products
information that goes beyond merely being decorative.
They draw strength from their ability to exhibit a gen- Singles Couples Families
eral context in an attractive, precise way. Icons illustrate
concepts. Viewers can understand what the information
is about by just glancing at the illustration.
Alexander Skorka (2018), chief evangelist for the Dapresy Singles Couples Families
Group, recommends using symbols and icons because they
simplify communication. Symbols are self-ex- planatory, and Notebooks
82% 76% 63%
our mind can process icons more easily than text. It is
important to consider that an icon’s success depends largely
Entertainment
on cultural context, so it is important to select universally 55% 64% 88%
understandable images.
Lifestyle products
That said, they certainly should not be complex illustra- 77% 73% 54%
The basic elements of the visualization process also involve preattentive attributes. Preattentive attributes are visual
features that facilitate the rapid visual perception of a graphic in a space. Designers use these characteristics to
better uncover relevant information in visuals, because these characteristics attract the eye.
Colin Ware, Director of the Data Visualization Research Lab at the University of New Hampshire, has highlighted
that preattentive attributes can be used as resources for drawing viewers’ immediate attention to certain
parts of visual representations (2004). According to Ware, preattentive processing happens very quickly—typi-
cally in the first 10 milliseconds. This process is the mind’s attempt to rapidly extract basic visual characteristics from
the graphic (stage 1). These characteristics are then consciously processed, along with the perception of the object,
so that the mind can extract patterns (stage 2), ultimately enabling the information to move to the highest level of
perception (stage 3). This makes it possible to find answers to the initial visual question, utilizing the information
saved in our minds. Colin Ware, cited in Meirelles (2014), explains it as follows:
Preattentive attributes enhance object perception and cognition processes, leveraging our mind’s visual capacities.
Good data visualizations deliberately make use of these attributes because they boost the mind’s discovery and rec-
ognition of patterns such as lines, planes, colors, movements, and spatial positioning.9
9 Dondis, D.A. (2015). La sintaxis de la imagen: introducción al alfabeto visual. Editorial Gustavo Gili: Barcelona
Meirelles, I. (2014). La información en el diseño. Barcelona: Parramón.
The visual below lists preattentive attributes that represent
aspects of lines and planes when visualizing and analyzing
graphic representation: shape, color, and spatial position.
Shape
Orientation Line Length Line Width Size
Gestalt’s principles
Data Visualizing with AI
29
Storytelling for social
communication
29
Storytelling for As we saw at the beginning of this ebook, our mind tends to The triune model is a valuable tool for effectively com-
communication communication, and it is inherent in every human being. potential buyers. Understanding and mastering this
theory enables us to extract information not just from
the neocortex, but from the reptilian and emotional
We cannot live
brains as well. This can be useful for qualitative market
research methodology, since it utilizes a host of different
without communicating, techniques, including in-depth interviews, ethnographic
without expressing our research, and focus groups. This information is essential
if we are aiming towards a scientific framework to talk
personalities, emotions, about neuromarketing.
and moods, our worries How, then, can we create stories that use data to
and fears. communicate insights? Below, we explain three simple
sequences for telling a story:
Paul Maclean, cited in María Alejandra Rendón (2009),
proposes a “Triune brain” theory, which addresses • Influencing people’s emotions by telling a story
the structure and behavior of the human mind. For (drawing in their attention).
Maclean, the mind consists of three inseparable parts • Persuading them through benefits that cover specific
(or distinct brains); none of the three functions inde- needs (benefits/engagement).
pendently or separately. They are the reptilian brain, the • Moving on to concrete steps (call to action).
emotional brain, and the neocortex.
If you can successfully visualize this sequence, you
The reptilian brain is home to our unconscious, also known understand the foundation of all narratives. What that
as our instinctive side. It manages survival and our body’s means is that every story we try to tell has a beginning,
self-regulation. The second part, the emo- tional brain, is a developed plot, and a resolution, all building up to
responsible for our emotional processes and basic the invaluable call to action. If you have a clear notion
motivations. Last but not least, the neocor- tex is our more of how to include the “story” element in your reports,
rational, complex side. It is in charge of driving our systematic statements, and dashboards, you will successfully create
and logical thinking. stories that use your data to share insights.
Data storytelling
We all love good stories, and data is one of the best What do we get when we
tools for telling them. Millions of pieces of data are combine these elements?
generated every day. They could be converted into
great stories, but instead they are left unused. It’s time to
change all that. It’s time to start telling stories that draw
their power from data. Data + Narrative Data + Visualization + Narration =
So-called “data storytelling” is nothing more than Data can be insights; they are drawn from study and
Successfully using our data to tell a
placing a structured focus on the way we use data to analysis. Their nature can propose the narrative context. story, wield influence, and effect the
communicate insights. It relies on three key elements: desired change.
narrative, visualization, and data.
Visualization + Data
Data The story must motivate. It must have a plot, highs and
lows, and an arc of emotional connection in order to
draw in and entertain our audience.