DVA Unit 3
DVA Unit 3
Unit III
Syllabus
Data Visualization
In today’s world, a lot of data is being generated on a daily basis. And sometimes to
analyze this data for certain trends, patterns may become difficult if the data is in its
raw format.
To overcome this data visualization comes into play.
Data visualization provides a good, organized pictorial representation of the data
which makes it easier to understand, observe, analyze. In this tutorial, we will discuss
how to visualize data using Python.
Data visualization is a field in data analysis that deals with visual representation of
data. It graphically plots data and is an effective way to communicate inferences from
data.
Using data visualization, we can get a visual summary of our data. With pictures, maps
and graphs, the human mind has an easier time processing and understanding any
given data.
Data visualization plays a significant role in the representation of both small and large
data sets, but it is especially useful when we have large data sets, in which it is
impossible to see all of our data, let alone process and understand it manually.
Data Visualization
Python provides various libraries that come with different features for
visualizing data. All these libraries come with different features and can
support various types of graphs. In this tutorial, we will be discussing four
such libraries.
Matplotlib
Seaborn
Bokeh
Plotly
Matplotlib vs Seaborn
Seaborn Matplotlib
It mainly works with datasets and arrays. It works with entire datasets.
Seaborn is considerably more organized and Matplotlib acts productively with data arrays
functional than Matplotlib and treats the entire and frames. It regards the aces and figures as
dataset as a solitary unit. objects.
A Pie Chart is a circular statistical plot that can display only one series of
data.
The area of the chart is the total percentage of the given data.
Pie charts are commonly used in business presentations like sales,
operations, survey results, resources, etc. as they provide a quick
summary.
Pie Chart Advantages and Disadvantages
Pie Chart Advantages
Pie Chart is very useful for finding and representing data. Various advantages of
the pie chart are,
Pie chart is easily understood and comprehended.
Visual representation of data in a pie chart is done as a fractional part of a whole.
Pie chart provides an effective mode of communication to all types of audiences.
Pie chart provides a better comparison of data for the audience.
Pie Chart Disadvantages
There are some disadvantages also of using pie charts and some of them are
added below,
In the case of too much data, this presentation becomes less effective using a pie
chart.
For multiple data sets, we need a series to compare them.
For analyzing and Assimilating the data in a pie chart, it is difficult for readers to
comprehend.
Uses of Pie Chart
Whenever a fraction or fractions are represented as a part of the whole, pie
charts are used. Pie charts are used to compare the data and to analyze
which data is bigger or smaller.
Hence, while dealing with discrete data, pie charts are preferred. Let’s take a
look at the uses of the pie chart:
Pie charts are used to compare the profit and loss in businesses.
In schools, the grades can be easily compared using a pie chart.
The relative sizes of data can be compared using a pie chart.
The marketing and sales data can be compared using a pie chart.
Matplotlib API has pie() function in its pyplot module which create a pie chart
representing the data in an array.
Syntax: matplotlib.pyplot.pie(data, explode=None, labels=None, colors=None,
autopct=None, shadow=False)
Parameters:
data represents the array of data values to be plotted, the fractional area of each
slice is represented by data/sum(data). If sum(data)<1, then the data values
returns the fractional area directly, thus resulting pie will have empty wedge of
size 1-sum(data).
labels is a list of sequence of strings which sets the label of each wedge.
color attribute is used to provide color to the wedges.
autopct is a string used to label the wedge with their numerical value.
shadow is used to create shadow of wedge.
The explode parameter allows you to do that.
The explode parameter, if specified, and not None, must be an array with
one value for each wedge.
Shadow
Add a shadow to the pie chart by setting the shadows parameter to True
Colors
You can set the color of each wedge with the colors parameter.
The colors parameter, if specified, must be an array with one value for
each wedge
Legend
To add a list of explanation for each wedge, use the legend() function:
Scatter plots
Except for x_axis_data and y_axis_data, all other parameters are optional, with
their default values set to None.
Parameters:
x_axis_data: An array containing data for the x-axis.matplotlib
s: Marker size, which can be a scalar or an array of size equal to the size of x or y.
c: Color of the sequence of colors for markers.
marker: Marker style.
cmap: Colormap name.
linewidths: Width of the marker border.
edgecolor: Marker border color.
alpha: Blending value, ranging between 0 (transparent) and 1 (opaque).
Matplotlib.pyplot.scatter() in Python
There are various ways of creating plots using matplotlib.pyplot.scatter()
in Python,
There are some examples that illustrate the matplotlib. pyplot.scatter()
function in matplotlib.plot:
Legend is an area on the graph that describes each element that makes up
the. A graph may be straightforward in the sense that it's.
If we include titles, labels for X, the Y label, and the legend, it will be
clearer.
When we look at the names, we are able to determine what the graph
represents easily and the type of data it represents.
Changing figure Size
Plots are an effective way of visually representing data and summarizing it
beautifully. However, if not plotted efficiently it seems appears
complicated. Python’s Matplotlib provides several libraries for data
representation. While making a plot we need to optimize its size.
Change Plot Size in Matplotlib in Python
There are various ways we can use those steps to set size of plot in
Matplotlib in Python:
Color plays an important role than any other aspect in the visualizations.
When used effectively, color adds more value to the plot.
A palette means a flat surface on which a painter arranges and mixes
paints.
Seaborn provides a function called color_palette(), which can be used to
give colors to plots and adding more aesthetic value to it.
The Different Ways For Using Color_palette() Types
Qualitative
Sequential
Diverging
Qualitative
A qualitative palette is
used when the variable
is categorical in nature,
the color assigned to
each group need to be
distinct. Each possible
value of the variable is
assigned one color
from a qualitative
palette within a plot
Sequential
In sequential palettes color
moved sequentially from a
lighter to a darker. When
the variable assigned to be
colored is numeric or has
inherently ordered values,
then it can be depicted
with a sequential palette
Diverging
When we work on mixed
value like +ve and -
ve(low and high values)
then diverging palette is
the best suit for
visualization.