0% found this document useful (0 votes)

21 views47 pages

R Visualization ADA

The document discusses exploratory data analysis and visualizing data in R. It provides examples of using the plot() function to create basic graphs and customize aspects like markers, lines, labels, and titles. It also discusses other related functions like abline() and par() as well as the grammar of graphics framework for building complex graphs from layers of data.

Uploaded by

HARSHITA RATHORE

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

21 views47 pages

R Visualization ADA

Uploaded by

HARSHITA RATHORE

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 47

Exploratory Data Analysis

Visualizing Data

s.patra@iimkashipur.ac.in

Indian Institute of Management Kashipur

s.patra@iimkashipur.ac.in (Indian Institute of Management

Exploratory
Kashipur)
Data Analysis 1 / 47
plot() Function

x <- 1:10
y <- log(x)
plot(x,y)
2.0
1.5
y

1.0
0.5
0.0

2 4 6 8 10

s.patra@iimkashipur.ac.in (Indian Institute of Management

Exploratory
Kashipur)
Data Analysis 2 / 47
plot() Options
The shape of the markers: The plot markers are by default small, empty circles. These
are also known as plot characters - denoted by pch. Change the shape of the marker by
varying pch values from 0 to 25 (0 is for a square, 1 is for a circle, 3 is for a triangle, 4 is
for a cross and so on).
Size of the plot markers: This aspect of a graph can be controlled using the cex
parameter. The cex parameter can be set to 0.5 if you want the markers to be 50%
smaller and 1.5 if you want them to be 50% larger.
Color of the plot markers: The symbols can be assigned one or many colors. These colors
can be selected from a list provided by R under the colors() function.
Connecting the points with lines: Many times, it is necessary to connect the displayed
points with different kinds of lines. This can be done using the type attribute of the plot
function. The type attribute set to p refers to only points and l to only a line. Similarly,
values b and o are for lines connecting points and overlaying points respectively. To get a
histogram like display the h option is used and s is used for a step option.
Varying the lines: The line type can be specified by the lty parameter (range 0 to 6) and
line width is set using an lwd parameter.

s.patra@iimkashipur.ac.in (Indian Institute of Management

Exploratory
Kashipur)
Data Analysis 3 / 47
R Plot pch Symbols

ggpubr::show_point_shapes()

Point shapes available in R

0 1 2 3 4 5

6 7 8 9 10 11

12 13 14 15 16 17

18 19 20 21 22 23

24 25

s.patra@iimkashipur.ac.in (Indian Institute of Management

Exploratory
Kashipur)
Data Analysis 4 / 47
plot() Options

plot(x,y,pch = c(0,18),cex = 1.5,col = c('red','blue'),type='o',lty = 3,lwd = 2)

2.0
1.5
y

1.0
0.5
0.0

2 4 6 8 10

s.patra@iimkashipur.ac.in (Indian Institute of Management

Exploratory
Kashipur)
Data Analysis 5 / 47
Adding Labels & Title
The main title is added using the main option in the plot function. The font, color, and
size can be customized using the font.main, col.main and cex.main respectively.
The titles for the axes are provided using xlab and ylab attributes. These can be
customized using font.lab,col.lab and cex.lab like above.
You can also add some extra text inside the plot using the text attribute, specifying the
text to use and the coordinates to display.
The text attribute can also be used to label the data points. The text, in this case, is a
vector of labels instead of a string.
The legend can be added to a graph using the R’s legend() function. Legend takes as
input the coordinates, text and the symbols to be interpreted.

labelset <-c('one','two','three','four','five','six','seven','eight','nine','ten')
plot(x,y,pch = c(0,18),cex = 1.5,col = c('red','blue'),type='o',lty = 3,lwd = 2,
main = "Graph of y = log(x) vs Graph of y = x-1", col.main = "purple",
xlab="X Values",ylab="Y Values")
text(x+1,y,labelset,col='red')
lines(x,x-1,col='green',lty = 4, lwd = 2)
legend('bottomright',inset=0.05, c("Log","minus 1"),
lty=c(2,4),col=c("red","green"))
abline(h=c(4,6),col="orange",lty=2)

s.patra@iimkashipur.ac.in (Indian Institute of Management

Exploratory
Kashipur)
Data Analysis 6 / 47
plot() Options

Graph of y = log(x) vs Graph of y = x−1

ten
nine
eight
2.0

seven

six

five
1.5

four
Y Values

three
1.0

two
0.5

Log(x)
x−1
0.0

one

2 4 6 8 10

X Values

s.patra@iimkashipur.ac.in (Indian Institute of Management

Exploratory
Kashipur)
Data Analysis 7 / 47
Other Related Functions
abline(v = 10) draws a straight line at x = 10.
abline(h = 10) draws a straight line at y = 10.
par() function sets the margin by taking option: mar() for margin and oma() for outer margin
area.
For both arguments, you must give four values giving the desired space in the bottom, left, top
and right part of the chart respectively. For instance, par(mar=c(4,0,0,0)) draws a margin of
size 4 only on the bottom of the chart.
par(mfrow = c(2, 2)): Creates a 2 x 2 plotting matrix

dev.off() : closes the specified plot (by default the current device)

s.patra@iimkashipur.ac.in (Indian Institute of Management

Exploratory
Kashipur)
Data Analysis 8 / 47
The Grammar Of Graphics
Grammar: “the fundamental principles or rules of an art or science”
A good grammar will allow us to gain insight into the composition of complicated graphics, and
reveal unexpected connections between seemingly different graphics.
The most important modern work in graphical grammars is “The Grammar of Graphics” by
Wilkinson, Anand, and Grossman (2005). They proposed an alternative parameterization of the
grammar, based around the idea of building up a graphic from multiple layers of data.
The basic idea: independently specify plot building blocks and combine them to create just
about any kind of graphical display you want. Building blocks of a graph include:

s.patra@iimkashipur.ac.in (Indian Institute of Management

Exploratory
Kashipur)
Data Analysis 9 / 47
Components of Layered Grammer
The data that you want to visualise and a set of aesthetic mappings describing how
variables in the data are mapped to aesthetic attributes that you can perceive.
Geometric objects, geoms for short, represent what you actually see on the plot: points,
lines, polygons, etc.
Statistical transformations, stats for short, summarize data in many useful ways. For
example, binning and counting observations to create a histogram, or summarising a 2d
relationship with a linear model. Stats are optional, but very useful.
The scales map values in the data space to values in an aesthetic space, whether it be
colour, or size, or shape. Scales draw a legend or axes, which provide an inverse mapping
to make it possible to read the original data values from the graph.
A coordinate system, coord for short, describes how data coordinates are mapped to the
plane of the graphic. It also provides axes and gridlines to make it possible to read the
graph. We normally use a Cartesian coordinate system, but a number of others are
available, including polar coordinates and map projections.
A faceting specification describes how to break up the data into subsets and how to
display those subsets as small multiples. This is also known as conditioning or
latticing/trellising.

s.patra@iimkashipur.ac.in (Indian Institute of Management

Exploratory
Kashipur)
Data Analysis 10 / 47
Data Layer
library(ggplot2)
df <- read.delim("datasets/marketing_campaign.csv") %>%
drop_na()
ggplot(data = df)

s.patra@iimkashipur.ac.in (Indian Institute of Management

Exploratory
Kashipur)
Data Analysis 11 / 47
Aesthetic Layer (aes)
The aesthetic layer maps variables in our data onto scales in our graphical visualization, such as
the x and y coordinates.

ggplot(data = df, aes(x = Income, y = MntSweetProducts))

200
MntSweetProducts

100

0e+00 2e+05 4e+05 6e+05

Income

s.patra@iimkashipur.ac.in (Indian Institute of Management

Exploratory
Kashipur)
Data Analysis 12 / 47
Geometries Layer (geom_)

ggplot(data = df, aes(x = Income, y = MntSweetProducts)) +

geom_point()

200
MntSweetProducts

100

0e+00 2e+05 4e+05 6e+05

Income

s.patra@iimkashipur.ac.in (Indian Institute of Management

Exploratory
Kashipur)
Data Analysis 13 / 47
Facets Layer

ggplot(data = df, aes(x = Income, y = MntSweetProducts)) +

geom_point() +
facet_wrap(~Education)

2n Cycle Basic Graduation

200

100
MntSweetProducts

0
0e+00 2e+05 4e+05 6e+05
Master PhD

200

100

0
0e+00 2e+05 4e+05 6e+05 0e+00 2e+05 4e+05 6e+05
Income

s.patra@iimkashipur.ac.in (Indian Institute of Management

Exploratory
Kashipur)
Data Analysis 14 / 47
Statistics Layer
ggplot(data = df, aes(x = Income, y = MntSweetProducts)) +
geom_point() +
stat_smooth(method = "lm", se = FALSE)

## ‘geom_smooth()‘ using formula = ’y ~ x’

400

300
MntSweetProducts

200

100

0e+00 2e+05 4e+05 6e+05

Income

s.patra@iimkashipur.ac.in (Indian Institute of Management

Exploratory
Kashipur)
Data Analysis 15 / 47
Statistics Layer
ggplot(data = df, aes(x = Income, y = MntSweetProducts)) +
geom_point() +
facet_wrap(~Education) +
stat_smooth(method = "lm", se = FALSE)

## ‘geom_smooth()‘ using formula = ’y ~ x’

2n Cycle Basic Graduation

400

300

200

100
MntSweetProducts

0e+00 2e+05 4e+05 6e+05

Master PhD

400

300

200

100

0e+00 2e+05 4e+05 6e+05 0e+00 2e+05 4e+05 6e+05

Income

s.patra@iimkashipur.ac.in (Indian Institute of Management

Exploratory
Kashipur)
Data Analysis 16 / 47
Coordinates Layer
ggplot(data = df, aes(x = Income, y = MntSweetProducts)) +
geom_point() +
stat_smooth(method = "lm", se = FALSE) +
coord_cartesian(xlim = c(0, 115000), ylim = c(0, 200))

## ‘geom_smooth()‘ using formula = ’y ~ x’

200

150
MntSweetProducts

100

0 30000 60000 90000 120000

Income

s.patra@iimkashipur.ac.in (Indian Institute of Management

Exploratory
Kashipur)
Data Analysis 17 / 47
Themes Layer
ggplot(data = df, aes(x = Income, y = MntSweetProducts)) +
geom_point() +
stat_smooth(method = "lm", se = FALSE) +
coord_cartesian(xlim = c(0, 115000), ylim = c(0, 200)) +
theme_classic()

## ‘geom_smooth()‘ using formula = ’y ~ x’

200

150
MntSweetProducts

100

0 30000 60000 90000 120000

s.patra@iimkashipur.ac.in (Indian Institute of Management
Exploratory
Kashipur)
Data
Income Analysis 18 / 47
More on Aesthetic Mappings
Adding colour to the chart
While you can do data manipulation in aes(), e.g. aes(log(Income),
log(MntSweetProducts)), best to only do simple calculations.
ggplot(data = df,
aes(x=log(Income), y=MntSweetProducts, col=factor(Teenhome), size = 2)) +
geom_point()

200

factor(Teenhome)
MntSweetProducts

0
1
2

size
100 2

8 10 12
log(Income)

s.patra@iimkashipur.ac.in (Indian Institute of Management

Exploratory
Kashipur)
Data Analysis 19 / 47
More on Aesthetic Mappings
Aesthetic mappings can be supplied in the initial ggplot() call, in individual layers, or in some
combination of both. All of these calls create the same plot specification:

## Specification 1
ggplot(data = df, aes(x = Income,
y = MntSweetProducts,
col=factor(Teenhome),
size = 2)) +
geom_point()
## Specification 2
ggplot(data = df, aes(x = Income, y = MntSweetProducts)) +
geom_point(aes(col=factor(Teenhome), size = 2))
## Specification 3
ggplot(data = df, aes(x = Income)) +
geom_point(aes(y = MntSweetProducts, col=factor(Teenhome), size = 2))
## Specification 4
ggplot(data = df) +
geom_point(aes(x = Income,
y = MntSweetProducts,
col=factor(Teenhome),
size = 2))

s.patra@iimkashipur.ac.in (Indian Institute of Management

Exploratory
Kashipur)
Data Analysis 20 / 47
More on Aesthetic Mappings
## Specification 1
ggplot(data = df, aes(x = Income, y = MntSweetProducts, col=factor(Teenhome))) +
geom_point() +
geom_smooth(method = "lm", se = FALSE) +
coord_cartesian(xlim = c(0, 115000), ylim = c(0, 200))

200

150
MntSweetProducts

factor(Teenhome)
0
100
1
2

0 30000 60000 90000 120000

Income

s.patra@iimkashipur.ac.in (Indian Institute of Management

Exploratory
Kashipur)
Data Analysis 21 / 47
More on Aesthetic Mappings
## Specification 2
ggplot(data = df, aes(x = Income, y = MntSweetProducts)) +
geom_point(aes(col=factor(Teenhome))) +
geom_smooth(method = "lm", se = FALSE) +
coord_cartesian(xlim = c(0, 115000), ylim = c(0, 200))

200

150
MntSweetProducts

factor(Teenhome)
0
100
1
2

0 30000 60000 90000 120000

Income

s.patra@iimkashipur.ac.in (Indian Institute of Management

Exploratory
Kashipur)
Data Analysis 22 / 47
More on Aesthetic Mappings
Setting vs. mapping Colours : Instead of mapping an aesthetic property to a variable, you can
set it to a single value by specifying it in the layer parameters. We map an aesthetic to a
variable (e.g., aes(colour = factor(Teenhome))) or set it to a constant (e.g., colour = “red”).
## Specification 2
ggplot(data = df, aes(x = Income, y = MntSweetProducts)) +
geom_point(aes(col= "red")) +
geom_smooth(method = "lm", se = FALSE) +
coord_cartesian(xlim = c(0, 115000), ylim = c(0, 200))

200

150
MntSweetProducts

colour
100
red

0 30000 60000 90000 120000

Income

s.patra@iimkashipur.ac.in (Indian Institute of Management

Exploratory
Kashipur)
Data Analysis 23 / 47
More on Aesthetic Mappings
## Specification 2
ggplot(data = df, aes(x = Income, y = MntSweetProducts)) +
geom_point() +
geom_smooth(aes(colour = "loess"), method = "loess", se = TRUE) +
geom_smooth(aes(colour = "lm"), method = "lm", se = TRUE) +
coord_cartesian(xlim = c(0, 115000), ylim = c(0, 200)) +
theme_classic()

200

150
MntSweetProducts

colour
100 lm
loess

0 30000 60000 90000 120000

Income

s.patra@iimkashipur.ac.in (Indian Institute of Management

Exploratory
Kashipur)
Data Analysis 24 / 47
Visualizing Amounts: Bar Plot

df %>%
ggplot(aes(x = Education, y = Income)) +
geom_bar(stat = "identity")

6e+07

4e+07
Income

2e+07

0e+00

2n Cycle Basic Graduation Master PhD

Education

s.patra@iimkashipur.ac.in (Indian Institute of Management

Exploratory
Kashipur)
Data Analysis 25 / 47
Visualizing Distribution: Histogram

df %>%
filter(Income < 20000) %>%
ggplot(aes(x=Income)) +
geom_histogram(binwidth=2000, fill="red", color="blue", alpha=0.9)

20
count

0 5000 10000 15000 20000

Income

s.patra@iimkashipur.ac.in (Indian Institute of Management

Exploratory
Kashipur)
Data Analysis 26 / 47
Visualizing Distribution: Density Plot

df %>%
filter(Income < 20000) %>%
ggplot(aes(x=Income)) +
geom_density(fill="green", color="#e9ecef", alpha=0.8)

1.0e−04

7.5e−05
density

5.0e−05

2.5e−05

0.0e+00

5000 10000 15000 20000

Income

s.patra@iimkashipur.ac.in (Indian Institute of Management

Exploratory
Kashipur)
Data Analysis 27 / 47
Visualizing Distribution: Box Plot

df %>%
filter(Income < 20000) %>%
ggplot(aes(x=Education, y=Income)) +
geom_boxplot() +
geom_jitter(color="black", size=0.4, alpha=0.9)

20000

15000
Income

10000

5000

2n Cycle Basic Graduation Master PhD

Education

s.patra@iimkashipur.ac.in (Indian Institute of Management

Exploratory
Kashipur)
Data Analysis 28 / 47
Visualizing Distribution: Cumulative Distribution

df %>%
filter(Income < 20000) %>%
ggplot(aes(x = Income, colour = factor(Teenhome))) +
stat_ecdf()

1.00

0.75

factor(Teenhome)
0
ecdf

0.50
1
2

0.25

0.00

5000 10000 15000 20000

Income

s.patra@iimkashipur.ac.in (Indian Institute of Management

Exploratory
Kashipur)
Data Analysis 29 / 47
Visualizing Distribution: Quantile-Quantile Plot
df %>%
group_by(Dt_Customer) %>%
summarise(No_Customer = n()) %>%
mutate(Date = dmy(Dt_Customer)) %>%
filter(year(Date) == 2014) %>%
ggplot(aes(x = Date, y = No_Customer)) +
geom_line() +
geom_point()

9
No_Customer

Jan Apr Jul

Date

s.patra@iimkashipur.ac.in (Indian Institute of Management

Exploratory
Kashipur)
Data Analysis 30 / 47
More on Facets: facet_grid()

df %>%
filter(Income < 20000) %>%
ggplot(aes(x=Education, y=Income)) +
geom_boxplot() +
geom_jitter(color="black", size=0.4, alpha=0.9) +
facet_grid(factor(Teenhome) ~ factor(Kidhome))

0 1 2
20000

15000

0
10000

5000

20000

15000
Income

1
10000

5000

20000

15000

2
10000

5000

2n Cycle Basic Graduation Master PhD 2n Cycle Basic Graduation Master PhD 2n Cycle Basic Graduation Master PhD
Education

s.patra@iimkashipur.ac.in (Indian Institute of Management

Exploratory
Kashipur)
Data Analysis 31 / 47
Position Adjustments: Stacked Bar Chart
Each geom also has a default position adjustment which specifies a set of “rules” as to how
different components should be positioned relative to each other.

df %>%
ggplot(aes(x = Education, y = Income, fill = factor(Teenhome))) +
geom_bar(stat = "Identity")

6e+07

4e+07

factor(Teenhome)
Income

0
1
2

2e+07

0e+00

2n Cycle Basic Graduation Master PhD

Education

s.patra@iimkashipur.ac.in (Indian Institute of Management

Exploratory
Kashipur)
Data Analysis 32 / 47
Position Adjustments: Grouped Bar Chart

df %>%
ggplot(aes(x = Education, y = Income, fill = factor(Teenhome))) +
geom_bar(stat = "Identity", position = "dodge")

6e+05

4e+05
factor(Teenhome)
Income

0
1
2

2e+05

0e+00

2n Cycle Basic Graduation Master PhD

Education

s.patra@iimkashipur.ac.in (Indian Institute of Management

Exploratory
Kashipur)
Data Analysis 33 / 47
Position Adjustments: Percentage Chart

df %>%
ggplot(aes(x = Education, y = Income, fill = factor(Teenhome))) +
geom_bar(stat = "Identity", position = "fill")

1.00

0.75

factor(Teenhome)
Income

0
0.50
1
2

0.25

0.00

2n Cycle Basic Graduation Master PhD

Education

s.patra@iimkashipur.ac.in (Indian Institute of Management

Exploratory
Kashipur)
Data Analysis 34 / 47
More on Coordinates

200

150
MntSweetProducts

factor(Teenhome)
0
100
1
2

0 30000 60000 90000 120000

Income

s.patra@iimkashipur.ac.in (Indian Institute of Management

Exploratory
Kashipur)
Data Analysis 35 / 47
Axis Transformation
Built in functions for axis transformations are :

scale_x_log10(), scale_y_log10() : for log10 transformation

scale_x_sqrt(), scale_y_sqrt() : for sqrt transformation
scale_x_reverse(), scale_y_reverse(): to reverse coordinates
coord_trans(x =“log10”, y=“log10”) : possible values for x and y are log2 , log10 ,
sqrt, . . .
scale_x_continuous(trans=‘log2’),
scale_y_continuous(trans=‘log2’): another allowed value for the
argument trans is log10
coord_flip(): flips coordinates

A continuous scale will handle things like numeric data (where there is a continuous set of
numbers), whereas a discrete scale (scale_x_discrete())will handle things like colors.

s.patra@iimkashipur.ac.in (Indian Institute of Management

Exploratory
Kashipur)
Data Analysis 36 / 47
Labels & Annotations
Textual labels and annotations (on the plot, axes, geometry, and legend) are an important part
of making a plot understandable and communicating information.

ggplot(df) +
aes(x=Income, y=MntSweetProducts) +
geom_point(aes(col=factor(Teenhome)), size=2) +
scale_x_continuous(breaks=seq(0, 150000, 25000), labels = seq(0,150,25)) +
xlim(c(0, 115000)) +
ylim(c(0, 200)) +
labs(title="Income vs Amount of Sweet Products Bought",
subtitle="Customer dataset",
y="Amount of sweet products",
x="Income (in thousand units)",
color = "Teens at home",
caption="Customer Purchase Behaviour")

s.patra@iimkashipur.ac.in (Indian Institute of Management

Exploratory
Kashipur)
Data Analysis 37 / 47
Labels & Annotations
Income vs Amount of Sweet Products Bought
Customer dataset

200

150
Amount of sweet products

Teens at home
0
100
1
2

0 30000 60000 90000 120000

Income (in thousand units)
Customer Purchase Behaviour

s.patra@iimkashipur.ac.in (Indian Institute of Management

Exploratory
Kashipur)
Data Analysis 38 / 47
Dealing with Colours
ggplot2 allows to customize the shape colors thanks to its fill and color arguments. It is
important to understand the diffence between both. Note that color and colour always have
the same effect.
Methods to call a colour

Name: R offers about 657 color names. You can read all of them using colors().
rgb(red, green, blue, alpha): The rgb() function allows to build a color using a
quantity of red, green and blue. An additionnal parameter (alpha) is available to set the
transparency. All parameters ranged from 0 to 1.
Number: Also possible to call a function by its number. For instance, if you need the
color number 143, use colors()[143].
Hex code → All colors can be defined by their hex code. A hex code looks like this:
#69b3a2. To find the hex code of your colour, visit this colour picker.
Colour Libraries: Rcolorbrewer, paletteer etc.

s.patra@iimkashipur.ac.in (Indian Institute of Management

Exploratory
Kashipur)
Data Analysis 39 / 47
Dealing with Colours: Rcolorbrewer Package
There are 3 types of palettes : Sequential palettes, Diverging palettes and Qualitative palettes.

YlOrRd
YlOrBr
YlGnBu
YlGn
Reds
RdPu
Purples
PuRd
PuBuGn
PuBu
OrRd
Oranges
Greys
Greens
GnBu
BuPu
BuGn
Blues
Set3
Set2
Set1
Pastel2
Pastel1
Paired
Dark2
Accent
Spectral
RdYlGn
RdYlBu
RdGy
RdBu
PuOr
PRGn
PiYG
BrBG

s.patra@iimkashipur.ac.in (Indian Institute of Management

Exploratory
Kashipur)
Data Analysis 40 / 47
Dealing with Colours: RColorBrewer Package

library(RColorBrewer)
ggplot(df) +
aes(x=Income, y=MntSweetProducts) +
geom_point(aes(col=factor(Teenhome)), size=2) +
scale_colour_brewer(palette = "Set1") +
scale_x_continuous(breaks=seq(0, 150000, 25000), labels = seq(0,150,25)) +
xlim(c(0, 115000)) +
ylim(c(0, 200)) +
labs(title="Income vs Amount of Sweet Products Bought",
subtitle="Customer dataset",
y="Amount of sweet products",
x="Income (in thousand units)",
color = "Teens at home",
caption="Customer Purchase Behaviour") +
theme_classic()

s.patra@iimkashipur.ac.in (Indian Institute of Management

Exploratory
Kashipur)
Data Analysis 41 / 47
Dealing with Colours: Rcolorbrewer Package
Income vs Amount of Sweet Products Bought
Customer dataset

200

150
Amount of sweet products

Teens at home
0
100
1
2

0 30000 60000 90000 120000

Income (in thousand units)
Customer Purchase Behaviour

s.patra@iimkashipur.ac.in (Indian Institute of Management

Exploratory
Kashipur)
Data Analysis 42 / 47
Draw a Verical Line

ggplot(df) +
aes(x=Income, y=MntSweetProducts) +
geom_point(aes(col=factor(Teenhome)), size=2) +
scale_colour_brewer(palette = "Set1") +
scale_x_continuous(breaks=seq(0, 150000, 25000), labels = seq(0,150,25)) +
xlim(c(0, 115000)) +
ylim(c(0, 200)) +
geom_vline(xintercept = c(35000,88000), #geom_hline for horizontal
linetype="dotted",
color = "green",
size=1.5) +
labs(title="Income vs Amount of Sweet Products Bought",
subtitle="Customer dataset",
y="Amount of sweet products",
x="Income (in thousand units)",
color = "Teens at home",
caption="Customer Purchase Behaviour") +
theme_classic()

s.patra@iimkashipur.ac.in (Indian Institute of Management

Exploratory
Kashipur)
Data Analysis 43 / 47
Draw a Verical Line
Income vs Amount of Sweet Products Bought
Customer dataset

200

150
Amount of sweet products

Teens at home
0
100
1
2

0 30000 60000 90000 120000

Income (in thousand units)
Customer Purchase Behaviour

s.patra@iimkashipur.ac.in (Indian Institute of Management

Exploratory
Kashipur)
Data Analysis 44 / 47
Add a text annotation at a particular coordinate

ggplot(df) +
aes(x=Income, y=MntSweetProducts) +
geom_point(aes(col=factor(Teenhome)), size=2) +
scale_colour_brewer(palette = "Set1") +
scale_x_continuous(breaks=seq(0, 150000, 25000), labels = seq(0,150,25)) +
xlim(c(0, 115000)) +
ylim(c(0, 200)) +
geom_vline(xintercept = c(35000,88000), #geom_hline for horizontal
linetype="dotted",
color = "green",
size=1.5) +
geom_text(x=5000, y=175, label="Scatter plot")) +
labs(title="Income vs Amount of Sweet Products Bought",
subtitle="Customer dataset",
y="Amount of sweet products",
x="Income (in thousand units)",
color = "Teens at home",
caption="Customer Purchase Behaviour") +
theme_classic()

s.patra@iimkashipur.ac.in (Indian Institute of Management

Exploratory
Kashipur)
Data Analysis 45 / 47
Add a text annotation at a particular coordinate
Income vs Amount of Sweet Products Bought
Customer dataset

200

Scatter plot

150
Amount of sweet products

Teens at home
0
100
1
2

0 30000 60000 90000 120000

Income (in thousand units)
Customer Purchase Behaviour

s.patra@iimkashipur.ac.in (Indian Institute of Management

Exploratory
Kashipur)
Data Analysis 46 / 47
Themes Options
There are three types of elements within the Themes Layer; text, line, and rectangle. Together
these three elements can control all the non-data ink in the graph.

Figure 1: Theme Elements

For more details, visit here.

## Themes Options
s.patra@iimkashipur.ac.in (Indian Institute of Management
Exploratory
Kashipur)
Data Analysis 47 / 47

Exploratory Data Analysis With R
No ratings yet
Exploratory Data Analysis With R
218 pages
DS-R Block 4 All
No ratings yet
DS-R Block 4 All
50 pages
Advance R Prog.-1
No ratings yet
Advance R Prog.-1
24 pages
Lab 02
No ratings yet
Lab 02
28 pages
Module 4-1
No ratings yet
Module 4-1
84 pages
Dar Lecture 7
No ratings yet
Dar Lecture 7
24 pages
Figures With GGPlot
No ratings yet
Figures With GGPlot
58 pages
R For Health Data Science
100% (1)
R For Health Data Science
365 pages
Exploratory Data Analysis Reference
No ratings yet
Exploratory Data Analysis Reference
50 pages
Modern Statistics With R
100% (3)
Modern Statistics With R
580 pages
Module 2 ExploratoryDataAnalysis
No ratings yet
Module 2 ExploratoryDataAnalysis
22 pages
Dav Exps - Merged - Merged
No ratings yet
Dav Exps - Merged - Merged
99 pages
R Programming Unit 3
No ratings yet
R Programming Unit 3
48 pages
Data - Analysis - With - R - 24
No ratings yet
Data - Analysis - With - R - 24
47 pages
UNIT 3 - Exploratory Graphs
No ratings yet
UNIT 3 - Exploratory Graphs
23 pages
Ggplot2 For Data Visualization: Grammer of Graphics "
No ratings yet
Ggplot2 For Data Visualization: Grammer of Graphics "
19 pages
02 Graphs and Chart in R-2012
No ratings yet
02 Graphs and Chart in R-2012
24 pages
Exploratory Data Analysis Course Notes
No ratings yet
Exploratory Data Analysis Course Notes
55 pages
Unit 2
No ratings yet
Unit 2
32 pages
Basics of Data Analysis and Graphics in
No ratings yet
Basics of Data Analysis and Graphics in
103 pages
Lecture 10 R
No ratings yet
Lecture 10 R
117 pages
IDS Unit-5
No ratings yet
IDS Unit-5
39 pages
DSCI Key Terms and Ideas For Review
No ratings yet
DSCI Key Terms and Ideas For Review
98 pages
Chapter - 03 - Review of Basic Data
No ratings yet
Chapter - 03 - Review of Basic Data
92 pages
P6ADBMS
No ratings yet
P6ADBMS
34 pages
Network Analysis and Visualization With R and Igraph
No ratings yet
Network Analysis and Visualization With R and Igraph
62 pages
Unit 4
No ratings yet
Unit 4
27 pages
DA Lab Week-2
No ratings yet
DA Lab Week-2
22 pages
Book - Roger D Peng-Exploratory Data Analysis With R-Leanpub (2015) PDF
0% (1)
Book - Roger D Peng-Exploratory Data Analysis With R-Leanpub (2015) PDF
125 pages
DV - Unit 2
No ratings yet
DV - Unit 2
73 pages
STA 272 Chapter 02 Notes and Codes Data Frames in R
No ratings yet
STA 272 Chapter 02 Notes and Codes Data Frames in R
5 pages
Exploratory Data Analysis Reference
100% (2)
Exploratory Data Analysis Reference
49 pages
R Commands
No ratings yet
R Commands
18 pages
GPT-9000 User Manual - EN Rev G 201712
No ratings yet
GPT-9000 User Manual - EN Rev G 201712
183 pages
R
No ratings yet
R
13 pages
Exploratory Data Analysis With R-Leanpub PDF
No ratings yet
Exploratory Data Analysis With R-Leanpub PDF
125 pages
R Tutorial
No ratings yet
R Tutorial
15 pages
RCC Notes Module 1
No ratings yet
RCC Notes Module 1
11 pages
Lecture 7 - Integrated Analysis With R
No ratings yet
Lecture 7 - Integrated Analysis With R
79 pages
Graphics
No ratings yet
Graphics
10 pages
Prism
No ratings yet
Prism
21 pages
Matematika BAB 5 Graphic in R
No ratings yet
Matematika BAB 5 Graphic in R
6 pages
Exdata
No ratings yet
Exdata
184 pages
TNX Tower Manual
No ratings yet
TNX Tower Manual
265 pages
MIS 4.hafta (Introduction To R)
No ratings yet
MIS 4.hafta (Introduction To R)
52 pages
Personal Values - Mark Manson
No ratings yet
Personal Values - Mark Manson
52 pages
Book - Roger D Peng-Exploratory Data Analysis With R-Leanpub (2015) PDF
No ratings yet
Book - Roger D Peng-Exploratory Data Analysis With R-Leanpub (2015) PDF
125 pages
Quick-R - Graphical Parameters
No ratings yet
Quick-R - Graphical Parameters
4 pages
Exploratory Data Analysis With R PDF
No ratings yet
Exploratory Data Analysis With R PDF
125 pages
Exploratory Data Analysis and Data Visualization: Credits: Chrisvolinsky - Columbia University
No ratings yet
Exploratory Data Analysis and Data Visualization: Credits: Chrisvolinsky - Columbia University
49 pages
Introduction To Matlab Lecture Advanced Data Analysis Jan2012
No ratings yet
Introduction To Matlab Lecture Advanced Data Analysis Jan2012
50 pages
Photographic Superimpositions
100% (1)
Photographic Superimpositions
10 pages
Apunts BLOC 1 Estadística
No ratings yet
Apunts BLOC 1 Estadística
15 pages
Muthayammal College of Arts and Science Rasipuram: Assignment No - 1
No ratings yet
Muthayammal College of Arts and Science Rasipuram: Assignment No - 1
10 pages
2 R - Zajecia - 4 - Eng
No ratings yet
2 R - Zajecia - 4 - Eng
7 pages
German Language Learning This Book Includes Learn German For Beginner PDF
100% (2)
German Language Learning This Book Includes Learn German For Beginner PDF
534 pages
Julia Gadfly Reference Card 0.1
No ratings yet
Julia Gadfly Reference Card 0.1
1 page
Geo3701 Unit 2
No ratings yet
Geo3701 Unit 2
59 pages
Matter and Measurement: Theodore L. Brown H. Eugene Lemay, Jr. and Bruce E. Bursten
No ratings yet
Matter and Measurement: Theodore L. Brown H. Eugene Lemay, Jr. and Bruce E. Bursten
48 pages
Civil Engineering Important Questions
No ratings yet
Civil Engineering Important Questions
8 pages
Importing The Files
No ratings yet
Importing The Files
14 pages
Advertisement For 2024 - 2025
No ratings yet
Advertisement For 2024 - 2025
2 pages
Chemistry Investigatory Project
33% (3)
Chemistry Investigatory Project
11 pages
0471 Thermal Insulation and Pliable Membranes
No ratings yet
0471 Thermal Insulation and Pliable Membranes
9 pages
Listening 3
No ratings yet
Listening 3
4 pages
Chapter 6-Leading
No ratings yet
Chapter 6-Leading
27 pages
Graph Plotting in R Programming
No ratings yet
Graph Plotting in R Programming
12 pages
Ecological Concepts in Buildings-A Case Study in Bangalore
No ratings yet
Ecological Concepts in Buildings-A Case Study in Bangalore
6 pages
Lesson One - Inclusive Education - Supplimentary Notes
No ratings yet
Lesson One - Inclusive Education - Supplimentary Notes
10 pages
FT 1000 - FP 1000 - TG L111e
No ratings yet
FT 1000 - FP 1000 - TG L111e
12 pages
Bilal Khan Paper
No ratings yet
Bilal Khan Paper
18 pages
Proposition
No ratings yet
Proposition
6 pages
Product Conformity Certificate - O2000 Oxygen Analyser
No ratings yet
Product Conformity Certificate - O2000 Oxygen Analyser
9 pages
Eoa Peg-4000 (En) Msds
No ratings yet
Eoa Peg-4000 (En) Msds
7 pages
Marking Criteria: End-Of-Term Exams For English 5 Speaking Exam: 30% (Five Tests)
No ratings yet
Marking Criteria: End-Of-Term Exams For English 5 Speaking Exam: 30% (Five Tests)
7 pages
San Chit
No ratings yet
San Chit
2 pages
Persuasive Essay Layout
100% (2)
Persuasive Essay Layout
3 pages
Genome Organization in E. Coli
No ratings yet
Genome Organization in E. Coli
7 pages
1.develop A Program To Draw A Line Using Bresenham's Line Drawing Technique
No ratings yet
1.develop A Program To Draw A Line Using Bresenham's Line Drawing Technique
1 page
Standard Operating Procedure Title: Determination of PH GTP Number Supersedes Standard Effective Date
No ratings yet
Standard Operating Procedure Title: Determination of PH GTP Number Supersedes Standard Effective Date
2 pages
Learning Area Grade Level 7 Quarter Date: English 4
No ratings yet
Learning Area Grade Level 7 Quarter Date: English 4
4 pages
Daily Time Record Daily Time Record: A.M. P.M. A.M. P.M
No ratings yet
Daily Time Record Daily Time Record: A.M. P.M. A.M. P.M
1 page
Symbolic Mathematics in Data Science. Algebra, Calculus, and Geometry with Matlab
From Everand
Symbolic Mathematics in Data Science. Algebra, Calculus, and Geometry with Matlab
César Pérez López
No ratings yet
IGNOU BCA Introduction to Algorithm Design Previous Year Unsolved Papers BCS 042
From Everand
IGNOU BCA Introduction to Algorithm Design Previous Year Unsolved Papers BCS 042
Manish Soni
No ratings yet
Graphs with MATLAB (Taken from "MATLAB for Beginners: A Gentle Approach")
From Everand
Graphs with MATLAB (Taken from "MATLAB for Beginners: A Gentle Approach")
Peter Kattan
4/5 (2)
Mastering Data Structures and Algorithms in C and C++
From Everand
Mastering Data Structures and Algorithms in C and C++
Sachin Naha
No ratings yet
Raster Graphics Editor: Transforming Visual Realities: Mastering Raster Graphics Editors in Computer Vision
From Everand
Raster Graphics Editor: Transforming Visual Realities: Mastering Raster Graphics Editors in Computer Vision
Fouad Sabry
No ratings yet
Vector Graphics Editor: Empowering Visual Creation with Advanced Algorithms
From Everand
Vector Graphics Editor: Empowering Visual Creation with Advanced Algorithms
Fouad Sabry
No ratings yet
Line Drawing Algorithm: Mastering Techniques for Precision Image Rendering
From Everand
Line Drawing Algorithm: Mastering Techniques for Precision Image Rendering
Fouad Sabry
No ratings yet
Scanline Rendering: Exploring Visual Realism Through Scanline Rendering Techniques
From Everand
Scanline Rendering: Exploring Visual Realism Through Scanline Rendering Techniques
Fouad Sabry
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.