R-Unit 4
R-Unit 4
UNIT – 4
1
Histogram
• A histogram represents the frequencies of values of a variable bucketed into ranges.
• Histogram is similar to bar chart but the difference is it groups the values into continuous ranges.
• Each bar in histogram represents the height of the number of values present in that range.
Syntax:
hist(v,main,xlab,xlim,ylim,breaks,col,border) #ylab is optional
3
Temperature <- airquality$Temp
hist(Temperature)
4
# Create data for the graph.
v <- c(9,13,21,8,36,22,12,41,31,33,19)
5
# Create data for the graph.
v <- c(9,13,21,8,36,22,12,41,31,33,19)
6
Histogram with non-uniform width
hist(Temperature,
main="Maximum daily temperature",
xlab="Temperature in degrees Fahrenheit",
xlim=c(50,100),
col="chocolate",
border="brown",
breaks=c(55,60,70,75,80,100)
)
7
Barplot
• A barplot is a tool to visualize the distribution of a qualitative variable. We draw a barplot of
the qualitative variable size
8
Bar Chart Labels, Title and Colors
# Create the data for the chart
R <- c(7,12,28,3,41)
M <- c("Mar","Apr","May","Jun","Jul")
9
Group Bar Chart and Stacked Bar Chart
# Create the input vectors.
colors = c("green","orange","brown")
months <- c("Mar","Apr","May","Jun","Jul")
regions <- c("East","West","North")
10
# Create the input vectors.
colors = c("green","orange","brown")
months <- c("Mar","Apr","May","Jun","Jul")
regions <- c("East","West","North")
# Create the matrix of the values.
Values <- matrix(c(2,9,3,11,9,4,8,7,3,12,5,2,8,10,11), nrow = 3, ncol = 5, byrow = TRUE)
barplot(Values, main = "total revenue", names.arg = months, xlab = "month", ylab = "revenue", col = colors, beside=TRUE)
# Add the legend to the chart
legend("topleft", regions, cex = 1.3, fill = colors)
11
Try the following attributes in Barplot
width=c( )
space=n
axes=FALSE
legend=c()
horiz=TRUE
12
Team A Team B Team C Team D Team E
Round 1 34 56 12 89 67
Round 2 12 56 78 45 90
Round 3 14 23 45 25 89
13
Boxplots
• Boxplots are a measure of how well distributed is the data in a data set. It divides the data set into three
quartiles.
• This graph represents the minimum, maximum, median, first quartile and third quartile in the data set.
• It is also useful in comparing the distribution of data across data sets by drawing boxplots for each of
them
• Syntax
boxplot(x, data, notch, varwidth, names, main)
14
Boxplots
15
Boxplots
16
Boxplots
In this example, the data set "mtcars" available in the R environment to create a basic
boxplot.
17
Boxplot with Notch
boxplot(mpg ~ cyl, data = mtcars,
xlab = "Number of Cylinders",
ylab = "Miles Per Gallon",
main = "Mileage Data",
notch = TRUE,
varwidth = TRUE,
col = c("green", "red", "blue"),
names = c("High", "Medium", "Low")
)
18
Use the mtcars dataset available in R by default and produce the following plot.
19
PlantGrowth is a built-in Rdataset. has 30 rows and two columns. The “weight” column represents the dry
biomass of each plant in grams, while the “group” column describes the experimental treatment that each
plant was given.
The
'data.frame': 30 obs. of 2 variables:
$ weight: num 4.17 5.58 5.18 6.11 4.5 4.61 5.17 4.53 5.33 5.14 ...
$ group : Factor w/ 3 levels "ctrl","trt1",..: 1 1 1 1 1 1 1 1 1 1 ...
20
boxplot(weight ~ group,
data = PlantGrowth,
main = "PlantGrowth data",
xlab = "Treatment Group",
ylab = "Dried Biomass Weight",
col = "red",
boxlty = 0,
border="green",
whisklty = 3,
whisklwd = 1.5,
whiskcol="purple",
staplelwd = 1.5,
staplecol="pink",
horizontal = TRUE)
21
Contingency Table
22
The following table of values shows a sample of 2300 music listeners classified by age,
education and whether they listen to classical music.
23
Mosaic Plots
• Mosaic plots give a graphical representation of these successive decompositions.
• Counts are represented by rectangles.
• At each stage of plot creation, the rectangles are split parallel to one of the two axes
24
Old Versus Young
25
Education Level Music Listening
26
The following table gives the details of number of participants for an event.
Plot this data using Mosaic plot
participationdata = c(425,1667,45,64)
dim(participationdata) = c(2, 2)
dimnames(participationdata) =list(Sex = c("Female", "Male"), Age = c("Adult", "child"))
participationdetails<-as.table(participationdata) # To change sex vs age to age vs.sex
print(participationdata)
mosaicplot(~ Age + Sex, data=participationdetails,col="light yellow",main="Participation Details")
27
Plot the following Vaccination data using Mosaic plot
Age
Survived Adult child
No 109 17
Yes 316 28
, , Sex = Male
Age
Survived Adult child
No 1329 35
Yes 338 29 29
music = c(210, 194, 190, 406,
170, 110, 730, 290)
dim(music) = c(2, 2, 2)
dimnames(music) =list(Age = c("Old", "Young"),
Listen = c("Yes", "No"),
Education = c("High", "Low"))
print(music)
mosaicplot(music, col = "steelblue1", main = "Classical Music Listening")
, , Education = High
Listen
Age Yes No
Old 210 190
Young 194 406
, , Education = Low
Listen
Age Yes No
Old 170 730
Young 110 290 30
The Titanic dataset available in the base installation. It describes the number of passengers who survived or died, cross-
classified by their class (1st, 2nd, 3rd, Crew), sex (Male, Female), and age (Child, Adult). Create a mosaic plot to
represent the Titanic data.
31
Produce a histogram with for the gear field of mtcars dataset.
hist(gear).
32
Binomial Distribution
Binomial distribution in R is a probability distribution used in statistics.
The binomial distribution is a discrete distribution and has only two outcomes i.e.
success or failure.
All its trials are independent, the probability of success remains the same and the
previous outcome does not affect the next outcome
Formula:
33
Functions for Binomial Distribution
There are four functions for handling binomial distribution :
Probability Density Function: dbinom(x, size, prob)
Cumulative Distribution Function: pbinom(x, size, prob,lower.tail = FALSE))
Quantile Function: qbinom(p, size, prob)
Generating random numbers: rbinom(n, size, prob)
where,
x is a vector of numbers.
p is a vector of probabilities.
n is number of observations.
size is the number of trials.
prob is the probability of success of each trial.
34
dbinom(): This function gives the probability density distribution at each point.
pbinom(): This function gives the cumulative probability of an event. It is a single value
representing the probability.
Examples
Find the Probability of getting 10 or less heads from 25 tosses of a coin.
pbinom(10,25,0.5) #pbinom(x, size, prob)
0.2121781
36
rbinom(): This function generates required number of random values of given probability
from a given sample.
Example:
#to draw n random observations from a binomial distribution
rbinom(10,25,0.5)
8 14 10 12 10 14 16 7 13 12
37
#Binomial distribution for tossing a coin
cat("probability of winning exactly 19 times out of 25 tosses", dbinom(19, 25, 0.5),"\n")
cat("probability of getting 10 or less heads from 25 tosses",pbinom(10,25,0.5),"\n") #P(X <= x)
#P(X > x)
c at("probability of more than 10 heads from 25 tosses",pbinom(10,25,0.5,lower.tail=FALSE),"\n")
cat("binomial quantile for the probability 0.4",qbinom(0.25,25,0.5),"\n")
cat("binomial quantile for the probability 1-0.4",qbinom(0.25,25,0.5,lower.tail=FALSE),"\n")
#to draw n random observations from a binomial distribution
cat("10 random observations",rbinom(10,25,0.5),"\n")
2. Subha flips a fair coin 20 times. What is the probability that the coin lands on heads exactly 7
times?
#find the probability of 7 successes during 20 trials where the probability of
#success on each trial is 0.5
dbinom(x=7, size=20, prob=.5)
# [1] 0.07392883
39
3. Raju flips a fair coin 5 times. What is the probability that the coin lands on heads more
than 2 times?
#find the probability of more than 2 successes during 5 trials where the
#probability of success on each trial is 0.5
pbinom(2, size=5, prob=.5, lower.tail=FALSE)
# [1] 0.5
4. Suppose a bowler scores a strike on 30% of his attempts when he bowls. If he bowls
10 times, what is the probability that he scores 4 or fewer strikes?
40
Examples:
Find the 10th quantile of a binomial distribution with 10 trials and probability of success
on each trial = 0.4
Find the 40th quantile of a binomial distribution with 30 trials and probability of success
on each trial = 0.25
41
Generate a vector that shows the number of successes of 10 binomial experiments with 100 trials
where the probability of success on each trial is 0.3.
results <- rbinom(10, size=100, prob=.3)
results
# [1] 31 29 28 30 35 30 27 39 30 28
Generate a vector that shows the number of successes of 1000 binomial experiments with 100 trials
where the probability of success on each trial is 0.3.
results <- rbinom(1000, size=100, prob=0.3)
42
Suppose there are twelve multiple choice questions in an English class quiz. Each question has
five possible answers, and only one of them is correct. Find the probability of having exactly 4
correct answers by random attempts as follows.
#Since only one out of five possible answers is correct, the probability of answering a question
#correctly by random is 1/5=0.2.
dbinom(4, size=12, prob=0.2)
[1] 0.1329
Find the probability of having four or less correct answers if a student attempts to answer every
question at random.
43
A Hospital database displays that the patients suffering from cancer, 65% recover of it.
What will be the probability that of 5 randomly chosen patients out of which 3 will recover?
A bowler scores a wicket on 20% of his attempts when he bowls. If he bowls 5 times, what
would be the probability that he scores 4 or lesser wicket?
44
Suppose you have a large population of students that’s 50% female. If students are
assigned to classrooms at random, and you visit 100 classrooms with 20 students each,
then how many girls might you expect to see in each classroom?
rbinom(100,20,0.5)
45
Bernoulli Distribution
Bernoulli Distribution is a special case of Binomial distribution where only a single trial is
performed. It is a discrete probability distribution for a Bernoulli trial (a trial that has only
two outcomes i.e. either success or failure)
The Bernoulli distribution is a special case of the binomial distribution with n=1
The base installation of R does not provide any Bernoulli distribution functions. For that
reason, we need to install and load the Rlab add-on package first.
46
dbern()
dbern( ) function in R programming measures density function of Bernoulli distribution.
Parameter:
x: vector of quantiles
prob: probability of success on each trial
log: logical; if TRUE, probabilities p are given as log(p)
Example:
A researcher is waiting outside of a library to ask people if they support a certain law. The
probability that a given person supports the law is p = 0.2. What is the probability that the
fourth person the researcher talks to is the first person to support the law?
Solution:
dgeom(x=3, prob=.2)
#0.1024
The probability that the researchers experiences 3 “failures” before the first success is
49
0.1024.
The pgeom function finds the probability of experiencing a certain amount of failures or
less before experiencing the first success in a series of Bernoulli trials, using the following
syntax:
pgeom(q, prob)
where:
q: number of failures before first success
prob: probability of success on a given trial
50
Example:1
A researcher is waiting outside of a library to ask people if they support a certain law.
The probability that a given person supports the law is p = 0.2. What is the probability
that the researcher will have to talk to 3 or less people to find someone who supports
the law?
pgeom(q=3, prob=.2)
#0.5904
Example:2
A researcher is waiting outside of a library to ask people if they support a certain law.
The probability that a given person supports the law is p = 0.2. What is the probability
that the researcher will have to talk to more than 5 people to find someone who
supports the law?
1 - pgeom(q=5, prob=.2) or pgeom(q=5, prob=.2, lower.tail=FALSE)
#0.262144 51
qgeom
The qgeom function finds the number of failures that corresponds to a certain
percentile, using the following syntax:
qgeom(p, prob)
where:
p: percentile
prob: probability of success on a given trial
Example:
A researcher is waiting outside of a library to ask people if they support a certain law. The
probability that a given person supports the law is p = 0.2. We will consider a “failure” to
mean that a person does not support the law. How many “failures” would the researcher
need to experience to be at the 90th percentile for number of failures before the first
success?
qgeom(p=.90, prob=0.2)
#10
52
rgeom
The rgeom function generates a list of random values that represent the number of
failures before the first success, using the following syntax:
rgeom(n, prob)
where:
n: number of values to generate
prob: probability of success on a given trial
53
Example:
A researcher is waiting outside of a library to ask people if they support a certain law. The
probability that a given person supports the law is p = 0.2. We will consider a “failure” to
mean that a person does not support the law. Simulate 10 scenarios for how many
“failures” the researcher will experience until she finds someone who supports the law.
# 1 2 1 10 7 4 1 7 4 1
During the first simulation, the researcher experienced 1 failure before finding someone
who supported the law.
During the second simulation, the researcher experienced 2 failures before finding
someone who supported the law.
During the third simulation, the researcher experienced 1 failure before finding someone
who supported the law.
During the fourth simulation, the researcher experienced 10 failures before finding
someone who supported the law. 54
A sports marketer randomly selects persons on the street until he encounters someone
who attended a game last season. What is the probability the marketer encounters 3
people who did not attend a game before the first success when p = 0.20 of the
population attended a game?
Simulate the above scenario for 10 times to find how many “failures” the marketer will
experience until he finds someone who attended a game last season.
55
What is the probability the marketer fails to find someone who attended a game in less
than or equal to 5 trials before finding someone who attended a game on the next trial
when the population probability is p = 0.20?
What is the probability the marketer fails to find someone who attended a game on greater
than 5 trials before finding someone who attended a game on the next trial when the
population probability is p = 0.20?
56
Consider a production line having 3.5 % defective rate. Let X denote the number of non-
defective products before first defective product.
(a) Find the probability that the there will be 3 non-defective products before first
defective.
(b) Find the probability that there will be at most 3 non-defective products before first
defective.
(c) Find the probability that there will be at least 3 non-defective products before first
defective.
(d) What is the probability that 3 to 5 (inclusive) non-defective products before first
defective product?
(e) What is the value of c, if P(X≤c)≥0.60? (or) Find the 60th quantile of given
geometric distribution
57
(f) Simulate 100 Geometric distributed random variables with prob=0.35.
a) The probability that the there will be 3 non-defective products before first defective is
dgeom(3,0.35)
b) The probability that there will be at most 3 non-defective products before first
defective[P(X≤3)] is pgeom(3,0.35) (or) sum(dgeom(0:3), 0.35)
c) The probability that there will be at least 3 non-defective products before first defective
[P(X≥3)] is pgeom(2, 0.35,lower.tail=FALSE)
e) we need to find the value of c such a that P(X≤c)≥0.60. That is we need to find the 60th
quantile of given Geometric distribution: qgeom(0.60, 0.35)
It is known that 20% of products on a production line are defective. Products are inspected until first defective is
encountered. Let X = number of inspections to obtain first defective .
what is the minimum number of inspections, that would be necessary so that the probability of observing a
defective is more that 75%?
Choose k so that P(X ≤ k) ≥ .75.
qgeom(.75, .2)
59
Poisson Distribution
Poisson Distribution deals with the probability distribution of data values taking the mean
into consideration.
That is, it estimates the probability value for a set of cases with specific trails or events that
happens at a customized yet constant mean value.
60
The Poisson probability function with mean λ can be calculated with the R dpois function
for any value of x. The following block of code summarizes the arguments of the
function:
dpois(x, lambda)
Example:
It is known that a certain website makes 10 sales per hour. In a given hour, what is the
probability that the site makes exactly 8 sales?
dpois(x=8, lambda=10)
Output
0.112599
61
The probability of a variable X following a Poisson distribution taking values equal or
lower than x can be calculated with the ppois funtion, which arguments are described
below:
syntax
ppois(q, lambda,lower.tail = TRUE) # If TRUE, probabilities are P(X <= x), or P(X > x)
Example:
It is known that a certain website makes 10 sales per hour. In a given hour, what is the
probability that the site makes 8 sales or less?
ppois(8, lambda = 10)
# 0.3328197
62
It is known that a certain website makes 10 sales per hour. In a given hour, what is the
probability that the site makes more than 8 sales?
63
The qpois function
The R qpois function allows obtaining the corresponding Poisson quantiles for a set of
probabilities. The qpois function finds the number of successes that corresponds to a
certain percentile based on an average rate of success,
qpois syntax
qpois(p, lambda, lower.tail = TRUE)
Example:
It is known that a certain website makes 10 sales per hour. How many sales would the
site need to make to be at the 90th percentile for sales in an hour?
qpois(p=.90, lambda=10)
#14
64
The rpois function
To draw n observations from a Poisson distribution the rpois function can be used. The
following block of code summarizes the arguments of the function.
rpois syntax
rpois(n, lambda)
Example:
Generate a list of 15 random variables that follow a Poisson distribution with a rate
of success equal to 10.
rpois(n=15, lambda=10)
65
Data from the maternity ward in a certain hospital shows that there is a historical average
of 4.5 babies born in this hospital every day. What is the probability that 6 babies will be
born in this hospital tomorrow?
dpois(6, 4.5)
rpois(365, 4.5)
66
Consider that the number of visits on a web page is known to follow a Poisson distribution
with mean 15 visits per hour.
What is the probability of getting
a) 10 or less visits per hour?
b) The probability of getting more than 20 visits per hour, P(X > 20)
c) The probability of getting less than 15 visits per hour? P(X<15)
d) The probability of getting 10 to 20 visits per hour.
d) ppois(20, lambda = 15) - ppois(10, lambda = 15) (or) sum(dpois(11:20, lambda = 15)) #
Equivalent
67
Problem
If there are twelve cars crossing a bridge per minute on average, find the probability of
having seventeen or more cars crossing the bridge in a particular minute.
68
Exponential distribution
The exponential distribution is a continuous probability distribution used to model the time
or space between events in a Poisson process. where the events occur continuously and
independently at a constant rate λ. It is a probability distribution that is used to model the
time we must wait until a certain event occurs.
Function Description
Exponential density
dexp
(Probability density function)
Exponential distribution
pexp
(Cumulative distribution function)
dexp syntax
dexp(x, # X-axis values (> 0)
rate = 1, # Vector of rates (lambdas)
) # If TRUE, probabilities are given as log
Example:
#To calculate the exponential density function of rate 2 for a grid of values in R you can
type:
# Grid of values
x <- seq(from = 0, to = 8, by = 0.01)
71
The pexp function
The R function that allows you to calculate the probabilities of a random variable XX taking
values lower than x is the pexp function, which has the following syntax:
pexp syntax
pexp(q,
rate = 1,
lower.tail = TRUE, # If TRUE, probabilities are P(X <= x), or P(X > x) otherwise
log.p = FALSE)
Example:
The probability of the variable (of rate 1) taking a value lower or equal to 2 is 0.8646647:
pexp(2) # 0.8646647
72
Examples:
The time spent on a determined web page is known to have an exponential distribution
with an average of 5 minutes per visit. In consequence, as E(X) = 1/λ ; 5 = 1/λ ; λ=0.2.
ii)To calculate the probability of a visitor spending more than 10 minutes on the site you can
type:
pexp(10, rate = 0.2, lower.tail = FALSE) # 0.1353353 or 13.53%
iii) To calculate the probability of a visitor spending between 2 and 6 minutes is:
pexp(6, rate = 0.2) - pexp(2, rate = 0.2) # 0.3691258 or 36.91%
73
The qexp function
The qexp function allows you to calculate the corresponding quantile (percentile) for
any probability p:
qexp syntax
qexp(q,
rate = 1,
lower.tail = TRUE) # If TRUE, probabilities are P(X <= x), or P(X > x) otherwise
Example:
To calculate the quantile for the probability 0.8646647 (Q(0.86)) :
qexp(0.8646647) # 2
qexp(1 - 0.8646647, lower.tail = FALSE) # Equivalent
74
The rexp function
The rexp function allows you to draw n observations from an exponential distribution.
The syntax of the function is as follows:
rexp syntax
rexp(n, # Number of observations to be generated
rate = 1)
Example:
To draw ten observations from an exponential distribution of rate 1 :
rexp(10)
Solution
The checkout processing rate is equals to one divided by the mean checkout
completion time. Hence the processing rate is 1/3 checkouts per minute. We then
apply the function pexp of the exponential distribution with rate=1/3.
[1] 0.48658
76
The time (in hours) required to repair a machine is an exponential distributed random
variable with paramter λ=1/2.
77
Let X denote the time (in hours) required to repair a machine. Given that X∼Exp(λ=1/2)
a) dexp(2.5,rate=0.5)
d) The probability that a repair time takes between 2 to 4 hours can be written as P(2<X<4).
pexp(4,rate=lambda)- pexp(2,rate=0.5)
f) 1000 random numbers from Exponential distribution with given rate=0.5 rexp(1000,0.5)
78
Normal distribution
Normal Distribution is a probability function used in statistics that tells about how the data values are
distributed.
It is the most important probability distribution function used in statistics because of its advantages in real
case scenarios.
For example, the height of the population, shoe size, IQ level, rolling a dice, and many more.
R has four in built functions to generate normal distribution. They are described below.
dnorm(x, mean, sd)
pnorm(x, mean, sd)
qnorm(p, mean, sd)
rnorm(n, mean, sd)
where,
x is a vector of numbers.
p is a vector of probabilities.
n is number of observations(sample size).
mean is the mean value of the sample data. It's default value is zero.
sd is the standard deviation. It's default value is 1.
79
dnorm
The function dnorm returns the value of the probability density function (pdf) of the normal
distribution given a certain random variable x, a population mean μ and population standard
deviation σ. The syntax for using dnorm is as follows:
#find the value of the normal distribution pdf at x=10 with mean=20 and sd=5
dnorm(x=10, mean=20, sd=5)
# [1] 0.01079819 80
pnorm
The function pnorm returns the value of the cumulative density function (cdf) of the
normal distribution given a certain random variable q, a population mean μ and
population standard deviation σ. The syntax for using pnorm is as follows:
pnorm(q, mean, sd)
Example:
Suppose the height of males at a certain school is normally distributed with a mean of
μ=70 inches and a standard deviation of σ = 2 inches. Approximately what percentage of
males at this school are taller than 74 inches?
#find percentage of males that are taller than 74 inches in a population with
#mean = 70 and sd = 2
pnorm(74, mean=70, sd=2, lower.tail=FALSE)
# [1] 0.02275013
81
Suppose the weight of a certain species of otters is normally distributed with a mean of μ=30 lbs
and a standard deviation of σ = 5 lbs. Approximately what percentage of this species of otters
weight less than 22 lbs?
#find percentage of otters that weight less than 22 lbs in a population with
#mean = 30 and sd = 5
pnorm(22, mean=30, sd=5)
# [1] 0.05479929
Suppose the height of plants in a certain region is normally distributed with a mean of μ=13 inches
and a standard deviation of σ = 2 inches. Approximately what percentage of plants in this region
are between 10 and 14 inches tall?
#find percentage of plants that are less than 14 inches tall, then subtract the
#percentage of plants that are less than 10 inches tall, based on a population
#with mean = 13 and sd = 2
pnorm(14, mean=13, sd=2) - pnorm(10, mean=13, sd=2)
# [1] 0.6246553 82
qnorm
The function qnorm returns the value of the inverse cumulative density function (cdf) of the normal
distribution given a certain random variable p, a population mean μ and population standard deviation σ.
The syntax for using qnorm is as follows:
#find the Z-score of the 95th quantile of the standard normal distribution
qnorm(.95)
# [1] 1.644854
#find the Z-score of the 10th quantile of the standard normal distribution
qnorm(.10)
83
# [1] -1.281552
rnorm
The function rnorm generates a vector of normally distributed random variables given a vector length n, a
population mean μ and population standard deviation σ. The syntax for using rnorm is as follows:
rnorm(n, mean, sd)
Example:
#generate a vector of 5 normally distributed random variables with mean=10 and sd=2
five <- rnorm(5, mean = 10, sd = 2)
five
# [1] 10.658117 8.613495 10.561760 11.123492 10.802768
#generate a vector of 1000 normally distributed random variables with mean=50 and sd=5
narrowDistribution <- rnorm(1000, mean = 50, sd = 15)
#generate a vector of 1000 normally distributed random variables with mean=50 and sd=25
wideDistribution <- rnorm(1000, mean = 50, sd = 25)
84
Suppose that you have a machine that packages rice inside boxes. The process follows a
Normal distribution and it is known that the mean of the weight of each box is 1000
grams and the standard deviation is 10 grams.
Calculate the quantile for probability 0.5 for the above scenario.
qnorm(0.5,1000,10)
EX: When you roll a fair die, the outcomes are 1 to 6. The probabilities of getting these
outcomes are equally likely and that is the basis of a uniform distribution.
dunif syntax
dunif(x, # X-axis values (grid of values)
min = 0, # Lower limit of the distribution (a)
max = 1) # Upper limit of the distribution (b)
Example:
Consider that you want to calculate the uniform probability density function in the
interval (1, 3) for a grid of values. For that purpose you can type:
x <- 0:4 # Grid
dunif(x, min = 1, max = 3)
Output
0.0 0.5 0.5 0.5 0.0 87
The punif function
punif function to calculate the uniform cumulative distribution function, this is, the
probability of a variable X taking a value lower than x. This function has the following
syntax:
punif syntax
punif(q, # Vector of quantiles
min = 0, # Lower limit of the distribution (a)
max = 0, # Upper limit of the distribution (b)
lower.tail = TRUE) # If TRUE, probabilities are P(X <= x), or P(X > x) otherwise
Example:
To calculate the probability of a uniform variable on the interval (0, 1) taking a value
equal or lower to 0.6 is:
punif(0.6) # 0.6 88
The qunif function
To calculate the quantile for any probability (p) for a uniform distribution
qunif syntax
qunif(p, # Vector of probabilities
min = 0, # Lower limit of the distribution (a)
max = 1, # Upper limit of the distribution (b)
lower.tail = TRUE) # If TRUE, probabilities are P(X <= x), or P(X > x) otherwise
Example:
To calculate the quantile for the probability 0.5 of a uniform distribution on the interval (0,
60)
The R runif function allows drawing n random observations from a uniform distribution.
runif syntax
runif(n # Number of observations to be generated
min = 0, # Lower limit of the distribution (a)
max = 0) # Upper limit of the distribution (b)
Example:
To draw ten observations from a uniform distribution on the interval (-1, 1)
(b) Find the probability that on a given day the amount of coffee dispensed by the machine
will be at most 8.8 liters.
(c) Find the probability that on a given day the amount of coffee dispensed by the machine
will be at least 8.5 liters.
(e) Simulate 1000 Uniform distributed random variables with a=7 and b=10.
91
(a) The value of the density function at x=7.6
dunif(7.6,min=7,max=10)
(b) The probability that on a given day the amount of coffee dispensed by the machine
will be at most 8.8 liters.
punif(8.8,7,10)
(c) The probability that on a given day the amount of coffee dispensed by the machine
will be at least 8.5 liters.
punif(8.4,7,10,lower.tail=FALSE)
(e) Simulate 1000 Uniform distributed random variables with a=7 and b=10.
runif(1000,7,10)
92
X is the time (in minutes) that a person has to wait in order to take a flight. If each flight
takes off each hour X∼U(0,60).
Find the probability that a person has to wait exactly for 20 minutes?
dunif(20,0,60)
Calculate the quantile for the probability 0.5 of a uniform distribution on the interval (0, 60)
qunif(0.5,0,60)
93