0% found this document useful (0 votes)

17 views33 pages

ps project file

Uploaded by

Durga Nandini

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

17 views33 pages

ps project file

Uploaded by

Durga Nandini

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 33

SO.

N PRACTICAL DATE REMARK

1. Load real-world datasets from sources 6/9/24

like CSV files or online repositories.

2. Calculate descriptive statistics like 13/9/24

mean, median, and standard deviation.

3. Create histograms, boxplots, scatter 20/9/24

plots, and bar charts to visualize data
distributions, relationships between
variables, and identify potential outliers.

4. Simulate random outcomes and 27/9/24

calculate probabilities for the case of
coin flips and card draws.

5. Simulate experimental probabilities and 4/10/24

compare it with theoretical probabilities.

6. alculate expected value and variance in 11/10/2

the context of single random variable. 4

7. Generate and plot probabilities for 8/10/24

events in discrete and continuous
distributions (Binomial, Poisson,
Geometric, and Normal).

8. Fit simple linear regression models 15/10/2

using built-in functions. 4

INDEX
PRACTICAL:01
Aim:Load real-world datasets from sources like CSV files or online repositories.
Output:
PRACTICAL:02
Aim: Calculate descriptive statistics like mean, median, mode and
standard deviation

Central Tendency: - Central tendency is a measure that best summarizes

the data and is a measure that is related to the centre of the data set. Mean,
median, and mode are the most common measures for central tendency.

Mean: - The mean is the average of the data. It is the sum of all data divided by
the number of data points. The mean works best if the data is distributed in a
normal distribution or distributed evenly. The mean represents the expected value if
the distribution is random.

Median: - The median is the middle or midpoint of the data and is also the 50
percentiles of the data. The median is affected by the outliers and skewness of the
data. The median can be a better measurement for centrality than the mean if the
data is skewed. The mean is the average, which is liable to be influenced by
outliers, so median is a better measure when the data is skewed
Mode: - Mode is a value in data that has the highest frequency and is useful
when the differences are non-numeric and seldom occur.
Standard Deviation: - Standard deviation R is the measure of the
dispersion of the values.

Range: - The range is the difference between the largest and smallest points in
the data.
Inquartile Range: - The interquartile range is the measure of the difference
between the 75 percentile or third quartile and the 25 percentile or first quartile.
Question: Twenty students , graduates and undergraduates, were enrolled in a
statistics course. Their ages were
18,19,19,19,19,20,20,20,20,20,21,21,21,21,22,23,24,27,30,36.
a) Find Mean and Median of all students
b) Find median age of all students under 25 years.
c) Find modal age of all student
PRACTICAL:03
Aim: Create histograms, boxplots, scatter plots, and bar charts to visualize data
distributions, relationships between variables, and identify potential outliers.

Histogram: A histogram shows the frequency distribution of a continuous

variable. It’s useful for understanding the distribution of data .

Output:

Boxplot:A boxplot is useful for visualizing the distribution of data, including the
median, quartiles, and potential outliers
Output:

Scatter Plot:A scatter plot helps visualize the relationship between two
continuous variables. Since we only have one variable (ages) in this
example, we'll generate a simple scatter plot of ages against an index
(just to demonstrate).

Output:
Bar Chart:A bar chart is suitable for categorical data, showing the
frequency of each category. In this case, we'll create a bar chart showing
the frequency of each age.

Output:
PRACTICAL:04
Aim: Simulate random outcomes and calculate probabilities for the case of coin
flips and card draws

Coin Flips Simulation:In this scenario, we will simulate a certain number of

coin flips and calculate the probabilities of heads and tails
sample(x, size, replace = FALSE, prob = NULL)
• x is the vector of elements from which you are sampling.
• size is the number of samples you wish to take.
• replace determines whether you are sampling with replacement or not. Sampling
without replacement means that sample will not pick the same value twice, and this
is the default behaviour. Pass replace = TRUE to sample if you wish to sample with
replacement.
• prob is a vector of probabilities or weights associated with x. It should be a vector
of nonnegative numbers of the same length as x. If the sum of prob is not 1, it will
be normalized. If this value is not provided, then each element of x is considered to
be equally likely.

Card Draw Simulation:To simulate drawing cards from a shuffled deck,

here's an easy method:
. PRACTICAL:05
Aim: Simulate experimental probabilities and compare it with theoretical
probabilities
Probability: Mathematical Approach
P(success)=number of ways to get success/total number of possible outcomes
Probability: Statistical Approach
P(success)=number of times the event occurred /total number of trials of
experiment

1: Comparison of Theoretical prob and Experimental Prob to flip a coin

2: Comparison of Theoretical prob and Experimental Prob for a die
with size 35
. PRACTICAL:06
Aim: alculate expected value and variance in the context of single random
variable

Description:
There are two main types of random variables:
1. Discrete Random Variable: Takes on a countable number of distinct values. For
example, the number of heads in a series of coin flips.

a) Expected Value (𝑬[𝑿]): is a measure of the central tendency of a random

variable. It represents mean of the possible values that a random variable can take,
weighted by their probabilities. It's calculated by summing the products of each
value and its probability

E(X)=i=1∑nxi⋅P(xi).
where 𝑥 are the possible values of the random variable 𝑋 and 𝑃(𝑋 = 𝑥) is the
probability of 𝑋 taking the value 𝑥.

b) Variance (Var(𝑿)): measures the spread or dispersion of a random variable's

possible values around the expected value. It quantifies how much the values of the
random variable differ from the expected value. A higher variance indicates greater
variability in the values. Variance is calculated as the expected value of the squared
deviation of a random variable from its mean:

Var(X)=E(X2)−[E(X)]2

 E(X^2) is the expected value of X^2, i.e., E(X2)=∑i= xi^2 ⋅P(xi)

Where:

 E(X)E(X)E(X) is the expected value.

2. Continuous Random Variable: Takes on an infinite number of possible values
within a given range. For example, the height of individuals in a population. a)
Expected Value (𝑬[𝑿]): It's calculated by integrating the product of the value and
its probability density function over the entire range.

E(X)=∫−∞∞x⋅f(x)dx
Where:
 X is the possible values of the random variable,
 f(x) is the probability density function.
b) The variance of a continuous random variable X is defined as:
Var(X) = E(X^2) - [E(X)]^2

 E(X^2) = ∫−∞∞ x^2 ⋅f(x)dx,

Where:

 E(X)E(X)E(X) is the expected value calculated earlier

For Discrete Random Variable:

For Continuous Random Variable:

Given the continuous random variable X with the probability density function (PDF):
f(x)={6x(1−x),0<x<10
0 ,otherwise
We are tasked with finding the expected value and variance of X.
PRACTICAL:07
Aim: Generate and plot probabilities for events in discrete and continuous
distributions (Binomial, Poisson, Geometric, and Normal).

Description:
Binomial Distribution
The binomial distribution is a discrete probability distribution that models the
number of successes in a fixed number of independent trials of a binary
experiment. Each trial can result in one of two outcomes: "success" or "failure."
Key Characteristics

1. Number of Trials (𝑛): The total number of independent trials or experiments.

2. Probability of Success (𝑝): The probability of success in a single trial.

3. Probability of Failure (𝑞): The probability of failure in a single trial, calculated as 𝑞

= 1 − 𝑝.

4. Random Variable (X): Represents the number of successes in 𝑛 trials. Probability

Mass Function (PMF)
Formula R provides several built-in functions for handling the binomial distribution.

1. dbinom(𝑥, size, prob) : This function gives the probability density distribution at
each point.

2. pbinom(𝑥, size, prob, lower.tail = TRUE) : This function gives the cumulative
probability of an event.
Parameters:

 𝑥: Number of successes (can be a vector).

 𝑞: The quantile (number of successes).

 size: Number of trials.

 prob: Probability of success on each trial.

 lower.tail: If TRUE (default), probabilities are 𝑃(𝑋 ≤ 𝑞); if FALSE, 𝑃(𝑋 > 𝑞).

experiment of flipping a fair coin 20 times. Also, create a bar plot to visualize the
probabilities for each possible outcome (number of heads).
Poisson Distribution
The Poisson distribution is a discrete probability distribution that expresses the
probability of a given number of events occurring continuously but within a fixed
interval of time or space, given that these events occur with a known constant
mean rate and are independent of the time since the last event. We call it the
distribution of rare events.
Key Characteristics

1. Parameter (𝜆): The average number of events in the given interval. It is also the
mean and variance of the distribution.
2. Events: The events must occur independently. That is, the occurrence of one
event does not affect the occurrence of another.
3. Interval: The interval can be time, space, or any other measurable quantity.
4. For a small interval, the probability of the event occurring is proportional to the
size of the interval.
5. The probability of more than one occurrence in the small interval is negligible.

Parameters:

 x or q: Number of successes (can be a vector).

 lambda: average no. of times event occur
 log: A logical value. If TRUE, the function returns the logarithm of the probability; if
FALSE (the default), it returns the actual probability.

 lower.tail: If TRUE (default), probabilities are 𝑃(𝑋 ≤ 𝑞); if FALSE, 𝑃(𝑋 > 𝑞).
A bookstore sells an average of 6 books per hour. What is the probability that the
bookstore sells exactly 4 books in a given hour? Additionally, visualize the
probability mass function (PMF) of the Poisson distribution for the number of books
sold from 0 to 10 in one hour.
Geometric Distribution
The geometric distribution is a discrete probability distribution that models the
number of trials required to achieve the first success in a sequence of independent
Bernoulli trials (where each trial has two possible outcomes: success or failure). It is
particularly useful in scenarios where you want to determine how many attempts it
takes before the first success occurs.
Key Characteristics

1. Parameter (𝑝): The probability of success on each trial.

2. Trials: Each trial is independent, meaning the outcome of one trial does not affect
the others.

Formula
R provides several built-in functions for handling the Geometric distribution.
1. dgeom(x, prob, log = FALSE): Calculates the probability of having exactly x
failures before the first success.
2. pgeom(q, prob, lower.tail = TRUE, log.p = FALSE): Calculates the probability of
having at most q failures before the first success.
In a scenario where the probability of success in a Bernoulli trial is 0.4, what is:
1. The probability of experiencing exactly 5 failures before achieving the first
success?
2. The cumulative probability of experiencing at most 5 failures before the first
success?
3. Additionally, visualize the probability mass function (PMF) of the geometric
distribution for the first 20 trials.
Normal Distribution
The normal distribution, also known as the Gaussian distribution, is a continuous
probability distribution that is symmetric about its mean, indicating that data near
the mean are more frequent in occurrence than data far from the mean.
Key Characteristics
1. Bell-Shaped Curve: The graph of the normal distribution is bell-shaped and
symmetric around the mean. In the graph, fifty percent of values lie to the left of
the mean and the other fifty percent lie to the right of the graph.

2. Mean (𝜇): The central value of the distribution, which is also its median and
mode.

3. Standard Deviation (𝜎): A measure of the dispersion or spread of the distribution.

It determines the width of the bell curve.

4. Notation: A normal distribution is often denoted as 𝑁(𝜇, 𝜎ଶ), where 𝜇 is the

mean and 𝜎 ଶ is the variance.
Parameters:
 x or q: vector of quantities.
 mean: is the mean value of the sample data. Its default value is zero.
 sd: is the standard deviation. Its default value is 1
PRACTICAL:08
Aim: Fit simple linear regression models using built-in functions .
Formula:

y=β0+β1x+ϵ

Parameters:
1. y is the dependent variable,
2. x is the independent variable,
3. β0 is the intercept,
4. β1 is the slope (coefficient for x),
5. ϵ is the error term.
Plot of the wt vs mpg data as a scatter plot and then overlay the regression line in
red, showing the linear relationship between weight and miles per gallon.

THANK YOU:

WEEK 1 & 2
No ratings yet
WEEK 1 & 2
33 pages
Lesson 2 Stats Prob
No ratings yet
Lesson 2 Stats Prob
19 pages
Discrete Probability Distributions - Statistics
No ratings yet
Discrete Probability Distributions - Statistics
9 pages
C Random Variables Probability Distribution
No ratings yet
C Random Variables Probability Distribution
21 pages
-Skewness 2025
No ratings yet
-Skewness 2025
62 pages
PRCCCCC
No ratings yet
PRCCCCC
4 pages
STATISTICS AND PROBABILITY 2
No ratings yet
STATISTICS AND PROBABILITY 2
16 pages
13 Discrete RV
No ratings yet
13 Discrete RV
29 pages
3-DISCRETE-PROBABILITY-DISTRIBUTIONS
No ratings yet
3-DISCRETE-PROBABILITY-DISTRIBUTIONS
39 pages
Stats - 3RD Quarter
No ratings yet
Stats - 3RD Quarter
7 pages
Lecture 1-1_Review of Probability
No ratings yet
Lecture 1-1_Review of Probability
36 pages
Stats - Prob - 3rd Quarter
No ratings yet
Stats - Prob - 3rd Quarter
4 pages
biostatistis
No ratings yet
biostatistis
35 pages
1_module_notes
No ratings yet
1_module_notes
9 pages
DA UNIT-4
No ratings yet
DA UNIT-4
37 pages
UNIT-4
No ratings yet
UNIT-4
38 pages
Stat-and-Prob-Q1-M5
No ratings yet
Stat-and-Prob-Q1-M5
12 pages
PME-lec7-ch4-a
No ratings yet
PME-lec7-ch4-a
67 pages
Data Analys - 1 - 18
No ratings yet
Data Analys - 1 - 18
38 pages
Lecture 8
No ratings yet
Lecture 8
28 pages
Probability Distributions - Discrete and Normal
No ratings yet
Probability Distributions - Discrete and Normal
35 pages
Internal Paper
No ratings yet
Internal Paper
20 pages
BSMA 301 Statistics: Dr. Eyram Kwame
No ratings yet
BSMA 301 Statistics: Dr. Eyram Kwame
137 pages
Review of Statistics Econ3005 L1 AEF
No ratings yet
Review of Statistics Econ3005 L1 AEF
42 pages
Lecture 7 9
No ratings yet
Lecture 7 9
16 pages
Statistics and Probability2021 - Quarter 3 2
No ratings yet
Statistics and Probability2021 - Quarter 3 2
38 pages
Week-5
No ratings yet
Week-5
30 pages
2024 F STA-1005ab Review Problems for the Final Exam
No ratings yet
2024 F STA-1005ab Review Problems for the Final Exam
65 pages
Unit 3 r as a Set of Statistical Tables
No ratings yet
Unit 3 r as a Set of Statistical Tables
31 pages
Quantitative Methods in Management
No ratings yet
Quantitative Methods in Management
100 pages
ISM Session 5 June 2025
No ratings yet
ISM Session 5 June 2025
74 pages
Probability Distributions in R
No ratings yet
Probability Distributions in R
42 pages
Descriptive and Infrential Statistics
No ratings yet
Descriptive and Infrential Statistics
33 pages
LEC 09 - Student - Discrete Probability Distributions
No ratings yet
LEC 09 - Student - Discrete Probability Distributions
57 pages
Descriptive Statistics
No ratings yet
Descriptive Statistics
51 pages
C15 Statistics TI84
No ratings yet
C15 Statistics TI84
178 pages
ML2_Math_Algo
No ratings yet
ML2_Math_Algo
72 pages
MAE 300 Textbook
No ratings yet
MAE 300 Textbook
95 pages
Introduction To Analytics
No ratings yet
Introduction To Analytics
50 pages
Mean and Variance of Random Variables and Probability Distribution Discussion
No ratings yet
Mean and Variance of Random Variables and Probability Distribution Discussion
36 pages
Revision Module 1,2,3
No ratings yet
Revision Module 1,2,3
129 pages
UNIT 1 SSMDA NOTES
No ratings yet
UNIT 1 SSMDA NOTES
35 pages
What Is Statistic
No ratings yet
What Is Statistic
129 pages
Types of Data
No ratings yet
Types of Data
45 pages
Statistical and Probability Tools For Cost Engineering
No ratings yet
Statistical and Probability Tools For Cost Engineering
16 pages
1 DiscreteDistribution2018
No ratings yet
1 DiscreteDistribution2018
75 pages
Chapter 6
No ratings yet
Chapter 6
5 pages
Lecture 02. Statistics Draft
No ratings yet
Lecture 02. Statistics Draft
39 pages
SDM 1 Formula
No ratings yet
SDM 1 Formula
9 pages
Cone Pre Calculus
No ratings yet
Cone Pre Calculus
30 pages
Descriptive Statistics and Probability Distributions: Session 1
No ratings yet
Descriptive Statistics and Probability Distributions: Session 1
34 pages
Worksheet Booklet 2020-2021 Name: - Grade: Subject: Economics
No ratings yet
Worksheet Booklet 2020-2021 Name: - Grade: Subject: Economics
48 pages
Mean, Median, Mode, Variance & Standard Deviation: Subject: Statistics Created By: Marija Stanojcic Revised: 10/9/2018
No ratings yet
Mean, Median, Mode, Variance & Standard Deviation: Subject: Statistics Created By: Marija Stanojcic Revised: 10/9/2018
3 pages
ECON1203 PASS Week 3
No ratings yet
ECON1203 PASS Week 3
4 pages
Review Some Basic Statistical Concepts: Topic
No ratings yet
Review Some Basic Statistical Concepts: Topic
55 pages
(3rd Month) MATH 112 - Statistics and Probability
No ratings yet
(3rd Month) MATH 112 - Statistics and Probability
65 pages
Probability Distributions.
No ratings yet
Probability Distributions.
46 pages
Chapter 2 - Lesson 4 Random Variables
No ratings yet
Chapter 2 - Lesson 4 Random Variables
19 pages
Revision - Elements or Probability: Notation For Events
No ratings yet
Revision - Elements or Probability: Notation For Events
20 pages
Statistical Inference
No ratings yet
Statistical Inference
106 pages
Basic - Statistics 30 Sep 2013 PDF
100% (1)
Basic - Statistics 30 Sep 2013 PDF
20 pages
IDS Unit-2
No ratings yet
IDS Unit-2
39 pages
Chapter 10 Measures of Central Tendency
No ratings yet
Chapter 10 Measures of Central Tendency
40 pages
Statistical Data
No ratings yet
Statistical Data
41 pages
Marketing Research in Practice
No ratings yet
Marketing Research in Practice
46 pages
Maa SL 4.1-4.3 Statistics - Basic Concepts
No ratings yet
Maa SL 4.1-4.3 Statistics - Basic Concepts
30 pages
Math Majorship 1
No ratings yet
Math Majorship 1
13 pages
Years India Singapore Vietnam Years India Singapore: Column1 Column2 Column3 Column1
No ratings yet
Years India Singapore Vietnam Years India Singapore: Column1 Column2 Column3 Column1
34 pages
Statistics For Business and Economics: Anderson Sweeney Williams
No ratings yet
Statistics For Business and Economics: Anderson Sweeney Williams
34 pages
Chapter 2
No ratings yet
Chapter 2
23 pages
Higher Order Function: Callback
No ratings yet
Higher Order Function: Callback
14 pages
Business Statistics Problems
No ratings yet
Business Statistics Problems
25 pages
New Module Geography May 2021
No ratings yet
New Module Geography May 2021
38 pages
Economics GR 11 Ist Term Project
No ratings yet
Economics GR 11 Ist Term Project
9 pages
Study Material of Class XI Economics
No ratings yet
Study Material of Class XI Economics
89 pages
Unit01 03
No ratings yet
Unit01 03
147 pages
Raghav Classes, Karan Celista, Balewadi: Worksheet 1: Statistics
No ratings yet
Raghav Classes, Karan Celista, Balewadi: Worksheet 1: Statistics
3 pages
Tutorial Week 3 (Q)
No ratings yet
Tutorial Week 3 (Q)
6 pages
Unit 1 Repaired)
No ratings yet
Unit 1 Repaired)
8 pages
Elementary Statistik
No ratings yet
Elementary Statistik
9 pages
Central Tendency PDF
No ratings yet
Central Tendency PDF
3 pages
Functions of Statistics
No ratings yet
Functions of Statistics
13 pages
Introduction To The Statistical Drake Equation
No ratings yet
Introduction To The Statistical Drake Equation
55 pages
Grade 9 statistics worksheet
No ratings yet
Grade 9 statistics worksheet
2 pages
Chapter 3 Data
No ratings yet
Chapter 3 Data
48 pages
Learn Statistics Fast: A Simplified Detailed Version for Students
From Everand
Learn Statistics Fast: A Simplified Detailed Version for Students
Hesbon R.M
No ratings yet
Data Analysis Midterm
No ratings yet
Data Analysis Midterm
25 pages
GEC 4 Module 4 Final Term
No ratings yet
GEC 4 Module 4 Final Term
86 pages
Grade 7 Statistics
No ratings yet
Grade 7 Statistics
23 pages
Collins - Cambridge - Statistics 1 Answers Key
100% (3)
Collins - Cambridge - Statistics 1 Answers Key
67 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

ps project file

Uploaded by

ps project file

Uploaded by

SO.

N PRACTICAL DATE REMARK

1. Load real-world datasets from sources 6/9/24

2. Calculate descriptive statistics like 13/9/24

3. Create histograms, boxplots, scatter 20/9/24

4. Simulate random outcomes and 27/9/24

5. Simulate experimental probabilities and 4/10/24

6. alculate expected value and variance in 11/10/2

7. Generate and plot probabilities for 8/10/24

8. Fit simple linear regression models 15/10/2

Central Tendency: - Central tendency is a measure that best summarizes

Histogram: A histogram shows the frequency distribution of a continuous

Coin Flips Simulation:In this scenario, we will simulate a certain number of

Card Draw Simulation:To simulate drawing cards from a shuffled deck,

1: Comparison of Theoretical prob and Experimental Prob to flip a coin

a) Expected Value (𝑬[𝑿]): is a measure of the central tendency of a random

b) Variance (Var(𝑿)): measures the spread or dispersion of a random variable's

 E(X^2) is the expected value of X^2, i.e., E(X2)=∑i= xi^2 ⋅P(xi)

 E(X)E(X)E(X) is the expected value.

 E(X^2) = ∫−∞∞ x^2 ⋅f(x)dx,

 E(X)E(X)E(X) is the expected value calculated earlier

For Continuous Random Variable:

1. Number of Trials (𝑛): The total number of independent trials or experiments.

2. Probability of Success (𝑝): The probability of success in a single trial.

3. Probability of Failure (𝑞): The probability of failure in a single trial, calculated as 𝑞

4. Random Variable (X): Represents the number of successes in 𝑛 trials. Probability

 𝑥: Number of successes (can be a vector).

 𝑞: The quantile (number of successes).

 size: Number of trials.

 prob: Probability of success on each trial.

 x or q: Number of successes (can be a vector).

1. Parameter (𝑝): The probability of success on each trial.

3. Standard Deviation (𝜎): A measure of the dispersion or spread of the distribution.

4. Notation: A normal distribution is often denoted as 𝑁(𝜇, 𝜎ଶ), where 𝜇 is the

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.