ML2_Math_Algo
ML2_Math_Algo
RAJAD SHAKYA
Descriptive statistics
● summarize and describe the features of a dataset.
● ={2 + 4 + 6 + 8 + 10}/{5} = 6
Mean (Average)
● measure the central tendency of a dataset
○ The median is 90
Question
● Given the dataset: 1, 2, 2, 3, 4, 5, 5, 5, 6, 8. Calculate
the mean, median, and mode.
○ The mean is 4
○ The mode is 5
Question
● Given the dataset: 1, 2, 2, 3, 4, 5, 5, 5, 6, 8. Calculate
the mean, median, and mode.
○ import numpy as np
np.mean(data)
np.median(data)
○ Mean = 5
○ Variance = (9 + 1 + 1 + 1 + 0 + 0 + 4 + 16)/8 = 4
○ Standard deviation = 2
Question
● You have the following values: [12, 15, 12, 15, 14,
12, 15, 14]. Compute the variance and standard
deviation.
○ 1.71
○ 1.31
Range
● difference between the maximum and minimum
values in a dataset.
● IQR = Q3 - Q1
Histogram
● graphical representation of the
distribution of numerical data.
● estimate of the probability
distribution of a continuous
variable.
● Bins: The range of values is
divided into intervals
● Frequency: The height of each
bin indicates the number of data
points
Histogram
Question
● Create histogram for dataset
12 34 45 67 69 45 66 78 88
64 63 33 11 16
Boxplot (Box-and-Whisker Plot)
● standardized way of displaying the distribution of
data based on a five-number summary:
●
Covariance
● measure of the joint variability of two random
variables. It indicates the direction of the linear
relationship between variables.
●
Covariance
● Calculate the coefficient of covariance for the
following data:
● X 2 8 18 20 28 30
Y 5 12 18 23 45 50
● 157.83
Covariance Matrix
● a square matrix provides the covariance between
each pair of components (or elements) of a given
random vector
● Correlation,ρ(X,Y) = Cov(X,Y)/σX σy
Correlation
● statistical measure that expresses the extent to which two
variables are linearly related.
● he scaled form of covariance.
● Positive Correlation: Indicates that as one variable increases,
the other variable also increases.
● Negative Correlation: Indicates that as one variable increases,
the other variable decreases.
● Zero Correlation: Indicates no linear relationship between the
variables.
Correlation
Pearson Correlation Coefficient
● value of the coefficient lies between -1 to +1.
# Example usage
x = [1, 2, 3, 4, 5]
y = [2, 4, 6, 8, 10]
○ Dice Roll:
■ Sample Space: {1, 2, 3, 4, 5, 6}
■ Event: Rolling an even number (i.e., {2, 4,
6}).
Probability
● P(A)= Number of favorable outcomes /
Total number of outcomes
● P(A∩B)=P(A)×P(B)
Conditional Probability
● probability of an event occurring given that
another event has already occurred.
● P(A∣B)=P(A∩B) / P(B)
Question 1
● Two fair six-sided dice are rolled. What is the
probability that the first die shows a 3 and the
second die shows an even number?
Event A: First die shows 3 (1/6)
Event B: Second die shows even number (3/6)
Joint Probability P(A∩B)=P(A)
×P(B)=1/6×1/2=1/12
Question 2
● A card is drawn from a standard deck of 52
cards. What is the probability that the card is a
King given that it is a face card?
P(King∣Face)=4/12=1/3
Question 3
● The probability that it rains on a given day is 0.3,
and the probability that there is traffic on given
day is 0.2. The probability that it rains and there
is traffic on the same day is 0.1. What is the
probability that it rains given that there is traffic?
P(T)= P(T∩C)+P(T∩¬C)
= 0.4 + 0.1
= 0.5
Bayes' Theorem
● describes how to update the probability of a
hypothesis based on new evidence.
● P(A∣B)=P(B∣A)⋅P(A) / P(B)
●
Skewness
● measures the asymmetry
of the data distribution.
● A positive skew indicates
a longer tail on the right,
● while a negative skew
indicates a longer tail on
the left.
Kurtosis
● measures the
“tailedness” of the
data distribution.
● A high kurtosis
indicates heavy tails,
while a low kurtosis
indicates light tails.
Gaussian / Normal Distribution
● Properties:
○ Symmetry: The normal distribution is symmetric
about the mean.
○ 68-95-99.7 Rule:
Central Limit Theorem (CLT)
● Given a sufficiently large sample size from a
population with a finite mean and variance, the
distribution of the sample mean will be
approximately normally distributed, regardless of
the original population distribution.
● This theorem justifies the use of the normal
distribution in many statistical methods and
hypothesis tests, even when the original data is not
normally distributed.
Standardization
● process of converting a normal distribution to a
standard normal distribution (mean = 0, standard
deviation = 1).
RAJAD SHAKYA