Probability & Statistics
Probability & Statistics
Statistics
The science of collecting, organizing, presenting, analyzing, and interpreting data to assist in making more
effective decisions.
Constant
A characterstic that assumes fixed value from individual to individual (place to place, time to time, person to
person). e.g. π, days in a week, month in a year etc.
Variable
A characterstic that variars either in quality or quantity from individual to individual (place to place, time to time,
person to person). e.g. age of a person, days in a month, temperature, gender of students, IQ of students etc.
There are two types of variables:
1. Qualitative Variable: A characterstic that variars only in quality from individual to individual (place to
place, time to time, person to person). It is also called Attribute. e.g. gender of students, eye color etc.
2. Quantitative Variable: A characterstic that variars only in quantity from individual to individual (place
to place, time to time, person to person). It is also called Scale Variable.
There are two types of Quantitative Variables:
a. Discrete Variable: A variable that can’t assume each and every value between two specified limits.
It usually the result of counts. e.g. family size, students in a class, dress size, shoe size etc.
b. Continuous Variable: A variable that can assume each and every value between two specified limits.
It usually the result of measurments. e.g. height, weight, temperature, age etc.
Level of Measurment of a Variable
1. Nominal Level: A catagorical data that can’t be arranged in a specific arrangement. e.g. colors, code of
players shirts etc.
2. Ordinal Level: A catagorical data that can be rankable. e.g.intelligence, students grade, education level
etc.
3. Interval Level: Interval level data in similar the ordinal level in that data category are rankable with the
difference that data categories are quantity. This level posses & additional characterstic that there is no
natural zero. e.g. temperature, experience of an employee etc.
4. Ratio Level: Ratio level data posses that characterstics of interval level data with the difference that there
is natural zero. e.g. age, height, weight.
Probability
Population is the total information related to the characteristic under investigation.
Sample
Sample is the representative part of the population selected with the believe that it reflects the characteristics of
the entire population.
Method of Selection a Presentative Samples
1. Non-Probability Sampling Method
2. Probability Sampling Method
i. Simple Random Sample: A sample selected so that each item or person in the population has the
same chance of being included.
ii. Systematic Random Sample: A random starting point is selected, and then every kth member of the
population is selected.
iii. Stratified Random Sample: A population is divided into subgroups, called strata, and a sample is
randomly selected from each stratum.
iv. Cluster Random Sample: A population is divided into clusters using naturally occurring geographic
or other boundaries. Then, clusters are randomly selected and a sample is collected by randomly
selecting from each cluster.
Reasons to Sample
To contact the whole population would be time-consuming.
The cost of studying all the items in a population may be prohibitive.
The physical impossibility of checking all items in the population.
The destructive nature of some tests.
The sample results are adequate.
Parameter
A measure computed from population. e.g. Mean (μ), Variance (σ2) etc.
Static
A measure computed from sample. e.g. Mean (X̅), Variance (S2) etc.
Average
A single value that can represent a data (either population or sample) as a single figure.
Three types of Averages are:
1. Arithmetic Average: A.M is the sum of all values divided by the number of values in a data.
∑𝑥
Sample Mean = X̅ =
𝑛
2. Median: It is the central value of array.
3. Mode: It is the most repeated value in data.
Characteristics of Good Average
It is based on all the values of data.
It should not affect by the extreme values (outliers).
Suitable Average for different Data
For Quantitative data mean, median is suitable.
For Qualitative data median, mode is suitable.
For Uniform & large data mean is suitable.
For Outliers & short data median is suitable.
For Nominal mode is suitable.
For Ordinal median is suitable.
Dispersion
Dispersion measure the reliability of data.
Variance
∑(x−x̅)2
Sample Variance = S2 =
𝑛−1
Standard Deviation
Standard Deviation (S) is the positive square root of the variance.
Range
Range = Large value – Small value
Random Experiment
An experiment whose outcome cannot be predicted prior to the performance of experiment, called Random
Experiment.
Random Variable
A variable measured as the result of an experiment. By chance, the variable can have different values.
Sample Space
Sample Space(S) is a set of all possible outcomes of a random experiment. e.g.
If a coin is toss; S = {H, T} and n(S) = 2
If a die is roll; S = {1, 2, 3, 4, 5, 6} and n(S) = 6
Event
An Event is an outcome of the random experiment.
The types of Event are:
1. Simple Event: Simple Event is an event that contains only one sample point of the sample space. If a die
is rolled; number is divisible by 5.
2. Compound Event: Compound Event is an event that contains more than one sample points of the sample
space. If a die is rolled; even number appear.
3. Mutually-Exclusive Event: Two Event E1 & E2 of a same random experiment are Mutually-Exclusive if
they contain no common element. E1 ∩ E2 = Φ
4. Equally-Likely Event: Two Event E1 & E2 of a same random experiment are Equally-Likely if number
of sample points of E1 & E2 are equal. n(E1) = n(E2)
5. Exhaustive Event: Two Event E1 & E2 of a same random experiment are Exhaustive if E 1 ∪ E2 = S
6. Impossible Event: An event that can never occur. If a die is rolled; 7 appear.
7. Independent Event: Two events are independent, if occurrence or non-occurrence of either of them don’t
effect the other. It is also called sampling with replacement.
8. Dependent Event: Two events are dependent, if occurrence or non-occurrence of either of them effect
the other. It is also called sampling without replacement.
Probability
The chance of the occurrence of an event is called its Probability.
𝑛(𝐸)
P(E) =
𝑛(𝑆)
Where n(S) is the total number of possible outcomes of random experiment & n(E) be the number of sample
points that belong of event E.
Rules of Probability
Addition Rule
If A & B are any two events, the probability that either A occurs or B occurs is:
P (A OR B) = P(A∪B) = P(A) + P(B) – P(A∩B)
If A & B Mutually-Exclusive than P(A∩B) = 0
Product Rule for Independent Events
If A & B are any two independent events, the probability that both A & B occurs is:
P (A AND B) = P(A∩B) =P(A). P(B)
Product Rule for Dependent Events
If A & B are any two dependent events, the probability that both A & B occurs is:
𝑛(𝐴∩𝐵)
P (A AND B) = P(A∩B) =P(A). P(B|A) ……… where P(B|A) =
𝑛(𝐴)
Probability Distribution
A listing of all the outcomes of an experiment and the probability associated with each outcome.
Mean of Probability Distribution = E(X) = μ = ∑X P(X)
Variance of Probability Distribution = σ2 = ∑X2P(X) – (∑X P(X))2