Normal Distribution & CLT
Normal Distribution & CLT
Normal Distribution
It is a continuous distribution described by the bell-shaped curve and is observed in many natural
phenomena such as height, weight and GRE score.
As the µ changes, the location of the distribution changes, and as the σ changes, the distribution becomes
narrower or wider.
- Distribution is symmetric
- Mean, median and mode are all equal
- Range of x is unbounded
- Empirical rule applies.
Source: https://upload.wikimedia.org/wikipedia/commons/thumb/a/a9/Empirical_Rule.PNG/350px-Empirical_Rule.PNG
Standardized Values-z scores- Identifies the relative distance of an observation from the mean and is
independent of the units of measurement
The z-score for the ith observation in a data set is calculated as follows:
Negative z-scores indicates that xi lies to the left of the mean and positive z-scores indicates that xi
lies to the right of the mean. Since the formula divides by the standard deviation, s, the z-score
represents the distance from the mean in units of standard deviations
A z-score of 1.0 means that the observation is one standard deviation to the right of the mean.
A z-score of -1.5 means that the observation is 1.5 standard deviations to the left of the mean.
PREPARED BY: FAIRUZ CHOWDHURY
Problems:
Otis Elevator reported that the number of hours lost per week last year due to employees' illness was
approximately normally distributed with a mean of 60 hours and a standard deviation of 15 hours. For a
given week, determine the following probabilities.
For 2014, the Dow Jones 30 stock index companies had an average earnings per share of 5.40 with a
standard deviation of 3.52.
a. What is the probability that a company will have earnings per share that exceed $12?
b. What is the probability that earnings per share will be negative?
c. One of the companies in the Dow Jones 30 is stating that their earnings per share are in the 90th
percentile among their peers. What would you expect their earnings per share to be using a normal
distribution?
Example:
We take random numbers from this interval and create 25 samples with sample size of 10. While looking
at the means of the 25 samples, we find variation in the statistic. Furthermore, the distribution of the
histogram of the sample mean depicts this issue. The reason behind this variation is sampling error.
Sampling error occurs as samples are only a subset of the total population and although we can minimize
it, we can’t avoid it. Finally, we look at the average and standard deviation of the sample means which
come to 5.011 and 0.8166 respectively.
We do the same thing with a larger sample size of 25 and work with 25 samples again. While the mean of
the samples is still close to 5, the standard deviation has reduced to 0.45. As we increase sample sizes to
100 and 500 progressively, we find that the mean is still close to 5, however, the standard deviations are
becoming smaller. Furthermore, the comparative histograms show that the sample means are being
clustered together around the expected value with increasing sample sizes.
Thus, we can conclude the distribution of sample means appear to assume the shape of normal distribution
for larger samples.
Furthermore, the theorem states that if the population is normally distributed, then the sampling
distribution of the men will be normal for any sample size.
This allows us to use the idea of calculating probabilities for normal distribution to draw conclusions about
sample means.
PREPARED BY: FAIRUZ CHOWDHURY