0% found this document useful (0 votes)
48 views46 pages

Stats Lecture 06. Normal Distribution Data

The document discusses the normal distribution and standard normal distribution. It defines key terms like mean, standard deviation, z-score, and properties of the normal curve. The normal distribution is a theoretical model that is bell-shaped and symmetrical. It is defined by its mean and standard deviation. The standard normal distribution transforms empirical distributions to have a mean of 0 and standard deviation of 1. The document provides tables and graphs to illustrate areas under the normal curve and finding probabilities based on z-scores.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
48 views46 pages

Stats Lecture 06. Normal Distribution Data

The document discusses the normal distribution and standard normal distribution. It defines key terms like mean, standard deviation, z-score, and properties of the normal curve. The normal distribution is a theoretical model that is bell-shaped and symmetrical. It is defined by its mean and standard deviation. The standard normal distribution transforms empirical distributions to have a mean of 0 and standard deviation of 1. The document provides tables and graphs to illustrate areas under the normal curve and finding probabilities based on z-scores.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 46

Normal Distribution

Shair Muhammad Hazara


MSPH (Health Services Academy, NIH, Islamabad)
MSBE (Dow University of Health Sciences Karachi)
BSN (PRN) The Aga Khan University, Karachi
E-mail address: hazara_27@Hotmail.com
Objectives

 By the end of this session the students should be able


to:
– Understand the concept of Normal distribution & standard
normal distribution.

– Differentiate between the sample mean and the population


mean.

2
What is the Normal Distribution (Curve)?

 It’s a theoretical model. The normal distribution plays a


very important role in statistical inference

 A frequency polygon or histogram that is unimodal, smooth,


and symmetrical (no empirical distribution has a shape that
perfectly matches this ideal model)

 Since the distribution is unimodal it is bell-shaped

3
The Nature of the Normal Distribution:
Properties of the Normal Distribution

 Unimodal
– One mode
 Symmetrical
– Left and right halves are mirror images
 Bell-shaped
– With maximum height at the mean, median, mode
 Continuous
– There is a value of Y for every value of X
• Asymptotic
– The farther the curve goes from the mean, the closer it gets to the X axis but
4
History of the Normal Curve

 The scores of many variables are normally distributed


– Normal Distribution
– Gaussian Distribution

Sir Francis Galton Carl Friedrich Gauss 5


(1822-1911) (1777-1855)
Normal Distribution & it’s Properties

 It was firstly discovered by De Moivre (1733). Also called a


Gaussian after another mathematician (gauss).
 Many real-life observations follow the normal distribution
(or are very close to being normally distributed);
 Two parameters define the normal distribution, the
 mean (µ) and the standard deviation (σ).

6
Many Normal Distributions

There are an infinite number of normal distributions

By varying the parameters  and , we obtain


different normal distributions
7
The Normal Distribution:
The Most Important One in Statistics

 It’s important because…


– Many variables have approximate normal
distributions.
– It’s used to approximate many discrete distributions.
– Many statistical methods use the normal distribution
even when the data are not bell-shaped.

8
Theoretical Normal Distribution

• ±1 s = about 68%
• ±2 s = about 95%
• ±3 s = about 99%

68.26%

95.44%

99.72%

9
-5 -4 -3 -2 -1 0 1 2 3 4 5
Finding Probabilities

Probability is
the area under
the curve! P c  X  d   ?

f(X)

X
c d 10
Standard Normal Distribution
The Z ...

1. The Z-score for an observation is the number of standard


deviations that it falls from the mean.

2. Expresses this distance in a standardized way.


Specifically, in standard deviation units.

3. Re-scales empirical distributions to have a mean=0 & a


standard deviation=1.

11
Standard Normal Distribution

• ±1 z = about 68%
• ±2 z = about 95%
• ±3 z = about 99%

68.26%

95.44%

99.72%

-5 -4 -3 -2 -1 0 1 2 3 4 5 12
-z -z -z z +z +z +z
Z-Score

 For each fixed number z, the probability within z


standard deviations of the mean is the area
under the normal curve between

 - z and   z
13
Z-Score
 For z = 1+
68% of the area (probability) of a normal
distribution falls between:

 - 1 and   1

14
Z-Score
 For z = 2+

95% of the area (probability) of a normal


distribution falls between:

 - 2 and   2

15
Z-Score
 For z = 3:
99.7% of the area (probability) of a normal
distribution falls between:

 - 3 and   3

16
Area between 0 and z

0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
0.0 0.0000 0.0040 0.0080 0.0120 0.0160 0.0199 0.0239 0.0279 0.0319 0.0359
0.1 0.0398 0.0438 0.0478 0.0517 0.0557 0.0596 0.0636 0.0675 0.0714 0.0753
0.2 0.0793 0.0832 0.0871 0.0910 0.0948 0.0987 0.1026 0.1064 0.1103 0.1141
0.3 0.1179 0.1217 0.1255 0.1293 0.1331 0.1368 0.1406 0.1443 0.1480 0.1517
0.4 0.1554 0.1591 0.1628 0.1664 0.1700 0.1736 0.1772 0.1808 0.1844 0.1879
0.5 0.1915 0.1950 0.1985 0.2019 0.2054 0.2088 0.2123 0.2157 0.2190 0.2224
0.6 0.2257 0.2291 0.2324 0.2357 0.2389 0.2422 0.2454 0.2486 0.2517 0.2549
0.7 0.2580 0.2611 0.2642 0.2673 0.2704 0.2734 0.2764 0.2794 0.2823 0.2852
0.8 0.2881 0.2910 0.2939 0.2967 0.2995 0.3023 0.3051 0.3078 0.3106 0.3133
0.9 0.3159 0.3186 0.3212 0.3238 0.3264 0.3289 0.3315 0.3340 0.3365 0.3389
1.0 0.3413 0.3438 0.3461 0.3485 0.3508 0.3531 0.3554 0.3577 0.3599 0.3621
1.1 0.3643 0.3665 0.3686 0.3708 0.3729 0.3749 0.3770 0.3790 0.3810 0.3830
1.2 0.3849 0.3869 0.3888 0.3907 0.3925 0.3944 0.3962 0.3980 0.3997 0.4015
1.3 0.4032 0.4049 0.4066 0.4082 0.4099 0.4115 0.4131 0.4147 0.4162 0.4177
1.4 0.4192 0.4207 0.4222 0.4236 0.4251 0.4265 0.4279 0.4292 0.4306 0.4319
1.5 0.4332 0.4345 0.4357 0.4370 0.4382 0.4394 0.4406 0.4418 0.4429 0.4441
17
Area between 0 and z

0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
1.6 0.4452 0.4463 0.4474 0.4484 0.4495 0.4505 0.4515 0.4525 0.4535 0.4545
1.7 0.4554 0.4564 0.4573 0.4582 0.4591 0.4599 0.4608 0.4616 0.4625 0.4633
1.8 0.4641 0.4649 0.4656 0.4664 0.4671 0.4678 0.4686 0.4693 0.4699 0.4706
1.9 0.4713 0.4719 0.4726 0.4732 0.4738 0.4744 0.4750 0.4756 0.4761 0.4767
2.0 0.4772 0.4778 0.4783 0.4788 0.4793 0.4798 0.4803 0.4808 0.4812 0.4817
2.1 0.4821 0.4826 0.4830 0.4834 0.4838 0.4842 0.4846 0.4850 0.4854 0.4857
2.2 0.4861 0.4864 0.4868 0.4871 0.4875 0.4878 0.4881 0.4884 0.4887 0.4890
2.3 0.4893 0.4896 0.4898 0.4901 0.4904 0.4906 0.4909 0.4911 0.4913 0.4916
2.4 0.4918 0.4920 0.4922 0.4925 0.4927 0.4929 0.4931 0.4932 0.4934 0.4936
2.5 0.4938 0.4940 0.4941 0.4943 0.4945 0.4946 0.4948 0.4949 0.4951 0.4952
2.6 0.4953 0.4955 0.4956 0.4957 0.4959 0.4960 0.4961 0.4962 0.4963 0.4964
2.7 0.4965 0.4966 0.4967 0.4968 0.4969 0.4970 0.4971 0.4972 0.4973 0.4974
2.8 0.4974 0.4975 0.4976 0.4977 0.4977 0.4978 0.4979 0.4979 0.4980 0.4981
2.9 0.4981 0.4982 0.4982 0.4983 0.4984 0.4984 0.4985 0.4985 0.4986 0.4986
3.0 0.4987 0.4987 0.4987 0.4988 0.4988 0.4989 0.4989 0.4989 0.4990 0.4990
18
Find the area under the curve to the left of 1.42

The "area under the curve" represents the shaded portion and it tells you
to the "left" of 1.42, so everything to the left of 1.42 should be shaded.
P (Z <1.42)

When the z-score of 1.42 is looked up in the table, the value returned is 0.9222 and since
we want the area to the left, we're done. 19
Find the area under the curve, P(Z > - 0.42)
The -0.42 is a single value, therefore it is the z-score looked up in the
table and represented by the vertical line. The z is a variable, meaning it
can take on many values, and corresponds to the shaded area. So
another way of looking at this is "the shaded area is greater than -0.42".

20
Find the percent of the data between -1.75 and 2.05
Here there are two z-values, but each of those is a singular number and so they
are represented by vertical lines on the graph. The "data" is the shaded portion
of the graph and so the shaded portion is between z = -1.75 and z = 2.05.

Area between -1.75 and 2.05

21
Z-Scores and the Standard Normal Distribution

 When a random variable has a normal distribution


and its values are converted to z-scores by
subtracting the mean and dividing by the standard
deviation, the z-scores have the standard normal
distribution.

22
Z-Score for a Value of a Random Variable

 The z-score for a value of a random variable is the number


of standard deviations that x falls from the mean µ.
 It is calculated as:

x-
z

23
Steps for calculating probability using the Z-score

1. Sketch a bell-shaped curve, indicate the mean and the


value(s) of x of interest.
2. shade the area (which represents the probability) you are
interested in obtaining.
3. Use the Z-score formula to calculate Z-value(s) for the
value of X of interested.
4. Look up Z-values in table to find corresponding area(s).
You need to use symmetry Z- table.
5. Calculate the area
6. interpret
24
Example

 Scores on the verbal or math portion of the


SAT are approximately normally distributed
with mean µ = 500 and standard deviation
σ = 100. The scores range from 200 to 800.

25
Theoretical Normal Distribution

68.26%

95.44%

99.72%

-5 -4 -3 -2 -1 0 1 2 3 4 5

200 300 400 500 600 700 800

26
Example

• If your verbal (Standardized test for USA,


Scholastic aptitude test) SAT score was
x = 700, how many standard deviations from
the mean was it? Find the z-score for above
x = 700.

27
Theoretical Normal Distribution

68.26%

95.44%

99.72%

-5 -4 -3 -2 -1 0 1 2 3 4 5

200 300 400 500 600 700 800

28
Example

• If your verbal SAT score was x = 700, how many standard


deviations from the mean was it? Find the z-score for
x = 700.

x-700 - 500
z   __________
2 = +2
 100

29
Now find out

• What percentage of SAT scores was


higher than yours?

30
Area between 0 and z

0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09

1.6 0.4452 0.4463 0.4474 0.4484 0.4495 0.4505 0.4515 0.4525 0.4535 0.4545
1.7 0.4554 0.4564 0.4573 0.4582 0.4591 0.4599 0.4608 0.4616 0.4625 0.4633
1.8 0.4641 0.4649 0.4656 0.4664 0.4671 0.4678 0.4686 0.4693 0.4699 0.4706
1.9 0.4713 0.4719 0.4726 0.4732 0.4738 0.4744 0.4750 0.4756 0.4761 0.4767
2.0 0.4772 0.4778 0.4783 0.4788 0.4793 0.4798 0.4803 0.4808 0.4812 0.4817
2.1 0.4821 0.4826 0.4830 0.4834 0.4838 0.4842 0.4846 0.4850 0.4854 0.4857
2.2 0.4861 0.4864 0.4868 0.4871 0.4875 0.4878 0.4881 0.4884 0.4887 0.4890
2.3 0.4893 0.4896 0.4898 0.4901 0.4904 0.4906 0.4909 0.4911 0.4913 0.4916
2.4 0.4918 0.4920 0.4922 0.4925 0.4927 0.4929 0.4931 0.4932 0.4934 0.4936
2.5 0.4938 0.4940 0.4941 0.4943 0.4945 0.4946 0.4948 0.4949 0.4951 0.4952
2.6 0.4953 0.4955 0.4956 0.4957 0.4959 0.4960 0.4961 0.4962 0.4963 0.4964
2.7 0.4965 0.4966 0.4967 0.4968 0.4969 0.4970 0.4971 0.4972 0.4973 0.4974
2.8 0.4974 0.4975 0.4976 0.4977 0.4977 0.4978 0.4979 0.4979 0.4980 0.4981
2.9 0.4981 0.4982 0.4982 0.4983 0.4984 0.4984 0.4985 0.4985 0.4986 0.4986
3.0 0.4987 0.4987 0.4987 0.4988 0.4988 0.4989 0.4989 0.4989 0.4990 0.4990 31
Z= 2
Area= 0.4772(probability of interest)

0.5 - 0.4772 = 0.0228*100


= 2.28%
• 2.28% percentage of SAT scores was
higher than yours.

32
Z= 2
Area= 0.4798(probability of interest)
0.5 + 0.4772 = 0.9772*100
= 97.72%
• 97.72% percentage of SAT scores was
less then yours.

33
Comparing variables with very different observed units of measure

• Example of comparing an SAT score to an American


collage testing (ACT) score
Mary’s ACT score is 26.
Jason’s SAT score is 900. Who did better?
• The mean SAT score is 1000 with a standard
deviation of 100 SAT points.
• The mean ACT score is 22 with a standard deviation
of 2 ACT points.

34
Let’s find the z-scores

Jason: Zx = 900-1000 = -1
100
Mary: Zx = 26-22 = +2
2
• From these findings, we gather that Jason’s score is
1 standard deviation below the mean SAT score and
Mary’s score is 2 standard deviations above the
mean ACT score.
• Therefore, Mary’s score is relatively better.
35
Central Limit Theorem

• Even though the population is not normally


distributed, the sampling distribution will approximate
a normal distribution.

• The approximation becomes better as the sample


size gets larger.

36
Sampling Distribution of Sample Means (N=2)
This distribution is only roughly normal.
37
Sampling Distribution of Sample Means (N=3)
Normality is better than those above.

38
Sampling Distribution of Sample Means (N=4)
Normality is even better than those above.

39
Sampling Distribution of Sample Means (N=5)
This distribution is very normal.

Sampling Distribution of Sample Means (N=6)


This distribution is even more normal. 40
Another example

• Suppose hemoglobin level in adults is approximately


normally distributed with mean 12.7 and standard
deviation 2.8

– A) What proportion of adults would you expect to have


HB level between 10 & 13.

41
A) Suppose hemoglobin level in adults is approximately normally distributed
with mean 12.7 and standard deviation 2.8

4.3 7.1 9.9 12.7 13.5 16.3 19.1 42


Example

• A) What proportion of adults would you expect to


have HB level between 10 & 13.

x- 10 - 12.7
z   ____ - 0.96______
 2.8
x -  13 - 12.7
z   ____0.10______
 2.8
x

43
Area between 0 and z

0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
0.0 0.0000 0.0040 0.0080 0.0120 0.0160 0.0199 0.0239 0.0279 0.0319 0.0359
0.1 0.0398 0.0438 0.0478 0.0517 0.0557 0.0596 0.0636 0.0675 0.0714 0.0753
0.2 0.0793 0.0832 0.0871 0.0910 0.0948 0.0987 0.1026 0.1064 0.1103 0.1141
0.3 0.1179 0.1217 0.1255 0.1293 0.1331 0.1368 0.1406 0.1443 0.1480 0.1517
0.4 0.1554 0.1591 0.1628 0.1664 0.1700 0.1736 0.1772 0.1808 0.1844 0.1879
0.5 0.1915 0.1950 0.1985 0.2019 0.2054 0.2088 0.2123 0.2157 0.2190 0.2224
0.6 0.2257 0.2291 0.2324 0.2357 0.2389 0.2422 0.2454 0.2486 0.2517 0.2549
0.7 0.2580 0.2611 0.2642 0.2673 0.2704 0.2734 0.2764 0.2794 0.2823 0.2852
0.8 0.2881 0.2910 0.2939 0.2967 0.2995 0.3023 0.3051 0.3078 0.3106 0.3133
0.9 0.3159 0.3186 0.3212 0.3238 0.3264 0.3289 0.3315 0.3340 0.3365 0.3389
1.0 0.3413 0.3438 0.3461 0.3485 0.3508 0.3531 0.3554 0.3577 0.3599 0.3621
1.1 0.3643 0.3665 0.3686 0.3708 0.3729 0.3749 0.3770 0.3790 0.3810 0.3830
1.2 0.3849 0.3869 0.3888 0.3907 0.3925 0.3944 0.3962 0.3980 0.3997 0.4015
1.3 0.4032 0.4049 0.4066 0.4082 0.4099 0.4115 0.4131 0.4147 0.4162 0.4177
1.4 0.4192 0.4207 0.4222 0.4236 0.4251 0.4265 0.4279 0.4292 0.4306 0.4319
1.5 0.4332 0.4345 0.4357 0.4370 0.4382 0.4394 0.4406 0.4418 0.4429 0.4441 44
Example

• 0.3315+0.0398 = 0.3713*100

• 37.13 % of adults would expect to have hemoglobin


level between 10 & 13.

45
Any Question ??

Thank You!!!!

46

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy