0% found this document useful (0 votes)
12 views33 pages

4 - The Shape of The Distribution

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views33 pages

4 - The Shape of The Distribution

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 33

 Symmetry

 Skewness
 Kurtosis

THE SHAPE
OF THE
DISTRIBUTION
Prepared by:
Danah A. Mama, RPm
SHAPE OF THE DISTRIBUTION

 Simply refers to the characteristics of


the frequency distribution (i.e.
histogram) of the scores.
IS IT SYMMETRICAL OR SKEWED?

 Symmetrical: can be cut down the


center to form 2 mirror images

 Realityis, we will never get a perfectly


symmetrical distribution, but we would
like our data to be as close to
symmetrical as possible
IS IT SYMMETRICAL OR SKEWED?
Normal Distribution
 has a single peak
(center), and two tails
that extend out
equally, forming a bell
shape or bell curve
 Perfectly symmetrical,
bell-shaped (normal)
curve: identical mean,
median, and mode
IS IT SYMMETRICAL OR SKEWED?
Bimodal Distribution
 two peaks that lie roughly
symmetrically on either side of the
center point
 Not a desirable characteristic to detect
numerically
SKEWNESS
 oneof the two tails of the distribution is
disproportionally longer than the other.

 Thisproperty can affect the value of the


averages we use in our analyses and
make them inaccurate representation of
our data, which causes many problems

 Tocorrectly determine the skewness of


the distribution, look at which tail is
longer.
SKEWNESS
 Positively-skewed
 More scores are to the left of the mode than
to the right
 the mean is greater than the median, and
the median is greater than the mode.
SKEWNESS
 Negatively-skewed
 More scores are to the right of the mode
than to the left
 The mean is less than the median, and the
median is less than the mode.
MAGNITUDE OF SKEWNESS

Skewness Magnitude
Approximately -0.5 to 0.5
Symmetrical
(Slight)
Moderately -1 to -0.5 (left)
Skewed 0.5 to 1 (right)
Significantly Less than -1 (left)
Asymmetrical Greater than 1 (right)
(Strong)
Descriptive Statistics
Sleep Hours
Valid 93
Missing 0
Mean 6.452
Std. Deviation 1.212
Skewness 0.125
Std. Error of Skewness 0.250
Kurtosis -0.444
Std. Error of Kurtosis 0.495
Shapiro-Wilk 0.961
P-value of Shapiro-Wilk 0.007
Minimum 4.000
Maximum 9.500
KURTOSIS
 “Tailedness” or “Peakness”

 Indicates how steep or flat a curve is compared


with the normal (bell-shaped) curve.

 It provides information about the shape of a


distribution, specifically the distribution's tails.

 Kurtosis is important in data analysis as it helps


identify the presence of outliers and the overall
shape of the distribution.
KURTOSIS
Mesokurtic Distribution (Normal)
 A mesokurtic distribution has a kurtosis
value close to 0.

 Ithas a bell-shaped curve similar to the


normal distribution.

 Example: The height of adult males in a


population.
Leptokurtic Distribution (Steep)
 A leptokurtic distribution has a kurtosis
value greater than 0.
 It has a higher, sharper peak and heavier
tails compared to a normal distribution.
 Example: The distribution of IQ scores in
a population.
 Implications: Leptokurtic distributions
have a higher probability of extreme
values or outliers.
Platykurtic Distribution (Shallow)
A platykurtic distribution has a kurtosis
value less than 0.
 It has a lower, flatter peak and thinner
tails compared to a normal distribution.
 Example: The distribution of test scores
after a curve is applied.
 Implications: Platykurtic distributions
have a lower probability of extreme
values or outliers.
In computer analysis…
■ A positive value of kurtosis means
that the curve is steep (Kurtosis > 0)
■ A zero value of kurtosis means that
the curve is middling (Kurtosis = 0)
■ A negative value of kurtosis means
that the curve is flat (Kurtosis < 0)
Conclusion
 Kurtosis is a valuable measure of the shape
of a distribution, providing information about
its peakedness and tailedness.
 Understanding kurtosis is important in data
analysis, as it helps identify the presence of
outliers and compare the distribution shapes
between datasets.
 Kurtosis has various applications in fields
such as finance, psychology, and quality
control.
 MEAN
 MEDIAN
 MODE
MEASURES
OF
CENTRAL
TENDENCY
Prepared by:
Danah A. Mama, RPm
CENTRAL TENDENCY
 What are the most typical and likely
scores in the distribution of
measurements?
 Central Tendency - the statistical
measure that identifies a single value as
representative of an entire dataset.
 helps summarize a set of data by
providing a central point around which
values tend to cluster.
Importance of Central
Tendency
 Data Summarization: Simplifies large
datasets.
 Comparison: Helps in comparing
different datasets.
 Decision Making: Aids in making
informed decisions based on data.
ARITHMETIC MEAN
 Is the average of a dataset.
 Sum of all of the scores in the distribution
divided by the number of scores
 Formula:
MEDIAN
 The middle score of a set if the scores are
organized from the smallest to the largest.
 50th percentile
 When there is an odd number of scores, the
median is simply the middle number
 When even numbers, median is the mean
of the two middle scores
 When there are numbers with the same
values, each appearance of that value gets
counted.
MODE
 The most frequently occurring value in
the dataset
 Only measure that we can use on
qualitative or categorical data as well
as numerical score data
 A dataset can have one mode, more
than one mode, or no mode at all.
 Bimodal or multimodal distribution –
several modes
Identifying outliers
statistically
 Outliers:atypical scores; unusually
large or small
 Distort any trend in the data
 Put analysis at risk of erroneous
conclusions
 Inspect tables of frequencies or
scatterplots
 Calculate Interquartile Range (IQR)
Calculate Interquartile Range
1. Arrange the scores from smallest to largest.
2. Delete the lowest 25% of the scores and the
highest 25% of the scores.
3. IQR = (Largest score – Smallest score) x 1.5

Note:
 Outliers among the low scores are defined as any score
which is smaller than the smallest score in the
interquartile range
 Outliers among the high scores are defined as any score
which is bigger than the largest score in the interquartile
range
Calculate Interquartile Range
1. 120, 115, 65, 140, 122, 142, 125, 135, 122,
138, 144, 118
2. 65, 115, 118, 120, 122, 122, 125, 135, 138,
140, 142, 144
3. 120, 122, 122, 125, 135, 138
4. IQR = 138 – 120 = 16 x 1.5 = 24
5. 120 – 24 = 96
6. 138 + 24 = 162
7. Scores NOT between 96 and 162 are outliers!
8. Thus, 65 is an outlier.
Calculate Interquartile Range
1. Extreme outliers are identified in much
the same way but the interquartile range
is multiplied by 3 (rather than 1.5)
2. It would be usual practice to delete
outliers from your data.
3. You might also wish to compare the
outcome of the analysis with the
complete data and with outliers
excluded.
COMPARING MEASURES OF
CENTRAL TENDENCY
 Differences among the measures occur with skewed
distributions
 Pattern:
 Mode will remain at the highest point in the
distribution
 Median will be pulled slightly out into the skewed tail
 Mean will be pulled the farthest out
 So, mean is more sensitive to skew than the median or
mode, and in cases of extreme skew, the mean may no
longer be appropriate to use.
 In media, median is usually reported to summarize the
center of skewed distributions
When to Use Each Measure
 Mean: Best for normally distributed data
without outliers.
 Median: Best for skewed distributions or
when outliers are present.
 Mode: Useful for categorical data or to
identify the most common item.
Practical Applications in
Psychology
 Personality Assessment
 Mean: Used to determine the average score on
personality tests to assess where an individual falls
compared to the norm.
 Median: Helpful in analyzing skewed distributions of
personality traits, as the median is less affected by
outliers.

 Clinical Psychology
 Mode: Identifies the most common symptoms or disorders
in a patient population, aiding in diagnosis and treatment
planning.
 Mean: Used to track changes in symptom severity over
time, such as the average depression score before and
after therapy.
Practical Applications in
Psychology
 Cognitive Psychology
 Median: Helpful in analyzing reaction time
data, which is often skewed, to determine
the typical response speed.
 Mode: Identifies the most frequent types
of errors or strategies used by
participants in cognitive tasks.
Practical Applications in
Psychology
 Developmental Psychology
 Median: Useful for determining milestones like the
median age of first steps or first words, as the
median is less influenced by early or late
developers.
 Mode: Identifies the most frequent developmental
stages or behaviors observed in a sample of
children.

 Social Psychology
 Mean: Used to calculate the average number of
friends or social interactions for individuals in a
study.
 Mode: Identifies the most common types of social
relationships or interactions observed.
Conclusion
 Measures of central tendency provide
valuable insights into datasets.
 Understanding when and how to use
mean, median, and mode is essential
for effective data analysis.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy