0% found this document useful (0 votes)
18 views32 pages

2 Frequency Distribution

PPT of Statistical Analysis for MPhil in Psychology
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views32 pages

2 Frequency Distribution

PPT of Statistical Analysis for MPhil in Psychology
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 32

Frequency DISTRIBUTION

Instructor: Dr. Irum Naqvi


 After collecting data, the first task for a
researcher is to organize and simplify the
data so that it is possible to get a general
overview of the results.
This is the goal of descriptive statistical
techniques.
One method for simplifying and organizing
data is to construct a frequency
distribution.

A frequency distribution is an organized


tabulation showing exactly how many
individuals are located in each category
on the scale of measurement.
A frequency distribution presents an
Interpreting frequency
distributions
Central Location
 Gravitational center
 Mean  Middle value  Median
Spread
 Range and inter-quartile range
 Standard deviation and variance
Shape
 Symmetry
 Modality
 Kurtosis and Skewness
Frequency Distribution
Table
A frequency distribution table consists of at
least two columns - one listing categories
on the scale of measurement (X) and
another for Frequency (f).
In the X column, values are listed from the
highest to lowest, without skipping any.
For the frequency column, tallies are
determined for each value (how often each
X value occurs in the data set). These
tallies are the frequencies for each X value
The sum of the frequencies should equal N.
Calculate ∑X from Table

X f ∑X2 ∑fX Calculate


∑X
5 1 25 5
∑fX
4 2 16 8
3 3 9 9 ∑X2
2 3 4 6
1 1 1 1
Frequency Distribution
Tables
A third column can be used for the
proportion (p) for each category: p = f/N.
The sum of the p column should equal
1.00.
 A fourth column can display the
percentage P(% ) of the distribution
corresponding to each X value. The
percentage is found by multiplying p by
100.
P(% )= (f/N)100
The sum of the percentage column is
X f ∑X2 ∑fX

5 1 25 5
4 2 16 8
3 3 9 9
2 3 4 6
1 1 1 1
15 10 55 30

X f p= P(% )=
f/N (f/N)100

5 1 .10 10
4 2 .20 20
3 3 .30 30
2 3 .30 30
1 1 .10 10
15 10 1 100
Regular Frequency Distribution
 When a frequency distribution table lists all of the individual categories
(X values) it is called a regular frequency distribution.

Grouped Frequency Distribution


 Sometimes, however, a set of scores covers a wide range of values. In
these situations, a list of all the X values would be quite long - too long
to be a “simple” presentation of the data.
 To remedy this situation, a grouped frequency distribution table is used.
 In a grouped table, the X column lists groups of scores, called class
intervals, rather than individual values.
 These intervals all have the same width, usually a simple number such as
2, 5, 10, and so on.
 Each interval begins with a value that is a multiple of the interval
width. The interval width is selected so that the table will have
approximately ten intervals.
Procedure of Constructing a Grouped Frequency Distribution
Step 1 Determine the classes. Find the highest
and lowest value. Find the range. Select the
number of classes desired.
Step 2 Find the width or Group Size by
dividing the range by the number of
classes desired and rounding up. Select a
starting point (usually the lowest value or
any convenient number less than the lowest
value); add the width to get the lower limits.
Find the upper class limits. Find the
boundaries.
Step 3 Tally the data.
Step 4 Find the numerical frequencies from
the tallies.
Example: Leaves - Alex measured the lengths of leaves on the
oak tree (to the nearest cm):
9,16,13,7,8,4,18,10,17,18,9,12,5,9,9,16,1,8,17,1,
10,5,9,11,15,6,14,9,1,12,5,16,4,16,8,15,14,17
 Lets group them: To get started, put the numbers in order,
then find the smallest and largest values in your data, and
 Calculate the range (range = largest - smallest)
In order the lengths are:
1,1,1,4,4,5,5,5,6,7,8,8,8,9,9,9,9,9,9,10,10,11,12,12,
13,14,14,15,15,16,16,16,16,17,17,17,18,18
 The smallest value (the "minimum") is 1 cm
 The largest value (the "maximum") is 18 cm
 The range is 18−1 = 17 cm
Group Size
 Now calculate an approximate group size, by dividing
the range by how many groups you would like.
 Then round that group size up to some simple value (like 2
instead of 1.83 or 5 instead of 4.26).
 Let us say we want about 5 groups. Divide the range by 5. 17/5
= 3.4 . Then round that up to 4
Start Value
 Pick a starting value that is less than or equal to the smallest
value. Try to make it a multiple of the group size if you can.
In our case 0 value make a sense.
Groups
 Now calculate the list of groups. (We must go up to or past the
largest value).
 Starting at 0 and with a group size of 4 we get: 0, 4, 8, 12, 16
 Write down the groups.
 Include the end value of each group that must be less than
Apparent limits: The
the next
endpoints group:
of an interval
Real limits: Calculated
from the apparent limits
using the unit of
measurement (+- 0.5)
Percentiles and Percentile Ranks
The relative location of individual scores
within a distribution can be described by
percentiles and percentile ranks.
The percentile rank of a particular score
is defined as the percentage of individuals
in the distribution with scores at or below
the particular value.
When a score is identified by its percentile
rank, the score is called a percentile.

When a desired percentile or percentile


rank is located between two known values,
it is possible to estimate the desired value
using the process of interpolation.
 To find percentiles and percentile ranks, two new columns
are placed in the frequency distribution table: One is for
cumulative frequency (Cf) and the other is for
cumulative percentage (c%).
 The cumulative frequencies show the number of individuals
located at or below each score. To find percentiles, we must
convert these frequencies into percentages. The resulting
values are called cumulative percentages because they
Apparent Real f Cf c% The values in this column
= are
show
Limits
the percentage
limits
of individuals who
Cf/N
accumulated
represent the percentage of as
you move up the scale.
Class
individuals who are located in and
(100%) below each category. For
boundari example, 76% of the individuals (7
es out of 38) had scores of X 15.5 or
lower. Notice that each cumulative
0-3 -0.5-3.5 3 3 3/38*100 percentage value is associated
=8 with the upper real limit of its
4-7 3.5-7.5 7 3+7 =10 26 interval

8-11 7.5-11.5 12 3+7+12=22 57


12-15 11.5-15.5 7 3+7+12+7= 76
29
16-19 15.5-19.5 9 3+7+12+7+ 100
9=38
Exercises
1. Arrange the following data in ascending
order. Calculate the proportion of each value
(a) 7, 2, 10, 14, 0, 6, 15, 24, 8, 3
(b) 4.6, 8.1, 2.0, 3.5, 0.7, 9.3, 1.4, 0.8

2. Arrange the following data in descending


order. Calculate the proportion percentage of
each value
(a) 14, 2, 0, 10, 6, 1, 22, 13, 28, 4, 8, 16
(b) 1.2, 3.5, 0.1, 0.3, 2.4, 8.6, 5.0, 3.7, 0.7,
0.9
Shape
A graph shows the shape of the distribution.
A distribution is symmetrical if the left side of
the graph is (roughly) a mirror image of the
right side.
One example of a symmetrical distribution is
the bell-shaped normal distribution.
On the other hand, distributions are skewed
when scores pile up on one side of the
distribution, leaving a "tail" of a few extreme
values on the other side.
In a positively skewed distribution, the
scores tend to pile up on the left side of the
distribution with the tail tapering off to the
right.
Skewness assesses the extent to which a
variable’s distribution is symmetrical. If the
distribution of responses for a variable stretches
toward the right or left tail of the distribution,
then the distribution is characterized as skewed.
A negative skewness indicates a greater
number of larger values on right, whereas a
positive skewness indicates a greater number of
smaller values on left side of distribution.

As a general guideline, a skewness value


between −1 and +1 is considered excellent,
but a value between −2 and +2 is generally
considered acceptable. Values beyond −2 and
 Positively Skewed: In a
distribution that is Positively
Skewed, the values are
more concentrated towards
the left side, and the right
tail is spread out. Hence,
the statistical results are
bent towards the left-
hand side. Hence, that the
mean, median, and mode
are always positive. In this
distribution, Mean >
Median > Mode.

 Negatively Skewed: In a
Negatively Skewed
distribution, the data points
are more concentrated
towards the right-hand side
of the distribution. This
makes the mean, median,
and mode bend towards the
right. Hence these values
Kurtosis is a measure of whether the
distribution is too peaked (very narrow
distribution with most of the responses in the
center).
 A positive value for the kurtosis indicates a
distribution more peaked than normal. In
contrast, a negative kurtosis indicates a shape
flatter than normal.
Analogous to the skewness, the general
guideline is that if the kurtosis is greater
than +2, the distribution is too peaked; a
kurtosis of less than −2 indicates a
distribution that is too flat.
When both skewness and kurtosis are close to
zero, the pattern of responses is considered a
The expected value of kurtosis is 3.
This is observed in a symmetric distribution.
A kurtosis greater than three will indicate
Positive Kurtosis. In this case, the value of
kurtosis will range from 1 to infinity.
Further, a kurtosis less than three will mean a
negative kurtosis. The range of values for a
negative kurtosis is from -2 to infinity.
The greater the value of kurtosis, the higher
the peak.
Frequency Distribution
Graphs
In a frequency distribution graph, the score
categories (X values) are listed on the X
axis and the frequencies are listed on the
Y axis.
When the score categories consist of
numerical scores from an interval or ratio
scale, the graph should be either
histogram or a polygon.
Frequency distribution graphs are useful
because they show the entire set of scores.
At a glance, you can determine the highest
score, the lowest score, and where the
scores are centered. The graph also shows
Histograms
 In a histogram, a
bar is centered
above each score
(or class
interval) so that
the height of the
bar corresponds
to the frequency
and the width
extends to the
real limits, so
that adjacent
bars touch.
Polygons
In a polygon, a
dot is centered
above each
score so that
the height of the
dot corresponds
to the frequency.
The dots are
then connected
by straight lines.
An additional line
is drawn at each
end to bring the
Bar graphs
 When the score
categories (X
values) are
measurements
from a nominal or
an ordinal scale,
the graph should
be a bar graph.
A bar graph is just
like a histogram
except that gaps
or spaces are left
between adjacent
Relative frequency
Many populations are so large that it is
impossible to know the exact number of
individuals
 In (frequency) for any specific
these situations,
category. distributions
population
can be shown using
relative frequency instead
of the absolute number of
individuals for each
category.
 Males and females living
in Pakistan: Consensus
data and general trends
the two numbers are very
close but female
Stem-and-Leaf Displays
 A stem-and-leaf display provides a very
efficient method for obtaining and displaying a
frequency distribution.
 Each score is divided into a stem (the first
digit or digits) and a leaf (the last digit).
 Finally, you go through the list of scores, one at
a time, and list the stems in one column and
write the leaf for each score beside its stem.
 The resulting display provides an organized
picture of the entire distribution. The number
of leafs beside each stem corresponds to the
frequency, and the individual leafs identify
the individual scores.
 A stem and leaf display is similar to a grouped
frequency distribution table, however the stem
Data as an ordered array (n = 10):
05 11 21 24 27 28 30 42 50 52
Divide each data point into
 Stem values (first one or two digits)
 Leaf values (next digit)
Draw stem-like axis from lowest to
highest stem
0|5
In this example 1|1
 Stem values  tens place 2|
 Leaf values  ones place 1478
 e.g., 21 has a stem value of 2 and leaf 3|0
value of 1 4|2
5|02
Stem and Leaf Plot with
Decimals
Home work Exercises
1. Arrange the following data in ascending
order. Calculate the proportion of each value
(a) 7, 2, 10, 14, 0, 6, 15, 24, 8, 3
(b) 4.6, 8.1, 2.0, 3.5, 0.7, 9.3, 1.4, 0.8

2. Arrange the following data in descending


order. Calculate the proportion percentage of
each value
(a) 14, 2, 0, 10, 6, 1, 22, 13, 28, 4, 8, 16
(b) 1.2, 3.5, 0.1, 0.3, 2.4, 8.6, 5.0, 3.7, 0.7,
0.9
3. Question
Pulse rate (per minute) of 25 persons were
recorded as
61, 75, 71, 72, 70, 65, 77, 72, 67, 80, 77,
62, 71, 74, 79, 67, 80, 77, 62, 71, 74, 61,
70, 80, 72, 59, 78, 71, 72.
Construct a frequency table expressing the
data in the inclusive form taking the class
interval of equal width then establish the
class boundaries. Calculate the cf and c%
4. Question
The frequency distribution of weights (in kg)
of 40 persons is given below.
Weights (in
30 - 35 35 - 40 40 - 45 45 - 50 50 - 55
kg)

Frequency 6 13 14 4 3

(a) What is the lower limit of fourth class


interval? ___Answer:
(b) What is the class size of each class interval?
___Answer:
(c ) Which class interval has the highest
frequency? ___Answer:
Below are the weights, in ounces, of some
tangerines.
2.5 3.1 3.5 3.4 3.9 2.7 4.3 2.9 2.9 3.5 4.0
5.1 5.5 6.0 5.6 4.5 4.6 4.7 4.5 4.5 3.9 5.2
(a) In the space below, construct a stem
and leaf diagram for the tangerines.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy