DLD Lab Task
DLD Lab Task
Assignment 1
Subject:
Basic of Bio-Statistics
Book code:
MPH-705-BB
Submitted To:
Mr. Mudassar Mushtaq Abassi
Submitted By:
Iqra Azam
Topic Name:
Measure of Location
Measure of Central Tendency
Semester :
1st Semester Group-B
Introduction:
In statistics, especially biostatistics which deals with biological data, understanding
the central tendency of a dataset is crucial. Three main measures help us achieve this:
mean, median, and mode. Each provides a unique perspective on the "middle" of the
data. This assignment explores the concept of central tendency in biostatistics
through three key measures: mean, median, and mode. A measure of central
tendency is a single value that attempts to describe a set of data by identifying the
central position within that set of data. As such, measures of central tendency are
sometimes called measures of central location.
The mean, median and mode are all valid measures of central tendency, but under
different conditions, some measures of central tendency become more appropriate to
use than others. In the following sections, we will look at the mean, mode and
median, and learn how to calculate them and under what conditions they are most
appropriate to be used.
1. Mean: (Arithmetic)
The mean, also known as the average, is calculated by adding all the values in a
dataset and dividing by the total number of values. Formula: Mean (X bar) = (Σx) /
n; where Σ (sigma) represents the sum of all values (x), and n represents the number
of data points. It is affected by extremely high or low values, called outliers, and may
not be the appropriate average to use in these situations. It cannot be computed for
the data in a frequency distribution that has an open-ended class. It varies less than
the median or mode when samples are taken from the same population. The sum of
the deviations from the mean is 0.
Advantages: Easy to understand and interpret, sensitive to changes in all data points.
Disadvantages of calculating mean: Can be skewed by outliers (extreme values).
Example:- The blood glucose values of the family Z is; 156 123 142 173, 93 so The
blood glucose values of the family Z is; 156 +123 +142 +173+97/5 Mean is 138.2
2. Median:
The median is the "middle" value when the data is arranged in ascending or
descending order. The Median is the midpoint of the values after they have been
ordered from the smallest to the largest. There are as many values above the median
Page |1
as below it in the data array. There is a unique median for each data set. It is not
affected by extremely large or small values and is therefore a valuable measure of
location when such values occur.
Calculation:
o For odd numbers of data points: Median = [(n + 1) / 2]th observation.
o For even numbers of data points: Median = Average of [(n / 2)th] and [(n / 2) +
1]th observations.
Advantages of using median: Not affected by outliers as much as the mean.
Disadvantages of using median: Doesn't provide as much information as the mean,
especially for large datasets.
Example: - The heights of four basketball players, in inches, are: 76, 73, 80, 75.
Solution: - Arranging the data in ascending order gives: 73, 75, 76, 80, thus the
median is 75.5. The median is found at the (n+1)/2 = (4+1)/2 =2.5 th data point.
3. Mode:
The mode is the most frequently occurring value in a dataset. The mode can be used
when the data are nominal, such as religious preference, gender, or political
affiliation. The mode is not always unique. A data set can have more than one mode,
or the mode may not exist for a data set.
Calculation: Identify the value that appears most often.
Advantages: Useful for identifying the most common value, especially in
categorical data.
Disadvantages: There can be multiple modes (bimodal or multimodal data), or no
mode at all if all values appear with equal frequency.
Page |2