100% found this document useful (2 votes)

435 views7 pages

Descriptive Statistics MBA

Descriptive statistics are used to describe basic features of data through simple summaries. There are three major characteristics examined: distribution, central tendency (mean, median, mode), and dispersion (range, variance, standard deviation). The normal distribution is a bell-shaped curve where the mean, median and mode are equal and about 68% of values fall within one standard deviation of the mean. Descriptive statistics provide simple descriptions of data, while inferential statistics are used to make generalizations beyond the sample data.

Uploaded by

Kritika Jaiswal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOC, PDF, TXT or read online on Scribd

100% found this document useful (2 votes)

435 views7 pages

Descriptive Statistics MBA

Uploaded by

Kritika Jaiswal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOC, PDF, TXT or read online on Scribd

You are on page 1/ 7

Descriptive Statistics

Descriptive statistics are used to describe the basic features of the data in a study. They
provide simple summaries about the sample and the measures. Together with simple graphics
analysis, they form the basis of virtually every quantitative analysis of data. Descriptive
statistics are typically distinguished from inferential statistics. With descriptive statistics you
are simply describing what is or what the data shows. With inferential statistics, you are
trying to reach conclusions that extend beyond the immediate data alone. For instance, we use
inferential statistics to try to infer from the sample data what the population might think. Or,
we use inferential statistics to make judgments of the probability that an observed difference
between groups is a dependable one or one that might have happened by chance in this study.

There are three major characteristics of a single variable that we tend to look at:

 The distribution
 The central tendency
 The dispersion

In most situations, we would describe all three of these characteristics for each of the
variables in our study.

The Distribution: Data can be "distributed” or spread out in different ways. It can be spread
out more on the left or more on the right or it can be all jumbled up.

But there are many cases where the data tends to be around a central value with no bias left or
right, and it looks like this:

This distribution which is bell shaped is a Normal Distribution. It is often called a "Bell
Curve"
because it looks like a bell.
The Normal Distribution has some properties which are as follows:
 It works on the principle of probability. (Likelihood that even will occur)
 mean = median = mode
 Symmetry about the center.
 50% of values less than the mean and 50% greater than the mean

Normal Distribution: The Concept

A probability distribution that plots all of its values in a symmetrical fashion and most of the results
are situated around the probability's mean. Values are equally likely to plot either above or
below the mean. Grouping takes place at values that are close to the mean and then tails off
symmetrically away from the mean. Normal distribution is also known as a "Gaussian
distribution" or "bell curve".

The normal distribution is produced by the normal density function,

In this exponential function e is the constant 2.71828…, is the mean, and σ is the standard deviation.
The probability of a random variable falling within any given range of values is equal to the
proportion of the area enclosed under the function’s graph between the given values and
above the x-axis. Because the denominator (σ√2π), known as the normalizing coefficient,
causes the total area enclosed by the graph to be exactly equal to unity, probabilities can be
obtained directly from the corresponding area—i.e., an area of 0.5 corresponds to a
probability of 0.5. tables were generated in the 19th century for the special case of = 0 and σ =
1, known as the standard normal distribution, and these tables can be used for any normal
distribution after the variables are suitably rescaled by subtracting their mean and dividing by
their standard deviation, (x − μ)/σ.

Measures of Shape
As defined earlier also, normal distribution is bell shaped. The shape of distribution is
assessed by examining skewness and Kurtosis.
Skewness:-
Skewness is a measure of symmetry, or more precisely, the lack of symmetry. A distribution,
or data set, is symmetric if it looks the same to the left and right of the center point. Skewness
is the tendency of deviation from the mean to be larger in one direction than in another. A
positively skewed distribution has a "tail" which is pulled in the positive direction. A
negatively skewed distribution has a "tail" which is pulled in the negative direction.

It is calculated by the formula:-

Where is the mean, is the standard deviation, and N is the number of data points. A
normal distribution has a skewness of 0.
Kurtosis:
Kurtosis is a measure of whether the data are peaked or flat relative to a normal distribution.
That is, data sets with high kurtosis tend to have a distinct peak near the mean, decline rather
rapidly, and have heavy tails. Data sets with low kurtosis tend to have a flat top near the mean
rather than a sharp peak. A uniform distribution would be the extreme case.
A normal distribution is a mesokurtic distribution. A pure leptokurtic distribution has a higher
peak than the normal distribution and has heavier tails. A pure platykurtic distribution has a
lower peak than a normal distribution and lighter tails.

It is calculated by the formula:-

Where is the mean, is the standard deviation, and N is the number of data points. A
normal distribution has kurtosis equal to 0.

Need of Normal Distribution

Many things actually are normally distributed, or very close to it. For example, height and intelligence are
approximately normally distributed; measurement errors also often have a normal distribution.

The normal distribution is easy to work with mathematically. In many practical cases, the methods
developed using normal theory work quite well even when the distribution is not normal.

There is a very strong connection between the size of a sample N and the extent to which a sampling
distribution approaches the normal form. Many sampling distributions based on large N can be
approximated by the normal distribution even though the population distribution itself is definitely not
normal.
Central Tendency: The central tendency of a distribution is an estimate of the "center" of a
distribution of values. There are three major types of estimates of central tendency:

 Mean
 Median
 Mode

The Mean or average is probably the most commonly used method of describing central
tendency. This is given by the formula:-

µ = ∑×/N

µ = Mean

X= Random Variable

N= No. of Respondents

To compute the mean all you do is add up all the values and divide by the number of values.
For example, the mean or average quiz score is determined by summing all the scores and
dividing by the number of students taking the exam. For example, consider the test score
values:

15, 20, 21, 20, 36, 15, 25, 15

The sum of these 8 values is 167, so the mean is 167/8 = 20.875.

The Median is a measure of central tendency given as the value above which half of the
values fall and below which half of the values fall. Median is the 50 th percentile. Data is
arranged in ascending or descending order and middle value is the median if data number is
odd. If data is even in number, then is formulated by adding the two middle values and
dividing their sum by 2.

If we order the 8 scores shown above, we would get:

15,15,15,20,20,21,25,36

There are 8 scores and score #4 and #5 represent the halfway point. Since both of these scores
are 20, the median is 20

The mode is the most frequently occurring value in the set of scores. To determine the mode,
you might again order the scores as shown above, and then count each one. The most
frequently occurring value is the mode. In the example given above, the value 15 occurs three
times and is the mode. In some distributions there is more than one modal value. For
instance, in a bimodal distribution there are two values that occur most frequently.

Note: Notice that for the same set of 8 scores we got three different values -- 20.875, 20,
and 15 -- for the mean, median and mode respectively. If the distribution is truly normal
(i.e., bell-shaped), the mean, median and mode are all equal to each other.

Dispersion: Dispersion refers to the spread of the values around the central tendency. There
are three common measures of dispersion, the range, variance and the standard deviation.

The range is simply the highest value minus the lowest value. In our example distribution,
the high value is 36 and the low is 15, so the range is 36 - 15 = 21.

Variance is the measure of the dispersion or deviation of a set of data points around their
mean value. Variance is a mathematical expectation of the average squared deviations from
the mean. It is depicted by the symbol σ 2. In order to calculate the variance, first calculate the
mean, then subtract each value from the mean, square the result and find out the average of
the result.
It is calculated using the formula:-

Example: Five people have Rs. 600, 470, 170, 430 and 300. Find out the variance.
Answer: - Mean (µ) = 600+470+170+ 430+300 = 1970/5 = 394.
Variance = (600-394)2 + (470-394)2+ (170-394)2 + (430-394)2 + (300-394)2 /5 = 21, 704

The Standard Deviation is a more accurate and detailed estimate of dispersion because
it shows the relation that set of scores has to the mean of the sample. . It is calculated
as:-

Standard Deviation:

In the above example of variance, the standard deviation would be = √ 21, 704 = 147.33 Rs.
Note: Once you know the mean and standard deviation of the population, you can tell how
far your data points lay from the mean and in what percentage. In a normal distribution
this is:-
 68% of the distribution lies within one standard deviation of the mean.
 95% of the distribution lies within two standard deviations of the mean.
 99.7% of the distribution lies within three standard deviations of the mean.
Presenting the Univariate Data Analysis

A basic way of presenting univariate data is to create a frequency distribution of the

individual cases, which involves presenting the number of attributes of the variable studied
for each case observed in the sample. The frequency (f) of a particular observation is the
number of times the observation occurs in the data. The distribution of a variable is the
pattern of frequencies of the observation. This can be done in a table format, histograms, with
a bar chart or a similar form of graphical representation.
Frequency distributions can show either the actual number of observations falling in each
range or the percentage of observations. When frequency distribution is done with the help of
percentage; the distribution is called a relative frequency distribution. Frequency distribution
tables can be used for both categorical and numeric variables. Continuous variables should
only be used with class intervals.
A sample distribution table and a bar chart for a univariate analysis are presented below

Age range Frequency Percent

under 18 10 5
18–29 50 25
29–45 40 20
45–65 40 20
over 65 60 30
Valid cases: 200
Missing cases: 0

Apart from frequency distribution tables, descriptive statistics also includes the measures of
central tendency, dispersion and shape.
Dealing with Missing data: There are certain situations in which respondents knowingly or
unknowingly don’t answer certain questions. The responses corresponding to such
respondents are known as missing data.
The most common approach dealing with missing data is list wise deletion whereby we
simply omit those cases with missing data and to run our analyses on what remains. This
approach is usually called list wise deletion, but it is also known as complete case analysis.
This approach results in reduced sample size and sometimes biased estimate of population
parameter. Another approach is pairwise deletion in which each element of the inter-
correlation matrix is estimated using all available data. If one participant reports his income
and expenditure, but not his age, he is included in the correlation of income and expenditure,
but not in the correlations involving age. This approach also suffers from several
disadvantages like estimate of parameters will be based on different sets of data, with
different sample sizes and different standard errors. Some researcher also use mean value to
substitute for the missing data. Others also conduct regression analysis to deal with missing
data. Missing data coding should also be done with caution. The missing data should be
assigned a number that should not be equal to the value of variable obtained in the survey. All
other methods of presenting univariate data have been explained in data processing chapter.

Statistics True or False
100% (1)
Statistics True or False
9 pages
Correlation New
100% (1)
Correlation New
38 pages
BRM Data Analysis Techniques
No ratings yet
BRM Data Analysis Techniques
53 pages
Palompon Institute of Technology Palompon, Leyte: FD 502 (Educational Statitics)
No ratings yet
Palompon Institute of Technology Palompon, Leyte: FD 502 (Educational Statitics)
18 pages
Normality, T-Test, ANOVA, Chi Square, Correlation
No ratings yet
Normality, T-Test, ANOVA, Chi Square, Correlation
31 pages
Regression Analysis: Statistics For Psychology
No ratings yet
Regression Analysis: Statistics For Psychology
40 pages
Introduction To Statistics
No ratings yet
Introduction To Statistics
22 pages
Correlation
No ratings yet
Correlation
17 pages
Catpca
No ratings yet
Catpca
19 pages
Sri Guru Tegh Bahadur Institute of Management & Information Technology (Ggsipu)
No ratings yet
Sri Guru Tegh Bahadur Institute of Management & Information Technology (Ggsipu)
63 pages
Research Design: Meaning and Types. Formulation of Research Problem
No ratings yet
Research Design: Meaning and Types. Formulation of Research Problem
28 pages
Statistics Project
No ratings yet
Statistics Project
22 pages
Correlation and Regression-1
No ratings yet
Correlation and Regression-1
32 pages
Class 1 Mathematical Basis For Managerial Decision - Chapter1 - Gaurav
No ratings yet
Class 1 Mathematical Basis For Managerial Decision - Chapter1 - Gaurav
42 pages
Bimal Jalan Committee
No ratings yet
Bimal Jalan Committee
15 pages
I M Com QT Final On16march2016
0% (1)
I M Com QT Final On16march2016
166 pages
Statistics Notes
No ratings yet
Statistics Notes
15 pages
Chapter Three Factor Analysis
No ratings yet
Chapter Three Factor Analysis
13 pages
Measures of Central Tendency - Business Statistics by PR Vittal
No ratings yet
Measures of Central Tendency - Business Statistics by PR Vittal
20 pages
Econ2330 Ch09
No ratings yet
Econ2330 Ch09
65 pages
Factor Analysis
No ratings yet
Factor Analysis
29 pages
STATA Training
100% (1)
STATA Training
63 pages
Business Analytics - The Science of Data Driven Decision Making
No ratings yet
Business Analytics - The Science of Data Driven Decision Making
55 pages
Cluster Analysis With SPSS
No ratings yet
Cluster Analysis With SPSS
8 pages
Introduction To Multivariate Analysis: Dr. Ibrahim Awad Ibrahim
No ratings yet
Introduction To Multivariate Analysis: Dr. Ibrahim Awad Ibrahim
36 pages
TABULAR AND GRAPHICAL PRESENTATIONS Objectives
No ratings yet
TABULAR AND GRAPHICAL PRESENTATIONS Objectives
13 pages
Chap4 Normality (Data Analysis) FV
100% (1)
Chap4 Normality (Data Analysis) FV
72 pages
Bca-1sem Statistics, Unit1,2 and Moment
No ratings yet
Bca-1sem Statistics, Unit1,2 and Moment
52 pages
Chapter 6-8 Sampling and Estimation
No ratings yet
Chapter 6-8 Sampling and Estimation
48 pages
Correlation & Simple Regression
No ratings yet
Correlation & Simple Regression
15 pages
Statistics For Managers Notes
No ratings yet
Statistics For Managers Notes
57 pages
Statistics PPT UNIT I 28.11.2020
No ratings yet
Statistics PPT UNIT I 28.11.2020
150 pages
Factor Analysis
100% (1)
Factor Analysis
35 pages
Organizational Cynicism
100% (1)
Organizational Cynicism
62 pages
Booklist and Supplementary Materials For Iss
No ratings yet
Booklist and Supplementary Materials For Iss
16 pages
Hypothesis Testing Sept 2016
No ratings yet
Hypothesis Testing Sept 2016
54 pages
Exploratory Factor Analysis
100% (1)
Exploratory Factor Analysis
33 pages
0210108402-24-Ind426-2018-04-Ppt 3 Conjoint Analysis
No ratings yet
0210108402-24-Ind426-2018-04-Ppt 3 Conjoint Analysis
12 pages
Business Statistics3
No ratings yet
Business Statistics3
18 pages
Statistics in Psychology IGNOU Unit 4
No ratings yet
Statistics in Psychology IGNOU Unit 4
21 pages
Chapter-15: Research Methodology
No ratings yet
Chapter-15: Research Methodology
25 pages
5 Chi Square Tests
No ratings yet
5 Chi Square Tests
38 pages
Business Research Method: Factor Analysis
100% (1)
Business Research Method: Factor Analysis
52 pages
Myer's Index
100% (1)
Myer's Index
1 page
Unit 10 Randomised Block Design: Structure
No ratings yet
Unit 10 Randomised Block Design: Structure
16 pages
OUTLIERS
100% (1)
OUTLIERS
5 pages
Causal Research
No ratings yet
Causal Research
23 pages
Cluster Analysis BRM Session 14
No ratings yet
Cluster Analysis BRM Session 14
25 pages
MPC 006 D
No ratings yet
MPC 006 D
12 pages
CUCET MBA 2025 Mock Test With Solutions PDF - 1741708367721
No ratings yet
CUCET MBA 2025 Mock Test With Solutions PDF - 1741708367721
53 pages
Application of Statistics in Real Life: By: Shrestha Pranay and Shivam Surya Nirwana
No ratings yet
Application of Statistics in Real Life: By: Shrestha Pranay and Shivam Surya Nirwana
21 pages
Land Resource and Energy Resources
100% (1)
Land Resource and Energy Resources
11 pages
Consequences of Multicollinearity
100% (2)
Consequences of Multicollinearity
2 pages
Estimation and Hypothesis Testing
100% (2)
Estimation and Hypothesis Testing
47 pages
DS Notes Unit - III
No ratings yet
DS Notes Unit - III
29 pages
Stat Distributions
No ratings yet
Stat Distributions
24 pages
Statical Data 1
No ratings yet
Statical Data 1
32 pages
Descreptive Statistics 1
No ratings yet
Descreptive Statistics 1
74 pages
Bioepi Lesson 6. Descriptive Statistics
No ratings yet
Bioepi Lesson 6. Descriptive Statistics
38 pages
Chap 2-1 Descriptive Statistics
No ratings yet
Chap 2-1 Descriptive Statistics
10 pages
De-Mystifying Math and Stats for Machine Learning: Mastering the Fundamentals of Mathematics and Statistics for Machine Learning
From Everand
De-Mystifying Math and Stats for Machine Learning: Mastering the Fundamentals of Mathematics and Statistics for Machine Learning
Seaport AI Madhavan
No ratings yet
Foundations of Probability in Python - Part 4
No ratings yet
Foundations of Probability in Python - Part 4
62 pages
Sol 12
No ratings yet
Sol 12
4 pages
Counting Techniques and Math. Expectation
No ratings yet
Counting Techniques and Math. Expectation
25 pages
ISYE 3039 Methods For Quality Improvement Spring 2013 Homework 1 Solution
No ratings yet
ISYE 3039 Methods For Quality Improvement Spring 2013 Homework 1 Solution
3 pages
Introduction To Bandits: (Some Slides Stolen From Csaba's AAAI Tutorial)
No ratings yet
Introduction To Bandits: (Some Slides Stolen From Csaba's AAAI Tutorial)
16 pages
SOA Exam P Sample Solutions
No ratings yet
SOA Exam P Sample Solutions
158 pages
Week 1 Q3
No ratings yet
Week 1 Q3
22 pages
Uniform Exponential Distribution
100% (1)
Uniform Exponential Distribution
11 pages
Chapter 1: Populations, Samples and Processes
No ratings yet
Chapter 1: Populations, Samples and Processes
28 pages
07 - Workbook Part 2 - Business Statistics
No ratings yet
07 - Workbook Part 2 - Business Statistics
158 pages
General ALT Model For Step Stress Test
No ratings yet
General ALT Model For Step Stress Test
12 pages
04 Probability Distributions
No ratings yet
04 Probability Distributions
47 pages
Hacking, Ian - Strange Expectations (1980)
No ratings yet
Hacking, Ian - Strange Expectations (1980)
6 pages
Financial Mathematics, Derivatives and Structured Products, 2nd (Raymond H. Chan, Yves ZY. Guo, Spike T. Lee Etc.) (Z-Library)
No ratings yet
Financial Mathematics, Derivatives and Structured Products, 2nd (Raymond H. Chan, Yves ZY. Guo, Spike T. Lee Etc.) (Z-Library)
478 pages
Cowan Statistical Data Analysis
No ratings yet
Cowan Statistical Data Analysis
10 pages
Solutions To Some Exercises From Bayesian Data Analysis, Third Edition, by Gelman, Carlin, Stern, and Rubin
No ratings yet
Solutions To Some Exercises From Bayesian Data Analysis, Third Edition, by Gelman, Carlin, Stern, and Rubin
36 pages
Geostatistics Formula Sheet
No ratings yet
Geostatistics Formula Sheet
2 pages
Quantitative Methods OUBS 027125 Revision Notes: Tutor: Ms Mushira Laloo
No ratings yet
Quantitative Methods OUBS 027125 Revision Notes: Tutor: Ms Mushira Laloo
12 pages
CARBayes ST
No ratings yet
CARBayes ST
37 pages
Stats Notes Book Matching
No ratings yet
Stats Notes Book Matching
1 page
Introduction To Probability Theory by Paul G Hoel Sidney C Port Charles J Stone PDF
No ratings yet
Introduction To Probability Theory by Paul G Hoel Sidney C Port Charles J Stone PDF
7 pages
Simulation 2
No ratings yet
Simulation 2
24 pages
Fundamentals of Statistics For Data Scientists and Analysts - by Tatev Karen Aslanyan - Towards Data Science
No ratings yet
Fundamentals of Statistics For Data Scientists and Analysts - by Tatev Karen Aslanyan - Towards Data Science
49 pages
4central Limit Theorem
No ratings yet
4central Limit Theorem
6 pages
Aem Probability PDF
No ratings yet
Aem Probability PDF
10 pages
Cambridge International Examinations
No ratings yet
Cambridge International Examinations
12 pages
Elementary Probability Theory
No ratings yet
Elementary Probability Theory
14 pages
End of Studies Internship
No ratings yet
End of Studies Internship
52 pages
Stats Presentation
No ratings yet
Stats Presentation
8 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Descriptive Statistics MBA

Uploaded by

Descriptive Statistics MBA

Uploaded by

Descriptive Statistics

Normal Distribution: The Concept

The normal distribution is produced by the normal density function,

It is calculated by the formula:-

It is calculated by the formula:-

Need of Normal Distribution

15, 20, 21, 20, 36, 15, 25, 15

The sum of these 8 values is 167, so the mean is 167/8 = 20.875.

If we order the 8 scores shown above, we would get:

A basic way of presenting univariate data is to create a frequency distribution of the

Age range Frequency Percent

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.