0% found this document useful (0 votes)

10 views57 pages

TDA1

The document outlines the competencies and course contents for a transportation data analysis course, including statistical methods such as mean, median, mode, and measures of dispersion. It discusses grouped data, five-number summary, skewness, kurtosis, and various statistical techniques for analyzing transportation data. Additionally, it covers the importance of understanding data distribution and variability in the context of civil engineering.

Uploaded by

Sreeja Tallam

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views57 pages

TDA1

Uploaded by

Sreeja Tallam

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 57

Dr.

Arpan Mehar
arpan@nitw.ac.in
Associate Professor

Transportation Division
Department of Civil Engineering
NATIONAL INSTITUTE OF TECHNOLOGY WARANGAL
• Course Competencies
CO1 Select a suitable method for processing and presentation of transportation data
CO2 Apply probability distributions to analyze transportation data
CO3 Choose appropriate hypothesis testing measures
CO4 Analyze multivariate transportation data
CO5 Differentiate various curve fitting techniques
CO6 Develop Time Series models
Course contents

• Data description and presentation

• Probability laws and distributions

• Statistical inference and tests of significance

• Regression and Correlation

• Parameter estimation and Curve fitting

• Sampling techniques

• Time series and models

Mean
• The sum of a collection of numbers divided by the number of numbers in the
collection
• Add together all the data and then divide it by the sum of the total number of
data or value

• Advantages:
•Most popular measure in fields such as business, engineering and computer
science
• It is unique - there is only one answer
• Useful when comparing sets of data
• Disadvantages:
• Affected by extreme values (outliers)
Median
• The Median will be the 'middle value' in your data set.
(for Odd numbers)
• The median will be equal to the sum of the two middle
numbers divided by two
(for Even numbers)
• Advantage
•Extreme values (outliers) do not affect the median as
strongly as they do the mean
•Useful when comparing sets of data
•It is unique - there is only one answer
• Disadvantage
•Not as popular as mean
Mode
• The mode is refers to the list of numbers that occur most frequently
• The value (number) that appears the most

• Advantage
•Extreme values (outliers) do not affect the mode
• Disadvantage
• Not as popular as mean and median.
• Not necessarily unique - may be more than one
• When there is more than one mode..?

• Note: If no number repeated in the data set, then there is no mode for that set
of data or number (mode become useless)
Grouped data
Speed 10 20 36 40 50 56 60 70 72 80 88 92 95
kmph
Freq. 1 1 3 4 3 2 4 4 1 1 2 3 1

Grouped data are data formed by aggregating individual

observations of a variable into groups
Speed Obtained (xi) Number of vehicle(fi) fixi
Grouped data
10 1 10
20 1 20
36 3 108
Mean of the Group data
40 4 160
50 3 150
56 2 112
60 4 240
70 4 280
72 1 72
80 1 80
88 2 176
92 3 276
95 1 95
Total Σfi = 30 Σfixi = 1779
Range and Class Interval
•Range:
The smallest number subtracted from the
largest number in your data set

•Class interval:
A simplest formula based on Sturgs's Rule
define the class interval

i= Range/(1+3.222log10N)
Grouped data mean: Direct method

Speed Class Number of Speed (xi) fixi

Interval veh (fi) mid value
10-25 2
25-40 3
40-55 7
55-70 6
70-85 6
85-100 6
Total Σfi = 30
Grouped Data Mean: (Assumed Mean Method)
•If the numerical values of xi and fi are
large, finding the product of xi and fi becomes
a time-consuming process
Class Interval Number of Mid Speed Deviation Product fidi
Veh (fi) (xi) di = xi – 47.5

10-25 2 17.5

25-40 3 32.5

40-55 7 47.5

55-70 6 62.5

70-85 6 77.5

85-100 6 92.5

Total Σfi = 30
Group data mean: Step Deviation Method
Class Freq.(fi) Class di = xi – a ui =(xi – a)/h fiui
Interval Mark (xi)
of Speed

10-25 2 17.5
25-40 3 32.5
40-55 7 47.5
55-70 6 62.5
70-85 6 77.5
85-100 6 92.5
Total Σfi = 30
Five Number Summary
• The five-number summary is a descriptive statistic that
provides information about a set of observations.
• It consists of the five most important sample percentiles
(The five-number summary provides a concise summary of
the distribution of the observations)

• (1) Sample minimum (smallest observation)

• (2) Lower quartile or first quartile (25th percentile)
• (3) Median (middle value) (50th percentile)
• (4) Upper quartile or third quartile (75th percentile)
• (5) Sample maximum (largest observation)
Q1. Which box-plot best reflects the data represented in
the following 5 number summary?

Minimum: 62.00,

1st Quartile: 66.25

Median: 72.00

3rd Quartile: 75.50

Maximum: 89.00
Spread of Data
120

100
• Range 98.7

80
70.99
68.41
16.09
60
54.9

• Inter-quartile range 40 Inter

Quartile
range
30.29

20
Easiest way to
summarise the spread 0
of data Small cars
• Center of Data
Graphically, the Center of a data is located at the median or
Mean or Mode

• Spread of Data
The spread of a data refers to the variability of the data
Measure of dispersion

• This is also known as variation or dispersion of

scatteredness

• Helps to find the variability of data of individual

items from appropriate measures of central
tendency (Mean, Median or Mode)

• It is also defined as the degree to which numerical

data tends to spread about an average value
Type of measures of dispersion

•Absolute measures of dispersion

•Relative measures of dispersion

• Absolute measures of dispersion

(1) Range

(2) Quartile deviation (semi-inter quartile)

(3) Mean deviation

(4) Standard deviation

• Range
Difference between lower and extreme value

• Inter quartile & semi-inter quartile (quartile deviation)

• Mean deviation or Average deviation

- Arithmetic mean of all the deviation, taken from central values

• Standard deviation (Root mean square deviation)

- Measures the absolute dispersion or variability of a distribution
- Small standard deviation means a high degree of uniformity in the observations
or data
- It is the positive square root of the average of squared deviations taken from the
arithmetic mean.
• Relative measures of dispersion

• Coefficient of range= (L-S)/(L+S)

• Coefficient of quartile deviation= (Q3-Q1)/(Q3+Q1)

• Coefficient of mean deviation= (Mean deviation about mean)/Mean

• Coefficient of standard deviation= Standard deviation/Mean

Variance
• Ungroup data
• 61, 63, 65, 66, 67, 68
A B C
1 1 2
1 1 3
1 1 4
2 1 5
2 1 6
1 1 7
Variance
• Grouped data
Coefficient of Variation (CV)

• It is relative measure of dispersion

• It has great practical significance and is the best measure for
comparing the variability of the two series
• If the COV is greater for any group of data then the data has more
variation (less consistency)
• Always represented in terms of percentage (%)
Some properties of Unimodal Curve
Shape of Data or Distribution
• The pattern of values in the data, showing their frequency of
outcomes relative to each other

- Multiple values, whether the data varies a lot or a little about

the most common values

- Whether that variations tends to more above or below the

common values

- Whether there are most unusually large or smaller values in the

data
Characteristics of Distribution
• Symmetry (Unimodality)
- A symmetric distribution can be divided at the center so that
each half is a mirror image of the other
• Non Symmetrical (Bimodal)
- Distributions of data with two clear peaks are called bimodal
• Skewness
- Distributions with less observations points on the right are said to
be skewed right
- Distributions with less observations points on the left are said to
be skewed left
• Uniform
- When the observations in a set of data are equally spread
across the range
- A uniform distribution has no clear peaks
Unusual Features
• Gaps
-Gaps refer to areas of a distribution where there are no observations

• Outliers
- extreme values that differ greatly from the other observations.
• Interpret the box plot
Examples:
Skewness

• Skewness is lack of symmetry (Riggleman andFrisbee)

• Skewness refers to asymmetry in shape of frequency

distribution (Morris, H.)

• A distribution is said to be skewed when the mean and the

median fall at different points in the distribution and center of
gravity is shifted to one side or other (Garret)

• When a series is not symmetrical it is said to be asymmetrical or

skewed (Croxton and Cowden)
Test of skewness

• 1. If Mean ≠ Median ≠ Mode

• 2. Q3 –Median ≠ Median – Q1

• 3. Sum of positive deviations ≠ Sum of negative

deviations

• 4. If frequencies of either side of the mode are unequal

• If the graph of the data do not give the normal curve

• Absolute Measure
Absolute skewness = Mean – Mode
= Median – Mean
= (Q3 – Q2)– (Q2 – Q1)

- used to express in units

- cannot be used for comparison

An absolute measure of skewness can not be used for purposes of comparison

because of the same amount of skewness has different meanings in distribution
with small variation and in distribution with large variation.
• Karl Pearson’s coefficient
Mean – Mode
Standard deviation

Values lies between +1 to -1

Mean – (3 Median – 2Mean)

S Kp =
Standard deviation

3(Mean – Median)
SKp =
Standard deviation
• Consider the following Speed Parameters

Parameters 6-lane 4-lane

Mean 100 90
Median 90 80
S.D 10 10

Show that:
(a) distribution A has same degree of variation as distribution B
(b) Both distribution have the same degree of skewness. True or false
KURTOSIS or CONVEXITY
• Kurtosis is the measure of ‘Peakedness’ or ‘Height’ or ‘Flatness’ of
distribution of real random values

• kurtosis is a descriptor of the shape of a probability distribution

• High kurtosis means more variance

• A common measure of kurtosis is also suggested by Karl Pearson

Definitions of Kurtosis

• Kurtosis refers to the degree of Peakedness of the hemp of the distribution

(C.M. Mayess)

• It indicate the degree to which the curve of a frequency distribution is

peaked or flat topped (Crosexton and Cowden)

• Degree of kurtosis of a distribution is measured relative to the peakedness

of a normal curve (Simrpson and Katkes)
Type of Pearson's kurtosis

- Provide a comparison of the shape of a given distribution with

respect to normal distribution

(1) Leptokurtic (Positive)

(2) Mesokurtic (Normal)
(3) Platykurtic (Negative)
Measure of Kurtosis based on Moment

• ß2 = µ4/(µ2)2

If ß2 > 3 Leptokurtic

If ß2 = 3 Normal (meso)

If ß2 < 3 Platykurtic
Measure of Skewness based on Moment

• The coefficients are used for measuring the skewness and Kurtosis

ß1 = (µ3) /(µ2)3
ß1 > 0 (positively)
ß1 < 0 (negetively)

ß2 = (µ4) /(µ2)2
ß2 > 3 (Lepto)
ß2 = 0 (Meso)
ß2 < 3 (Platy)
• Calculate first four moments about the arbitrary origin and also find the
value of β1 and β2

Speed class Speed class Freq.

(Low) (High)

0 2 2

2 4 9

4 6 10

6 8 5

8 10 3

10 12 2

12 14 1
12

8
Frequency

0
0 5 10 15
Speed (m/s)
• Find the skewness and Kurtosis of the data

Probability and Statistics Lecture Notes
100% (1)
Probability and Statistics Lecture Notes
9 pages
Numerical Descriptive Measures: A. Measures of Central Tendency
No ratings yet
Numerical Descriptive Measures: A. Measures of Central Tendency
21 pages
Statistics
No ratings yet
Statistics
49 pages
Mathematical Analysis
100% (1)
Mathematical Analysis
46 pages
Biostatistics: Khadeeja PK
0% (1)
Biostatistics: Khadeeja PK
27 pages
Chapter 4 Measures of Dispersion (Variation)
No ratings yet
Chapter 4 Measures of Dispersion (Variation)
34 pages
EDA W3 Obtaining-Data
No ratings yet
EDA W3 Obtaining-Data
57 pages
Measures of Dispersion
No ratings yet
Measures of Dispersion
24 pages
TRB 29 Landslide and Engineering Practice PDF
100% (2)
TRB 29 Landslide and Engineering Practice PDF
255 pages
Measures of Central Tendency
100% (15)
Measures of Central Tendency
15 pages
Chapter 3 Data Presentation
No ratings yet
Chapter 3 Data Presentation
40 pages
Analytics Compendium (Incl Stats)
No ratings yet
Analytics Compendium (Incl Stats)
31 pages
DSJ BMS Unit2
No ratings yet
DSJ BMS Unit2
18 pages
Chapter 4-1
No ratings yet
Chapter 4-1
46 pages
04 - Measures of Variation
No ratings yet
04 - Measures of Variation
24 pages
Frequency Distributions and Graphs2
No ratings yet
Frequency Distributions and Graphs2
8 pages
Important Measures of Central Tendency Are Mean, Median and Mode
No ratings yet
Important Measures of Central Tendency Are Mean, Median and Mode
31 pages
Day 3 Educational Statistics
No ratings yet
Day 3 Educational Statistics
37 pages
Measures of Central Tendency and Dispersion
No ratings yet
Measures of Central Tendency and Dispersion
9 pages
Unit 4 Descriptive Statistics
No ratings yet
Unit 4 Descriptive Statistics
8 pages
M-1 CH-3 Descriptive Statistcs
No ratings yet
M-1 CH-3 Descriptive Statistcs
27 pages
Slides For IT SKill
No ratings yet
Slides For IT SKill
63 pages
Biostats Lesson 3
No ratings yet
Biostats Lesson 3
6 pages
SCSA1606 - Predictive and Advanced Analytics - Unit II
No ratings yet
SCSA1606 - Predictive and Advanced Analytics - Unit II
50 pages
Business Statistics - KMBN104
No ratings yet
Business Statistics - KMBN104
25 pages
Domain Name System: Window Server 2012 R2
No ratings yet
Domain Name System: Window Server 2012 R2
46 pages
Biostatistics (Descriptive Statistics)
No ratings yet
Biostatistics (Descriptive Statistics)
30 pages
Lecture - 2 Measures of Central Tendency and Variation
No ratings yet
Lecture - 2 Measures of Central Tendency and Variation
40 pages
Chapter 4
No ratings yet
Chapter 4
46 pages
Unit 1 Notes
No ratings yet
Unit 1 Notes
5 pages
Care Management of Small Ruminant
No ratings yet
Care Management of Small Ruminant
29 pages
Exercises Aeroelasticity 2
100% (1)
Exercises Aeroelasticity 2
1 page
UNGROUPED DATA Measures of Central Tendency, Dispersion, and Position
No ratings yet
UNGROUPED DATA Measures of Central Tendency, Dispersion, and Position
34 pages
MRP System Nervousness
100% (1)
MRP System Nervousness
232 pages
Measures of Central Tendency and Dispersion Measure of Central Tendency
No ratings yet
Measures of Central Tendency and Dispersion Measure of Central Tendency
8 pages
Theory and Formula
No ratings yet
Theory and Formula
42 pages
Lecture 3 - Numerical Statistics
No ratings yet
Lecture 3 - Numerical Statistics
7 pages
Descreptive Statistics 1
No ratings yet
Descreptive Statistics 1
74 pages
Descriptive Measures With Samples-1
No ratings yet
Descriptive Measures With Samples-1
33 pages
Interpreting Test Score: Online Workshop 8602 Aiou
100% (1)
Interpreting Test Score: Online Workshop 8602 Aiou
39 pages
Statistics - Imp Points
No ratings yet
Statistics - Imp Points
6 pages
Descriptive Stat
No ratings yet
Descriptive Stat
13 pages
2.3 Descriptive Numerical Summary Measures
No ratings yet
2.3 Descriptive Numerical Summary Measures
67 pages
IBM Planning Analytics With Watson
No ratings yet
IBM Planning Analytics With Watson
7 pages
Mean Median Mode
No ratings yet
Mean Median Mode
56 pages
Introduction To Statistics
No ratings yet
Introduction To Statistics
35 pages
Descriptive Statistics 1
No ratings yet
Descriptive Statistics 1
63 pages
المحاضرة رقم 3
No ratings yet
المحاضرة رقم 3
44 pages
Lecture 2-Descriptive Statistics
No ratings yet
Lecture 2-Descriptive Statistics
74 pages
8614.educational Statitics Unit 4
No ratings yet
8614.educational Statitics Unit 4
34 pages
Statistics
100% (1)
Statistics
11 pages
Statistics 3: DR Taher
No ratings yet
Statistics 3: DR Taher
38 pages
Chapter 3
No ratings yet
Chapter 3
17 pages
(Ebook) Mental Disorder in Canada: An Epidemiological Perspective by John Cairney David L. Streiner ISBN 9781442698574, 1442698578
100% (2)
(Ebook) Mental Disorder in Canada: An Epidemiological Perspective by John Cairney David L. Streiner ISBN 9781442698574, 1442698578
77 pages
MCS Lecture 3
No ratings yet
MCS Lecture 3
57 pages
SSC CGL Tier 2 Statistics - Last Minute Study Notes: Measures of Central Tendency
No ratings yet
SSC CGL Tier 2 Statistics - Last Minute Study Notes: Measures of Central Tendency
10 pages
Unit 1 - Business Statistics & Analytics
No ratings yet
Unit 1 - Business Statistics & Analytics
25 pages
Measure of Dispersion-Intro
No ratings yet
Measure of Dispersion-Intro
14 pages
303-01c Engine - V8 (4V)
No ratings yet
303-01c Engine - V8 (4V)
94 pages
f592b059 1643454320549
No ratings yet
f592b059 1643454320549
39 pages
Major Ingredients in Baking
No ratings yet
Major Ingredients in Baking
42 pages
Wrath & Glory - Beginner - S Rulebook
100% (3)
Wrath & Glory - Beginner - S Rulebook
74 pages
CH 2 Lecture Notes
No ratings yet
CH 2 Lecture Notes
12 pages
Introduction To Descriptive Statistics
No ratings yet
Introduction To Descriptive Statistics
73 pages
Cause Effect Eng.5lpppp
100% (2)
Cause Effect Eng.5lpppp
3 pages
Class Test 1 Revision Notes
No ratings yet
Class Test 1 Revision Notes
10 pages
Seating Plan
No ratings yet
Seating Plan
130 pages
Namma Kalvi 10th Maths Question Bank em 216419
No ratings yet
Namma Kalvi 10th Maths Question Bank em 216419
36 pages
UPES-CCE - MBA - SEM4 - Dissertation Topics
No ratings yet
UPES-CCE - MBA - SEM4 - Dissertation Topics
33 pages
Dynamics Ax 2012 r2 Import Export Framework Walkthrough Installation v1 Secured
No ratings yet
Dynamics Ax 2012 r2 Import Export Framework Walkthrough Installation v1 Secured
17 pages
MODULE 6. Effects of Contemporary Economic Issues Affecting The Filipino
No ratings yet
MODULE 6. Effects of Contemporary Economic Issues Affecting The Filipino
32 pages
Chapter-3ni Kamote Chua
No ratings yet
Chapter-3ni Kamote Chua
29 pages
Math in The Modern World Stat Lecture
No ratings yet
Math in The Modern World Stat Lecture
3 pages
Comprehensive Accounting Review Center
No ratings yet
Comprehensive Accounting Review Center
3 pages
TML Manual
No ratings yet
TML Manual
90 pages
CIS Times 2023 2024
No ratings yet
CIS Times 2023 2024
210 pages
Centro de Idiomas de La Universidad Nacional de Trujillo: Teacher
No ratings yet
Centro de Idiomas de La Universidad Nacional de Trujillo: Teacher
3 pages
ATI Trincomalee: SRI Nka Insti U OF Dvanced Technological Education
No ratings yet
ATI Trincomalee: SRI Nka Insti U OF Dvanced Technological Education
4 pages
Laporan Presensi Pegawai Bulanan
No ratings yet
Laporan Presensi Pegawai Bulanan
7 pages
50 AIPD2023 06A Planning Design Terminal Area
No ratings yet
50 AIPD2023 06A Planning Design Terminal Area
135 pages
Lesson 3
No ratings yet
Lesson 3
9 pages
14 AIPD2023 03C Airfield Geometric Design 02
No ratings yet
14 AIPD2023 03C Airfield Geometric Design 02
30 pages
UTP Module 2
No ratings yet
UTP Module 2
138 pages
GISOverView 10022025
No ratings yet
GISOverView 10022025
39 pages
Introduction To Social Representation Theory
No ratings yet
Introduction To Social Representation Theory
8 pages
UltraPoxy Data Sheet English v3
No ratings yet
UltraPoxy Data Sheet English v3
2 pages
Autumn and Summer in A Frozen Fire
No ratings yet
Autumn and Summer in A Frozen Fire
2 pages
Christmas Programme 2024
No ratings yet
Christmas Programme 2024
2 pages
PETRO Fire Rated Tank Brochure
No ratings yet
PETRO Fire Rated Tank Brochure
2 pages
Wa0018.
No ratings yet
Wa0018.
17 pages
Zorba
No ratings yet
Zorba
28 pages
Amjad Khan
No ratings yet
Amjad Khan
2 pages
Statistical Analysis and Visualization
From Everand
Statistical Analysis and Visualization
Mohit Chatterjee
No ratings yet
Data Mining Models: Techniques and Applications
From Everand
Data Mining Models: Techniques and Applications
Ravi Deshpande
No ratings yet
SAT Math Shortcuts
From Everand
SAT Math Shortcuts
Bella Biscotti
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

TDA1

Uploaded by

TDA1

Uploaded by

Dr.

• Data description and presentation

• Probability laws and distributions

• Statistical inference and tests of significance

• Regression and Correlation

• Parameter estimation and Curve fitting

• Time series and models

Grouped data are data formed by aggregating individual

Speed Class Number of Speed (xi) fixi

• (1) Sample minimum (smallest observation)

1st Quartile: 66.25

3rd Quartile: 75.50

• Inter-quartile range 40 Inter

• This is also known as variation or dispersion of

• Helps to find the variability of data of individual

• It is also defined as the degree to which numerical

•Absolute measures of dispersion

•Relative measures of dispersion

(2) Quartile deviation (semi-inter quartile)

(3) Mean deviation

(4) Standard deviation

• Inter quartile & semi-inter quartile (quartile deviation)

• Mean deviation or Average deviation

• Standard deviation (Root mean square deviation)

• Coefficient of range= (L-S)/(L+S)

• Coefficient of quartile deviation= (Q3-Q1)/(Q3+Q1)

• Coefficient of mean deviation= (Mean deviation about mean)/Mean

• Coefficient of standard deviation= Standard deviation/Mean

• It is relative measure of dispersion

- Multiple values, whether the data varies a lot or a little about

- Whether that variations tends to more above or below the

- Whether there are most unusually large or smaller values in the

• Skewness is lack of symmetry (Riggleman andFrisbee)

• Skewness refers to asymmetry in shape of frequency

• A distribution is said to be skewed when the mean and the

• When a series is not symmetrical it is said to be asymmetrical or

• 1. If Mean ≠ Median ≠ Mode

• 3. Sum of positive deviations ≠ Sum of negative

• 4. If frequencies of either side of the mode are unequal

• If the graph of the data do not give the normal curve

- used to express in units

An absolute measure of skewness can not be used for purposes of comparison

Values lies between +1 to -1

Mean – (3 Median – 2Mean)

Parameters 6-lane 4-lane

• kurtosis is a descriptor of the shape of a probability distribution

• High kurtosis means more variance

• A common measure of kurtosis is also suggested by Karl Pearson

• Kurtosis refers to the degree of Peakedness of the hemp of the distribution

• It indicate the degree to which the curve of a frequency distribution is

• Degree of kurtosis of a distribution is measured relative to the peakedness

- Provide a comparison of the shape of a given distribution with

(1) Leptokurtic (Positive)

Speed class Speed class Freq.

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.