0% found this document useful (0 votes)

10 views55 pages

Chapter 01

Uploaded by

mohamedalbialy312

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views55 pages

Chapter 01

Uploaded by

mohamedalbialy312

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 55

Statistics

Chapter 01 – Overview and

Descriptive Statistics
Probability vs. Statistics

 Probability of an event is the likelihood of it occurring, e.g., when a

coin is tossed, there is a probability to get a head or tail.
 Statistics deals with a set of data, e.g., finding the most frequently
used item from a set of data. It is the science of learning from data.
 This toy example helped:
 Probability is starting with an animal and figuring out what
footprints it will make.
 Statistics is seeing a footprint and guessing the animal.

2
Probability vs. Statistics Cont’d.

 Suppose we have information

about a population, and we want to Probability: Given the
know about samples we could take information in the pail
from that population. Probability what is in your hand?
addresses these questions.
 Suppose we have sample data, and
we want to know about the
population the sample came from. Statistics: Given the
Statistics use sample data to make information in your hand
what is in the pail?
inferences about the population the
sample came from.
3
What is a Population?

 Population is the entire set of

items from which you draw data for
a statistical study. It can be a group
of individuals, a set of items, etc. It
makes up the data pool for a study.
 An example of a population would
be the entire student body at a
school.

4
What is a Sample?
 A sample is a smaller and more manageable representation of a larger
group. A subset of a larger population that contains characteristics of
that population.

Sample

Population
5
What is a Sample? Cont’d.
 Samples are used when:
The population is too large to collect data.
The data collected is not reliable.
The population is hypothetical and is unlimited in size.
 Take the example of a study that documents the results of a new
medical procedure. It is unknown how the procedure will affect
people across the globe, so a test group is used to find out how
people react to it.

6
What is a Sample? Cont’d.

 A sample should generally:

 Satisfy all different variations present in the population and a well-
defined selection criterion.
 Be unbiased on the properties of the objects being selected.
 Be random to choose the objects of study fairly.

7
Descriptive vs. Inferential Statistics
 There are two main branches in the field of statistics:
① Descriptive statistics aims to describe a chunk of raw data using
summary statistics, graphs, and tables, etc.
 Let's say, we have a set of raw data that shows the test scores of 1000
students at a particular school. We might be interested in the average
test score along with the distribution of test scores.
② Inferential statistics uses a small sample of data to draw inferences
about the larger population that the sample came from.
8
Descriptive vs. Inferential Statistics Cont’d.

 Let's say, we might be interested in understanding the political

preferences of millions of people in a country. However, it would
take too long and be too expensive to survey every individual in the
country. Thus, we would instead take a smaller survey of say, 1000
individuals, and use the results of the survey to draw inferences
about the population.

9
Descriptive vs. Inferential Statistics Cont’d.
 The relationship between the two disciplines can be summarized by
saying that probability reasons from the population to the sample
(deductive reasoning), whereas inferential statistics reasons from the
sample to the population (inductive reasoning).

10
Pictorial and Tabular Methods in Descriptive
Statistics

11
Stem-and-Leaf displays
 Consider a numerical dataset 𝑥1 , 𝑥2 , … , 𝑥𝑛 for which each 𝑥𝑖 consists
of at least two digits. A quick way to obtain an informative visual
representation of the dataset is to construct a stem-and-leaf display.

12
Example. The average number of hours of
sleep per day over a two-week period for a
Stem-and-Leaf
sample of 253 college students. displays Cont’d.

13
Stem-and-Leaf Example. The average number of hours of
displays Cont’d. sleep per day over a two-week period for
a sample of 253 college students*.

 Numbers in the Low Group end with a

second digit of 0, 1, 2, 3, or 4.
Bell-shaped curve  Numbers in the High Group end with a
*Individuals
second digit of 5, 6, 7, 8, or 9.
in this age group need about 8.4 hours of sleep per day. 14
Stem-and-Leaf displays Cont’d.

 A stem-and-leaf display discloses the following aspects of the data:

Identification of a typical or representative value.
Extent of spread about the typical value.
Presence of any gaps in the data.
Extent of symmetry in the distribution of values.
Number and locations of peaks.
Presence of outliers, i.e., values far from the rest of the data.
 Frankly, a display based on ‘between 5 and 20 stems’ is
recommended.
15
Dotplots

 A dotplot is an attractive summary of numerical data when the dataset is

reasonably small or there are relatively few distinct data values. Each
observation is represented by a dot above the corresponding location on a
horizontal measurement scale. When a value occurs more than once, there is
a dot for each occurrence, and these dots are stacked vertically.
Example. There is a growing concern in the U.S. that not enough students are
graduating from college. America used to be number 1 in the world for the
percentage of adults with college degrees, but it has recently dropped to 16th.
Here is data on the percentage of 25- to 34-year-olds in each state who had
some type of post-secondary degree as of 2010 (listed in alphabetical order,
with Washington D.C. included):

16
51 Measures
Dotplots Cont’d.

** Note. A dotplot can be quite cumbersome to construct and look crowded when
the number of observations is large. Now, let’s look at other interesting methods!!
17
Histograms
 A numerical variable is discrete if its set of possible values either is
finite or else can be listed in an infinite sequence (one in which there is
a first number, a second number, and so on).
 A discrete variable 𝑥 almost always results from counting, in which case
possible values are 0, 1, 2, 3, … or some subset of these integers.
 A numerical variable is continuous if its possible values consist of an
entire interval on the number line.
 Continuous variables arise from making measurements. Such as, if 𝑥 is
the pH of a chemical substance, then in theory 𝑥 could be any number
between 0 and 14, e.g., 7.0, 7.03, 7.032, and so on.

18
Histograms Cont’d.
 Consider data consisting of observations on a discrete variable 𝑥. The
frequency of any 𝑥 value is the number of times that value occurs in the
dataset. The relative frequency of a value is the fraction or proportion of
times the value occurs:
number of times the value occurs
relative frequency of a value =
number of observations in the dataset
Example. Suppose that our dataset consists of 200 observations
(students) on 𝑥 = the number of courses a college student is taking this
term. If 70 of these 𝑥 values are 3, then:
19
Histograms Cont’d.
 Frequency of the 𝑥 value 3: 70 and
70
 Relative frequency of the 𝑥 value 3: = .35
200
 Multiplying a relative frequency by 100 gives a percentage; in the
college-course example, 35% of the students in the sample are taking
three courses. The relative frequencies, or percentages, are usually of
more interest than the frequencies themselves.
 In theory, the relative frequencies should sum to 1, but in practice the
sum may differ slightly from 1 because of rounding.

20
Histograms Cont’d.

Example. How unusual is a no-hitter* or a one-hitter in a major league

baseball game, and how frequently does a team get more than 10, 15, or even
20 hits? The table below is a frequency distribution for the number of hits per
team per game for all nine-inning games that were played between 1989 and
1993.
*In baseball, a no-hitter is a game in which a team was not able to record a single hit through conventional means.
21
Histograms Cont’d.
Frequency Frequency
𝒙 𝒙

22
Histograms Cont’d.

23
Histograms Cont’d.

 Proportion of games with at most two hits = relative frequency for

𝑥 = 0 + relative frequency for 𝑥 = 1 + relative frequency for 𝑥 = 2
= 0010 + .0037 + .0108 = .0155
 Similarly, proportion of games with between 5 and 10 hits (inclusive) =
.0752 + .1026 + ⋯ + .1015 = .6361
 That is, roughly 64% of all these games resulted in between 5 and 10
(inclusive) hits.

24
Histogram Shapes
 Histograms come in a variety of shapes. A unimodal histogram is one
that rises to a single peak and then declines. A bimodal histogram has two
different peaks. Bimodality can occur when the dataset consists of
observations on two quite different kinds of individuals or objects.
Example. consider a large dataset consisting of driving times for cars
traveling between San Luis Obispo, California, and Monterey, California
(exclusive of stopping time for sightseeing, eating, etc.). This histogram
would show two peaks: one for those cars that took the inland route
(roughly 2.5 hours) and another for those cars traveling up the coast (3.5
− 4 hours).
 A histogram with more than two peaks is said to be multimodal.
25
Histogram Shapes Cont’d.
 A histogram is symmetric if the left half is a mirror image of the right half (b).
A unimodal histogram is positively skewed if the stretching is to the right (c)
and negatively skewed if the stretching is to the left (a).

 For a positively skewed data,

large positive outliers exist
which will tend to “pull” the
mean upward.
 For a negatively skewed
distribution, large negative
outliers exist which tend to
“pull” the mean downward.

Skewness is simply a reflection of a dataset in which activity is heavily condensed in one range and less condensed in another. 26
Histogram Shapes Cont’d.

27
Histogram Shapes Cont’d.
 Draw a histogram to represent the following data: 5, 3, 3, 6, 4, 3, 5, 4, 7, 3, 3, 5,
3, 6, 4, 3, 4, and then draw a histogram to represent the following data: 7, 4, 6, 7,
5, 7, 6, 3, 4, 7, 5, 6, 6, 7, 7, 5, 7.

Right Skewed Left Skewed

28
Histogram Shapes Cont’d.

Right Skewed Histogram Left Skewed Histogram

Also known as a positively skewed histogram. Also known as a negatively skewed histogram.

Mean > Median > Mode. Mean < Median < Mode.

The peak of the graph lies on the left side of the center. The peak of the graph lies on the right side of the center

29
Measures of Location

30
The Mean
 For a given set of numbers 𝑥1 , 𝑥2 , … , 𝑥𝑛 , the most familiar and useful
measure of the center is the mean, or arithmetic average of the set. We will
often refer to the arithmetic average as the sample mean and denote it by 𝑥.ҧ

31
The Mean Cont’d.

The sample mean can be regarded as 229.0

the balance point of the distribution 𝑥ҧ = = 16.36
14
of observations.

32
The median

33
79.0 89.0
The median Cont’d.

The sample median is very insensitive to

outliers. If the two largest 𝑥𝑖 are increased
from 75.7 and 79.0 to 85.7 and 89 , 66.4 + 67.4
respectively, 𝑥෤ would be unaffected. Thus, in 𝑥෤ = 2
= 66.90
the treatment of outlying data values, 𝑥ҧ and
𝑥෤ are at opposite ends of a spectrum.
34
Measures of Variability

35
The Variance

36
The Variance Cont’d.

Try to validate it yourself!! 37

The Variance Cont’d.

 The variance is unchanged when a constant 𝑐 is added to (or subtracted from)

each data value. This is intuitive, since adding or subtracting 𝑐 shifts the location
of the dataset but leaves distances between data values unchanged.
 Multiplication of each 𝑥𝑖 by 𝑐 results in 𝑠 2 being multiplied by a factor of 𝑐 2 .
These properties can be proved noting that 𝑦ത = 𝑥ҧ + 𝑐 and 𝑦ത = 𝑐 𝑥.ҧ
38
𝑆𝑥𝑥
The Variance Cont’d. 𝑠2 =
𝑛−1
= 31.41

39
Boxplots

40
Boxplots Cont’d.

41
Boxplots Cont’d.

42
Boxplots Cont’d.
 Find the median, lower quartile and upper quartile of the following
numbers: 12, 5, 22, 30, 7, 36, 14, 42, 15, 53, 25.
 First, arrange the data in ascending order:
5, 7, 12, 14, 15, 22, 25, 30, 36, 42, 53
 Median (middle value) = 22
 Lower Quartile (middle value of the lower half) = 12
 Upper Quartile (middle value of the upper half) = 36

 If there is an even number of data items, then we need to get the

average of the middle numbers.
43
Boxplots Cont’d.
The following data consists of observations on the time until failure
(1000s of hours) for a sample of turbo-chargers from one type of
engine.

44
Boxplots Cont’d.

45
Boxplots Cont’d.

46
Boxplots Cont’d.
Example. the following is a sample of TN (total nitrogen) loads (kg
N/day) from a particular location, displayed in increasing order.

47
Boxplots Cont’d.
 Relevant summary quantities are:
𝑥෤ = 92.17 lower 4th = 45.64 upper 4th = 167.79
𝑓𝑠 = 122.15 1.5𝑓𝑠 = 183.225 3𝑓𝑠 = 366.45
 Subtracting 1.5𝑓𝑠 from the lower 4th gives a negative number, and none of
the observations are negative, so there are no outliers on the lower end of
the data. Yet,
upper 4th + 1.5𝑓 = 351.015, upper 4th + 3𝑓 = 534.24
𝑠 𝑠
 Thus, the four largest observations— 563.92 , 690.11 , 826.54 , and
1529.35—are extreme outliers, and 352.09, 371.47, 444.68, and 460.86
are mild outliers.
48
Boxplots Cont’d.
 When the median is in the middle of the box,
and the whiskers are about the same on both
sides of the box, then the distribution is
symmetric.
 When the median is closer to the bottom of
the box, and if the whisker is shorter on the
lower end of the box, then the distribution is
positively skewed* (skewed right).
 When the median is closer to the top of the
box, and if the whisker is shorter on the upper
end of the box, then the distribution is
*If your whisker extends out in the direction of the larger
negatively skewed** (skewed left). numbers, your data are positively skewed.
**If your whisker extends out to the smaller numbers, your data

are negatively skewed. 49

Boxplots Cont’d.

50
Brainstorming

Do you think
Boxplots are a good
choice for
multimodal data?

51
Brainstorming
 To see why boxplots are ill-suited for multimodal data, let’s consider an
example. Imagine our data set consisted of these values: 30, 30, 30, 62, 87, 115,
115, 115, 172, 209, 214. In this example, we have two modes: 30 and 115 both
occur three times.

52
Symmetrical Distribution
 The distribution of the height of males is roughly symmetrically distributed and has
no skew. The average height of a male in the United States is roughly 69.1 inches.
The distribution of heights is roughly symmetrical, with some being shorter and
others taller.
 Notice that the vertical line inside the box representing the median is equally close
to the first and third quartile, which means the distribution is symmetrical and has
no skew.

53
Right-Skewed Distribution
 The distribution of annual household incomes in the United States is right-skewed.
Most households earn between $40k and $80k per year, but there is a long right tail on
the distribution representing households earning much more.
 Notice that the vertical line inside the box representing the median is much closer to
the first quartile than the third quartile, meaning the distribution is right-skewed.

54
Left-Skewed Distribution
 The distribution of the age of deaths in most populations is left-skewed. Most people
live to be between 70 and 80 years old, with fewer and fewer living less than this age.
 Notice that the vertical line inside the box representing the median is much closer to
the third quartile than the first, meaning the distribution is left-skewed.

ST Topic 1
No ratings yet
ST Topic 1
164 pages
RL348 Ex - 2
100% (1)
RL348 Ex - 2
4 pages
Statistics Course
No ratings yet
Statistics Course
75 pages
Descriptive Statistics
No ratings yet
Descriptive Statistics
86 pages
Lecture 01 Introduction To Statistics PPT 06022025 095924am
No ratings yet
Lecture 01 Introduction To Statistics PPT 06022025 095924am
40 pages
Chapter 1
No ratings yet
Chapter 1
109 pages
Slides 1 Statistics
No ratings yet
Slides 1 Statistics
171 pages
Chapter 1
No ratings yet
Chapter 1
63 pages
Chapter 1-Overview & Descriptive Statistics - Classroom Upload
No ratings yet
Chapter 1-Overview & Descriptive Statistics - Classroom Upload
81 pages
IB Standard Level Maths Analysis Approaches
No ratings yet
IB Standard Level Maths Analysis Approaches
23 pages
Stats Notes
No ratings yet
Stats Notes
81 pages
Biostatistics 2M Answers
No ratings yet
Biostatistics 2M Answers
6 pages
Lecture 001 2024-02-19 V-4.0
No ratings yet
Lecture 001 2024-02-19 V-4.0
62 pages
Statistical Method
No ratings yet
Statistical Method
136 pages
Notebook PDF v2
No ratings yet
Notebook PDF v2
182 pages
Hypothesis Tests Regarding A Parameter The Language of Hypothesis Testing
No ratings yet
Hypothesis Tests Regarding A Parameter The Language of Hypothesis Testing
93 pages
Chap1 Introduction To Applied Probability Statistics Upload
No ratings yet
Chap1 Introduction To Applied Probability Statistics Upload
87 pages
1 Biostatistics LECTURE 1
100% (1)
1 Biostatistics LECTURE 1
64 pages
Pre - Week 1N
No ratings yet
Pre - Week 1N
27 pages
Chap 2
No ratings yet
Chap 2
23 pages
Probability and Statistics
No ratings yet
Probability and Statistics
62 pages
Statistical Foundations - Intro 64zlf
100% (2)
Statistical Foundations - Intro 64zlf
86 pages
Introduction To Statistics and SPSS
100% (1)
Introduction To Statistics and SPSS
110 pages
Blue - Doodle - Project - Presentation (1) 123456
No ratings yet
Blue - Doodle - Project - Presentation (1) 123456
33 pages
Basic Econometrics Notes
No ratings yet
Basic Econometrics Notes
156 pages
Educ3063 Notes
No ratings yet
Educ3063 Notes
52 pages
Chapter 05
No ratings yet
Chapter 05
29 pages
Blue Doodle Project Presentation
No ratings yet
Blue Doodle Project Presentation
23 pages
Blue Doodle Project Presentation
No ratings yet
Blue Doodle Project Presentation
23 pages
Data8 Su22 Final
No ratings yet
Data8 Su22 Final
17 pages
Compass Maritime Case Analysis
33% (6)
Compass Maritime Case Analysis
31 pages
BS-chapter1-2022-Intro Statistics-Descrptv N Sumary M & Measures of Location
No ratings yet
BS-chapter1-2022-Intro Statistics-Descrptv N Sumary M & Measures of Location
54 pages
Statistics Notes Part - 1
No ratings yet
Statistics Notes Part - 1
25 pages
01 - Introduction To Statistics
No ratings yet
01 - Introduction To Statistics
24 pages
Intro 123243 Ewqs 1
No ratings yet
Intro 123243 Ewqs 1
37 pages
M 301 - Ch1 - Introduction To Statistics
No ratings yet
M 301 - Ch1 - Introduction To Statistics
96 pages
Lecture 1
No ratings yet
Lecture 1
28 pages
3rd QTR Stats Reviewer
No ratings yet
3rd QTR Stats Reviewer
24 pages
PROBABILITY Lecture 1 - 2 - 3
No ratings yet
PROBABILITY Lecture 1 - 2 - 3
63 pages
Chapter 1
No ratings yet
Chapter 1
23 pages
Required Assignment Week 7
No ratings yet
Required Assignment Week 7
12 pages
Chapter 06 - Course Project
No ratings yet
Chapter 06 - Course Project
14 pages
Week 2 Test Statistics
No ratings yet
Week 2 Test Statistics
61 pages
Lecture 1
No ratings yet
Lecture 1
27 pages
Lecture-1 Introduction
No ratings yet
Lecture-1 Introduction
51 pages
Collection of Data Part 2 Edited MLIS
No ratings yet
Collection of Data Part 2 Edited MLIS
45 pages
Maths 2019
No ratings yet
Maths 2019
12 pages
Lecture No 01 Statistics 13-2-24
No ratings yet
Lecture No 01 Statistics 13-2-24
34 pages
Basic Statistics Notes
No ratings yet
Basic Statistics Notes
10 pages
Report 1
No ratings yet
Report 1
13 pages
CH-1 Stat I
No ratings yet
CH-1 Stat I
13 pages
3141b86-6fd4-7726-D8ad-20a1516bcd Statistics Interview Cheat Sheet - Emmading - Com. All Rights Reserved.
No ratings yet
3141b86-6fd4-7726-D8ad-20a1516bcd Statistics Interview Cheat Sheet - Emmading - Com. All Rights Reserved.
10 pages
Example S
No ratings yet
Example S
5 pages
Lec 2 - Descriptive Statistics
No ratings yet
Lec 2 - Descriptive Statistics
40 pages
Chapter One&2
No ratings yet
Chapter One&2
16 pages
Meta Analysis Formula
No ratings yet
Meta Analysis Formula
16 pages
Answers To Questions
No ratings yet
Answers To Questions
9 pages
RL348 Ex - 2
No ratings yet
RL348 Ex - 2
10 pages
Circuit Design For Front-End Electrocardiograph: LV Jinhua and Xu Yanyi
No ratings yet
Circuit Design For Front-End Electrocardiograph: LV Jinhua and Xu Yanyi
10 pages
T Statistic and Z Statics Difference
No ratings yet
T Statistic and Z Statics Difference
4 pages
ECGproject
No ratings yet
ECGproject
8 pages
ECG Circuit Design and Analysis Algorithm
No ratings yet
ECG Circuit Design and Analysis Algorithm
8 pages
Stats For PGDM
No ratings yet
Stats For PGDM
52 pages
Analysis of The Cost of Traffic Congesti
No ratings yet
Analysis of The Cost of Traffic Congesti
13 pages
Statistics One
No ratings yet
Statistics One
3 pages
Rareevents PDF
No ratings yet
Rareevents PDF
5 pages
L1 Describing Data Set
No ratings yet
L1 Describing Data Set
23 pages
Chapter 17 - Logistic Regression
No ratings yet
Chapter 17 - Logistic Regression
32 pages
For Engineer
No ratings yet
For Engineer
2 pages
Econometrics Practical 2
No ratings yet
Econometrics Practical 2
2 pages
Z-Test For Two Independent Proportions
100% (1)
Z-Test For Two Independent Proportions
11 pages
3233514408042025232931new Test
No ratings yet
3233514408042025232931new Test
1 page
قطع غيار
No ratings yet
قطع غيار
1 page
Document 1
No ratings yet
Document 1
7 pages
Introduction To Statistics
No ratings yet
Introduction To Statistics
39 pages
Chapter 1: Introduction To Statistics: 1.1 An Overview of Statistics
No ratings yet
Chapter 1: Introduction To Statistics: 1.1 An Overview of Statistics
5 pages
OPIM-274 HW2-Solutions
No ratings yet
OPIM-274 HW2-Solutions
4 pages
Exercise 11 Answers
No ratings yet
Exercise 11 Answers
3 pages
Section 2: Descriptive Statistics Part 1: Organizing Data
No ratings yet
Section 2: Descriptive Statistics Part 1: Organizing Data
59 pages
Sheet 1 - Solution
No ratings yet
Sheet 1 - Solution
2 pages
Econometrics: Multicollinearity
No ratings yet
Econometrics: Multicollinearity
9 pages
Descriptive Statistics
No ratings yet
Descriptive Statistics
5 pages
Worksheet For Surds
No ratings yet
Worksheet For Surds
21 pages
Methods Lectures: Financial Econometrics Linear Factor Models and Event Studies
No ratings yet
Methods Lectures: Financial Econometrics Linear Factor Models and Event Studies
46 pages
STATISTICS (Tanya) PG 1 - 28
No ratings yet
STATISTICS (Tanya) PG 1 - 28
35 pages
Introduction Book 1
No ratings yet
Introduction Book 1
41 pages
Introduction To Statistics Presentation of Data
No ratings yet
Introduction To Statistics Presentation of Data
20 pages
Math F353
No ratings yet
Math F353
3 pages
Part1 141104090445 Conversion Gate01
No ratings yet
Part1 141104090445 Conversion Gate01
27 pages
BIOSTAT LESSON 2 - Descriptive Statistics
No ratings yet
BIOSTAT LESSON 2 - Descriptive Statistics
3 pages
Inferential Statistics
No ratings yet
Inferential Statistics
92 pages
Basic Concepts in Statistics
No ratings yet
Basic Concepts in Statistics
42 pages
DA222 Karluk
No ratings yet
DA222 Karluk
2 pages
Chapter 1 Mathematics
No ratings yet
Chapter 1 Mathematics
2 pages
KD Hndcse 44 20
No ratings yet
KD Hndcse 44 20
18 pages
Topic 1 - Role of Statistics in Engineering
No ratings yet
Topic 1 - Role of Statistics in Engineering
15 pages
Parametric Versus Non Parametric Statistics
No ratings yet
Parametric Versus Non Parametric Statistics
19 pages
Introduction To Stati Stics: There Are Three Kinds of Lies: Lies, Damned Lies, A ND Statistics." (B.Disraeli)
No ratings yet
Introduction To Stati Stics: There Are Three Kinds of Lies: Lies, Damned Lies, A ND Statistics." (B.Disraeli)
39 pages
Chapter 5: Statistical Aspects of Regression: and Are Only Estimates of and
No ratings yet
Chapter 5: Statistical Aspects of Regression: and Are Only Estimates of and
21 pages
Lesson 1: Engineering Data Analysis First Semester - A.Y. 2021 - 2022
100% (1)
Lesson 1: Engineering Data Analysis First Semester - A.Y. 2021 - 2022
4 pages
KTN Omitted Variables
No ratings yet
KTN Omitted Variables
6 pages
Chapter 1
No ratings yet
Chapter 1
41 pages
Unit II: Basic Data Analytic Methods
No ratings yet
Unit II: Basic Data Analytic Methods
38 pages
Course Introduction Inferential Statistics Prof. Sandy A. Lerio
No ratings yet
Course Introduction Inferential Statistics Prof. Sandy A. Lerio
46 pages
Lecture 1: Introduction: Statistics Is Concerned With
No ratings yet
Lecture 1: Introduction: Statistics Is Concerned With
45 pages
Business Analytics: Team 7
No ratings yet
Business Analytics: Team 7
7 pages
Sampling in Statistics
From Everand
Sampling in Statistics
Stephanie Glen
No ratings yet
Statistics I Essentials
From Everand
Statistics I Essentials
Emil G. Milewski
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Chapter 01

Uploaded by

Chapter 01

Uploaded by

Statistics

Chapter 01 – Overview and

 Probability of an event is the likelihood of it occurring, e.g., when a

 Suppose we have information

 Population is the entire set of

 A sample should generally:

 Let's say, we might be interested in understanding the political

 Numbers in the Low Group end with a

 A stem-and-leaf display discloses the following aspects of the data:

 A dotplot is an attractive summary of numerical data when the dataset is

Example. How unusual is a no-hitter* or a one-hitter in a major league

 Proportion of games with at most two hits = relative frequency for

 For a positively skewed data,

Right Skewed Left Skewed

Right Skewed Histogram Left Skewed Histogram

The sample mean can be regarded as 229.0

The sample median is very insensitive to

Try to validate it yourself!! 37

 The variance is unchanged when a constant 𝑐 is added to (or subtracted from)

 If there is an even number of data items, then we need to get the

are negatively skewed. 49

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.