0% found this document useful (0 votes)

25 views54 pages

Stats10_Chapter+4 2

Uploaded by

w5w4fsfwz7

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

25 views54 pages

Stats10_Chapter+4 2

Uploaded by

w5w4fsfwz7

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 54

Chapter 4

STATS 10 Introduction to Statistical Reasoning

Maria Cha
Review
• In Chapter 2, we talked about how we graphically
summarize numerical / categorical data.

• In Chapter 3, we talked about how to describe

ONE numerical variable with center and spread
when its distribution is symmetric or skewed.

• In Chapter 4, we will talk about the linear

relationship between TWO numerical variables
with 1) a new plot and 2) a new measure and 3) a
linear equation.

2
Scatterplot

3
Scatterplot
• Scatterplots are the best way to start observing the
relationship and the ideal way to picture associations
between two numerical variables.
• In a scatterplot, you can see patterns, trends,
relationships, and even the occasional extraordinary
value sitting apart from the others.

• The variable in the x-axis is called the explanatory

(independent) variable and the variable on the y-axis
is called the response (dependent) variable.

• When looking at scatterplots, we look for direction,

shape, strength, and unusual features.
4
Scatterplot - Direction
• A pattern that runs from lower left to upper right is
said to have a positive direction (as x increases, y
increases as well).
• A trend running from upper left to lower right has a
negative direction (as x increases, y decreases).

5
Scatterplot - Shape
• If the points appear as a cloud or swarm of
points stretched out in a generally consistent,
straight form, the form of the relationship is
linear.

• Otherwise we categorize the relationship as

non-linear.

6
Scatterplot - Shape
• Linear shape?

A. B.

C. D.

7
Scatterplot - Strength
• If there does not appear to be a lot of scatter, there
is a strong relationship between the two variables
• If there appears to be some scatter, there is a weak
relationship between the two variables
• If there appears to be lots of scatter, there is no
relationship between the two variables

8
Exercise
• Interpret the scatterplot in terms of shape,
direction, and strength.

9
Correlation Coefficient
• The correlation coefficient (𝑟) (also called
Pearson’s correlation coefficient) gives us a
numerical measurement of the strength of the
linear relationship between the explanatory and
response variables.

• Formula:

10
Correlation Coefficient : Properties
• The sign of a correlation coefficient gives the
direction of the association.

• Correlation is always between -1 and +1

• Correlation can be exactly equal to -1 or +1, but
these values are unusual in real data because
they mean that all the data fall exactly on a
single straight line (perfect linear relationship)
• A correlation near zero corresponds to a weak
linear association.

11
Correlation Coefficient : Properties

12
Correlation Coefficient : Properties
• Correlation has no units.
• When arm length and height are measured in
centimeters or inches the correlation coefficient is
still same.
• Correlation measures the strength of the “linear”
association between the two numerical variables.

13
Correlation Coefficient : Properties
• Correlation is sensitive to outliers. A single
outlying value can make a small correlation
large or make a large one small association
between the two numerical variables.

14
Correlation vs. Causation
• Mexican lemon imports prevent highway deaths?

15
Correlation vs. Causation
• A high correlation between two variables does
NOT imply causation.

16
Exercise
Suppose that we have 𝑥 and 𝑦: 𝑖 𝑥! 𝑦!
Calculate and interpret their
1 1 1
correlation coefficient. Note that
the mean of 𝑥 (𝑥)̅ is 3 and SD for 𝑥 2 2 3
(𝑠! ) is 1.83, while the mean of 𝑦 (𝑦)
'
3 4 5
is 4 and SD of 𝑦 (𝑠" ) is 2.58.
4 5 7
𝑥̅ = 3 𝑦& = 4
𝑠" 𝑠#
= 1.83 = 2.58

17
Modeling Linear Trends
• Correlation says “there seems to be a linear
association between these two variable,” but it
doesn't tell us what that association is.

• Want to answer the questions like :

• How much more do people tend to weigh for
each additional inch in height?
• Can we predict how much space a book will take
on a bookshelf by knowing how many pages are
in the book?

18
Modeling Linear Trends
• Modeling linear trends with a linear equation :
• If the model fits the data well, the model can
describe the data and apply/predict the real world
situation.
• The line that describes the relationship between 𝑥 and
𝑦 is called the regression line.

19
Regression Line
• It is a tool for making predictions about future
observed values.
• It provides us with a useful way of summarizing
a linear relationship.

• We can have many possible regression lines for

the data, but we want find the best line which
well describes and predicts the data : How to
find the best regression line?
I.e. let’s assume we have three regression lines.
- we want the deviation between the line and the dots to be small

20
Residual
• The linear model won't be perfect, regardless of the
line we draw.
• Some points will be above the line and some will be
below the line.
• The estimate made from a linear model is the
predicted value, denoted as 𝑦.
!
• The difference between the observed value and its
associated predicted value is called the residual.

• To find the residuals, we always subtract the

predicted value from the observed one:
residual = observed - predicted

21
Residual
• (blue dot) : observed(actual) y values
• (red dot) : predicted y values
• Residual : −

22
Residual
• A negative residual
means the
predicted value is
bigger than the
actual value (an
overestimate).
• A positive residual
means the
predicted value is
smaller than the
actual value (an
underestimate).
23
Finding the “Best” Regression Line
• Ultimately, we want to find the best fitted model,
which minimize the residuals.
• To find a good ‘measure’ for amount of residuals
• 1) Sum of residuals? : Some residuals are positive,
others are negative, on average, they cancel each
other out. – Not a good way!
• 2) Sum of squared residuals : Similar to what we
did with deviations, we square the residuals and
add the squares.
• The smaller the sum, the better the fit.

• The line of best fit is the line for which the sum of the
squared residuals is smallest.
24
Finding the “Best” Regression Line
• Among possible regression lines, find the line that
minimizes the sum or squared residuals.
Scatterplot of Mothers' age vs. Fathers' age
60
Fathers' age

40
20
0

0 10 20 30 40 50

Mothers' age
• Which one looks the best?
25
Finding the “Best” Regression Line
Scatterplot of Mothers' age vs. Fathers' age

60
Fathers' age

40
20
0

0 10 20 30 40 50

Mothers' age

Sum of squared
Line Color Linear Equation
residuals
Black 𝑦$ = 11.54 + 0.68𝑥 38905.1
Green 𝑦$ = 1 + 𝑥 54545
Red 𝑦$ = 15 + 0.6𝑥 47887.32
26
Regression Line
• Ultimately, we want to write the regression line
as:
$ = 𝒊𝒏𝒕𝒆𝒓𝒄𝒆𝒑𝒕 + 𝒔𝒍𝒐𝒑𝒆×𝒙
𝒚
by finding the intercept and slope that find the
regression line which provides smallest sum of
squared residuals.

• This linear model says that our predictions from

our model follow a straight line.
• If the model is a good one, the data values will
scatter closely around it.

27
Regression Line : Slope
• The slope can be calculated using the correlation
coefficient, 𝑟, and the standard deviations of the
explanatory (independent) variable (𝒔𝒙 ) and the
response (dependent) variable (𝒔𝒚 ):

𝒔𝒚
𝒔𝒍𝒐𝒑𝒆 = 𝒓×
𝒔𝒙

• Interpretation of the slope: For each unit

increase in 𝑥, we expect 𝑦 to increase/decrease on
average by the value of the slope.

28
Regression Line : Slope
𝒔𝒚
𝒔𝒍𝒐𝒑𝒆 = 𝒓×
𝒔𝒙

• When 𝑟 is positive the slope will be positive and

when 𝑟 is negative the slope will be negative.
• A positive correlation and a positive slope both
indicate that as 𝑥 increases 𝑦 is expected to
increase as well.
• A negative correlation and a negative slope both
indicate that as 𝑥 increase 𝑦 is expected to
decrease.

29
Regression Line : Slope
• Example : The mean fat content of the 30 Burger
King menu items is 23.5g with a standard deviation
of 16.4g, and the mean protein content of these
items is 17.2g with a standard deviation of 14g. If the
correlation between protein and fat contents is
0.83, calculate the slope of the linear model for
predicting fat content from protein content.

• Which of the two variables, between fat and protein,

should be considered as 𝑦?

30
Regression Line : Slope
Fat (𝑦) Protein(𝑥)
Mean 23.5 17.2
Standard
16.4 14
deviation
Correlation
0.83
Coefficient
• Find a slope of the linear model for predicting fat
content from protein content.
𝒔𝒚 𝟏𝟔. 𝟒
𝒔𝒍𝒐𝒑𝒆 = 𝒓× = 𝟎. 𝟖𝟑× = 𝟎. 𝟗𝟕
𝒔𝒙 𝟏𝟒

• Interpret the slope in context. : For one gram increase in

protein content, we would expect the fat content to
increase on average by 0.97 grams.
31
Regression Line : Intercept
• Once we find the slope, we can calculate the
intercept of the equation.

* − 𝒔𝒍𝒐𝒑𝒆×*
𝒊𝒏𝒕𝒆𝒓𝒄𝒆𝒑𝒕 = 𝒚 𝒙

* denotes the mean of 𝑦, and 𝒙

• Note: 𝒚 * denotes
the mean of x.

• Interpretation of the intercept: When 𝑥

equals 0, we expect 𝑦 to equal the intercept.

32
Regression Line : Intercept
• Example (continued) :
Fat (𝑦) Protein(𝑥)
Mean 23.5 17.2

• We calculated the slope as 0.97.

• Calculate the intercept.

𝒊𝒏𝒕𝒆𝒓𝒄𝒆𝒑𝒕 = 𝒚+ − 𝒔𝒍𝒐𝒑𝒆×+ 𝒙
= 𝟐𝟑. 𝟓 − 𝟎. 𝟗𝟕×𝟏𝟕. 𝟐 = 𝟔. 𝟖

• Interpret the intercept in context. : Burger King

menu items with no protein are expected to have a fat
content of 6.8 grams.
33
Regression Line : Intercept
• Sometimes the intercept by itself may not make
sense. In these cases the intercept serves only to
adjust the height of the line and is meaningless by
itself.

• The scatterplot shows

the relationship between
height and weight

• People who are

0 centimeters tall are
expected to weigh -105.0113 kilograms. This is
obviously not possible.
34
Regression Line
• Once we found slope and intercept of the
regression model, we can write as a form of an
equation.

• For the Burger King menu items the slope was

found to be 0.97 and the intercept was found to be
6.8. Which of the below is the correct equation for
the regression model?
! = 6.8 + 0.97×𝑝𝑟𝑜𝑡𝑒𝑖𝑛
(a) 𝑓𝑎𝑡
4 = 6.8 + 0.97×𝑓𝑎𝑡
(b) 𝑝𝑟𝑜𝑡𝑒𝑖𝑛
(c) 𝑝𝑟𝑜𝑡𝑒𝑖𝑛 = 6.8 + 0.97×𝑓𝑎𝑡
! = 0.97 + 6.8×𝑝𝑟𝑜𝑡𝑒𝑖𝑛
(d) 𝑓𝑎𝑡

35
Regression Line

4 = 6.8 + 0.97×𝑝𝑟𝑜𝑡𝑒𝑖𝑛
𝑓𝑎𝑡

• If a new menu comes with 1g of protein, what do

we expect for the amount of fat in the menu?

36
Exercise
• The table shows the heights and weights of some
people. We want to predict weights from height.

Height (inches) Weight (pounds)

60 105
66 140
72 185
70 145
62 120

a) Draw a scatterplot for the given data. Do you

observe any linear association?
37
Exercise
Height (inches) Weight (pounds)
60 105
66 140
72 185
70 145
62 120

b) Find the mean, SD, and correlation coefficient

of the two variables. What does the correlation
coefficient imply?

38
Exercise
Height (inches) Weight (pounds)
60 105
66 140
72 185
70 145
62 120

c) Find the best fitted line and plot the line on the
scatterplot. Interpret the slope and the intercept
in the context of the data.

39
Exercise
Height (inches) Weight (pounds)
60 105
66 140
72 185
70 145
62 120

d) Compare the predicted weights and actual

weights. What are their residual?

40
Exercise
Height (inches) Weight (pounds)
60 105
66 140
72 185
70 145
62 120

e) If we find a person’s height of 68 inches, what

do we expect for the person’s weight?

41
Measure of Goodness of fit
• If the correlation between 𝑥 and 𝑦 is 𝑟 = 1 or 𝑟 = −1,
then the model would predict 𝑦 perfectly, and the
residuals would all be zero. Hence, all of the variability
in the response variable would be explained by the
explanatory variable.

42
R2 : Coefficient of determination
• In the Burger King menu model, 𝑟 = 0.83, which
is not perfect. In other words, protein cannot
explain 100% of fat’s variation. Then how much?

• The squared correlation (R2, r-squared) gives the

percentage of the variance of 𝑦 accounted for by
the regression model, or in other words explained
by 𝑥.

• BK model : 69%(0.832) of the variation in fat

content is accounted for by the model, i.e.,
explained by the protein content of the burger
43
R2 : Coefficient of determination
• A) 𝑅2 = 0% : the
variation in y is not
explained by x at all.

• B) R2=100% : the
variation in y is
perfectly explained by
x.

44
R2 : Coefficient of determination
• C) R2=60.5% : Some
portion (60.5%) of the
variation in y is
explained by x

45
R2 : Properties
• 𝑅2 = 0 means that none of the variance in y is
explained by x.
• 𝑅2 = 1 means that all of the variance in y is
explained by x.
• While the correlation coefficient is between -1 and
1, 𝑅2 is between 0 and 1
0 ≤ 𝑅2 ≤ 1

• We would like 𝑅2 to be as close to 100% as

possible.

46
Exercise
• The correlation between weight and gas mileage of
cars is -0.96. Which of the below is the correct value
and interpretation of the linear model for predicting
gas mileage from weight?

a) 8% or the variation in gas mileage is explained by

the weight of the cars.
b) 92% of the variation in weight is explained by the
gas mileage of cars.
c) 92% of the variation in gas mileage is explained by
the weight of cars.
d) 96% of the variation in gas mileage is explained by
the weight of cars.
47
Exercise
• The following scatterplot shows the relationship
between number of hours students watch TV
and their score on an exam. If 𝑅2 = 50%, what is
the correlation coefficient?

a) -0.71
b) 0.71
c) 0.25
d) 7.07

48
Residual Plot
• The linear model assumes that the relationship
between the two variables is a perfect straight
line. The residuals are the part of the data that
has not been modeled.
Data = Model + Residual
or equivalently
Residual = Data – Model.

• Residuals help us see whether the model makes

sense (fits well).
• When a regression model is appropriate (is a
good fit), nothing interesting should be left
behind. 49
Residual Plot
The double hamburger
with a protein content of
31 grams has a predicted
fat content of
= = 6.8 + 0.97×31
𝑓𝑎𝑡
= 36.87 grams

The actual fat content

for a double hamburger is
26 grams, so the residual
is
26 − 36.87 = −10.87
grams.
50
Residual Plot : Model Assessment
• We can check if our regression model is a good fit by
looking at what is left behind, i.e., the residuals
• Good model : The scatterplot of residuals vs. x should
show a random scatter around 0.
• Some data points will be above the regression line and
some will be below; hence some of the residuals will be
negative and some positive.
• There should be no apparent patterns in the residual plot
as x increases.

51
Residual Plot : Model Assessment
• Bad examples (not fit to the linear model)
• Fan shaped residuals: Variance of the residuals is
large when x becomes large (or vice versa)
• Curved residuals: Residuals are negative for small x,
then positive, and finally slightly negative again for
large x.

52
Extrapolation
• Do not extrapolate beyond the data, i.e., do not make a
prediction for an x value outside the range of the data –
the linear model may no longer hold outside that range
• For example, if the BK Broiler chicken sandwich has a
protein content of 75 grams, what is the predicted fat
content?

We should not use the linear

model to predict the amount
of fat in a burger with 75g of
protein since the data we used
to create the model is for
burgers with approximately 0 to
50g of protein.
53
What can go wrong
• Do not fit a straight line to a nonlinear
relationship.
• Beware of extraordinary points (y-values that
stand off from the linear pattern or extreme x-
values).
• Do not extrapolate beyond the data – the linear
model may no longer hold outside the range of
the data.
• Do not infer that x causes y just because there is a
good linear model for their relationship –
correlation not causation.

Week 2 Lab - Introduction To Data - Coursera
No ratings yet
Week 2 Lab - Introduction To Data - Coursera
6 pages
Tolerancias Mettler Oct 2016
No ratings yet
Tolerancias Mettler Oct 2016
292 pages
Skew Gaussian Process For Nonlinear Regression
No ratings yet
Skew Gaussian Process For Nonlinear Regression
26 pages
Module 2 - Section 4 (Linear Regression) - 11
No ratings yet
Module 2 - Section 4 (Linear Regression) - 11
20 pages
Part 2 Exploring Relationships Among Variables
No ratings yet
Part 2 Exploring Relationships Among Variables
8 pages
Correlation
100% (1)
Correlation
29 pages
Lectures 14 15
No ratings yet
Lectures 14 15
66 pages
Corr_Regression Analysis
No ratings yet
Corr_Regression Analysis
19 pages
Relationship- Correlation and Regression (1)
No ratings yet
Relationship- Correlation and Regression (1)
42 pages
Review: I Am Examining Differences in The Mean Between Groups
100% (2)
Review: I Am Examining Differences in The Mean Between Groups
44 pages
MetNum1 2023 1 Week 13
No ratings yet
MetNum1 2023 1 Week 13
70 pages
Captura de ecrã 2024-10-16 à(s) 13.04.06
No ratings yet
Captura de ecrã 2024-10-16 à(s) 13.04.06
38 pages
Chapter 3
No ratings yet
Chapter 3
15 pages
Corelation With Example
No ratings yet
Corelation With Example
112 pages
Chapter 1
No ratings yet
Chapter 1
22 pages
Friday, May 21
No ratings yet
Friday, May 21
53 pages
Chapter 2
No ratings yet
Chapter 2
67 pages
Simple Linear Regression and Correlation
No ratings yet
Simple Linear Regression and Correlation
77 pages
6 Correlation and Linear Regression
No ratings yet
6 Correlation and Linear Regression
32 pages
Correlation and Regression: Associate Professor Georgi Iskrov, PHD Department of Social Medicine and Public Health
No ratings yet
Correlation and Regression: Associate Professor Georgi Iskrov, PHD Department of Social Medicine and Public Health
28 pages
Corr and Regress
No ratings yet
Corr and Regress
42 pages
lec4
No ratings yet
lec4
57 pages
Regression and correlation notes
No ratings yet
Regression and correlation notes
28 pages
Ch 4- Correlation and Regression YARA&LAMA
No ratings yet
Ch 4- Correlation and Regression YARA&LAMA
27 pages
Chapter12 Stats
No ratings yet
Chapter12 Stats
6 pages
Regression Correlation
No ratings yet
Regression Correlation
22 pages
biostat lecture note 3
No ratings yet
biostat lecture note 3
5 pages
Correlation and Regression
No ratings yet
Correlation and Regression
5 pages
CORRELATION-AND-REGRESSION-ANALYSIS
No ratings yet
CORRELATION-AND-REGRESSION-ANALYSIS
37 pages
Correlation
No ratings yet
Correlation
72 pages
Week 8 2025 - Correlation and Regression
No ratings yet
Week 8 2025 - Correlation and Regression
47 pages
Lecture 8 Correlation and Linear Regression
No ratings yet
Lecture 8 Correlation and Linear Regression
66 pages
SEE5211 Chapter3-P2017
No ratings yet
SEE5211 Chapter3-P2017
58 pages
5_Chapter9-linear regression
No ratings yet
5_Chapter9-linear regression
15 pages
06 Simple Linear Regression Part1
No ratings yet
06 Simple Linear Regression Part1
8 pages
SOCI1005 - Correlation and Regression
No ratings yet
SOCI1005 - Correlation and Regression
36 pages
Unit 2 - Scatterplots Correlation and Regression Summer 2021
No ratings yet
Unit 2 - Scatterplots Correlation and Regression Summer 2021
43 pages
Chapter 3 Describing Relationships
No ratings yet
Chapter 3 Describing Relationships
39 pages
Stats_ch_4_powerpoint
No ratings yet
Stats_ch_4_powerpoint
67 pages
Correlation Regression
No ratings yet
Correlation Regression
58 pages
Correlation Simple Regression
No ratings yet
Correlation Simple Regression
26 pages
Notes Scatter Plots
No ratings yet
Notes Scatter Plots
39 pages
6 Continuous Data Analysis
No ratings yet
6 Continuous Data Analysis
49 pages
ASS#1-FINALS Doromal
No ratings yet
ASS#1-FINALS Doromal
8 pages
Lecture 7
No ratings yet
Lecture 7
65 pages
Correlation and Regression 2020
No ratings yet
Correlation and Regression 2020
63 pages
Regression&Corr&Annova
No ratings yet
Regression&Corr&Annova
71 pages
Scatter plot
No ratings yet
Scatter plot
20 pages
Presentation4 - Bivariate Analysis and Simple Linear Regression
No ratings yet
Presentation4 - Bivariate Analysis and Simple Linear Regression
31 pages
Data Analytics Lesson 11 Notes
No ratings yet
Data Analytics Lesson 11 Notes
8 pages
Second Stats Packet 24
No ratings yet
Second Stats Packet 24
100 pages
Correlation and Regression
No ratings yet
Correlation and Regression
31 pages
Correlation and Regression
100% (1)
Correlation and Regression
20 pages
Correlation Regression
100% (1)
Correlation Regression
25 pages
Topic V
No ratings yet
Topic V
30 pages
Chapter 8 Part 1
No ratings yet
Chapter 8 Part 1
16 pages
Correlation_Linear_Logistic Regression
No ratings yet
Correlation_Linear_Logistic Regression
123 pages
13simple linear regression
No ratings yet
13simple linear regression
127 pages
Investigating Variables
No ratings yet
Investigating Variables
15 pages
Correlation and Regression
No ratings yet
Correlation and Regression
16 pages
Statistical Analysis: Linear Regression
No ratings yet
Statistical Analysis: Linear Regression
36 pages
Correlation and Regression: Six Sigma Thinking, #8
From Everand
Correlation and Regression: Six Sigma Thinking, #8
Sumeet Savant
5/5 (1)
Bell's Inequality Untwisted
From Everand
Bell's Inequality Untwisted
Jim Spinosa
No ratings yet
softmax
No ratings yet
softmax
17 pages
GROUP 3 WRITTEN REPORT
No ratings yet
GROUP 3 WRITTEN REPORT
16 pages
Qam Ii - PS 1
No ratings yet
Qam Ii - PS 1
7 pages
Quick and Dirty Regression Tutorial
No ratings yet
Quick and Dirty Regression Tutorial
6 pages
Data Mining Methods: Data Pre-Processing: Prof. Dr. Christina Andersson
No ratings yet
Data Mining Methods: Data Pre-Processing: Prof. Dr. Christina Andersson
33 pages
Non-Stationarity and Unit Roots
No ratings yet
Non-Stationarity and Unit Roots
25 pages
PROBLEM SET-4 Continuous Probability - Solutions
71% (7)
PROBLEM SET-4 Continuous Probability - Solutions
8 pages
How To Use ROC Curves and Precision-Recall Curves For Classification in Python
No ratings yet
How To Use ROC Curves and Precision-Recall Curves For Classification in Python
47 pages
Latin Square Design
100% (1)
Latin Square Design
46 pages
EViews Help - Estimating A Panel Equation
No ratings yet
EViews Help - Estimating A Panel Equation
1 page
M3 S3-A2 Variance and Standard Deviation Practice With Answers
No ratings yet
M3 S3-A2 Variance and Standard Deviation Practice With Answers
7 pages
Basic Business Statistics Chap 15
No ratings yet
Basic Business Statistics Chap 15
36 pages
Biostatistics Module
No ratings yet
Biostatistics Module
10 pages
Brand Image Project Report
No ratings yet
Brand Image Project Report
8 pages
ml_cheatsheet
No ratings yet
ml_cheatsheet
4 pages
Research Proposal
No ratings yet
Research Proposal
3 pages
Gamma Distribution: Presented To: Dr. Zahid Ahmad Presented By: Rauf Shaukat (557) Waheed Afzal
No ratings yet
Gamma Distribution: Presented To: Dr. Zahid Ahmad Presented By: Rauf Shaukat (557) Waheed Afzal
45 pages
Chapter 13 - Correlation and Linear Regression
No ratings yet
Chapter 13 - Correlation and Linear Regression
51 pages
Statistical Hypothesis Test
No ratings yet
Statistical Hypothesis Test
6 pages
2023 05 Struktur Variaans-Kovarians
No ratings yet
2023 05 Struktur Variaans-Kovarians
42 pages
IBM SPSS Amos 19 User's Guide
No ratings yet
IBM SPSS Amos 19 User's Guide
654 pages
Paired Sample T-Test
No ratings yet
Paired Sample T-Test
5 pages
Sample Size Determination
No ratings yet
Sample Size Determination
21 pages
CH 2
No ratings yet
CH 2
12 pages
Capability Analysis
No ratings yet
Capability Analysis
16 pages
9.3statistical Tables
No ratings yet
9.3statistical Tables
6 pages
Results and Discussion
No ratings yet
Results and Discussion
7 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Stats10_Chapter+4 2

Uploaded by

Stats10_Chapter+4 2

Uploaded by

Chapter 4

STATS 10 Introduction to Statistical Reasoning

• In Chapter 3, we talked about how to describe

• In Chapter 4, we will talk about the linear

• The variable in the x-axis is called the explanatory

• When looking at scatterplots, we look for direction,

• Otherwise we categorize the relationship as

• Correlation is always between -1 and +1

• Want to answer the questions like :

• We can have many possible regression lines for

• To find the residuals, we always subtract the

• This linear model says that our predictions from

• Interpretation of the slope: For each unit

• When 𝑟 is positive the slope will be positive and

• Which of the two variables, between fat and protein,

• Interpret the slope in context. : For one gram increase in

* denotes the mean of 𝑦, and 𝒙

• Interpretation of the intercept: When 𝑥

• We calculated the slope as 0.97.

• Calculate the intercept.

• Interpret the intercept in context. : Burger King

• The scatterplot shows

• People who are

• For the Burger King menu items the slope was

• If a new menu comes with 1g of protein, what do

Height (inches) Weight (pounds)

a) Draw a scatterplot for the given data. Do you

b) Find the mean, SD, and correlation coefficient

d) Compare the predicted weights and actual

e) If we find a person’s height of 68 inches, what

• The squared correlation (R2, r-squared) gives the

• BK model : 69%(0.832) of the variation in fat

• We would like 𝑅2 to be as close to 100% as

a) 8% or the variation in gas mileage is explained by

• Residuals help us see whether the model makes

The actual fat content

We should not use the linear

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.