0% found this document useful (0 votes)
30 views7 pages

York University Adms2320 Chapter 16 Example

York University

Uploaded by

Simon Su
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views7 pages

York University Adms2320 Chapter 16 Example

York University

Uploaded by

Simon Su
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

© Michael Rochon 2011 – All Rights Reserved

Consider the following data values of variables x and y:

x 2 4 6 8 10 13
y 7 11 17 21 27 36

1. Determine the least squares regression line.


2. Find the predicted value of y for x = 9.
3. What does the value of the slope of the regression line tell you?
4. Calculate the coefficient of determination, and describe what this statistic tells
you about the relationship between the two variables.
5. Calculate Se and describe that value that is found.
6. Can we conclude at the 5% significance level that higher cigarette
consumption (x = number of cigarettes smoked) causes higher consumption of
coffee (y = number of cups of coffee drank per week).
7. Calculate the Coefficient of Correlation

A)

I suggest putting the information that you require into tabular form. It allows you to
organize the information, produce the necessary statistics you need and allow you a
quick way to check the work over. The following variables will be needed:

n xi yi xi
2
yi
2
xi yi
1 2 7 4 49 14
2 4 11 16 121 44
3 6 17 36 289 102
4 8 21 64 441 168
5 10 27 100 729 270
6 13 36 169 1296 468

n=6 ∑ = 43 ∑ = 119 ∑ = 389 ∑ = 2925 ∑ = 1066


43 119
x= y=
6 6 (need this
= 7.1667 = 19.8333 later)
© Michael Rochon 2011 – All Rights Reserved

First we will calculate the Covariance:


Covariance
⎡ n n ⎤
1 ⎢n
⎢ ∑ xi ∑ y i ⎥
cov( x, y ) = ∑ xi y i − i =1 i =1 ⎥
n − 1 ⎢ i =1 n ⎥
⎢ ⎥
⎣ ⎦
1 ⎡ (43)(119) ⎤
= 1066 −
6 − 1 ⎢⎣ 6 ⎥

=
1
[1066 − 852.8333]
5
= .2 [213.1667]
= 42.6333

Next we will need the variance of x:


Variance
⎡ ⎛ n ⎞ ⎤
2
⎢ ⎜⎜ ∑ x i ⎟⎟ ⎥
1 ⎢ n
⎝ i =1 ⎠ ⎥
s x2 = ⎢ ∑
n − 1 ⎢ i =1
x 2
i −
n


⎢ ⎥
⎣ ⎦

1⎡
= ⎢389 −
(43)2 ⎤

5 ⎢⎣ 6 ⎥⎦

= .20[389 − 308.1667]
= 16.1667
© Michael Rochon 2011 – All Rights Reserved

Now we are ready to calculate beta 1:


Slope Coefficient

Cov ( x, y )
b1 =
s x2
42.6333
=
16.1667
= 2.6371

It is now time to calculate the y-intercept:


Y-Intercept

b0 = y − b1 x
= 19.8333 − 2.6371(7.1667)
= .934

Put the prior two calculations together and you get:


Least Squares Line is

yˆ = .934 + 2.6371x
© Michael Rochon 2011 – All Rights Reserved

B) When x=9, the predictive value of y is:

ŷ = .934 + 2.6371(9)
= 24.6679

C) The slope of the regression line tells us that if x increases by one unit,
y on average will increase by 2.6371.

D) Coefficient of Determination (remember this is one of our assessment


techniques)

[cov( x, y )]2
R =
2
s x2 s 2y

First must calculate the variance of y:

⎡ ⎛ n ⎞ ⎤
2
⎢ ⎜⎜ ∑ y i ⎟⎟ ⎥
1 ⎢ n
⎝ i =1 ⎠ ⎥
s 2y = ⎢∑ y i −
2

n − 1 ⎢ i =1 n ⎥
⎢ ⎥
⎣ ⎦

= .2 [2925 – 2360.1667]
= 112.9667

Now we have all the information to answer the question:

[cov( x, y )]2
R =
2
s x2 s 2y
© Michael Rochon 2011 – All Rights Reserved

(42.6333) 2
R =2
(16.1667)(112.9667)
1817.5983
=
1826.2987
= .9952

L Interpretation: This means that 99.52% of the variation in the


dependent variable y is explained by the variation in the
independent variable x. Also, it means that .48% is unexplained.

E) Se is the standard error (another one of our assessment techniques


for a regression model). In order to find this value we need to
calculate the Sum of Squares for Error (SSE):

⎛ 2 [cov( x, y )] 2 ⎞
SSE = (n − 1)⎜ s y − ⎟
⎜ s x2 ⎟
⎝ ⎠
⎛ 1817.5983 ⎞
= 5⎜112.9667 − ⎟
⎝ 16.1667 ⎠
= 5(112.9667 − 112.4285)
= 5(.5382)
= 2.691

SSE 2.691
Se = = = .8202
n−2 4

L Interpretation: You want this number to be small. As the standard


error approaches zero, the points near the actual regression line. So
in essence, if the standard error is equal to zero then all the points
fall on the line. In our case this is a very small standard error thus
making the models fit good.
© Michael Rochon 2011 – All Rights Reserved

f)Testing the Slope (is the most important of our assessment techniques as
there is no subjectivity in the analysis)
Hypothesis

H 0 : β1 = 0 (it is greater than because we want to know if


higher cigarette consumption causes higher
H 1 : β1 > 0 consumption of coffee)

Rejection Region

t > tα ,ν = t.05,4 = 2.132

Calculated this using table in the text where,

v = n-k-1 (where k is the number of independent variables you have) so, v = 6-1-1 = 4
Alpha (given in original question) was 5% thus alpha is .05, this also means that the
confidence level is 95%.

Test Statistic
⎛ 2 [cov( x, y )]2 ⎞
SSE = (n − 1)⎜ s y − ⎟ = 2.691 (already have this)
⎜ 2 ⎟
⎝ s x ⎠

SSE
sε = = .8202 (already have this)
n−2

sε .8202
sb1 = = = .0912
(n − 1) s x2 (5)16.667

b1 − β1 2.6371 − 0
t= = = 28.9156
sb1 .0912
© Michael Rochon 2011 – All Rights Reserved

Conclusion
Reject H0.There is enough statistical evidence to infer that the number
of cigarettes play a factor in the amount of coffee a smoker drinks.

g)
Coefficent of Correlation

COV ( X , Y )
r=
sx s y

(42.6333)
=
(4.0207)(10.6286)
42.6333
=
42.7344
= .9976

Measures the strength of the association between two variables, we can only
measure r when it is 1, 0, or -1 this is why we default to using the coefficient
of determination for the assessment analysis of a model.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy