CH 11
CH 11
0.8
0.7
0.6
0.5
y 0.4
0.3
0.2
0.1
0
-50 0 50 100 150
d) βˆ1 = 0.00416
11-3 a)
Regression Analysis: Rating Points versus Meters per Att
Analysis of Variance
Source DF SS MS F P
Regression 1 1669.0 1669.0 61.02 0.000
Residual Error 30 820.5 27.4
Total 31 2489.5
yi = β 0 + β1 xi + ε i
(204.74) 2
S xx = 1323.648 − = 13.696
32
(204.74)(2714.1)
S xy = 17516.34 − = 151.1889
32
S xy 151.1889
βˆ1 = = = 11.039
S xx 13.696
2714.1 ⎛ 204.74 ⎞
βˆ0 = − (11.039) ⎜ ⎟ = 14.187
32 ⎝ 32 ⎠
11-2
Applied Statistics and Probability for Engineers, 5th edition February 23, 2010
yˆ = 14.2 + 11x
SS E 820.5
σˆ 2 = MS E = = = 27.35
n−2 30
c) − βˆ1 = −11
1
d) × 10 = 0.91
11
e) yˆ = 14.2 + 11(6.59) = 86.69
11-4 a)
Regression Analysis - Linear model: Y = a+bX
Dependent variable: SalePrice Independent variable: Taxes
---------------------------------------------------------------------------
Standard T Prob.
Parameter Estimate Error Value Level
Intercept 13.3202 2.57172 5.17948 .00003
Slope 3.32437 0.390276 8.518 .00000
---------------------------------------------------------------------------
Analysis of Variance
Source Sum of Squares Df Mean Square F-Ratio Prob.
Level
11-3
Applied Statistics and Probability for Engineers, 5th edition February 23, 2010
σˆ 2 = 8.76775
If the calculations were to be done by hand, use Equations (11-7) and (11-8).
yˆ = 13.3202 + 3.32437 x
yˆ = 31.9496
e = y − yˆ = 28.9 − 31.9496 = −3.0496
d) All the points would lie along a 45 degree line. That is, the regression model would estimate the
values exactly. At this point, the graph of observed vs. predicted indicates that the simple linear
regression model provides a reasonable fit to the data.
50
45
Predicted
40
35
30
25
25 30 35 40 45 50
Observed
11-5 a)
Regression Analysis - Linear model: Y = a+bX
Dependent variable: Usage Independent variable: Temperature
----------------------------------------------------------------------------
Standard
Parameter Estimate Error T Prob.
11-4
Applied Statistics and Probability for Engineers, 5th edition February 23, 2010
σˆ 2 = 3
If the calculations were to be done by hand, use Equations (11-7) and (11-8).
yˆ = 130 + 7.59 x
yˆ = 190.72
11-6 a)
The regression equation is
MPG = 39.2 - 0.0402 Engine Displacement
Analysis of Variance
Source DF SS MS F P
Regression 1 385.18 385.18 27.49 0.000
Residual Error 19 266.24 14.01
Total 20 651.41
σˆ 2 = 14.01
yˆ = 39.2 − 0.0402 x
11-5
Applied Statistics and Probability for Engineers, 5th edition February 23, 2010
c) yˆ = 34.2956
11-7 a)
60
50
y
40
Analysis of Variance
Source DF SS MS F P
Regression 1 322.50 322.50 44.03 0.000
Error 11 80.57 7.32
Total 12 403.08
σˆ 2 = 7.3212
yˆ = −16.5093 + 0.0693554 x
11-8 a)
9
8
7
6
5
y
4
3
2
1
0
60 70 80 90 100
x
11-6
Applied Statistics and Probability for Engineers, 5th edition February 23, 2010
Yes, a linear regression would seem appropriate, but one or two points might be outliers.
Analysis of Variance
Source DF SS MS F P
Regression 1 92.934 92.934 53.50 0.000
Residual Error 18 31.266 1.737
Total 19 124.200
c) yˆ = 5.5541 at x = 90
11-9 a)
250
200
y
150
100
0 10 20 30 40
x
Analysis of Variance
Source DF SS MS F P
Regression 1 20329 20329 51.04 0.000
Error 7 2788 398
Total 8 23117
d) yˆ = 156.883 e = 15.1175
11-7
Applied Statistics and Probability for Engineers, 5th edition February 23, 2010
11-10 a)
40
30
y
20
10
0
0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8
x
Yes, a simple linear regression model seems appropriate for these data.
Analysis of Variance
Source DF SS MS F P
Regression 1 1273.5 1273.5 92.22 0.000
Error 16 220.9 13.8
Total 17 1494.5
b) σˆ 2 = 13.81
yˆ = 0.470467 + 20.5673 x
11-11 a)
11-8
Applied Statistics and Probability for Engineers, 5th edition February 23, 2010
Yes, a simple linear regression (straight-line) model seems plausible for this situation.
Analysis of Variance
Source DF SS MS F P
Regression 1 72222688 72222688 156.67 0.000
Residual Error 18 8297849 460992
Total 19 80520537
b) σˆ 2 = 460992
yˆ = 18090.2 − 254.55 x
d) If there were no error, the values would all lie along the 45° line. The plot indicates age is a
reasonable regressor variable.
11-9
Applied Statistics and Probability for Engineers, 5th edition February 23, 2010
11-12 a)
The regression equation is
Porosity = 55.6 - 0.0342 Temperature
Analysis of Variance
Source DF SS MS F P
Regression 1 136.68 136.68 1.77 0.241
Residual Error 5 386.65 77.33
Total 6 523.33
yˆ = 55.63 − 0.03416 x
σˆ 2 = 77.33
c) yˆ = 4.39 e = 7.012
11-10
Applied Statistics and Probability for Engineers, 5th edition February 23, 2010
d)
The simple linear regression model doesn’t seem appropriate because the scatter plot doesn’t show a
linear relationship between the data.
11-13 a)
The regression equation is
BOD = 0.658 + 0.178 Time
Analysis of Variance
Source DF SS MS F P
Regression 1 13.344 13.344 161.69 0.000
Residual Error 9 0.743 0.083
Total 10 14.087
yˆ = 0.658 + 0.178 x
σˆ 2 = 0.083
c) 0.178(3) = 0.534
11-11
Applied Statistics and Probability for Engineers, 5th edition February 23, 2010
e)
All the points would lie along the 45 degree line y = yˆ . That is, the regression model would estimate
the values exactly. At this point, the graph of observed vs. predicted indicates that the simple linear
regression model provides a reasonable fit to the data.
11-14 a)
The regression equation is
Deflection = 32.0 - 0.277 Stress level
Analysis of Variance
Source DF SS MS F P
Regression 1 45.154 45.154 40.38 0.000
Residual Error 7 7.827 1.118
Total 8 52.981
σˆ 2 = 1.118
11-12
Applied Statistics and Probability for Engineers, 5th edition February 23, 2010
c) (−0.277)(5) = −1.385
1
d) = 3.61
0.277
e) yˆ = 32.05 − 0.277(75) = 11.275 e = y − yˆ = 12.534 − 11.275 = 1.259
11-15
It’s possible to fit this data with straight-line model, but it’s not a good fit. There is a curvature shown
on the scatter plot.
a)
The regression equation is
y = 2.02 + 0.0287 x
Analysis of Variance
11-13
Applied Statistics and Probability for Engineers, 5th edition February 23, 2010
Source DF SS MS F P
Regression 1 1.3280 1.3280 52.42 0.000
Residual Error 25 0.6333 0.0253
Total 26 1.9613
yˆ = 2.02 + 0.0287 x
σˆ 2 = 0.0253
c)
If the relationship between length and age was deterministic, the points would fall on the 45 degree
line y = yˆ . Because the points in this plot vary substantially from this line, it appears that age is not a
b) βˆ1 = 0.00749
11-17 Let x = engine displacement (cm3) and xold = engine displacement (in3)
b) βˆ1 = −0.0025
11-14
Applied Statistics and Probability for Engineers, 5th edition February 23, 2010
11-19 a) The slopes of both regression models will be the same, but the intercept will be shifted.
b) yˆ = 2132.41 − 36.9618 x
11-20 a) The least squares estimate minimizes ∑(y i − β xi ) 2 . Upon setting the derivative equal to zero, we
2∑ ( yi − β xi ) (− xi ) = 2[∑ − yi xi + β ∑ xi ] = 0
2
obtain
Therefore, βˆ =
∑yxi i
.
∑x
2
i
45
40
35
30
25
chloride
20
15
10
0
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
watershed
11-15
Applied Statistics and Probability for Engineers, 5th edition February 23, 2010
Section 11-4
βˆ0 − β 0 12.857
11-21 a) T0 = = = 12.4583
se( β 0 ) 1.032
βˆ1 − β1 2.3445
T1 = = = 20.387
se( β1 ) 0.115
βˆ0 − β 0 26.753
11-22 a) T0 = = = 11.2739
se( β 0 ) 2.373
P-value = 2[P( T14 > 11.2739)] and P-value < 2(0.0005) = 0.001
βˆ1 − β1 1.4756
T1 = = = 13.8815
se( β1 ) 0.1063
P-value = 2[P( T14 > 13.8815)] and P-value < 2(0.0005) = 0.001
Degrees of freedom of the residual error = 15 – 1 = 14.
Sum of squares regression = Sum of square Total – Sum of square residual error = 1500 – 94.8 =
1405.2
SS Regression 1405.2
MS Regression = = = 1405.2
1 1
MS R 1405.2
F0 = = = 192.4932
MS E 7.3
11-16
Applied Statistics and Probability for Engineers, 5th edition February 23, 2010
3) H1 : β1 ≠ 0
4) α = 0.01
5) The test statistic is
MS R SS R /1
f0 = =
MS E SS E /(n − 2)
SS R = βˆ1 S xy = −2.3298017(−59.057143)
= 137.59
SS E = S yy − SS R
= 159.71429 − 137.59143
= 22.123
137.59
f0 = = 74.63
22.123 /12
8) Since 74.63 > 9.33 reject H0 and conclude that compressive strength is significant in predicting
intrinsic permeability of concrete at α = 0.01. We can therefore conclude that the model specifies a
useful linear relationship between these two variables.
P-value ≅ 0.000002
SS E 22.123 σˆ 2 1.8436
b) σˆ 2 = MS E = = = 1.8436 and se( βˆ1 ) = = = 0.2696
n−2 12 S xx 25.3486
⎡1 x2 ⎤ ⎡ 1 3.07142 ⎤
c) se( βˆ0 ) = σˆ 2 ⎢ + ⎥ = 1.8436 ⎢ + ⎥ = 0.9043
⎣ n S xx ⎦ ⎣14 25.3486 ⎦
11-24 a) 1) The parameter of interest is the regressor variable coefficient, β1.
2) H 0 : β1 = 0
3) H1 : β1 ≠ 0
4) α = 0.01
5) The test statistic is
MS R SS R /1
f0 = =
MS E SS E /(n − 2)
11-17
Applied Statistics and Probability for Engineers, 5th edition February 23, 2010
SS R = βˆ1 S xy = (0.0041612)(141.445)
= 0.5886
SS E = S yy − SS R
= (8.86 − 12.75 ) − 0.5886
2
20
= 0.143275
0.5886
f0 = = 73.95
0.143275 /18
8) Since 73.95 > 8.29, reject H0 and conclude the model specifies a useful relationship at α = 0.01.
P-value ≅ 0.000001
σˆ 2 .00796
b) se( βˆ1 ) = = = 4.8391× 10−4
S xx 33991.6
⎡1 x ⎤ ⎡1 73.92 ⎤
se( βˆ0 ) = σˆ 2 ⎢ + ⎥ = .00796 ⎢ + ⎥ = 0.04091
⎣ n S xx ⎦ ⎣ 20 33991.6 ⎦
11-25 a)
Regression Analysis: Rating Pts versus Yds per Att
Analysis of Variance
Source DF SS MS F P
Regression 1 1672.5 1672.5 61.41 0.000
Residual Error 30 817.1 27.2
Total 31 2489.5
b) σˆ 2 = 27.2
11-18
Applied Statistics and Probability for Engineers, 5th edition February 23, 2010
σˆ 2 27.2
se( βˆ1 ) = = = 1.287
S xx 16.422
⎡1 x2 ⎤ ⎡1 72 ⎤
se( βˆ0 ) = σˆ 2 ⎢ + ⎥ = 27.2 ⎢ + ⎥ = 9.056
⎣ n S xx ⎦ ⎣ 32 16.422 ⎦
3) H1 : β1 ≠ 10
4) α = 0.01
βˆ1 − β1,0
5) The test statistic is t0 =
se( βˆ1 )
3) H1 : β1 ≠ 0
βˆ1
5) The test statistic is t0 =
se( βˆ1 )
3) H1 : β1 ≠ 0
4) α = 0.05
MSR SSR /1
5) The test statistic is f 0 = =
MSE SSE /(n − 2)
11-19
Applied Statistics and Probability for Engineers, 5th edition February 23, 2010
σˆ 2 8.7675
c) se( βˆ1 ) = = = .39027
S xx 57.5631
⎡1 x ⎤ ⎡ 1 6.4049 2 ⎤
se( βˆ0 ) = σˆ 2 ⎢ + ⎥ = 8.7675 ⎢ + ⎥ = 2.5717
⎣ n S xx ⎦ ⎣ 24 57.5631 ⎦
3) H1 : β1 ≠ 0
βˆ0
5) The test statistic is t0 =
se( βˆ0 )
3) H1 : β1 ≠ 0
4) α = 0.05
MS R SS R /1
5) The test statistic is f 0 = =
MS E SS E /(n − 2)
11-20
Applied Statistics and Probability for Engineers, 5th edition February 23, 2010
3) H1 : β1 ≠ 10
βˆ1 − β1,0
5) The test statistic is t0 =
se( βˆ1 )
P-value < 0.005. Reject H0 and conclude that the intercept should be included in the model.
MS R 385.18
f0 = = = 27.49
MS E 14.01
F0.01,1,19 = 8.18
Reject the null hypothesis and conclude that the slope is not zero. The exact P-value is P = 0.0000463
(d) H 0 : β 0 = 0; H1 : β 0 ≠ 0
11-21
Applied Statistics and Probability for Engineers, 5th edition February 23, 2010
H1 : β1 ≠ 0
α = 0.05
f 0 = 44.0279
f.05,1,11 = 4.84
f 0 > fα ,1,11
c) H 0 : β 0 = 0
H1 : β 0 ≠ 0
α = 0.05
t0 = −1.67718
t.025,11 = 2.201
| t0 |</ −tα / 2,11
H1 : β1 ≠ 0
α = 0.05
f 0 = 53.50
f.05,1,18 = 4.414
f 0 > fα ,1,18
c) H 0 : β 0 = 0
H1 : β 0 ≠ 0
α = 0.05
11-22
Applied Statistics and Probability for Engineers, 5th edition February 23, 2010
t0 = − 5.079
t.025,18 = 2.101
| t0 |> tα / 2,18
H1 : β1 ≠ 0
α = 0.01
f 0 = 156.67
f.01,1,18 = 8.285
f 0 > fα ,1,18
c) H 0 : β1 = −206.84
H1 : β1 ≠ −206.84
α = 0.01
−254.55 − ( −206.84)
t0 = = −2.3456
20.34
t.005,18 = 2.878
| t0 |>/ −tα / 2,18
d) H 0 : β 0 = 0
H1 : β 0 ≠ 0
α = 0.01
t0 = 58.20
t.005,18 = 2.878
e) H 0 : β 0 = 17236.89
H1 : β 0 > 17236.89
α = 0.01
11-23
Applied Statistics and Probability for Engineers, 5th edition February 23, 2010
18090.2 − 17236.89
t0 = = 2.7455
310.8
t.01,18 = 2.552
H1 : β1 ≠ 0
α = 0.05
f 0 = 92.224
f 0.05,1,16 = 4.49
f 0 > fα ,1,16
d) H 0 : β 0 = 0
H1 : β 0 ≠ 0
α = 0.05
t0 = 0.243
t0.025,16 = 2.12
t0 >/ tα / 2,16
Therefore, do not reject H0. There is not sufficient evidence to conclude that the intercept differs
from zero. Based on this test result, the intercept could be removed from the model.
b) σˆ 2 = 0.083
The standard errors for the parameters can be obtained from the computer output or calculated as
follows.
11-24
Applied Statistics and Probability for Engineers, 5th edition February 23, 2010
σˆ 2 0.083
se( βˆ1 ) = = = 0.014
S xx 420.91
⎡1 x2 ⎤ ⎡ 1 10.092 ⎤
se( βˆ0 ) = σˆ 2 ⎢ + ⎥ = 0.083 ⎢ + ⎥ = 0.1657
⎣ n S xx ⎦ ⎣11 420.91 ⎦
3) H1 : β 0 ≠ 0
4) α = 0.01
β0
5) The test statistic is t0 =
se( β 0 )
6) Reject H0 if t0 < −tα / 2, n − 2 where −t0.005,9 = −3.250 or t0 > tα / 2,n − 2 where t0.005,9 = 3.250
c) σˆ 2 = 1.118
σˆ 2 1.118
se( βˆ1 ) = = = 0.0436
S xx 588
⎡1 x2 ⎤ ⎡ 1 65.67 2 ⎤
se( βˆ0 ) = σˆ 2 ⎢ + ⎥ = 1.118 ⎢ + ⎥ = 2.885
⎣ n S xx ⎦ ⎣9 588 ⎦
11-35 a) H 0 : β1 = 0
H1 : β1 ≠ 0
α = 0.01
Because the P-value = 0.310 > α = 0.01, fail to reject H0. There is not sufficient evidence of a linear
relationship between these two variables.
11-25
Applied Statistics and Probability for Engineers, 5th edition February 23, 2010
Analysis of Variance
Source DF SS MS F P
Regression 1 36.68 36.68 1.20 0.310
Residual Error 7 214.83 30.69
Total 8 251.51
b) σˆ 2 = 30.69, se( βˆ1 ) = 0.2340, se( βˆ0 ) = 9.141 from the computer output
⎡1 x2 ⎤ ⎡ 1 38.2562 ⎤
c) se( βˆ0 ) = σˆ 2 ⎢ + ⎥ = 30.69 ⎢ + ⎥ = 9.141
⎣ n S xx ⎦ ⎣ 9 560.342 ⎦
βˆ1 b
After the transformation βˆ1∗ = βˆ1 , S xx
∗
= a 2 S xx , x ∗ = ax , βˆ0∗ = bβˆ0 , and
11-36 t0 =
σˆ / S xx
2 a
|10 − (12.5) |
11-37 d= = 0.331
5.5 31/16.422
Assume α = 0.05, from Chart VII (e) and interpolating between the curves for n = 30 and n = 40,
β ≅ 0.55
βˆ
11-38 a) has a t distribution with n − 1 degree of freedom.
σˆ 2
∑x i
2
11-26
Applied Statistics and Probability for Engineers, 5th edition February 23, 2010
(2.5 −3.0714)2
42.1885 ± (2.179) 1.844( 141 + 25.3486
)
42.1885 ± 2.179(0.3943)
41.3293 ≤ μˆY | x0 ≤ 43.0477
(x −x ) 2
yˆ 0 ± t0.005,12 σˆ 2 (1 + 1n + 0S )
xx
(2.5 −3.0714) 2
42.1885 ± 3.055 1.844(1 + 14
1
+ 25.348571
)
42.1885 ± 3.055(1.4056)
37.8944 ≤ y0 ≤ 46.4826
It is wider because it depends on both the errors associated with the fitted model and the
future observation.
0.0041612 ± (2.878)(0.000484)
0.0027682 ≤ β1 ≤ 0.0055542
0.3299892 ± (2.878)(0.04095)
0.212250 ≤ β 0 ≤ 0.447728
11-27
Applied Statistics and Probability for Engineers, 5th edition February 23, 2010
μˆY | x0 = 0.683689
( x0 − x ) 2
μˆY | x0 ± t.005,18 σˆ 2 ( 1n + S xx
)
(85 − 73.9)2
0.683689 ± (2.878) 0.00796( 20
1
+ 33991.6
)
0.683689 ± 0.0594607
0.6242283 ≤ μˆY | x0 ≤ 0.7431497
yˆ 0 = 0.7044949
(x −x) 2
yˆ 0 ± t.005,18 σˆ 2 (1 + 1n + 0S )
xx
(90 − 73.9) 2
0.7044949 ± 2.878 0.00796(1 + 20
1
+ 33991.6
)
0.7044949 ± 0.263567
0.420122 ≤ y0 ≤ 0.947256
c) 99% confidence interval for the mean rating when the average yards per attempt is 8.0
μˆ = 14.195 + 10.092(8.0) = 94.931
⎛ ( x0 − x )2 ⎞
2 1
μ ± t0.005,30 σ
ˆ ˆ ⎜ + ⎟
⎜n S xx ⎟
⎝ ⎠
⎛ 1 ( 8 − 7 )2 ⎞
94.931 ± 2.75 27.2 ⎜ + ⎟
⎜ 32 16.422 ⎟
⎝ ⎠
90.577 ≤ μ ≤ 99.285
11-28
Applied Statistics and Probability for Engineers, 5th edition February 23, 2010
⎛ 1 ( x − x )2 ⎞
yˆ ± t0.005,30 σˆ 2 ⎜ 1 + + 0 ⎟
⎜ n S xx ⎟
⎝ ⎠
⎛ 1 (8 − 7 ) ⎞
2
94.931 ± 2.75 27.2 ⎜ 1 + + ⎟
⎜ 32 16.422 ⎟
⎝ ⎠
79.943 ≤ μ ≤ 109.919
11-42
Regression Analysis: Price versus Taxes
Analysis of Variance
Source DF SS MS F P
Regression 1 636.16 636.16 72.56 0.000
Residual Error 22 192.89 8.77
Total 23 829.05
(7.5 − 6.40492)2
c) 38.253 ± (2.074) 8.76775( 24
1
+ 57.563139
)
38.253 ± 1.5353
36.7177 ≤ μˆY | x0 ≤ 39.7883
(7.5− 6.40492) 2
d) 38.253 ± (2.074) 8.76775(1 + 24
1
+ 57.563139
)
38.253 ± 6.3302
31.9228 ≤ y0 ≤ 44.5832
11-43
Regression Analysis: Usage versus Temperature
11-29
Applied Statistics and Probability for Engineers, 5th edition February 23, 2010
Analysis of Variance
Source DF SS MS F P
Regression 1 57701 57701 17148.85 0.000
Residual Error 10 34 3
Total 11 57734
(13−8.08)2
c) 228.67 ± (2.228) 3( 121 + 1000.917
)
228.67 ± 1.26536
227.4046 ≤ μˆ Y|x 0 ≤ 229.9354
(13−8.08) 2
d) 228.67 ± (2.228) 3(1 + 12
1
+ 1000.917
)
228.67 ± 4.061644
224.6084 ≤ y0 ≤ 232.73164
It is wider because the prediction interval includes errors for both the fitted model and for a future
observation.
Sum of
Variable n Mean Sum Squares
x 21 238.9 5017.0 1436737.0
11-30
Applied Statistics and Probability for Engineers, 5th edition February 23, 2010
⎡ 1 (150 − 238.9) 2 ⎤
33.15 ± 2.093 14.01 ⎢1 + + ⎥
⎢⎣ 21 1, 436, 737.0 ⎥⎦
33.15 ± 8.0394
25.11 ≤ Y0 ≤ 41.19
b) −47.0877 ≤ β 0 ≤ 14.0691
( 910 − 939 ) 2
c) 46 . 6041 ± ( 3 . 106 ) 7 . 324951 ( 131 + 67045 . 97
)
46 .6041 ± 2 .514401
44 .0897 ≤ μ y | x 0 ≤ 49 .1185 )
(910 −939) 2
d) 46.6041 ± (3.106) 7.324951(1 + 13
1
+ 67045.97
)
46.6041 ± 8.779266
37.8298 ≤ y0 ≤ 55.3784
b) −14.3002 ≤ β 0 ≤ −5.32598
(85 −82.3) 2
c) 4.76301 ± (2.101) 1.982231( 20
1
+ 3010.2111
)
4.76301 ± 0.6772655
4.0857 ≤ μ y | x0 ≤ 5.4403
(85−82.3)2
d) 4.76301 ± (2.101) 1.982231(1 + 20
1
+ 3010.2111
)
4.76301 ± 3.0345765
1.7284 ≤ y0 ≤ 7.7976
b) − 4.67015≤ β0 ≤ −2.34696
(30 − 24.5)2
c) 128.814 ± (2.365) 398.2804( 19 + 1651.4214
)
11-31
Applied Statistics and Probability for Engineers, 5th edition February 23, 2010
128.814 ± 16.980124
111.8339 ≤ μ y| x0 ≤ 145.7941
b) −5.18501 ≤ β 0 ≤ 6.12594
(1− 0 .806111 ) 2
c) 21 .038 ± ( 2 .921 ) 13 .8092 ( 181 + 3 .01062 )
21.038 ± 2.8314277
18.2066 ≤ μ y| x0 ≤ 23.8694
(1− 0 .806111 ) 2
d) 21 .038 ± ( 2.921) 13 .8092 (1 + 181 + 3.01062 )
21 .038 ± 11 .217861
9 .8201 ≤ y 0 ≤ 32 .2559
b) 17195.7176 ≤ β 0 ≤ 18984.6824
(20 −13.3375) 2
c) 12999.2 ± 2.878 460992( 20
1
+ 1114.6618
)
12999.2 ± 585.64
12413.56 ≤ μ y| x0 ≤ 13584.84
(20 −13.3375)2
d) 12999.2 ± 2.878 460992(1 + 20
1
+ 1114.6618
)
12999.2 ± 2039.93
10959.27 ≤ y0 ≤ 15039.14
11-32
Applied Statistics and Probability for Engineers, 5th edition February 23, 2010
⎛1 ( x0 − x )2 ⎞⎟
μˆ ± t0.005,5 σˆ 2 ⎜ +
⎜n S xx ⎟
⎝ ⎠
⎛ 1 (1500 − 1242.86 )2 ⎞
4.63 ± 4.032 77.33 ⎜ + ⎟
⎜7 117142.8 ⎟
⎝ ⎠
4.63 ± 4.032(7.396)
−25.19 ≤ μ ≤ 34.45
⎛ 1 ( x − x )2 ⎞
yˆ ± t0.005,5 σˆ 2 ⎜ 1 + + 0 ⎟
⎜ n S xx ⎟
⎝ ⎠
⎛ 1 (1500 − 1242.86 )2 ⎞
4.63 ± 4.032 77.33 ⎜ 1 + + ⎟
⎜ 7 117142.8 ⎟
⎝ ⎠
4.63 ± 4.032(11.49)
−41.7 ≤ y0 ≤ 50.96
It’s wider because it depends on both the error associated with the fitted model as well as that of the
future observation.
11-33
Applied Statistics and Probability for Engineers, 5th edition February 23, 2010
11-34
Applied Statistics and Probability for Engineers, 5th edition February 23, 2010
Section 11-7
S 2 25.35
11-52 R 2 = βˆ12 XX = ( −2.330 ) = 0.8617
SYY 159.71
a) R 2 = 0.672
The model accounts for 67.2% of the variability in the data.
b) There is no major departure from the normality assumption in the following graph.
11-35
Applied Statistics and Probability for Engineers, 5th edition February 23, 2010
11-54 Use the results from Exercise 11-5 to answer the following questions.
a) SalePrice Taxes Predicted Residuals
25.9 4.9176 29.6681073 -3.76810726
29.5 5.0208 30.0111824 -0.51118237
27.9 4.5429 28.4224654 -0.52246536
25.9 4.5573 28.4703363 -2.57033630
29.9 5.0597 30.1405004 -0.24050041
29.9 3.8910 26.2553078 3.64469225
30.9 5.8980 32.9273208 -2.02732082
28.9 5.6039 31.9496232 -3.04962324
35.9 5.8282 32.6952797 3.20472030
31.5 5.3003 30.9403441 0.55965587
31.0 6.2712 34.1679762 -3.16797616
30.9 5.9592 33.1307723 -2.23077234
30.0 5.0500 30.1082540 -0.10825401
36.9 8.2464 40.7342742 -3.83427422
41.9 6.6969 35.5831610 6.31683901
40.5 7.7841 39.1974174 1.30258260
43.9 9.0384 43.3671762 0.53282376
37.5 5.9894 33.2311683 4.26883165
37.9 7.5422 38.3932520 -0.49325200
44.5 8.7951 42.5583567 1.94164328
37.9 6.0831 33.5426619 4.35733807
38.9 8.3607 41.1142499 -2.21424985
36.9 8.1400 40.3805611 -3.48056112
45.8 9.1416 43.7102513 2.08974865
b) Assumption of normality does not seem to be violated since the data appear to fall along a straight
line.
99.9
99
95
cumulative percent
80
50
20
0.1
-4 -2 0 2 4 6 8
Residuals
11-36
Applied Statistics and Probability for Engineers, 5th edition February 23, 2010
c) There are no serious departures from the assumption of constant variance. This is evident by the
random pattern of the residuals.
Plot of Residuals versus Predicted Plot of Residuals versus Taxes
8 8
6 6
4 4
Residuals
Residuals
2 2
0 0
-2 -2
-4 -4
d) R 2 ≡ 76.73% ;
11-55 Use the results of Exercise 11-5 to answer the following questions
99.9
99
95
cumulative percent
80
50
20
0.1
Residuals
c) There might be lower variance at the middle settings of x. However, this data does not indicate a
serious departure from the assumptions.
5.4 5.4
3.4 3.4
Residuals
Residuals
1.4 1.4
-0.6 -0.6
-2.6 -2.6
11-37
Applied Statistics and Probability for Engineers, 5th edition February 23, 2010
11-56 a) R 2 = 20.1121%
b) These plots might indicate the presence of outliers, but no real problem with assumptions.
10
Residual
-10
-2 -1 0 1 2
Normal Score
11-57 a) R 2 = 0.879397
b) No departures from constant variance are noted.
Residuals Versus x Residuals Versus the Fitted Values
(response is y) (response is y)
30 30
20 20
10 10
Residual
Residual
0 0
-10 -10
-20 -20
-30 -30
0 10 20 30 40 80 130 180 230
x Fitted Value
30
20
10
Residual
-10
-20
-30
-1.5 -1.0 -0.5 0.0 0.5 1.0 1.5
Normal Score
11-38
Applied Statistics and Probability for Engineers, 5th edition February 23, 2010
11-58 a) R 2 = 71.27%
b) No major departure from normality assumptions.
Normal Probability Plot of the Residuals
(response is y)
Residual
0
-1
-2
-2 -1 0 1 2
Normal Score
3
3
2
2
1
1
Residual
Residual
0 0
-1 -1
-2 -2
60 70 80 90 100 0 1 2 3 4 5 6 7 8
x Fitted Value
11-59 a) R 2 = 85 . 22 %
b) Assumptions appear reasonable, but there is a suggestion that variability increases slightly with ŷ .
Residuals Versus x
Residuals Versus the Fitted Values
(response is y)
(response is y)
5
5
Residual
Residual
0 0
-5 -5
0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 0 10 20 30 40
x Fitted Value
c) Normality assumption may be questionable. There is some “bending” away from a line in the tails
of the normal probability plot.
11-39
Applied Statistics and Probability for Engineers, 5th edition February 23, 2010
Residual
0
-5
-2 -1 0 1 2
Normal Score
11-60 a)
The regression equation is
Compressive Strength = - 2150 + 185 Density
Analysis of Variance
Source DF SS MS F P
Regression 1 28209679 28209679 245.15 0.000
Residual Error 40 4602769 115069
Total 41 32812448
c) σˆ 2 = 115069
SS R SS 28209679
d) R 2 = = 1− E = = 0.8597 = 85.97%
SST SST 32812448
11-40
Applied Statistics and Probability for Engineers, 5th edition February 23, 2010
2
c) Rnew model = 0.8799
Smaller, because the older model is better able to account for the variability in the data with these
two outlying data points removed.
d) σˆ old
2
model = 460992
σˆ new
2
model = 188474
Yes, reduced more than 50%, because the two removed points accounted for a large amount of the
error.
11-62 a)
11-41
Applied Statistics and Probability for Engineers, 5th edition February 23, 2010
yˆ = 0.55 + 0.05237 x
b) H0 : β1 = 0 H1 : β1 ≠ 0 α = 0.05
f0 = 7.41
f.05,1,12 = 4.75
f0 > fα ,1,12
Reject H0.
c) σˆ 2 = 26.97
d) σˆ orig 2 = 7.502
The new estimate is larger because the new point added additional variance that was not accounted
for by the model.
e) yˆ = 0.55 + 0.05237(860) = 45.5882
e = y − yˆ = 60 − 45.5882 = 14.4118
11-42
Applied Statistics and Probability for Engineers, 5th edition February 23, 2010
g) Constant variance assumption appears valid except for the added point.
11-43
Applied Statistics and Probability for Engineers, 5th edition February 23, 2010
11-63 Yes, when the residuals are standardized the unusual residuals are easier to identify.
1.11907 −0.75653 −0.13113 0.68314 −2.49705 −2.26424 0.51810
0.48210 0.11676 0.40780 0.22274 −0.93513 0.88167 0.76461
−0.49995 0.99241 0.12989 0.39831 1.15898 −0.82134
Then,
⎣ ⎦
2 ⎡1 ( xi − x ) 2 ⎤ 2 ⎡1 ( xi − x )2 ⎤
= σ + σ n + S xx − 2σ n + S xx
2
⎣ ⎦ ⎣ ⎦
= σ 1 − ( 1n + iS xx ) ⎤
2 ⎡ ( x − x ) 2
⎣ ⎦
a) Because ei is divided by an estimate of its standard error (when σ 2 is estimated by σˆ 2 ), ri has
approximately unit variance.
b) No, the term in brackets in the denominator is necessary.
c) If xi is near x and n is reasonably large, ri is approximately equal to the standardized residual.
d) If xi is far from x , the standard error of ei is small. Consequently, extreme points are better fit by
least squares regression than points near the middle range of x. Because the studentized residual at
any point has variance of approximately one, the studentized residuals can be used to compare the fit
of points to the regression line over the range of x.
( n − 2)(1 − SSSyyE ) S yy − SS E S yy − SS E
Using R 2 = 1 − , F0 = = =
SS E
11-65 S yy SS E
S yy
SS E
n−2
σˆ 2
Also,
SS E = ∑ ( yi − βˆ 0 − βˆ1 xi ) 2
= ∑ ( yi − y − βˆ1 ( xi − x )) 2
= ∑ ( yi − y ) + βˆ1 ∑ ( x − x ) − 2βˆ ∑ ( yi − y )(xi − x )
2 2
i 1
= ∑ ( yi − y ) 2 − βˆ ∑ ( x − x )
2 2
1 i
S yy − SS E = βˆ1 ∑ ( xi − x ) 2
2
βˆ12
Therefore, F0 = = t02
σˆ / S xx
2
Because the square of a t random variable with n − 2 degrees of freedom is an F random variable with
1 and n − 2 degrees of freedom, the usual t-test that compares | t0 | to tα / 2, n − 2 is equivalent to
11-44
Applied Statistics and Probability for Engineers, 5th edition February 23, 2010
0.9(23)
a) f 0 = = 207 . Reject H 0 : β1 = 0 .
1 − 0.9
23R 2
b) Because f 0.01,1,23 = 7.88 , H0 is rejected if > 7.88 .
1 − R2
That is, H0 is rejected if
23R 2 > 7.88(1 − R 2 )
27.28 R 2 > 7.88
R 2 > 0.289
11-45
Applied Statistics and Probability for Engineers, 5th edition February 23, 2010
Section 11-8
11-66 a) H 0 : ρ = 0
H1 : ρ ≠ 0 α = 0.05
0.8 20 − 2
t0 = 1− 0.64
= 5.657
t0.025,18 = 2.101
| t0 |> t0.025,18
H1 : ρ ≠ 0.5 α = 0.05
11-67 a) H 0 : ρ = 0
H1 : ρ > 0 α = 0.05
0.75 20 − 2
t0 = = 4.81
1− 0.752
t0.05,18 = 1.734
t0 > t0.05,18
ρ ≥ 2.26
Because ρ = 0 and ρ = 0.5 are not in the interval, reject the nul hypotheses from parts (a) and (b).
11-68 n = 30 r = 0.83
11-46
Applied Statistics and Probability for Engineers, 5th edition February 23, 2010
a) H 0 : ρ = 0
H1 : ρ ≠ 0 α = 0.05
r n−2
t0 = = 0.83 28
= 7.874
1− r 2 1− (0.83) 2
t.025,28 = 2.048
t0 > tα / 2,28
Reject H 0 . P-value = 0.
a) H 0 : ρ = 0.8
H1 : ρ ≠ 0.8 α = 0.05
11-69 n = 50 r = 0.62
a) H 0 : ρ = 0
H1 : ρ ≠ 0 α = 0.01
r n−2
t0 = = 0.62 48
= 5.475
1− r 2 1− (0.62)2
t.005,48 = 2.682
t0 > t0.005,48
c) Yes.
11-70 a) r = 0.933203
b) H 0 : ρ = 0
H1 : ρ ≠ 0 α = 0.05
r n−2
t0 = = 0.933203 15
1− (0.8709)
= 10.06
1− r 2
t.025,15 = 2.131
t0 > tα / 2,15
Reject H0.
c) yˆ = 0.72538 + 0.498081 x
11-47
Applied Statistics and Probability for Engineers, 5th edition February 23, 2010
H 0 : β1 = 0
H1 : β1 ≠ 0 α = 0.05
f 0 = 101 . 16
f .05 ,1 ,15 = 4 . 543
f 0 >> f α ,1 ,15
Reject H0. Conclude that the model is significant at α = 0.05. This test and the one in part b) are
identical.
d) No problems with model assumptions are noted.
Residuals Versus x Residuals Versus the Fitted Values
(response is y) (response is y)
3 3
2 2
1 1
Residual
Residual
0 0
-1 -1
-2 -2
10 20 30 40 50 5 15 25
x Fitted Value
1
Residual
-1
-2
-2 -1 0 1 2
Normal Score
b) H 0 : β1 = 0
H1 : β1 ≠ 0 α = 0.05
f 0 = 79 . 838
f . 05 ,1 ,18 = 4 . 41
f 0 >> f α ,1 ,18
11-48
Applied Statistics and Probability for Engineers, 5th edition February 23, 2010
Reject H0.
c) r = 0 . 816 = 0 . 903
d) H 0 : ρ = 0
H1 : ρ ≠ 0 α = 0.05
R n−2 0.90334 18
t0 = = = 8.9345
1− R 2
1 − 0.816
t.025,18 = 2.101
t0 > tα / 2,18
Reject H0.
e) H 0 : ρ = 0.5
H1 : ρ ≠ 0.5 α = 0.05
z0 = 3.879
z.025 = 1.96
z0 > zα / 2
Reject H0.
0.7677 ≤ ρ ≤ 0.9615 .
b) H 0 : β1 = 0
H1 : β1 ≠ 0 α = 0.05
f 0 = 35 .744
f .05 ,1, 24 = 4 .260
f 0 > f α ,1, 24
Reject H0.
c) r = 0.77349
d) H 0 : ρ = 0
H1 : ρ ≠ 0 α = 0.05
t0 = 0 .77349
1 − 0 . 5983
24
= 5 . 9787
t .025 , 24 = 2 .064
t 0 > t α / 2 , 24
Reject H0.
e) H 0 : ρ = 0.6
H1 : ρ ≠ 0.6 α = 0.05
11-49
Applied Statistics and Probability for Engineers, 5th edition February 23, 2010
0.5513 ≤ ρ ≤ 0.8932 .
11-73 a)
Analysis of Variance
Source DF SS MS F P
Regression 1 1493.7 1493.7 70.88 0.000
Residual Error 8 168.6 21.1
Total 9 1662.3
yˆ = 5.50 + 6.73x
b) r = 0.899 = 0.948
c) H 0 : ρ = 0
11-50
Applied Statistics and Probability for Engineers, 5th edition February 23, 2010
H1 : ρ ≠ 0
r n−2 0.948 10 − 2
t0 = = = 8.425
1− r2
1 − 0.9482
t0.025,8 = 2.306
t0 = 8.425 > t0.025,8 = 2.306
Reject H0.
⎛ z ⎞ ⎛ z ⎞
d) tanh ⎜ arctan h r − α / 2 ⎟ ≤ ρ ≤ tanh ⎜ arctan h r + α / 2 ⎟
⎝ n−3 ⎠ ⎝ n−3 ⎠
⎛ 1.96 ⎞ ⎛ 19.6 ⎞
tanh ⎜ arctan h 0.948 − ⎟ ≤ ρ ≤ tanh ⎜ arctan h 0.948 + ⎟
⎝ 10 − 3 ⎠ ⎝ 10 − 3 ⎠
0.7898 ≤ ρ ≤ 0.9879
11-74 a)
Analysis of Variance
Source DF SS MS F P
Regression 1 23.117 23.117 1897.63 0.000
Residual Error 10 0.122 0.012
Total 11 23.239
yˆ = −0.014 + 1.011x
11-51
Applied Statistics and Probability for Engineers, 5th edition February 23, 2010
b) r = 0.995 = 0.9975
c) H 0 : ρ = 0.9
H1 : ρ ≠ 0.9
z0 = (arctan h R − arctan h ρ 0 ) ( n − 3)
1/ 2
z0 = 5.6084
zα / 2 = z0.025 = 1.96
| z0 |> z0.025
⎛ z ⎞ ⎛ z ⎞
d) tanh ⎜ arctan h r − α / 2 ⎟ ≤ ρ ≤ tanh ⎜ arctan h r + α / 2 ⎟
⎝ n−3 ⎠ ⎝ n−3 ⎠
⎛ 1.96 ⎞ ⎛ 19.6 ⎞
tanh ⎜ arctan h 0.9975 − ⎟ ≤ ρ ≤ tanh ⎜ arctan h 0.9975 + ⎟
⎝ 12 − 3 ⎠ ⎝ 12 − 3 ⎠
0.9908 ≤ ρ ≤ 0.9993
a) r = 0.672 = 0.820
b) H 0 : ρ = 0
H1 : ρ ≠ 0
r n−2 0.82 32 − 2
t0 = = = 7.847
1− r 2
1 − 0.822
t.025,30 = 2.042
t0 > t.025,30
d) H 0 : ρ = 0.7
H1 : ρ ≠ 0.7
z0 = (arctan h R − arctan h ρ0 ) ( n − 3)
1/ 2
z0 = 1.56
zα / 2 = z0.025 = 1.96
| z0 |< z0.025
11-52
Applied Statistics and Probability for Engineers, 5th edition February 23, 2010
11-76
Here r = 0. The correlation coefficient does not detect the relationship between x and y because the
relationship is not linear. See the graph above.
11-53
Applied Statistics and Probability for Engineers, 5th edition February 23, 2010
Section 11-9
11-77 a) Yes, ln y = ln β 0 + β1 ln x + ln ε
b) No
c) Yes, ln y = ln β 0 + x ln β1 + ln ε
1 1
d) Yes, = β 0 + β1 + ε
y x
b) y = −1955.8 + 6.684x
c)
Source DF SS MS F P
Regression 1 491448 491448 35.54 0.000
Residual Error 9 124444 13827
Total 10 615892
d)
11-54
Applied Statistics and Probability for Engineers, 5th edition February 23, 2010
ln y = 20.6 − 5185(1/x)
Analysis of Variance
Source DF SS MS F P
Regression 1 28.334 28.334 103488.96 0.000
Residual Error 9 0.002 0.000
Total 10 28.336
There is still curvature in the data, but now the plot is convex instead of concave.
11-79 a)
11-55
Applied Statistics and Probability for Engineers, 5th edition February 23, 2010
15
10
y
5
b) yˆ = −0.8819 + 0.00385 x
c) H 0 : β1 = 0
H1 : β1 ≠ 0 α = 0.05
f 0 = 122.03
f 0 > f 0.05,1,48
0
Residual
-1
-2
-3
-4
-5
0 5 10
Fitted Value
11-56
Applied Statistics and Probability for Engineers, 5th edition February 23, 2010
Section 11-10
1
11-80 a) The fitted logistic regression model is yˆ =
1 + exp[−(−8.84679 − 0.000202 x)]
Log-Likelihood = -11.163
Test that all slopes are zero: G = 5.200, DF = 1, P-Value = 0.023
b) The P-value for the test of the coefficient of income is 0.044 < α = 0.05. Therefore, income has a
significant effect on home ownership status.
c) The odds ratio is changed by the factor exp(β1) = exp(0.0002027) = 1.000202 for every unit increase
in income. More realistically, if income changes by $1000, the odds ratio is changed by the factor
exp(1000β1) = exp(0.2027) = 1.225.
1
11-81 a) The fitted logistic regression model is yˆ =
1 + exp[ −(5.33968 − 0.000224 x)]
Binary Logistic Regression: Number Failing, Sample Size, versus Load (kN/m2)
Link Function: Logit
Response Information
Variable Value Count
Number Failing Failure 337
Success 353
Sample Size Total 690
11-57
Applied Statistics and Probability for Engineers, 5th edition February 23, 2010
Log-Likelihood = -421.856
Test that all slopes are zero: G = 112.459, DF = 1, P-Value = 0.000
b) The P-value for the test of the coefficient of load is near zero. Therefore, load has a significant
effect on failing performance.
1
11-82 a) The fitted logistic regression model is yˆ =
1 + exp[ −(−2.12756 + 0.113925 x)]
Log-Likelihood = -4091.801
Test that all slopes are zero: G = 741.361, DF = 1, P-Value = 0.000
b) The P-value for the test of the coefficient of discount is near zero. Therefore, discount has a
significant effect on redemption.
11-58
Applied Statistics and Probability for Engineers, 5th edition February 23, 2010
c)
d) The P-value of the quadratic term is 0.95 > 0.05, so we fail to reject the null hypothesis of the
quadratic coefficient at the 0.05 level of significance. There is no evidence that the quadratic term is
required in the model. The Minitab results are shown below
Predictor Upper
Constant
Discount, x 1.22
Discount, x*Discount, x 1.00
Log-Likelihood = -4090.796
Test that all slopes are zero: G = 743.372, DF = 2, P-Value = 0.000
e) The expanded model does not visually provide a better fit to the data than the original model.
11-59
Applied Statistics and Probability for Engineers, 5th edition February 23, 2010
Log-Likelihood = -10.423
Test that all slopes are zero: G = 6.880, DF = 2, P-Value = 0.032
b) Because the P-value = 0.032 < α = 0.05 we can conclude that at least one of the coefficients (of
income and age) is not equal to zero at the 0.05 level of significance. The individual z-tests do not
generate P-values less than 0.05, but this might be due to correlation between the independent
variables. The z-test for a coefficient assumes it is the last variable to enter the model. A model
might use either income or age, but after one variable is in the model, the coefficient z-test for the
other variable may not be significant because of their correlation.
c) The odds ratio is changed by the factor exp(β1) = exp(0.0000833) = 1.00008 for every unit increase
in income with age held constant. Similarly, odds ratio is changed by the factor exp(β1) =
11-60
Applied Statistics and Probability for Engineers, 5th edition February 23, 2010
exp(1.06263) = 2.894 for every unit increase in age with income held constant. More realistically, if
income changes by $1000, the odds ratio is changed by the factor exp(1000β1) = exp(0.0833) =
1.087 with age held constant.
d) At x1 = 45000 and x2 = 5 from part (a)
1
yˆ = = 0.78
1 + exp[−(−7.79891 + 0.0000833 x1 + 1.06263 x2 )]
Log-Likelihood = -8.112
Test that all slopes are zero: G = 11.503, DF = 3, P-Value = 0.009
Because the P-value = 0.104 there is no evidence that an interaction term is required in the model.
11-61
Applied Statistics and Probability for Engineers, 5th edition February 23, 2010
Supplemental Exercises
∑y ∑x
n n n
11-84 a) ∑(y
i =1
i − yˆ i ) = ∑ yi − ∑ yˆ i and
i =1 i =1
i = nβˆ0 + βˆ1 i from the normal equations
Then,
n
(nβˆ0 + βˆ1 ∑ x ) − ∑ yˆ
i =1
i i
n n
= n βˆ0 + βˆ1 ∑
i =1
xi − ∑ (βˆ
i =1
0 + βˆ1 xi )
n n
= n βˆ0 + βˆ1 ∑
i =1
xi − nβˆ0 − βˆ1 ∑x
i =1
i =0
n n n
b) ∑ ( y i − yˆi )xi = ∑ yi xi − ∑ yˆ i xi
i =1 i =1 i =1
n n n
and ∑ yi xi = βˆ0 ∑ xi + βˆ1 ∑ xi 2 from the normal equations. Then,
i =1 i =1 i =1
n n n
βˆ0 ∑ xi + βˆ1 ∑ xi 2 − ∑ (βˆ0 + βˆ1 xi ) xi =
i =1 i =1 i =1
n n n n
βˆ0 ∑ xi + βˆ1 ∑ xi − βˆ0 ∑ xi − βˆ1 ∑ xi 2 = 0
2
i =1 i =1 i =1 i =1
∑ yˆ
1
c) i =y
n i =1
∑ yˆ = ∑ ( βˆ0 +βˆ1 x)
1 n 1
∑ yˆ i = n ∑ ( βˆ 0 + βˆ1 xi )
n i =1
1
= (nβˆ 0 + βˆ1 ∑ xi )
n
1
= (n( y − βˆ1 x ) + βˆ1 ∑ xi )
n
1
= ( ny − nβˆ1 x + βˆ1 ∑ xi )
n
= y − βˆx + βˆ x 1
=y
11-85 a)
11-62
Applied Statistics and Probability for Engineers, 5th edition February 23, 2010
Plot of y vs x
2.2
1.9
1.6
y
1.3
0.7
yˆ = −0.966824 + 1.54376 x
2) H 0 : β1 = 0
3) H1 : β1 ≠ 0
4) α = 0.05
SS R / k
5) The test statistic is f 0 =
SS E /( n − p )
11-63
Applied Statistics and Probability for Engineers, 5th edition February 23, 2010
1.96613 /1
f0 = = 252263.9
0.0000623515 / 8
8) Because 252264 > 11.26 reject H0 and conclude that the regression model is significant at α =
0.05.
P-value ≈ 0
e) 2) H 0 : β 0 = 0
3) H1 : β 0 ≠ 0
4) α = 0.01
βˆ0
5) The test statistic is t0 =
se( βˆ0 )
b) H 0 : β1 = 0
H1 : β1 ≠ 0
α = 0.05
f 0 = 12.872
f.05,1,14 = 4.60
f 0 > f 0.05,1,14
c) (9.689 ≤ β1 ≤ 21.445)
d) (79.333 ≤ β 0 ≤ 107.767)
11-64
Applied Statistics and Probability for Engineers, 5th edition February 23, 2010
⎣ ⎦
132.475 ± 6.49
125.99 ≤ μˆY | x0 = 2.5 ≤ 138.97
11-87 yˆ ∗ = 1.2232 + 0.5075 x where y ∗ = 1/ y . No, the model does not seem reasonable.
The model appears to be an excellent fit. The R2 is large and both regression coefficients are
significant. No, the existence of a strong correlation does not imply a cause and effect relationship.
11-89 yˆ = 0.7916 x
Even though y should be zero when x is zero, because the regressor variable does not usually assume
values near zero, a model with an intercept fits this data better. Without an intercept, the residuals plots
are not satisfactory.
11-90 a)
110
100
90
80
days
70
60
50
40
30
16 17 18
index
Analysis of Variance
Source DF SS MS F P
Regression 1 1492.6 1492.6 2.64 0.127
Residual Error 14 7926.8 566.2
Total 15 9419.4
11-65
Applied Statistics and Probability for Engineers, 5th edition February 23, 2010
Do not reject H0. We do not have evidence of a relationship. Therefore, there is not sufficient evidence
to conclude that the seasonal meteorological index (x) is a reliable predictor of the number of days that
the ozone level exceeds 0.20 ppm (y).
c) 95% CI on β1
βˆ1 ± tα / 2 , n − 2 se ( βˆ1 )
15.296 ± t.025 ,12 (9.421)
15.296 ± 2.145(9.421)
− 4.912 ≤ β 1 ≤ 35.504
d) The normality plot of the residuals is satisfactory. However, the plot of residuals versus run order
exhibits a strong downward trend. This could indicate that there is another variable should be
included in the model and it is one that changes with time.
40
30
20
10
Residual
-10
-20
-30
-40
2 4 6 8 10 12 14 16
Observation Order
1
Normal Score
-1
-2
-40 -30 -20 -10 0 10 20 30 40
Residual
11-91 a)
0.7
0.6
0.5
y
0.4
0.3
0.2
b) yˆ = 0.6714 − 2964 x
11-66
Applied Statistics and Probability for Engineers, 5th edition February 23, 2010
c) Analysis of Variance
Source DF SS MS F P
Regression 1 0.03691 0.03691 1.64 0.248
Residual Error 6 0.13498 0.02250
Total 7 0.17189
R2 = 21.47%
Because the P-value > 0.05, reject the null hypothesis and conclude that the model is significant.
d) There appears to be curvature in the data. There is a dip in the middle of the normal probability plot
and the plot of the residuals versus the fitted values shows curvature.
1.5
0.2
1.0
0.1
Normal Score
0.5
Residual
0.0 0.0
-0.5
-0.1
-1.0
-0.2
-1.5
0.4 0.5 0.6
-0.2 -0.1 0.0 0.1 0.2
Fitted Value
Residual
11-92 a)
940
y
930
920
b) yˆ = 33.3 + 0.9636 x
Analysis of Variance
Source DF SS MS F P
Regression 1 584.62 584.62 19.79 0.002
11-67
Applied Statistics and Probability for Engineers, 5th edition February 23, 2010
Reject the null hypothesis and conclude that the model is significant. Here 77.3% of the variability is
explained by the model.
d) H 0 : β1 = 0
H1 : β1 ≠ 0
α = 0.05
βˆ1 − 1 0.9299 − 1
t0 = = = −0.3354
se( βˆ1 ) 0.2090
ta / 2, n − 2 = t.025,8 = 2.306
Because t0 > −ta / 2, n − 2 , we cannot reject H0 and we conclude that there is not enough evidence to reject
the claim that the devices produce different temperature measurements. Therefore, we assume the
devices produce equivalent measurements.
1
Normal Score
-1
-5 0 5
Residual
11-68
Applied Statistics and Probability for Engineers, 5th edition February 23, 2010
Residual
0
-5
Fitted Value
11-93 a)
b) yˆ = −0.12 + 1.17 x
c)
Source DF SS MS F P
Regression 1 28.044 28.044 22.75 0.001
Residual Error 8 9.860 1.233
Total 9 37.904
Reject the null hypothesis and conclude that the model is significant.
d) x0 = 4.25 μˆ y| x0 = 4.853
⎛ 1 (4.25 − 4.25) 2 ⎞
4.853 ± 2.306 1.2324 ⎜ + ⎟
⎝ 10 20.625 ⎠
4.853 ± 2.306(0.35106)
4.0435 ≤ μ y| x0 ≤ 5.6625
11-69
Applied Statistics and Probability for Engineers, 5th edition February 23, 2010
e) The normal probability plot of the residuals appears linear, but there are some large residuals in the
lower fitted values. There may be some problems with the model.
11-94 a)
The regression equation is
No. Of Atoms (x 10E9) = - 0.300 + 0.0221 power(mW)
Analysis of Variance
11-70
Applied Statistics and Probability for Engineers, 5th edition February 23, 2010
Source DF SS MS F P
Regression 1 1.2870 1.2870 1130.43 0.000
Residual Error 13 0.0148 0.0011
Total 14 1.3018
c) r = 0.989 = 0.994
d) H 0 : ρ = 0
H1 : ρ ≠ 0
r n−2 0.994 15 − 2
t0 = = = 32.766
1− r2 1 − .9942
t0.025,13 = 2.160
t0 = 32.766 > t0.025,13 = 2.160.
11-95 a)
11-71
Applied Statistics and Probability for Engineers, 5th edition February 23, 2010
The relationship between carat and price is not linear. Yes, there is one outlier, observation number 33.
b) The person obtained a very good price—high carat diamond at low price.
c) All the data
Analysis of Variance
Source DF SS MS F P
Regression 1 15270545 15270545 138.61 0.000
Residual Error 38 4186512 110171
Total 39 19457057
11-72
Applied Statistics and Probability for Engineers, 5th edition February 23, 2010
Analysis of Variance
Source DF SS MS F P
Regression 1 16173949 16173949 184.33 0.000
Residual Error 37 3246568 87745
Total 38 19420517
The width for the outlier removed is narrower than for the first case.
11-96
The regression equation is
Population = 3549143 + 651828 Count
Analysis of Variance
Source DF SS MS F P
Regression 1 2.07763E+11 2.07763E+11 6.15 0.029
Residual Error 12 4.05398E+11 33783126799
Total 13 6.13161E+11
yˆ = 3549143 + 651828 x
Yes, the regression is significant at α = 0.01. Care needs to be taken in making cause and effect
statements based on a regression analysis. In this case, it is surely not the case that an increase in the
stork count is causing the population to increase, in fact, the opposite is most likely the case. However,
unless a designed experiment is performed, cause and effect statements should not be made on
regression analysis alone. The existence of a strong correlation does not imply a cause and effect
relationship.
11-73
Applied Statistics and Probability for Engineers, 5th edition February 23, 2010
Mind-Expanding Exercises
11-97 The correlation coefficient for the n pairs of data (xi, zi) will not be near unity. It will be near zero. The
data for the pairs (xi, zi) where zi = yi2 will not fall along the line yi = xi which has a slope near unity
and gives a correlation coefficient near unity. These data will fall on a line yi = xi that has a slope
S
11-98 a) βˆ1 = xY , βˆ0 = Y − βˆ1 x
S xx
11-99 a) MS E =
∑ (Y − βˆ
i 0 − βˆ1 xi )2
=
∑e 2
i
n−2 n−2
E ( MS E ) =
∑ E (e ) = ∑ V (e )
2
i i
n−2 n−2
∑σ
( xi − x ) 2
2
[1 − ( 1n + )]
=
S xx
n−2
σ 2 [n − 1 − 1]
= =σ2
n−2
{
E ( MS R ) = E ( βˆ12 S xx ) = S xx V ( βˆ1 ) + [ E ( βˆ1 )]2 }
⎛σ2 ⎞
= S xx ⎜ + β12 ⎟ = σ 2 + β12 S xx
⎝ xx
S ⎠
11-74
Applied Statistics and Probability for Engineers, 5th edition February 23, 2010
S x1Y
11-100 βˆ1 =
S x1 x1
⎡ n ⎤ n
E ⎢ ∑ Yi ( x1i − x1 ) ⎥ ∑ (β 0 + β1 x1i + β 2 x2 i )( x1i − x1 )
E ( βˆ1 ) = ⎣ i =1 ⎦= i =1
S x1 x1 S x1 x1
n
β1 S x x + β 2 ∑ x ( x − x1 )
1 1 2i 1i
β2 Sx x
= i =1
= β1 + 1 2
S x1 x1 S x1 x1
σ 2 n
11-101 V ( βˆ1 ) = . To minimize V ( βˆ1 ), S xx should be maximized. Because S xx = ∑ ( xi − x ) 2 , S xx is
S xx i =1
maximized by choosing approximately half of the observations at each end of the range of x.
From a practical perspective, this allocation assumes the linear model between Y and x holds
throughout the range of x and observing Y at only two x values prohibits verifying the linearity
assumption. It is often preferable to obtain some observations at intermediate values of x.
n
11-102 One might minimize a weighted some of squares ∑w (y
i =1
i i − β 0 − β1 xi ) 2 in which a Yi with small variance
β0
∑w (y
i =1
i i − β 0 − β1 xi ) 2 = −2∑ wi ( yi − β 0 − β1 xi )
i =1
∂ n n
β1
∑w (y
i =1
i i − β 0 − β1 xi ) 2 = −2∑ wi ( yi − β 0 − β1 xi )xi
i =1
βˆ0 ∑ wi + βˆ1 ∑ wi xi = ∑ wi yi
βˆ0 ∑ wi xi + βˆ1 ∑ wi xi 2 = ∑ wi xi yi
βˆ 1 =
( ∑ w x y )( ∑ w ) − ∑ w y
i i i i i i
(∑ w )(∑ w x ) − (∑ w x ) 2 2
i i i i i
βˆ0 =
∑ w y − ∑ w x βˆ .
i i i i
∑w ∑w
1
i i
sy
11-103 yˆ = y + r (x − x )
sx
11-75
Applied Statistics and Probability for Engineers, 5th edition February 23, 2010
= y+
S xy ∑(y i − y )2 ( x − x )
S xx S yy ∑ (x − x )
i
2
S xy
= y+ (x − x )
S xx
= y + βˆ1 x − βˆ1 x = βˆ0 + βˆ1 x
∂ n n
11-104 a)
β1
∑(y
i =1
i − β 0 − β1 xi ) 2 = −2∑ ( yi − β 0 − β1 xi )xi
i =1
β 0 ∑ xi + β1 ∑ xi 2 = ∑ xi yi
Therefore,
βˆ1 =
∑ x y − β ∑ x = ∑ x (y − β
i i 0 i i i 0 )
∑x ∑x
2 2
i i
⎛ ∑ xi (Yi − β 0 ) ⎞ ∑ xi 2σ 2 σ2
b) V ( βˆ1 ) = V ⎜ ⎟= =
⎜ ∑ xi ⎟
⎠ ⎡⎣ ∑ xi ⎤⎦ ∑ xi
2 2 2 2
⎝
c) βˆ1 ± tα / 2, n −1 σˆ 2
2
∑ xi
This confidence interval is shorter because ∑ xi ≥ ∑ ( xi − x ) 2 . Also, the t value based on n-1
2
degrees of freedom is slightly smaller than the corresponding t value based on n-2 degrees of
freedom.
11-76