0% found this document useful (0 votes)
62 views10 pages

Simple Linear

This document discusses linear regression models. Linear regression is used to model the relationship between a response variable and one or more predictor variables. The simplest form of linear regression is simple linear regression which uses one predictor variable. Simple linear regression models take the form of Y = α + βX + ε, where α is the intercept, β is the slope, and ε is the error term. The regression coefficients α and β are estimated using the least squares method to minimize the sum of squared residuals. The slope β1 indicates the expected change in the response per one unit change in the predictor. Statistical tests can be used to make inferences about the estimated regression coefficients.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
62 views10 pages

Simple Linear

This document discusses linear regression models. Linear regression is used to model the relationship between a response variable and one or more predictor variables. The simplest form of linear regression is simple linear regression which uses one predictor variable. Simple linear regression models take the form of Y = α + βX + ε, where α is the intercept, β is the slope, and ε is the error term. The regression coefficients α and β are estimated using the least squares method to minimize the sum of squared residuals. The slope β1 indicates the expected change in the response per one unit change in the predictor. Statistical tests can be used to make inferences about the estimated regression coefficients.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

LINEAR REGRESSION MODELS

Regression analysis is a statistical technique for evaluating the relationship between two
or more variables so that one variable can be explained or predicted by using information
of the other variables. The relationship if it exists, is explained using a mathematical
function known as the regression model; the regression model may be linear or non-
linear. The function predicts or explains the behavior of one of the variables referred
to as the response or dependent or outcome variable in terms of the other variables
referred to as the predictor or explanatory or independent variables.
Generally linear models are of the form

Data = pattern + residual

where

• data is the response variable

• pattern is the set of predictor variable(s). This part of the models gives the explained
variability in the data.

• residual is the error or noise. It is the stochastic part and is the unexplained vari-
ability in the data.

For linear models, the response variable is continuous and its distribution is assumed to
follow the normal distribution. The distinct difference in these models is with respect to
the nature of predictors as summarized below:

Linear Model Predictor variable(s)


Simple linear One continuous variable
Multiple linear More than one continuous variable
Multiple linear with Both continuous and
indicator variables categorical variables
One-way ANOVA One categorical variable
Two-way ANOVA Two categorical variables
One-way ANCOVA One categorical variable
One continuous(covariate) variable

1
SIMPLE LINEAR REGRESSION

For linear regression models with quantitative predictors, the simplest form is the simple
linear regression for which the response variable is explained using only one predictor
variable. The model is of the form:

Y = α + βX + ϵ

where X and Y are the predictor and response variables respectively. The parameters α
and β are known as regression coefficients; α is the intercept and β is the slope of
the regression line. The variable ϵ is the random or error term associated with fitting
the regression line.
Assumptions made for the model:

(i) the error terms are independent

(ii) the error terms are normally distributed with mean 0 and variance σ 2

(iii) the response variables are statistically independent of each other.

(iv) the response variable is normally distributed with mean α + βX and variance σ 2 .

(v) the error and predictor are independent of each other.

(vi) the sum of the error terms is zero.

2
Estimating the regression coefficients:
Let (xi , yi ), i = 1, 2, .., n be n observable pairs of the predictor variable Xi and response
variable Yi . Fitting the regression line to the observable data, the model is of the form

yi = β0 + β1 xi + ϵi , i = 1, 2, .., n.

One criterion that can be used to find the ”best” fit model is the least square criterion
method. This method requires that we choose the estimates α and β that minimize

n ∑
n
S= (yi − β0 − βxi ) =
2
ϵ2i
i=2 i=1

i.e the values of the regression coefficients that minimize the sum of squared residuals.
The estimates β̂0 and β̂1 are obtained by solving
dS
= 0
dβ0
dS
= 0
dβ1
which reduces to the simultaneous equations

n ∑
n
yi = nβ0 + β1 xi
i=1 i=1

n ∑
n ∑
n
xi y i = β 0 xi + β1 x2i
i=1 i=1 i=1

called the normal equations.


Solving these equations gives:

β̂0 = y − β̂1 x
∑n
(x − x)(yi − y)
β̂1 = ∑n i
i=1

i=1 (xi − x)
2

∑n ( )
xi − x
= ∑n yi
i=1 i=1 (x i − x)2

∑n ( )
xi − x
= yi
i=1
Sx2

where

n
Sx2 = (xi − x)2
i=1

3
Interpretation of regression coefficient β1 :
The regression coefficient β1 gives the estimated change in the average value of response
variable for every unit increase in the predictor.

(i) If β1 < 0, then the average value of the response decreases by β1 units for every unit
increase in predictor.

(ii) If β1 > 0, then the average value of the response increases by β1 units for every unit
increase in predictor

Example: Suppose we fit a model with blood pressure as the response and age as the
predictor and obtain the fit ŷ = 29.65 + 2.64age the interpretation of slope is: for every
additional year in age, the blood pressure increased by 2.64mmHg

Statistical distributions for β̂1 and σ̂ 2 :

1 The random variable


1 ∑
n
σ̂ =2
(yi − β̂0 − β̂1 xi )2
n − 2 i=1
is an estimator for σ 2 and has χ2 (n − 2)

2 The estimate β̂1 is a function of the normal random variables yi ’s; thus it is also a
normal random variable with mean and variance:
( n ( )
∑ xi − x ) ∑ n (
xi − x
)
E[β̂1 ] = E yi = E[yi ]
Sx2 Sx2
i=1 i=1
( n )
1 ∑ ∑
n
1
= (xi − x)(β0 + β1 xi ) = 2 β1 x2i − nx2
Sx2 i=1 Sx i=1
= β1

and
( n ( ) ) n ( )
∑ xi − x ∑ xi − x
V ar[β̂1 ] = V ar yi = V ar[yi ]
i=1
Sx2 i=1
Sx2
1 ∑
n
= (xi − x)2 σ 2
(Sx2 )2 i=1
2
σ
=
Sx2
( )
σ2
β̂1 ∼ N β1 , 2
Sx

4
Statistical inference for β1 :
( 2
)
(i) Given β̂1 ∼ N β1 , Sσ 2 ,
x

β̂1 − β1
Z= √ 2 ∼ N (0, 1)
σ
Sx2

since σ 2 is unknown then its point estimate σ̂ 2 is used to estimate it; we make use
of the student t distribution

β̂1 − β1
T = √ 2 ∼ t(n − 2)
σ̂
Sx2

The 100(1 − α)% confidence interval for β1 is


( √ √ )
σ̂ 2 σ̂ 2
β̂ − t(n − 2, α/2) ; β̂1 + t(n − 2, α/2)
Sx2 Sx2

(ii) For β1 ; if H0 : β1 = 0 vs H1 : β1 ̸= 0, then the test statistic is

β̂1
T =√ 2 ∼ t(n − 2)
σ̂
Sx2

Test for significance of model fit:


Once we have obtained the ”best” fit, we wish to determine if the relationship is statisti-
cally significant. To do so we test the hypothesis:

H0 : Regression fit is not significant

vs

H1 : Regression fit is significant
the test statistic is derived as follows:
The total variation of the observed response variable can be partitioned into two parts:

n ∑
n
(yi − y)
2
= (yi − ŷi + ŷi − y)2
i=1 i=1
∑n
= (yi − ŷi )2 + (ŷi − y)2
i=1
SST = SSE + SSR

where ŷi = β̂0 + β̂1 xi .

5
The sum of square residuals SSE is the amount of variation of the response variable
that is not explained after estimating the linear relationship of response to predictor; and
the regression sum of squares SSR measures the reduction in variation attributed to the
predictor in the estimated regression function. SSR is known as the explained variation
while SSE is the unexplained variation. Finally the total sum of squares SST is the
total variability in the observed values of response variable.
The distributions of the three sum of squares are:

1. Since the two parameters in the expression for SSE are estimated using sample data;
SSE
∼ χ2 (n − 2)
σ2

2. using the sampling distribution for sample variance;


SST
∼ χ2 (n − 1)
σ2

3. Using one of the properties of chi-square random variables; since

SST SSE SSR


− =
σ2 σ2 σ2
then
SSR
∼ χ2 (1)
σ2
We compare the explained and unexplained variations in the form of a ratio

SSR/1
Fc = ∼ F (1, n − 2)
SSE/n − 2

which gives us the test statistic.


If Fc is significantly large then it implies that the explained variation is much larger than
the unexplained and hence the predictor significantly explains the response variable. We
use F-tables to determine the critical value to use so as to determine the criteria for
rejecting or failing to reject H0 . The results of the test can be presented in an ANOVA
table.

6
The ANOVA table is of the form:

Source of Sum of Degrees of Mean square F-ratio


variation squares(SS) freedom (MS)
Regression SSR 1 M SR = SSR
1
M SR
M SE
Error SSE n-2 M SE = SSE
n−2
Total SST n-1
For computations,

n
SST = yi2 − n(y)2
(
i=1
)

n
SSR = β̂1 xi yi − n(x)y
i=1
σ̂ 2 = M SE

We reject H0 if F-ratio> Fα (1, n − 2). If H0 is rejected then we conclude that the fitted
model is statistically significant. On the other hand if we fail to reject H0 , then we
conclude that the fitted model is not statistically significant.
Coefficient of determination:
This quantity describes precisely how much of the variability in the observed values of the
response variable y in the regression model is due to variation in the predictor variable x.
Earlier on we saw that SSE, the error sum of squares measures how much variability in
the yi ’s is not explained by the regression relationship. Given the total variability SST ,
then the proportion of variability in the yi ’s which is unexplained by the linear regression
of y on x is SSE
SST
and the proportion explained is 1 − SSE
SST
which leads to the definition;
Definition: The proportion of variability in the observed values of the response variable
which is explained by the linear regression relationship with the predictor variable is
referred to as coefficient of determination and is equal to
SST − SSE SSR
r2 = =
SST SST
the coefficient of determination is interpreted as the proportion of explained variation, i.e
(100 × r2 )% is the proportion of the variation that is accounted for by the predictor. It is
used as a measure of the goodness of fit of the model. Values close to 1 indicate a good
fit.

7
Prediction of response variable
using specific value of predictor(within the range of values of the predictor used to fit the
line); we can predict the response variable; i.e ŷ0 = β̂0 + β̂1 X0 where X0 is the specific
value of the predictor X. We can go a step further and predict a range of possible values
for response variable by constructing a 100(1 − α)% confidence interval for Ŷ :

1 (X0 − X)2
ŷ0 ± tn−2, α2 × σ̂ 1 + +
n Sx2

Prediction of conditional mean of response variable


The conditional mean of response is

µY |X = β0 + β1 X

. The point estimate of this parameter is

µ̂Y |X = β̂0 + β̂1 X

statistical inference for conditional mean:


(0) (0) (0)
(i) Test H0 : µY |X = µY |X vs. H0 : µY |X ̸= µY |X where µY |X is some hypothesized value
of interest. The test statistic is
(0)
µ̂Y |X − µY |X
T = √ 2
σ̂ n1 + (X−X)
S2 x

T ∼ t(n − 2).

(ii) The corresponding 100(1 − α)% confidence interval for µY |X is :



1 (X − X)2
µ̂Y |X ± tn−2, α2 × σ̂ +
n Sx2

8
Example: The following are the scores that 12 students obtained on the midterm
and final examination in a course in Statistics:

Midterm Final Midterm Final Midterm Final Midterm Final


71 83 82 78 64 76 32 51
49 62 80 89 93 89 85 74
80 76 87 73 73 77 58 48

(a) Find the best fit model that would predict a student’s final exam score based on
his/her midterm exam score.

(b) Test for the significance of model fit.

(c) Test for the significance of the slope at 0.05 level of significance.

(d) How much of the variation is accounted for by the predictor. Comment on the result

(e) Predict the final exam score for a student whose scored 84 marks in the midterm
exam.

(f) Test the hypothesis H0 : µY |X=80 = 75 vs H0 : µY |X=80 ̸= 75

(g) Find the 95% confidence interval for condition mean of final exam score obtained
by students who obtained 80 marks in the mid term exam.

Solution
∑12 ∑12 2 ∑12 2
(a) i=1 xi yi = 64346, i=1 yi = 65850, i=1 xi = 64222, y = 73, x = 71.67

64346 − 12(73)(71.67)
β̂1 = = 0.605
64222 − 12(71.67)2

and β̂0 = 73 − (0.605 ∗ 71.67) = 29.64


the model is ŷ = 29.64 + 0.605x

(b) SST= 65850 - 12 (73)2 = 1902; SSR = 0.605 (64346 - 12(71.67*73))=945.66; SSE =
1902-945.66 = 956.34
The ANOVA table is of the form:

Source of Sum of Degrees of Mean square F-ratio


variation squares(SS) freedom (MS)
Regression 945.66 1 945.66 9.889
Error 956.34 10 95.63
Total 1902 11
From F-tables, F(1,10,0.05)=4.96, the computed value of F-ratio is greater than
critical value hence the fitted model is a significant fit.

9
(c) the test statistic is
β̂1 0.605
T =√ 2 =√ = 3.144
σ̂ 95.634
Sx2 2582.9332

from t-tables , t(10,0.025)= 2.228; T > 2.228, so we reject H0 and conclude the
predictor is significant in explaining the response.

(d) r2 = 945.66
1902
= 0.497, 49.7% of the variation is accounted for by the predictor. This
does not indicate a strong relationship between response and predictor.

(e) ŷ = 29.64+ 0.605(84)= 80.46 marks




ˆ 2 = 95.64 = 9.779 the value of test
(f) µ̂Y |X=80 = 29.64 + 0.605(80) = 78.04; sigma
statistic is
78.04 − 75
T = √ = 0.295
1 (80−71.67)2
9.779 1 + 12 + 2582.9322
From t-tables, t(10,0.025)=2.228; T < 2.228, hence we fail to reject H0 . We do
not have sufficient evidence to indicate that the final exam score is not 75 marks .
However we are not saying that students who scored 80 in the mid term on average
got 75 in the final exam.

(g) the point estimate of conditional mean is µY |X=80 = 29.64 + 0.605(80) = 78.04
the standard error of the estimate is

1 (80 − 71.67)2
9.779 + = 3.2462
12 2582.9332
The 95% confidence interval for condition mean is :

78.04 ± 2.228 × 3.2462 = [70.81, 85.27]

10

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy