0% found this document useful (0 votes)
2 views4 pages

??? ????? ???? ??????? ?

Uploaded by

Om Sai
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views4 pages

??? ????? ???? ??????? ?

Uploaded by

Om Sai
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

ISI 2021 PSB Problem 9

Cheenta School of Statistics and Data Science

Every problem allows you to learn something new, and an opportunity to expand your creativity. At
Cheenta Academy, we solve every problem with an intense passion for learning, and problem-solving.

Contents

1 Problem 1

2 Solution 1

3 Verifying Estimators of a Regression Model with Python 3

4 Real-Life Application of Regression Model: Predicting House Prices 4

1 Problem
Problem
Consider a linear regression model:

yi = α + βxi + ei , i = 1, 2, . . . , n

where xi ’s are fixed and ei ’s are i.i.d. random errors with mean 0 and variance σ 2 .
Define two estimators of β as follows
Pn Pn
i=1 yi xi yi
β̂1 = Pn and β̂2 = Pi=1n 2 .
x
i=1 i i=1 xi

(a) Obtain an unbiased estimator of β as a linear combination of β̂1 and β̂2 .

(b) Find mean squared errors of β̂1 and β̂2 . Which, between β̂1 and β̂2 , has lower mean squared
error?

2 Solution
We want to find values of c1 and c2 so that the estimator

β̂ = c1 β̂1 + c2 β̂2

is an unbiased estimator of β.
To ensure unbiasedness, we need:
E[β̂] = β.
Expanding E[β̂], we get:
E[c1 β̂1 + c2 β̂2 ] = c1 E[β̂1 ] + c2 E[β̂2 ].
Using the expected values of β̂1 and β̂2 :
αn
E[β̂1 ] = β + Pn ,
i=1 xi

1
Regression model

Pn
α xi
E[β̂2 ] = β + Pni=1 2 .
x
i=1 i
Substituting these into the unbiased condition:
   Pn 
αn α xi
c1 β + Pn + c2 β + Pni=1 2 = β.
i=1 xi i=1 xi

Separating the terms of β and α, we get two equations: 1. Coefficient of β:


c1 + c2 = 1.
2. Coefficient of α: Pn
n xi
c1 Pn + c2 Pni=1 2 = 0.
i=1 xi i=1 xi
Solving these equations: From the first equation, we have:
c2 = 1 − c1 .
Substitute c2 = 1 − c1 into the second equation:
Pn
n xi
c1 Pn + (1 − c1 ) Pni=1 2 = 0.
x
i=1 i x
i=1 i

Expanding and simplifying:


 Pn  Pn
n xi xi
c1 Pn − Pni=1 2 = − Pni=1 2 .
i=1 x i x
i=1 i x
i=1 i

Thus, Pn
− xi
i=1
c1 = Pn Pn 2.
n i=1 xi − ( i=1 xi )
Therefore, Pn
i=1 xi
c2 = 1 + Pn Pn 2.
n i=1 xi − ( i=1 xi )
The values of c1 and c2 are: Pn
i=1 xi
c1 = − Pn Pn 2,
n i=1 xi − ( i=1 xi )
Pn
i=1 xi
c2 = 1 + Pn Pn 2.
n i=1 xi − ( i=1 xi )

b) To compute the mean squared error (MSE) of an estimator β̂i , we use the formula:
 2
MSE(β̂i ) = Var(β̂i ) + Bias(β̂i ) .

MSE of β̂1 :
The variance of β̂1 is:
Pn 2
σ 2 ( i=1 xi )
Var(β̂1 ) = Pn .
n i=1 1
The bias of β̂1 is:
Pn
α i=1 xi
Bias(β̂1 ) = .
n
Thus, the MSE of β̂1 is:
Pn 2  Pn 2
σ2 ( i=1 xi ) α i=1 xi
MSE(β̂1 ) = + .
n n

2
Regression model

MSE of β̂2 :
The variance of β̂2 is:
Pn
σ2 i=1 x2i
Var(β̂2 ) = .
n
The bias of β̂2 is:
Pn
α xi
Bias(β̂2 ) = Pni=1 2 .
x
i=1 i

Thus, the MSE of β̂2 is:


Pn  Pn 2
σ2 i=1 x2i α i=1 xi
MSE(β̂2 ) = + Pn 2 .
n i=1 xi

Comparison of MSEs:
By comparing the MSEs of β̂1 and β̂2 , the estimator with the lower MSE will depend on the values of
σ 2 , α, xi ’s, and n.PGenerally, β̂2 tends to have a lower MSE than β̂1 when the xi ’s have high variability,
n
due to the factor i=1 x2i in the denominator of Var(β̂2 ).

3 Verifying Estimators of a Regression Model with Python


This code defines a function computec1c2 that takes an array of x-values and computes the values of c1
and c2 based on the given formulas. Replace xvalues with the actual values to compute for specific data.

Python Code to check regression model


1

2 import numpy as np
3

4 def compute_c1_c2(x):
5 n = len(x)
6 sum_x = np.sum(x)
7 sum_x_squared = np.sum(x**2)
8

9 denominator = n * sum_x_squared - sum_x**2


10 c1 = -sum_x / denominator
11 c2 = 1 + (sum_x / denominator)
12

13 return c1, c2
14

15 # Example usage
16 x_values = np.array([1, 2, 3, 4, 5]) # Example x values
17 c1, c2 = compute_c1_c2(x_values)
18 print("c1:", c1)
19 print("c2:", c2)
20

21

22

Code Output

1 c1: -0.3
2 c2: 1.3
3

3
Regression model

4 Real-Life Application of Regression Model: Predicting House


Prices
A real-life application of a regression model is in predicting housing prices. In the real estate industry,
regression models are widely used to predict the price of a house based on various factors (features) such
as:

• Size of the house (square footage)

• Number of bedrooms and bathrooms


• Age of the house
• Location (neighborhood)

• Proximity to schools, parks, and other amenities

Steps in the Process


1. Data Collection: Collect data on past house sales, including both the sale price and the various
features of each house.

2. Model Training: Use a linear regression model (or multiple regression if there are many variables)
to analyze the relationship between the house features and the price.
3. Prediction: Once the model is trained, you can input the features of a new house (e.g., size,
number of bedrooms) to predict its expected price.

This helps sellers and buyers make informed decisions, and assists real estate agents in pricing properties
effectively.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy