0% found this document useful (0 votes)

8 views6 pages

Exp 2 (Multiple Linear Regression)

The document discusses the application of multiple linear regression (MLR) to analyze datasets, particularly focusing on the Boston Housing dataset. It outlines the theory behind MLR, its limitations, and applications, followed by a code implementation for model training and evaluation. The results indicate that while both models perform well, the Boston Housing model demonstrates superior prediction accuracy based on mean squared error (MSE).

Uploaded by

piyushdohare143

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views6 pages

Exp 2 (Multiple Linear Regression)

Uploaded by

piyushdohare143

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

Aim:

To Perform multiple linear regression on multiple datasets and see the results and check which one
has better output.

Theory:

Multiple Linear Regression: Theory and Understanding

Multiple linear regression (MLR) is a statistical technique used to model the relationship between a
single dependent variable (what you want to predict) and multiple independent variables (features
that influence the dependent variable). It assumes a linear relationship between these variables and
builds a linear equation to capture this relationship.

Key Concepts:

Equation:

y_hat = β₀ + β₁x₁ + β₂x₂ + ... + β_p * x_p

 y_hat is the predicted value of the dependent variable.

 β₀ is the intercept term (constant value when all independent variables are zero).

 β_i are the coefficients for each independent variable x_i.

 p is the number of independent variables.

Limitations of MLR:

 Cannot capture non-linear relationships.

 Sensitive to assumptions, and their violation can lead to inaccurate results.

 Cannot establish causation; only identifies correlations.

Applications of MLR:

 Predicting house prices based on features like size, location, and amenities.

 Understanding how factors like age, income, and education affect job satisfaction.

 Analysing the impact of advertising campaigns on sales

Code:

import pandas as pd

import numpy as np

import matplotlib.pyplot as plt

from sklearn.model_selection import train_test_split

from sklearn.linear_model import LinearRegression

from sklearn.preprocessing import StandardScaler, PolynomialFeatures

from sklearn.feature_selection import SelectFromModel

from sklearn.ensemble import RandomForestRegressor

from sklearn.metrics import mean_squared_error, r2_score

# Load the dataset (replace 'your_dataset_filename.csv' with the actual name)

df = pd.read_csv('boston.csv')

# Handle outliers using IQR (adjust based on your data's characteristics)

numeric_cols = df.select_dtypes(include=[np.number]).columns

Q1 = df[numeric_cols].quantile(0.25)

Q3 = df[numeric_cols].quantile(0.75)

IQR = Q3 - Q1

df = df[~((df[numeric_cols] < (Q1 - 1.5 * IQR)) | (df[numeric_cols] > (Q3 + 1.5 * IQR))).any(axis=1)]

# Extract features and target variable (using the provided column names)

X = df.drop(['TOWN', 'TRACT', 'LON', 'LAT', 'MEDV'], axis=1)

y = df['MEDV']

# Feature Scaling

scaler = StandardScaler()

X_scaled = scaler.fit_transform(X)

# Train-test split
X_train, X_test, y_train, y_test = train_test_split(X_scaled, y, test_size=0.2, random_state=0)

# Feature selection (experiment with different thresholds and methods)

rf_model = RandomForestRegressor(random_state=0)

rf_model.fit(X_train, y_train)

sfm = SelectFromModel(rf_model, threshold=0.1) # Adjust threshold if needed

X_train = sfm.transform(X_train)

X_test = sfm.transform(X_test)

# Polynomial features (consider different degrees)

poly = PolynomialFeatures(degree=2, include_bias=False) # Adjust degree if needed

X_train_poly = poly.fit_transform(X_train)

X_test_poly = poly.transform(X_test)

# Model fitting

regressor = LinearRegression()

regressor.fit(X_train_poly, y_train)

# Evaluation

y_pred = regressor.predict(X_test_poly)

mse = mean_squared_error(y_test, y_pred)

r2 = r2_score(y_test, y_pred)

print('Train Score: ', regressor.score(X_train_poly, y_train))

print('Test Score: ', regressor.score(X_test_poly, y_test))

print('Mean Squared Error (MSE): ', mse)

print('R-squared (R2): ', r2)

# Visualization (optional)

plt.scatter(y_test, y_pred)

plt.xlabel("Actaul Medv")

plt.ylabel("Predicted Medv")

plt.title("Actual Medv vs Predicted Medv")

plt.show()
Performance Metrics:
Multiple Linear Regression Dataset:

Boston Housing Dataset:

Output:
Multiple Regression dataset:
Boston Housing Dataset Output:

Comparission.

Comparing the performance of models trained on a multiple regression dataset and the Boston
Housing dataset:

Train Score:

The multiple regression model achieves a very high train score (0.983), indicating an excellent fit to
the training data.

The Boston Housing model also demonstrates a reasonably high train score (0.822), suggesting a
good fit to its training data.

Test Score:

Both models exhibit high test scores, with the multiple regression model at 0.887 and the Boston
Housing model at 0.877, indicating strong generalization performance.
Mean Squared Error (MSE):

The multiple regression model has a relatively high MSE of 2,611,228, suggesting higher prediction
errors on average.

In contrast, the Boston Housing model shows a much lower MSE of 5.379, indicating superior
prediction accuracy.

R-squared (R2):

The multiple regression model and the Boston Housing model both achieve high R-squared values
(0.887 and 0.877 respectively), indicating good explanatory power over the variance in their
respective dependent variables.

Conclusion:

While both models exhibit strong performance in terms of train and test scores, the Boston Housing
model outperforms in terms of MSE, suggesting superior prediction accuracy.

Despite the multiple regression model's higher R-squared value, indicating a better fit to the data, its
higher MSE implies potential issues with prediction accuracy on unseen data.

Therefore, for accurate prediction of housing prices, the Boston Housing model is preferred.
However, if the goal is to explain variance in the dependent variable, the multiple regression model
may be more suitable.

SPM Essay - DW - Article - Safeguarding The Environment
58% (19)
SPM Essay - DW - Article - Safeguarding The Environment
3 pages
Naruto - The Wind Calamity by DevilHex-8lrgvp44
No ratings yet
Naruto - The Wind Calamity by DevilHex-8lrgvp44
2,734 pages
Data Science - Machine Learning - Multiple Linear Regression
No ratings yet
Data Science - Machine Learning - Multiple Linear Regression
14 pages
5 Multiple Linear Regression
No ratings yet
5 Multiple Linear Regression
2 pages
Ben 10
No ratings yet
Ben 10
15 pages
ML Assignment 1ipynb
No ratings yet
ML Assignment 1ipynb
10 pages
The Damning Stone
No ratings yet
The Damning Stone
349 pages
Guide To Kali Linux PDF
100% (2)
Guide To Kali Linux PDF
50 pages
AD-22053227 Lab 401, 402
No ratings yet
AD-22053227 Lab 401, 402
4 pages
Practical No. 02: To Implement Linear Regression To Predict A Continuous Target Variable
No ratings yet
Practical No. 02: To Implement Linear Regression To Predict A Continuous Target Variable
4 pages
ML - Assignment 1ipynb - Colab
No ratings yet
ML - Assignment 1ipynb - Colab
5 pages
Medical Lecture: Nazem Shams
No ratings yet
Medical Lecture: Nazem Shams
26 pages
7th ExP
No ratings yet
7th ExP
4 pages
Lesson 6
No ratings yet
Lesson 6
25 pages
Mod2 - Multiple Linear Regression
No ratings yet
Mod2 - Multiple Linear Regression
10 pages
DA Lab2
No ratings yet
DA Lab2
5 pages
Lasso Regression Aim: Roll Number: 160122733094 Date
No ratings yet
Lasso Regression Aim: Roll Number: 160122733094 Date
8 pages
SNT 7
No ratings yet
SNT 7
13 pages
9515-181-50-Eng - Rev - G1 Eli 280 V2.2.0
No ratings yet
9515-181-50-Eng - Rev - G1 Eli 280 V2.2.0
87 pages
Jejej
No ratings yet
Jejej
109 pages
LR LogReg
No ratings yet
LR LogReg
53 pages
Decision Tree
No ratings yet
Decision Tree
4 pages
Regression Analysis On The Boston House Price Dataset For House Price Prediction
No ratings yet
Regression Analysis On The Boston House Price Dataset For House Price Prediction
2 pages
Sagar Sagar 123
No ratings yet
Sagar Sagar 123
62 pages
Assignment 2
No ratings yet
Assignment 2
10 pages
RFIT-PRT-0895 FilmArrayPneumoplus Instructions For Use EN PDF
No ratings yet
RFIT-PRT-0895 FilmArrayPneumoplus Instructions For Use EN PDF
112 pages
wvcg0mt7pkASSI 3 ML 16
No ratings yet
wvcg0mt7pkASSI 3 ML 16
4 pages
ML Exp4
No ratings yet
ML Exp4
4 pages
Exp4 (Linear Regression)
No ratings yet
Exp4 (Linear Regression)
2 pages
22 Practice Polynomial Regression
No ratings yet
22 Practice Polynomial Regression
6 pages
IoT Task4 21BEC0384
No ratings yet
IoT Task4 21BEC0384
9 pages
Assignment 1
No ratings yet
Assignment 1
5 pages
New Opendocument Text
No ratings yet
New Opendocument Text
7 pages
Basic Knowledge of Periodic Maintenance
No ratings yet
Basic Knowledge of Periodic Maintenance
58 pages
Day.11 What Is Multiple Linear Regression
No ratings yet
Day.11 What Is Multiple Linear Regression
10 pages
Python File
No ratings yet
Python File
5 pages
Fermentation Media, Fermentation Process and Downstream Processing Bcba p7 T
No ratings yet
Fermentation Media, Fermentation Process and Downstream Processing Bcba p7 T
159 pages
Assignment-3 (Motion in A Plane)
No ratings yet
Assignment-3 (Motion in A Plane)
7 pages
2871747
No ratings yet
2871747
21 pages
20BCP021 - Assignment - 5
No ratings yet
20BCP021 - Assignment - 5
5 pages
SML - Week 3
No ratings yet
SML - Week 3
5 pages
Mulitple Linear Regression
No ratings yet
Mulitple Linear Regression
6 pages
User Guide: Downloaded From Manuals Search Engine
No ratings yet
User Guide: Downloaded From Manuals Search Engine
20 pages
7 A
No ratings yet
7 A
2 pages
Experiment 7 ML Vtu
No ratings yet
Experiment 7 ML Vtu
5 pages
Transmission Gear 9 Fe 75w-80 Tds
No ratings yet
Transmission Gear 9 Fe 75w-80 Tds
2 pages
SiddharthShah 1032221195 DivC 50 DL LabAssignment2
No ratings yet
SiddharthShah 1032221195 DivC 50 DL LabAssignment2
7 pages
Boston Housing Kaggle Challenge With Linear Regression
No ratings yet
Boston Housing Kaggle Challenge With Linear Regression
3 pages
Zerox Ready
No ratings yet
Zerox Ready
21 pages
Lid Driven Cavity Simulation
No ratings yet
Lid Driven Cavity Simulation
13 pages
DL Assignment 1ms24rai03
No ratings yet
DL Assignment 1ms24rai03
10 pages
Tbc-50s - Instruction Manual
No ratings yet
Tbc-50s - Instruction Manual
32 pages
Unit II - Diagnotis and Multiple Linear
No ratings yet
Unit II - Diagnotis and Multiple Linear
8 pages
DSBDAL - Assignment No 4
No ratings yet
DSBDAL - Assignment No 4
15 pages
Atr72-600 Jic 05-51-25 Volcanic Ash Insp 2
No ratings yet
Atr72-600 Jic 05-51-25 Volcanic Ash Insp 2
9 pages
Import As From Import From Import From Import: R'creditcard - CSV' 'Time' 'Time'
No ratings yet
Import As From Import From Import From Import: R'creditcard - CSV' 'Time' 'Time'
3 pages
CP4252 Machine Learning Lab Manual
No ratings yet
CP4252 Machine Learning Lab Manual
26 pages
DAV 2201079 Exp 3-1
No ratings yet
DAV 2201079 Exp 3-1
11 pages
T2 Summary VHA
No ratings yet
T2 Summary VHA
14 pages
Coding Question
No ratings yet
Coding Question
6 pages
Contents of Elementary Science in A Spiral Progression
No ratings yet
Contents of Elementary Science in A Spiral Progression
11 pages
Module 5 Gned 06 Issues and Application of STS
No ratings yet
Module 5 Gned 06 Issues and Application of STS
9 pages
The Eve of Waterloo by Lord Byron
No ratings yet
The Eve of Waterloo by Lord Byron
3 pages
A-Simple-Neural-Network-From-Scratch - Jupyter Notebook
No ratings yet
A-Simple-Neural-Network-From-Scratch - Jupyter Notebook
9 pages
May Edition
No ratings yet
May Edition
2 pages
4CH Mobile DVRs-Pricelist
No ratings yet
4CH Mobile DVRs-Pricelist
5 pages
ML Record
No ratings yet
ML Record
19 pages
LAB5 Regularization
No ratings yet
LAB5 Regularization
6 pages
House Pricing
No ratings yet
House Pricing
15 pages
Cureus 0015 00000038079
No ratings yet
Cureus 0015 00000038079
8 pages
1 - Lab Manual (ML)
No ratings yet
1 - Lab Manual (ML)
42 pages
Prediction of House Rent Using Multiple Linear Regression
No ratings yet
Prediction of House Rent Using Multiple Linear Regression
20 pages
Pa Da1
No ratings yet
Pa Da1
17 pages
ML
No ratings yet
ML
17 pages
Machine Learning
No ratings yet
Machine Learning
10 pages
Process Costing
No ratings yet
Process Costing
6 pages
Experiment Number: 3: Aim:-Study of The Linear Regression in The Machine Learning Using The Boston Housing Dataset. 1)
No ratings yet
Experiment Number: 3: Aim:-Study of The Linear Regression in The Machine Learning Using The Boston Housing Dataset. 1)
14 pages
CD # 0157 Tanker Management and Self-Assessment Guide
100% (1)
CD # 0157 Tanker Management and Self-Assessment Guide
5 pages
Front Pages of Lab Journal
No ratings yet
Front Pages of Lab Journal
12 pages
Danfoss Recommended Lubricants
No ratings yet
Danfoss Recommended Lubricants
2 pages
Invoice Print
No ratings yet
Invoice Print
2 pages
ML-Lab07-Building and Evaluating Multivariate Regression Models in Python
No ratings yet
ML-Lab07-Building and Evaluating Multivariate Regression Models in Python
5 pages
Assignment 2 - LP1
No ratings yet
Assignment 2 - LP1
7 pages
Exp1 (Linear - Regression) (1) 2
No ratings yet
Exp1 (Linear - Regression) (1) 2
7 pages
Linear Regression
No ratings yet
Linear Regression
15 pages
Wa0009.
No ratings yet
Wa0009.
4 pages
Final Lab Manual
No ratings yet
Final Lab Manual
34 pages
Decision Tree PDF
No ratings yet
Decision Tree PDF
2 pages
AIML
No ratings yet
AIML
5 pages
Certificate Index
No ratings yet
Certificate Index
2 pages
MUNAR - Linear Regression - Ipynb - Colaboratory
No ratings yet
MUNAR - Linear Regression - Ipynb - Colaboratory
30 pages
Linear Regression Mca Lab - Jupyter Notebook
No ratings yet
Linear Regression Mca Lab - Jupyter Notebook
2 pages
AWS Assignment 2
No ratings yet
AWS Assignment 2
1 page
Assignment 1 - Icc - Even Sem 2025
No ratings yet
Assignment 1 - Icc - Even Sem 2025
1 page
Process Performance Models: Statistical, Probabilistic & Simulation
From Everand
Process Performance Models: Statistical, Probabilistic & Simulation
Vishnuvarthanan Moorthy
No ratings yet
Random Sample Consensus: Robust Estimation in Computer Vision
From Everand
Random Sample Consensus: Robust Estimation in Computer Vision
Fouad Sabry
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Exp 2 (Multiple Linear Regression)

Uploaded by

Exp 2 (Multiple Linear Regression)

Uploaded by

Aim:

Multiple Linear Regression: Theory and Understanding

y_hat = β₀ + β₁x₁ + β₂x₂ + ... + β_p * x_p

 y_hat is the predicted value of the dependent variable.

 β_i are the coefficients for each independent variable x_i.

 p is the number of independent variables.

 Cannot capture non-linear relationships.

 Sensitive to assumptions, and their violation can lead to inaccurate results.

 Cannot establish causation; only identifies correlations.

 Analysing the impact of advertising campaigns on sales

import matplotlib.pyplot as plt

from sklearn.model_selection import train_test_split

from sklearn.linear_model import LinearRegression

from sklearn.preprocessing import StandardScaler, PolynomialFeatures

from sklearn.feature_selection import SelectFromModel

from sklearn.ensemble import RandomForestRegressor

from sklearn.metrics import mean_squared_error, r2_score

# Load the dataset (replace 'your_dataset_filename.csv' with the actual name)

# Handle outliers using IQR (adjust based on your data's characteristics)

X = df.drop(['TOWN', 'TRACT', 'LON', 'LAT', 'MEDV'], axis=1)

# Feature selection (experiment with different thresholds and methods)

sfm = SelectFromModel(rf_model, threshold=0.1) # Adjust threshold if needed

# Polynomial features (consider different degrees)

poly = PolynomialFeatures(degree=2, include_bias=False) # Adjust degree if needed

mse = mean_squared_error(y_test, y_pred)

print('Train Score: ', regressor.score(X_train_poly, y_train))

print('Test Score: ', regressor.score(X_test_poly, y_test))

print('Mean Squared Error (MSE): ', mse)

print('R-squared (R2): ', r2)

plt.title("Actual Medv vs Predicted Medv")

Boston Housing Dataset:

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.