0% found this document useful (0 votes)
5 views3 pages

Dav Exp3

This document outlines an experiment on multiple linear regression using Python, detailing its purpose and assumptions. It includes code for generating independent and dependent variables, training a regression model, and evaluating its performance. The output presents model parameters, including coefficients and performance metrics like Mean Squared Error and R2 Score.

Uploaded by

shreyas chaware
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views3 pages

Dav Exp3

This document outlines an experiment on multiple linear regression using Python, detailing its purpose and assumptions. It includes code for generating independent and dependent variables, training a regression model, and evaluating its performance. The output presents model parameters, including coefficients and performance metrics like Mean Squared Error and R2 Score.

Uploaded by

shreyas chaware
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

SINDHUDURG SHIKSHAN PRASARAK MANDAL’S

A.P-Harkul Budruk, Nardave Road, TAL-Kankavli, DIST-Sindhudurg PIN-416602


Department of Computer Science and Engineering (AIML)

Lab Course: CSL601 Sub: DAVLAB

EXPERIMENT NO.3
Aim: Multiple linear regression in python.
Regression models are used to describe relationships between variables by fitting a line to
the observed data. Regression allows you to estimate how a dependent variable changes as
the independent variable(s) change.

Multiple linear regression is used to estimate the relationship between two or more
independent variables and one dependent variable. You can use multiple linear regression
when you want to know:

1. How strong the relationship is between two or more independent variables and one
dependent variable (e.g. how rainfall, temperature, and amount of fertilizer added
affect crop growth).
2. The value of the dependent variable at a certain value of the independent variables
(e.g. the expected yield of a crop at certain levels of rainfall, temperature, and
fertilizer addition).

3. Assumptions of multiple linear regression


4. Multiple linear regression makes all of the same assumptions as simple linear
regression:
5. Homogeneity of variance (homoscedasticity): the size of the error in our prediction
doesn’t change significantly across the values of the independent variable.
6. Independence of observations: the observations in the dataset were collected using
statistically valid sampling methods, and there are no hidden relationships among
variables.
7. In multiple linear regression, it is possible that some of the independent variables are
actually correlated with one another, so it is important to check these before
developing the regression model. If two independent variables are too highly
correlated (r2 > ~0.6), then only one of them should be used in the regression model.
8. Normality: The data follows a normal distribution.
9. Linearity: the line of best fit through the data points is a straight line, rather than a
curve or some sort of grouping factor.
SINDHUDURG SHIKSHAN PRASARAK MANDAL’S
A.P-Harkul Budruk, Nardave Road, TAL-Kankavli, DIST-Sindhudurg PIN-416602
Department of Computer Science and Engineering (AIML)

LabCourse:CSL601 Sub: DAVLAB

EXPERIMENT NO.3
Aim: Multiple linear regression in python.

Input code:
# Import necessary libraries
import numpy as np
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error, r2_score

# Set random seed for reproducibility


np.random.seed(42)

# Generate independent variables


X1 = 2 * np.random.rand(100, 1)
X2 = 3 * np.random.rand(100, 1)
X3 = 5 * np.random.rand(100, 1)
# Combine independent variables into a single matrix
X = np.hstack((X1, X2, X3))

# Generate dependent variable with noise


y = 5 + 2 * X1 + 3 * X2 + 1.5 * X3 + np.random.randn(100, 1)

# Split data into training and testing sets


X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,
random_state=42)

# Initialize and train the multiple linear regression model


model = LinearRegression()
model.fit(X_train, y_train)

# Make predictions on test data


y_pred = model.predict(X_test)

# Calculate evaluation metrics


mse = mean_squared_error(y_test, y_pred)
r2 = r2_score(y_test, y_pred)

# Extract model coefficients and intercept


intercept = model.intercept_[0]
coefficients = model.coef_[0]
# Print model parameters and performance metrics
print("\nModel Parameters:")
print(f"Intercept: {intercept:.2f}")
print(f"Coefficients: X1={coefficients[0]:.2f}, X2={coefficients[1]:.2f},
X3={coefficients[2]:.2f}") # Corrected line
print(f"Mean Squared Error: {mse:.2f}")
print(f"R2 Score: {r2:.2f}")

# Plot Actual vs Predicted values


plt.scatter(y_test, y_pred, color='blue', label="Predicted vs Actual")
plt.plot(y_test, y_test, color='red', linewidth=2, label="Perfect Fit Line") #
Corrected line
plt.xlabel("Actual Values")
plt.ylabel("Predicted Values")
plt.title("Actual vs Predicted Values in Multiple Linear Regression")
plt.legend()
plt.show()

Output:
Model Parameters:
Intercept: 4.51
Coefficients: X1=2.33, X2=3.14, X3=1.59
Mean Squared Error: 2.10
R2 Score: 0.88

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy