0% found this document useful (0 votes)
20 views4 pages

wvcg0mt7pkASSI 3 ML 16

The document outlines an assignment by student Aniruddha Vharkate on creating a multi-variable regression model using the California housing dataset. It includes source code for data loading, exploration, model training, evaluation, and visualization of results. Key metrics such as Mean Squared Error and R-squared are calculated to assess model performance.

Uploaded by

202201667
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views4 pages

wvcg0mt7pkASSI 3 ML 16

The document outlines an assignment by student Aniruddha Vharkate on creating a multi-variable regression model using the California housing dataset. It includes source code for data loading, exploration, model training, evaluation, and visualization of results. Key metrics such as Mean Squared Error and R-squared are calculated to assess model performance.

Uploaded by

202201667
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Student Name Aniruddha Vharkate

SRN No 202101542

Roll No 16

Program Computer Engg

Year Fourth Year

Division B

Subject Machine Learning

Assignment 3
Create Multi variable Regression model of your choice using suitable dataset.

Source Code:
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error, r2_score
import matplotlib.pyplot as plt
import seaborn as sns

# Load the California housing dataset


from sklearn.datasets import fetch_california_housing
housing = fetch_california_housing()

# Convert to DataFrame
data = pd.DataFrame(housing.data, columns=housing.feature_names)
data['MedHouseVal'] = housing.target

# Explore the data


print("Data Head:\n", data.head())
print("\nData Description:\n", data.describe())
print("\nData Information:\n")
data.info()

# Check for missing values


print("\nMissing Values:\n", data.isnull().sum())

# Correlation heatmap
plt.figure(figsize=(10, 8))
sns.heatmap(data.corr(), annot=True, cmap='coolwarm')
plt.title("Correlation Heatmap")
plt.show()
# Prepare the data for training
X = data.drop('MedHouseVal', axis=1)
y = data['MedHouseVal']

# Split the data into training and testing sets


X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Initialize and train the model


model = LinearRegression()
model.fit(X_train, y_train)

# Predict on test data


y_pred = model.predict(X_test)

# Evaluate the model


mse = mean_squared_error(y_test, y_pred)
r2 = r2_score(y_test, y_pred)

print(f"\nMean Squared Error: {mse}")


print(f"R-squared: {r2}")

# Plot the Actual vs Predicted Prices


plt.scatter(y_test, y_pred)
plt.xlabel("Actual Prices")
plt.ylabel("Predicted Prices")
plt.title("Actual vs Predicted Prices")
plt.show()

# Residual plot
sns.residplot(y_test, y_pred)
plt.title("Residual Plot")
plt.show()

OUTPUT :-

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy