0% found this document useful (0 votes)
7 views8 pages

DS P6 Yash

The document outlines a practical assignment for a Data Science course focusing on regression analysis, including simple and multiple linear regression. It provides code examples for implementing both types of regression using datasets, visualizing the results, and interpreting model coefficients. The assignment aims to enhance understanding of regression models and their applications in predicting outcomes based on independent variables.

Uploaded by

Ayesha Bagwan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views8 pages

DS P6 Yash

The document outlines a practical assignment for a Data Science course focusing on regression analysis, including simple and multiple linear regression. It provides code examples for implementing both types of regression using datasets, visualizing the results, and interpreting model coefficients. The assignment aims to enhance understanding of regression models and their applications in predicting outcomes based on independent variables.

Uploaded by

Ayesha Bagwan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 8

SHREE PANCHAM KHEMRAJ MAHAVIDYALAY, SAWANTWADI

DEPARTMENT OF COMPUTER SCIENCE


CLASS : TYCS ROLL NO. : 03
COURSE : Data Science DATE :
PRACTICAL NO. : 06 SEAT NO :
AIM: Regression and Its Types

TEACHER SIGNATURE

Regression and Its Types


 Implement simple linear regression using a dataset.
 Explore and interpret the regression model coefficients and goodness-of-fit measures.
 Extend the analysis to multiple linear regression and assess the impact of additional
predictors.

CODE
1. Simple Linear Regression

# Import libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from pandas.core.common import random_state
from sklearn.linear_model import LinearRegression
# Get dataset
df_sal = pd.read_csv('D:\Backup\TYCS Sem 6 (Yash)\Data science\Practical 6\\
Salary_Data.csv')
df_sal.head()
print(df_sal)
# Describe data
print(df_sal.describe())
# Data distribution
plt.title('Salary Distribution Plot')
sns.displot(df_sal['Salary'])

Yash Arjun Dhaske


plt.show()
# Relationship between Salary and Experience
plt.scatter(df_sal['YearsExperience'], df_sal['Salary'], color = 'lightcoral')
plt.title('Salary vs Experience')
plt.xlabel('Years of Experience')
plt.ylabel('Salary')
plt.box(False)
plt.show()
# Splitting variables
X = df_sal.iloc[:, :1] # independent
y = df_sal.iloc[:, 1:] # dependent
# Splitting dataset into test/train
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 0)
# Regressor model
regressor = LinearRegression()
regressor.fit(X_train, y_train)
# Prediction result
y_pred_test = regressor.predict(X_test) # predicted value of y_test
y_pred_train = regressor.predict(X_train) # predicted value of y_train
# Prediction on training set
plt.scatter(X_train, y_train, color = 'lightcoral')
plt.plot(X_train, y_pred_train, color = 'firebrick')
plt.title('Salary vs Experience (Training Set)')
plt.xlabel('Years of Experience')
plt.ylabel('Salary')
plt.legend(['X_train/Pred(y_test)', 'X_train/y_train'], title = 'Sal/Exp', loc='best',
facecolor='white')
plt.box(False)
plt.show()
# Prediction on test set
plt.scatter(X_test, y_test, color = 'lightcoral')

Yash Arjun Dhaske


plt.plot(X_train, y_pred_train, color = 'firebrick')
plt.title('Salary vs Experience (Test Set)')
plt.xlabel('Years of Experience')
plt.ylabel('Salary')
plt.legend(['X_train/Pred(y_test)', 'X_train/y_train'], title = 'Sal/Exp', loc='best',
facecolor='white')
plt.box(False)
plt.show()
# Regressor coefficients and intercept
print(f'Coefficient: {regressor.coef_}')
print(f'Intercept: {regressor.intercept_}')

OUTPUT

Yash Arjun Dhaske


Yash Arjun Dhaske
Yash Arjun Dhaske
2. Multiple Linear Regression
CODE
# Import libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.compose import ColumnTransformer
from sklearn.preprocessing import OneHotEncoder
from sklearn.linear_model import LinearRegression
# Get dataset
df_start = pd.read_csv("50_Startups.csv")
df_start.head()
print(df_start)
# Describe data
print(df_start.describe())
# Data distribution
plt.title('Profit Distribution Plot')
sns.displot(df_start['Profit'])
plt.show()
# Relationship between Profit and R&D Spend
plt.scatter(df_start['R&D Spend'], df_start['Profit'], color = 'lightcoral')
plt.title('Profit vs R&D Spend')
plt.xlabel('R&D Spend')
plt.ylabel('Profit')
plt.box(False)
plt.show()
# Split dataset in dependent/independent variables
X = df_start.iloc[:, :-1].values
y = df_start.iloc[:, -1].values
# One-hot encoding of categorical data
ct = ColumnTransformer(transformers = [('encoder', OneHotEncoder(), [3])], remainder =
'passthrough')
X = np.array(ct.fit_transform(X))
# Split dataset into test/train
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 0)
# Train multiple regression model
regressor = LinearRegression()
regressor.fit(X_train, y_train)
# Predict result
y_pred = regressor.predict(X_test)
# Compare predicted result with actual value
np.set_printoptions(precision = 2)
result = np.concatenate((y_pred.reshape(len(y_pred), 1), y_test.reshape(len(y_test), 1)),
1)
print(result)
OUTPUT

Yash Arjun Dhaske


Yash Arjun Dhaske
Yash Arjun Dhaske

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy