0% found this document useful (0 votes)
8 views8 pages

Regression Prac 9

Uploaded by

Om Bachhav
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views8 pages

Regression Prac 9

Uploaded by

Om Bachhav
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

3/15/24, 11:41 AM Regression

MGV's Loknete Vyankatrao Hiray Arts, Science and


Commerce College Nashik

Department of Mathematics

M. Sc. 1 Data Science

Model 1 Linear Regression

Practical 9

Regression Analysis

Step 1 Import Libraries


In [1]: import pandas as pd

Step 2 Import Data


In [2]: df=pd.read_csv('Admission_Predict.csv')

In [3]: df.head(5)

Out[3]:
Serial No. GRE Score TOEFL Score University Rating SOP LOR CGPA Research Chance of Admit

0 1 337 118 4 4.5 4.5 9.65 1 0.92

1 2 324 107 4 4.0 4.5 8.87 1 0.76

2 3 316 104 3 3.0 3.5 8.00 1 0.72

3 4 322 110 3 3.5 2.5 8.67 1 0.80

4 5 314 103 2 2.0 3.0 8.21 0 0.65

In [4]: df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 400 entries, 0 to 399
Data columns (total 9 columns):
Serial No. 400 non-null int64
GRE Score 400 non-null int64
TOEFL Score 400 non-null int64
University Rating 400 non-null int64
SOP 400 non-null float64
LOR 400 non-null float64
CGPA 400 non-null float64
Research 400 non-null int64
Chance of Admit 400 non-null float64
dtypes: float64(4), int64(5)
memory usage: 28.2 KB

localhost:8888/notebooks/Desktop/LVH Academic/Data Science/practical exercis/Regression.ipynb 1/8


3/15/24, 11:41 AM Regression

In [5]: df.describe()

Out[5]:
TOEFL University Chance of
Serial No. GRE Score SOP LOR CGPA Research
Score Rating Admit

count 400.000000 400.000000 400.000000 400.000000 400.000000 400.000000 400.000000 400.000000 400.000000

mean 200.500000 316.807500 107.410000 3.087500 3.400000 3.452500 8.598925 0.547500 0.724350

std 115.614301 11.473646 6.069514 1.143728 1.006869 0.898478 0.596317 0.498362 0.142609

min 1.000000 290.000000 92.000000 1.000000 1.000000 1.000000 6.800000 0.000000 0.340000

25% 100.750000 308.000000 103.000000 2.000000 2.500000 3.000000 8.170000 0.000000 0.640000

50% 200.500000 317.000000 107.000000 3.000000 3.500000 3.500000 8.610000 1.000000 0.730000

75% 300.250000 325.000000 112.000000 4.000000 4.000000 4.000000 9.062500 1.000000 0.830000

max 400.000000 340.000000 120.000000 5.000000 5.000000 5.000000 9.920000 1.000000 0.970000

Step 3 Define X(features) ; y(target)


In [6]: df.columns

Out[6]: Index(['Serial No.', 'GRE Score', 'TOEFL Score', 'University Rating', 'SOP',
'LOR ', 'CGPA', 'Research', 'Chance of Admit '],
dtype='object')

In [7]: y=df['Chance of Admit ']

In [8]: x=df[['GRE Score', 'TOEFL Score', 'University Rating', 'SOP','LOR ', 'CGPA', 'Research']]

In [10]: df.shape

Out[10]: (400, 9)

In [9]: x.shape, y.shape

Out[9]: ((400, 7), (400,))

Step 4 Train test split


In [13]: from sklearn.model_selection import train_test_split
x_train,x_test,y_train,y_test = train_test_split(x,y,test_size=0.3,random_state = 55)

In [14]: x_train.shape ,x_test.shape ,y_train.shape ,y_test.shape

Out[14]: ((280, 7), (120, 7), (280,), (120,))

Step 5 Identify the type of problem and select the model

It is the regression problem

In [15]: from sklearn.linear_model import LinearRegression

In [16]: model = LinearRegression()

localhost:8888/notebooks/Desktop/LVH Academic/Data Science/practical exercis/Regression.ipynb 2/8


3/15/24, 11:41 AM Regression

Step 6 Train/ fit the model


In [17]: model.fit(x_train, y_train)

Out[17]: LinearRegression()

Step 7 Predict the model


In [18]: y_pred=model.predict(x_test)

In [19]: y_pred

Out[19]: array([0.75866592, 0.78553434, 0.81020053, 0.83353243, 0.71078396,


0.6220308 , 0.68928774, 0.61533352, 0.49752396, 0.78937908,
0.80971744, 0.68669098, 0.71072851, 0.49599277, 0.64188502,
0.64331251, 0.60392824, 0.72006312, 0.82288075, 0.5335372 ,
0.49170938, 0.63086558, 0.92606661, 0.70219312, 0.67624218,
0.67728438, 0.64657225, 0.88842604, 0.85589808, 0.97692798,
0.71804212, 0.62475994, 0.89304046, 0.8070998 , 0.75676071,
0.93063188, 0.47306972, 0.57335455, 0.64100024, 0.90681499,
0.5560433 , 0.77667231, 0.84721554, 0.703978 , 0.82609734,
0.64793602, 0.95869966, 0.64362894, 0.94938087, 0.51215621,
0.71684422, 0.73650605, 0.50653959, 0.94680144, 0.78440737,
0.58259502, 0.7472152 , 0.66647161, 0.73325462, 0.65183653,
0.78199281, 0.7363312 , 0.93340242, 0.80431308, 0.65030079,
0.72866373, 0.62051463, 0.85586952, 0.91761967, 0.53698851,
0.64214865, 0.69427608, 0.93141977, 0.52796443, 0.92569401,
0.63051092, 0.78642105, 0.65298105, 0.65223581, 0.95534194,
0.84411466, 0.66917474, 0.633211 , 0.80002273, 0.58795512,
0.82524634, 0.70622573, 0.59750855, 0.60740384, 0.89431252,
0.58325964, 0.70598593, 0.44442279, 0.78474776, 0.9629417 ,
0.54614605, 0.68559667, 0.74970029, 0.69920648, 0.62470747,
0.70492483, 0.64095754, 0.89239269, 0.72241105, 0.62175909,
0.71785906, 0.60130164, 0.73459027, 0.89664139, 0.77055757,
0.77533315, 0.64565156, 0.64175951, 0.70767949, 0.64423264,
0.54918264, 0.68626558, 0.47172742, 0.78001753, 0.74481964])

In [20]: y_test
342 0.58
83 0.92
131 0.77
101 0.64
320 0.75
323 0.62
333 0.71
72 0.93
380 0.78
319 0.80
62 0.54
304 0.62
200 0.73
13 0.62
291 0.56
262 0.70
376 0.34
283 0.80
264 0.75
Name: Chance of Admit , Length: 120, dtype: float64

Step 8 Calculate the accuracy of model


In [21]: from sklearn.metrics import mean_absolute_error, mean_absolute_percentage_error,mean_squared_erro

localhost:8888/notebooks/Desktop/LVH Academic/Data Science/practical exercis/Regression.ipynb 3/8


3/15/24, 11:41 AM Regression

In [22]: mean_absolute_error(y_test,y_pred)

Out[22]: 0.04666971599519059

In [23]: per_e = mean_absolute_percentage_error(y_test,y_pred)

In [24]: per_e

Out[24]: 0.07354722873305462

In [25]: accuracy = (1-per_e)*100

In [26]: accuracy

Out[26]: 92.64527712669454

In [27]: mean_squared_error(y_test, y_pred)

Out[27]: 0.003699173479770781

Model 2 KNeighbours Regressor


In [28]: from sklearn.neighbors import KNeighborsRegressor

In [29]: model2 = KNeighborsRegressor()

In [30]: model2.fit(x_train,y_train)

Out[30]: KNeighborsRegressor()

In [31]: y_pred2 = model2.predict(x_test)

In [32]: y_pred2

Out[32]: array([0.804, 0.708, 0.796, 0.8 , 0.736, 0.64 , 0.644, 0.702, 0.57 ,
0.848, 0.786, 0.746, 0.684, 0.662, 0.734, 0.674, 0.526, 0.746,
0.716, 0.574, 0.542, 0.578, 0.91 , 0.702, 0.594, 0.642, 0.656,
0.942, 0.892, 0.92 , 0.756, 0.626, 0.91 , 0.83 , 0.72 , 0.95 ,
0.458, 0.528, 0.626, 0.92 , 0.542, 0.778, 0.826, 0.626, 0.768,
0.71 , 0.92 , 0.648, 0.94 , 0.514, 0.708, 0.714, 0.596, 0.92 ,
0.806, 0.532, 0.7 , 0.618, 0.686, 0.646, 0.812, 0.74 , 0.946,
0.796, 0.688, 0.682, 0.624, 0.86 , 0.916, 0.534, 0.708, 0.72 ,
0.94 , 0.552, 0.94 , 0.608, 0.74 , 0.65 , 0.652, 0.924, 0.83 ,
0.666, 0.68 , 0.694, 0.598, 0.746, 0.694, 0.592, 0.64 , 0.884,
0.516, 0.642, 0.462, 0.796, 0.928, 0.532, 0.63 , 0.744, 0.658,
0.66 , 0.658, 0.652, 0.86 , 0.646, 0.618, 0.674, 0.594, 0.738,
0.882, 0.73 , 0.776, 0.684, 0.628, 0.672, 0.592, 0.606, 0.676,
0.484, 0.784, 0.762])

localhost:8888/notebooks/Desktop/LVH Academic/Data Science/practical exercis/Regression.ipynb 4/8


3/15/24, 11:41 AM Regression

In [33]: y_test

Out[33]: 26 0.76
258 0.77
128 0.84
126 0.85
6 0.75
293 0.64
110 0.61
20 0.64
57 0.46
133 0.79
48 0.82
53 0.72
85 0.76
38 0.52
261 0.71
181 0.71
30 0.65
398 0.67
281 0.80
159 0 52
In [34]: error2 = mean_absolute_percentage_error(y_test,y_pred2)

In [35]: error2

Out[35]: 0.09038456485463613

In [36]: accuracy2 = (1-error2)*100

In [37]: accuracy2

Out[37]: 90.96154351453639

Model 3 DecisionTreeRegressor
In [38]: from sklearn.tree import DecisionTreeRegressor

In [39]: model3 = DecisionTreeRegressor()

In [40]: model3.fit(x_train, y_train)

Out[40]: DecisionTreeRegressor()

In [41]: y_pred3 = model3.predict(x_test)

In [42]: y_pred3

Out[42]: array([0.78, 0.76, 0.79, 0.91, 0.79, 0.71, 0.64, 0.66, 0.44, 0.76, 0.69,
0.7 , 0.64, 0.57, 0.48, 0.64, 0.44, 0.74, 0.86, 0.59, 0.57, 0.63,
0.93, 0.71, 0.68, 0.64, 0.57, 0.9 , 0.86, 0.97, 0.72, 0.61, 0.93,
0.78, 0.81, 0.89, 0.47, 0.63, 0.72, 0.93, 0.54, 0.79, 0.87, 0.61,
0.86, 0.42, 0.89, 0.65, 0.94, 0.54, 0.7 , 0.73, 0.54, 0.89, 0.79,
0.62, 0.74, 0.65, 0.73, 0.65, 0.8 , 0.72, 0.95, 0.82, 0.72, 0.73,
0.65, 0.88, 0.91, 0.57, 0.65, 0.64, 0.89, 0.44, 0.91, 0.65, 0.82,
0.73, 0.61, 0.94, 0.86, 0.64, 0.67, 0.66, 0.61, 0.92, 0.69, 0.63,
0.62, 0.89, 0.49, 0.47, 0.47, 0.69, 0.95, 0.58, 0.71, 0.68, 0.66,
0.54, 0.47, 0.61, 0.93, 0.64, 0.72, 0.66, 0.59, 0.71, 0.86, 0.77,
0.8 , 0.72, 0.53, 0.71, 0.46, 0.54, 0.69, 0.44, 0.77, 0.79])

localhost:8888/notebooks/Desktop/LVH Academic/Data Science/practical exercis/Regression.ipynb 5/8


3/15/24, 11:41 AM Regression

In [43]: y_test

Out[43]: 26 0.76
258 0.77
128 0.84
126 0.85
6 0.75
293 0.64
110 0.61
20 0.64
57 0.46
133 0.79
48 0.82
53 0.72
85 0.76
38 0.52
261 0.71
181 0.71
30 0.65
398 0.67
281 0.80
159 0.52
118 0.47
179 0.73
399 0.95
340 0.75
256 0.76
86 0.72
137 0.71
286 0.92
69 0.78
143 0.97
...
168 0.64
109 0.68
79 0.46
360 0.85
212 0.95
8 0.50
226 0.63
334 0.73
393 0.77
387 0.53
269 0.77
342 0.58
83 0.92
131 0.77
101 0.64
320 0.75
323 0.62
333 0.71
72 0.93
380 0.78
319 0.80
62 0.54
304 0.62
200 0.73
13 0.62
291 0.56
262 0.70
376 0.34
283 0.80
264 0.75
Name: Chance of Admit , Length: 120, dtype: float64

In [44]: error3 = mean_absolute_percentage_error(y_test,y_pred3)

In [45]: error3

Out[45]: 0.1074076957985312

localhost:8888/notebooks/Desktop/LVH Academic/Data Science/practical exercis/Regression.ipynb 6/8


3/15/24, 11:41 AM Regression

In [46]: accuracy3 = (1-error3)*100

In [47]: accuracy3

Out[47]: 89.25923042014688

Model 4 RandomForestRegressor
In [48]: from sklearn.ensemble import RandomForestRegressor

In [49]: model4 = RandomForestRegressor()

In [50]: model4.fit(x_train, y_train)

Out[50]: RandomForestRegressor()

In [51]: y_pred4=model4.predict(x_test)

In [52]: y_pred4

Out[52]: array([0.7522, 0.756 , 0.801 , 0.8981, 0.7353, 0.6397, 0.6769, 0.5919,


0.473 , 0.753 , 0.7641, 0.678 , 0.6942, 0.5411, 0.6472, 0.6731,
0.5488, 0.7615, 0.823 , 0.5544, 0.5472, 0.6508, 0.9321, 0.6922,
0.6494, 0.6788, 0.6393, 0.8773, 0.8575, 0.9667, 0.7118, 0.6258,
0.9239, 0.8071, 0.6867, 0.9175, 0.4602, 0.5184, 0.6884, 0.9152,
0.5432, 0.7734, 0.8564, 0.6624, 0.8376, 0.5788, 0.9189, 0.6584,
0.9367, 0.4897, 0.7294, 0.6476, 0.514 , 0.9203, 0.7845, 0.5882,
0.7516, 0.6583, 0.6967, 0.6889, 0.738 , 0.7177, 0.9338, 0.7872,
0.6897, 0.6827, 0.6619, 0.8714, 0.9166, 0.4575, 0.6633, 0.7127,
0.9148, 0.4789, 0.914 , 0.6403, 0.7625, 0.6884, 0.6491, 0.9375,
0.8269, 0.6539, 0.686 , 0.7186, 0.6108, 0.9014, 0.7007, 0.6008,
0.6206, 0.9025, 0.5268, 0.6659, 0.4645, 0.7376, 0.9257, 0.5708,
0.6962, 0.6977, 0.7021, 0.5888, 0.6911, 0.683 , 0.9159, 0.6761,
0.6592, 0.6734, 0.6203, 0.7317, 0.9052, 0.7659, 0.7711, 0.6686,
0.6031, 0.7021, 0.5202, 0.5608, 0.6859, 0.4874, 0.7654, 0.741 ])

In [53]: y_test

Out[53]: 26 0.76
258 0.77
128 0.84
126 0.85
6 0.75
293 0.64
110 0.61
20 0.64
57 0.46
133 0.79
48 0.82
53 0.72
85 0.76
38 0.52
261 0.71
181 0.71
30 0.65
398 0.67
281 0.80
159 0 52
In [54]: error4 = mean_absolute_percentage_error(y_test,y_pred4)

In [55]: accuracy4 = (1-error4)*100

localhost:8888/notebooks/Desktop/LVH Academic/Data Science/practical exercis/Regression.ipynb 7/8


3/15/24, 11:41 AM Regression

In [56]: accuracy4

Out[56]: 91.9966051693334

Let's compare accuracy of all 4 models

Linear Regression
In [57]: accuracy

Out[57]: 92.64527712669454

KNeighborsRegressor

In [58]: accuracy2

Out[58]: 90.96154351453639

DecisionTreeRegressor

 In [59]: accuracy3

Out[59]: 89.25923042014688

In [ ]: ​

RandomForestRegressor

In [60]: accuracy4

Out[60]: 91.9966051693334

In [ ]: ​

localhost:8888/notebooks/Desktop/LVH Academic/Data Science/practical exercis/Regression.ipynb 8/8

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy