0% found this document useful (0 votes)
11 views9 pages

Logistic Regression Using Python

LOGISTIC REGRESSION USING PYTHON -

Uploaded by

Ishant
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views9 pages

Logistic Regression Using Python

LOGISTIC REGRESSION USING PYTHON -

Uploaded by

Ishant
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

12/22/23, 3:45 AM practice

In [83]: import pandas as pd


import numpy as np
data = pd.read_csv("mbasalary.csv")
data.head()
#data.iloc[:,0:1]

Out[83]: S. No. Percentage in Grade 10 Salary

0 1 62.00 270000

1 2 76.33 200000

2 3 72.00 240000

3 4 60.00 250000

4 5 61.00 180000

In [84]: data = data[['Percentage in Grade 10','Salary']]


x = data[['Percentage in Grade 10']]
y= data[['Salary']]
data.describe()

Out[84]: Percentage in Grade 10 Salary

count 50.000000 50.000000

mean 63.922400 258192.000000

std 9.859937 76715.790993

min 37.330000 120000.000000

25% 57.685000 204500.000000

50% 64.700000 250000.000000

75% 70.000000 300000.000000

max 83.000000 450000.000000

In [85]: import matplotlib.pyplot as plt


plt.scatter(data.iloc[:,0:1],data.iloc[:,-1])

<matplotlib.collections.PathCollection at 0x26fcab243d0>
Out[85]:

localhost:8888/nbconvert/html/practice .ipynb?download=false 1/9


12/22/23, 3:45 AM practice

In [ ]:

In [ ]:

In [86]: data.mean()

Percentage in Grade 10 63.9224


Out[86]:
Salary 258192.0000
dtype: float64

In [87]: data.mode()

Out[87]: Percentage in Grade 10 Salary

0 68.0 300000

In [88]: data.median()

Percentage in Grade 10 64.7


Out[88]:
Salary 250000.0
dtype: float64

In [89]: data.std()

Percentage in Grade 10 9.859937


Out[89]:
Salary 76715.790993
dtype: float64

In [90]: data.quantile() #by default q=0.5

Percentage in Grade 10 64.7


Out[90]:
Salary 250000.0
Name: 0.5, dtype: float64

In [91]: data.quantile(q=0.25)

localhost:8888/nbconvert/html/practice .ipynb?download=false 2/9


12/22/23, 3:45 AM practice
Percentage in Grade 10 57.685
Out[91]:
Salary 204500.000
Name: 0.25, dtype: float64

In [92]: data.quantile(q=[0.25,0.5])

Out[92]: Percentage in Grade 10 Salary

0.25 57.685 204500.0

0.50 64.700 250000.0

In [93]: data.var()

Percentage in Grade 10 9.721836e+01


Out[93]:
Salary 5.885313e+09
dtype: float64

LOGISTIC REGRESSION
In [151… # Import necessary libraries
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, classification_report, f1_score

# Create a simple dataset


data = pd.DataFrame({
'Hours_Studied': [2, 3, 4, 5, 6, 7, 8, 9, 10],
'Hours_Slept': [5, 6, 5, 7, 8, 7, 8, 9, 10],
'Pass': [0, 0, 0, 0, 1, 1, 1, 1, 1] # 1 indicates pass, 0 indicates fail
})

# Display the dataset


print("Dataset:")
print(data)

# Split the data into training and testing sets


X = data[['Hours_Studied', 'Hours_Slept']]
y = data['Pass']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_sta

# Create a logistic regression model


model = LogisticRegression()

# Train the model on the training data


model.fit(X_train, y_train)

# Make predictions on the testing data


y_pred = model.predict(X_test)

# Evaluate the model


accuracy = accuracy_score(y_test, y_pred)
print(f"\nAccuracy: {accuracy:.2f}")
print("f1 score: ", f1_score(y_test,y_pred))

localhost:8888/nbconvert/html/practice .ipynb?download=false 3/9


12/22/23, 3:45 AM practice
Dataset:
Hours_Studied Hours_Slept Pass
0 2 5 0
1 3 6 0
2 4 5 0
3 5 7 0
4 6 8 1
5 7 7 1
6 8 8 1
7 9 9 1
8 10 10 1

Accuracy: 1.00
f1 score: 1.0

In [95]: # Display classification report


print("Classification Report:")
print(classification_report(y_test, y_pred))

Classification Report:
precision recall f1-score support

0 1.00 1.00 1.00 1


1 1.00 1.00 1.00 1

accuracy 1.00 2
macro avg 1.00 1.00 1.00 2
weighted avg 1.00 1.00 1.00 2

In [ ]:

In [ ]:

In [ ]:

In [ ]:

MULTINOMIAL REGRESSION
In [115… # Import necessary libraries
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, classification_report

# Create a simple dataset


data = pd.DataFrame({
'Hours_Studied': [2, 3, 4, 5, 6, 7, 8, 9, 10],
'Hours_Slept': [5, 6, 5, 7, 8, 7, 8, 9, 10],
'Grade': ['F', 'F', 'F', 'C', 'B', 'C', 'B', 'A', 'A'] # Three classes: F, C,
})

# Display the dataset


print("Dataset:")
print(data)

# Split the data into training and testing sets

localhost:8888/nbconvert/html/practice .ipynb?download=false 4/9


12/22/23, 3:45 AM practice
X = data[['Hours_Studied', 'Hours_Slept']]
y = data['Grade']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_sta

# Create a multinomial logistic regression model


model = LogisticRegression(multi_class='multinomial', solver='lbfgs')

# Train the model on the training data


model.fit(X_train, y_train)

# Make predictions on the testing data


y_pred = model.predict(X_test)

# Evaluate the model


accuracy = accuracy_score(y_test, y_pred)
print(f"\nAccuracy: {accuracy:.2f}")

Dataset:
Hours_Studied Hours_Slept Grade
0 2 5 F
1 3 6 F
2 4 5 F
3 5 7 C
4 6 8 B
5 7 7 C
6 8 8 B
7 9 9 A
8 10 10 A

Accuracy: 1.00

In [119… # Display classification report


print("Classification Report:")
print(classification_report(y_test, y_pred))

Classification Report:
precision recall f1-score support

A 1.00 1.00 1.00 1


F 1.00 1.00 1.00 1

accuracy 1.00 2
macro avg 1.00 1.00 1.00 2
weighted avg 1.00 1.00 1.00 2

Multiple Linear Regression(using sklearn)


In [113… # Import necessary libraries
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error, r2_score

# Create a simple dataset


data = pd.DataFrame({
'Hours_Studied': [2, 3, 4, 5, 6, 7, 8, 9, 10],
'Hours_Slept': [5, 6, 5, 7, 8, 7, 8, 9, 10],
'Score': [55, 65, 50, 80, 90, 75, 85, 95, 100] # Dependent variable
})

localhost:8888/nbconvert/html/practice .ipynb?download=false 5/9


12/22/23, 3:45 AM practice

# Display the dataset


print("Dataset:")
display(data)

# Split the data into training and testing sets


X = data[['Hours_Studied', 'Hours_Slept']]
y = data['Score']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_sta

# Create a multiple linear regression model


model = LinearRegression()

# Train the model on the training data


model.fit(X_train, y_train)

# Make predictions on the testing data


y_pred = model.predict(X_test)

# Evaluate the model


mse = mean_squared_error(y_test, y_pred)
r2 = r2_score(y_test, y_pred)

print(f"\nMean Squared Error: {mse:.2f}")


print(f"R-squared: {r2:.2f}")

# Display the coefficients and intercept


print("\nCoefficients:")
print(model.coef_)
print("Intercept:", model.intercept_)

Dataset:
Hours_Studied Hours_Slept Score

0 2 5 55

1 3 6 65

2 4 5 50

3 5 7 80

4 6 8 90

5 7 7 75

6 8 8 85

7 9 9 95

8 10 10 100

Mean Squared Error: 6.60


R-squared: 0.97

Coefficients:
[-2.43842365 13.36206897]
Intercept: -4.384236453201979

MLR Using statsmodel


In [140… import statsmodels.api as sm
dataa = pd.DataFrame({

localhost:8888/nbconvert/html/practice .ipynb?download=false 6/9


12/22/23, 3:45 AM practice
'Hours_Studied': [2, 3, 4, 5, 6, 7, 8, 9, 10],
'Hours_Slept': [5, 6, 5, 7, 8, 7, 8, 9, 10],
'Score': [55, 65, 50, 80, 90, 75, 85, 95, 100] # Dependent variable
})
print(dataa)
m = dataa[['Hours_Studied','Hours_Slept']]
n = dataa['Score']

mlr = sm.OLS(n,m).fit()
print("Params:")
print(mlr.params)
y_pred = mlr.predict(m)
print('Y Pred: ')
print(y_pred)

Hours_Studied Hours_Slept Score


0 2 5 55
1 3 6 65
2 4 5 50
3 5 7 80
4 6 8 90
5 7 7 75
6 8 8 85
7 9 9 95
8 10 10 100
Params:
Hours_Studied -1.428571
Hours_Slept 11.890756
dtype: float64
Y Pred:
0 56.596639
1 67.058824
2 53.739496
3 76.092437
4 86.554622
5 73.235294
6 83.697479
7 94.159664
8 104.621849
dtype: float64

In [142… rsquare = r2_score(y_pred,n)


print(r2)

0.9706465820573179

In [143… mlr.summary2()

C:\Users\Ishant\anaconda3\Lib\site-packages\scipy\stats\_stats_py.py:1736: UserWar
ning: kurtosistest only valid for n>=20 ... continuing anyway, n=9
warnings.warn("kurtosistest only valid for n>=20 ... continuing "

localhost:8888/nbconvert/html/practice .ipynb?download=false 7/9


12/22/23, 3:45 AM practice

Out[143]: Model: OLS Adj. R-squared (uncentered): 0.998

Dependent Variable: Score AIC: 48.5980

Date: 2023-12-21 18:23 BIC: 48.9925

No. Observations: 9 Log-Likelihood: -22.299

Df Model: 2 F-statistic: 2623.

Df Residuals: 7 Prob (F-statistic): 8.64e-11

R-squared (uncentered): 0.999 Scale: 10.684

Coef. Std.Err. t P>|t| [0.025 0.975]

Hours_Studied -1.4286 0.7787 -1.8346 0.1092 -3.2699 0.4127

Hours_Slept 11.8908 0.6872 17.3024 0.0000 10.2657 13.5158

Omnibus: 1.260 Durbin-Watson: 1.268

Prob(Omnibus): 0.533 Jarque-Bera (JB): 0.686

Skew: -0.168 Prob(JB): 0.710

Kurtosis: 1.689 Condition No.: 9

Notes:
[1] R² is computed without centering (uncentered) since the model does not contain a
constant.
[2] Standard Errors assume that the covariance matrix of the errors is correctly specified.

In [ ]:

In [107… #Reshaping the data


X_train.values.reshape(-1,1)

array([[ 7],
Out[107]:
[ 7],
[ 2],
[ 5],
[10],
[10],
[ 4],
[ 5],
[ 6],
[ 8],
[ 5],
[ 7],
[ 8],
[ 8]], dtype=int64)

DataFrame from Dictionary


In [106… dataas = pd.DataFrame(data)
dataas

localhost:8888/nbconvert/html/practice .ipynb?download=false 8/9


12/22/23, 3:45 AM practice

Out[106]: Hours_Studied Hours_Slept Pass

0 2 5 0

1 3 6 0

2 4 5 0

3 5 7 0

4 6 8 1

5 7 7 1

6 8 8 1

7 9 9 1

8 10 10 1

In [165…

Cell In[165], line 1


jupyter nbconvert --to FORMAT notebook.ipynb
^
SyntaxError: invalid syntax

In [ ]:

In [ ]:

In [ ]:

In [ ]:

localhost:8888/nbconvert/html/practice .ipynb?download=false 9/9

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy