0% found this document useful (0 votes)
6 views3 pages

# Linear Regression

The document provides a comprehensive overview of various regression techniques in Python, including linear regression using both SciPy and Statsmodels libraries, polynomial regression, and least squares fitting. It also demonstrates data handling and model training using a dataset related to salinity and temperature, showcasing data cleaning, scatter plotting, and model evaluation. The examples utilize libraries such as NumPy, Pandas, Matplotlib, and Seaborn for data manipulation and visualization.

Uploaded by

Sukmana Putra
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views3 pages

# Linear Regression

The document provides a comprehensive overview of various regression techniques in Python, including linear regression using both SciPy and Statsmodels libraries, polynomial regression, and least squares fitting. It also demonstrates data handling and model training using a dataset related to salinity and temperature, showcasing data cleaning, scatter plotting, and model evaluation. The examples utilize libraries such as NumPy, Pandas, Matplotlib, and Seaborn for data manipulation and visualization.

Uploaded by

Sukmana Putra
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
You are on page 1/ 3

# Linear Regression

import matplotlib.pyplot as plt


from scipy import stats
x = [0,1,2,3,4]
y = [3,5,5,6,7]

slope, intercept, r, p, std_err = stats.linregress(x, y)


print("slope: ", slope)
print("intercept: ", intercept)

def myfunc(x):
return slope * x + intercept
Defines a linear function called myfunc()
mymodel = list(map(myfunc, x))
plt.scatter(x, y)
plt.plot(x, mymodel)
plt.show()

#Linear Regression using Statsmodels Library


import matplotlib.pyplot as plt
import numpy as np
import statsmodels.api as sm
x = [0,1,2,3,4]
y = [3,5,5,6,7]
Creates two arrays for x and y values
x1 = sm.add_constant(x)
model = sm.OLS(y,x1)
results = model.fit()
print (results.params)
print (results.summary())
y_pred = results.predict(x1)
plt.scatter(x,y)
plt.xlabel("X")
plt.ylabel("Y")
plt.plot(x,y_pred, "r")
plt.show()

# Polynomial Regression
import matplotlib.pyplot as plt
from scipy import stats
import numpy as np
x = [0,1,2,3,4,5]
y = [3,8,6,6,7,3]
mymodel = np.poly1d(np.polyfit(x, y, 3))
print(mymodel)
myline = np.linspace(0, 5, 100)
plt.scatter(x, y)
plt.plot(myline, mymodel(myline))
plt.show()

#Python Least Squares Fitting


import numpy as np
import scipy.optimize as optimization
x = np.array([0,1,2,3,4,5])
y = np.array([100,90,60,30,10,1])
def func(x, a, b, c):
return a * np.exp(-b* x) + c
popt, pcov = optimization.curve_fit(func, x, y)
print ("Best fit a b c: ",popt)
print ("Best fit covariance: ",pcov)

# Contoh pake CalCofl


import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
from sklearn import preprocessing, svm
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression

Step 2: Reading the dataset


df = pd.read_csv('bottle.csv')
df_binary = df[['Salnty', 'T_degC']]
# Taking only the selected two attributes from the dataset
df_binary.columns = ['Sal', 'Temp']
#display the first 5 rows
df_binary.head()

Step 3: Exploring the data scatter


#plotting the Scatter plot to check relationship between Sal and Temp
sns.lmplot(x ="Sal", y ="Temp", data = df_binary, order = 2, ci = None)
plt.show()

Step 4: Data cleaning


# Eliminating NaN or missing input numbers
df_binary.fillna(method ='ffill', inplace = True)

Step 5: Training our model


X = np.array(df_binary['Sal']).reshape(-1, 1)
y = np.array(df_binary['Temp']).reshape(-1, 1)
# Separating the data into independent and dependent variables
# Converting each dataframe into a numpy array
# since each dataframe contains only one column
df_binary.dropna(inplace = True)
# Dropping any rows with Nan values
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.25)
# Splitting the data into training and testing data
regr = LinearRegression()
regr.fit(X_train, y_train)
print(regr.score(X_test, y_test))

Step 6: Exploring our results


y_pred = regr.predict(X_test)
plt.scatter(X_test, y_test, color ='b')
plt.plot(X_test, y_pred, color ='k')
plt.show()
# Data scatter of predicted values

Working with a smaller dataset


Artificial Intelligence
59
df_binary500.fillna(method ='ffill', inplace = True)
X = np.array(df_binary500['Sal']).reshape(-1, 1)
y = np.array(df_binary500['Temp']).reshape(-1, 1)
df_binary500.dropna(inplace = True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.25)
regr = LinearRegression()
regr.fit(X_train, y_train)
print(regr.score(X_test, y_test))

Working with a smaller dataset


y_pred = regr.predict(X_test)
plt.scatter(X_test, y_test, color ='b')
plt.plot(X_test, y_pred, color ='k')
plt.show()

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy