Emllab
Emllab
ipynb - Colab
print(data.DESCR)
.. _california_housing_dataset:
:Attribute Information:
- MedInc median income in block group
- HouseAge median house age in block group
- AveRooms average number of rooms per household
- AveBedrms average number of bedrooms per household
- Population block group population
- AveOccup average number of household members
- Latitude block group latitude
- Longitude block group longitude
The target variable is the median house value for California districts,
expressed in hundreds of thousands of dollars ($100,000).
This dataset was derived from the 1990 U.S. census, using one row per census
block group. A block group is the smallest geographical unit for which the U.S.
Census Bureau publishes sample data (a block group typically has a population
of 600 to 3,000 people).
.. topic:: References
data.feature_names
['MedInc',
'HouseAge',
'AveRooms',
'AveBedrms',
'Population',
'AveOccup',
'Latitude',
'Longitude']
https://colab.research.google.com/drive/1ihyBCBb0Gx3Ajpj_XyPsrzNoDSjiskhN#scrollTo=OEGyszt0Z-j3&printMode=true 1/6
7/19/24, 10:57 AM Untitled3.ipynb - Colab
import pandas as pd
df=pd.DataFrame(data.data,columns=data.feature_names)
df.head()
df['Price']=data.target
df.head()
df.describe()
df.isnull().sum()
MedInc 0
HouseAge 0
AveRooms 0
AveBedrms 0
Population 0
AveOccup 0
Latitude 0
Longitude 0
Price 0
dtype: int64
(5160, 9)
sns.pairplot(df_copy)
https://colab.research.google.com/drive/1ihyBCBb0Gx3Ajpj_XyPsrzNoDSjiskhN#scrollTo=OEGyszt0Z-j3&printMode=true 2/6
7/19/24, 10:57 AM Untitled3.ipynb - Colab
<seaborn.axisgrid.PairGrid at 0x7ebddafac040>
https://colab.research.google.com/drive/1ihyBCBb0Gx3Ajpj_XyPsrzNoDSjiskhN#scrollTo=OEGyszt0Z-j3&printMode=true 3/6
7/19/24, 10:57 AM Untitled3.ipynb - Colab
#feature scaling
from sklearn.preprocessing import StandardScaler
sc=StandardScaler()
x_train=sc.fit_transform(x_train)
x_train
x_test=sc.transform(x_test)
x_test
#model training
from sklearn.linear_model import LinearRegression
lr=LinearRegression()
lr.fit(x_train,y_train)
▾ LinearRegression
LinearRegression()
https://colab.research.google.com/drive/1ihyBCBb0Gx3Ajpj_XyPsrzNoDSjiskhN#scrollTo=OEGyszt0Z-j3&printMode=true 4/6
7/19/24, 10:57 AM Untitled3.ipynb - Colab
lr.coef_
lr.intercept_
2.0708259184263813
#prediction
y_pred=lr.predict(x_test)
0.5335029155157139
0.540898948179417
0.730412839095613
0.5875394343499214
-0.0012124958000889752
▾ Ridge
Ridge(alpha=20.0)
y_pred=ridge.predict(x_test)
y_pred
mse=mean_squared_error(y_test,y_pred)
mae=mean_absolute_error(y_test,y_pred)
print(mse)
print(mae)
print(np.sqrt(mse))
0.5335706984910803
0.5408065397594831
0.7304592380763489
▾ Lasso
Lasso(alpha=20.0)
https://colab.research.google.com/drive/1ihyBCBb0Gx3Ajpj_XyPsrzNoDSjiskhN#scrollTo=OEGyszt0Z-j3&printMode=true 5/6
7/19/24, 10:57 AM Untitled3.ipynb - Colab
y_pred=lasso.predict(x_test)
mse=mean_squared_error(y_test,y_pred)
mae=mean_absolute_error(y_test,y_pred)
print(mse)
print(mae)
print(np.sqrt(mse))
1.2935112672240048
0.9000629779192262
1.1373263679454568
▾ ElasticNet
ElasticNet(alpha=20.0)
y_pred=elastic.predict(x_test)
mse=mean_squared_error(y_test,y_pred)
mae=mean_absolute_error(y_test,y_pred)
print(mse)
print(mae)
print(np.sqrt(mse))
1.2935112672240048
0.9000629779192262
1.1373263679454568
df_copy.corr()
https://colab.research.google.com/drive/1ihyBCBb0Gx3Ajpj_XyPsrzNoDSjiskhN#scrollTo=OEGyszt0Z-j3&printMode=true 6/6