0% found this document useful (0 votes)
28 views14 pages

1684918425867

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
28 views14 pages

1684918425867

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 14
5724128, 148 AM mT) in [2) out(2 mB) out} in a) outta Amport pandas as pd Smport matplotlib:pyplot as ple port Seaborn as sh port umgy 35. 9p sns.set_theme(color_codes-T-ue) af = pd.read_csv( Magicsricks.csv") Delhi House Price Predton -Jupyter Notebook ead) 4 soo 2 20 yg SRy Ret Sertor2heamwree 650 9 ga0q000 Randy r2movo NowPropany Bull Fear 6867.0 Data Preprocessing Part 1 Tech the numberof untque value fron ll of the object datatype loenity” a8 sor ect clam eco ts rater tte lt of stop) ahead “um 4 1ad 228 SenFimined 1.0 IDO et e-nove New Popa) fowtmat 8478 Exploratory Data Analysis locas 868nctbooksDehi House Pie Pr te 5724123, 148 AM Delhi House Price Predton - Jupyter Notebook In [10]: |# List of categorical variables to plot Cat_vars = [Furnishing", "status", "Transaction’, “Type', “Bm, sathroon’, "Parking'] # create figure with subplots Fig, axs = plt-subplots or axs'= ans navel) | ncols=s, Figsize=(20, 28)) 1 create barplot for each cotegorical variable for 4, var in enunerate(eat_vars) sne.barplaticever, ye'Poice', data-df, ax-axs[i], estimator=op.nean) ans{ i). set_xticklabels(ars{] get_eticklabels(), rotation 4 adjust spacing between subplots Hie. tight_layoutO) 1 remove the etgth subplot Tigecelaxes(ox517)) fe renove the ninth subplot ‘f3gscelaees(axs()) 1 show pict show) | Tf por P| e u | L ae fd - all locas: 8888inotebooks/Delh House Price Preditonipynbt ae 5724123, 148 AM Dei House Price Prediction -Jupyter Notebook in (12): |cat_vars = ['Furnishing’, ‘status", ‘Transaction’, "Type" 1t create a figure and oxes ¥5g, axe = plt-svbplots(orous-2, ncols-2, Figstze(20, 28)) # create a pie chart for each categorical variable for 4, var in enunerste(eat vars): AEC ten(axs. flat) # count the number of occurrences for each category cat_counts = dé{var].valuecourts() # create a pie chart Sx #lat(1].ple(cat_counts, Labele-cat_caunts.index, autopets "41.68%", startangle-92) set a title for cach subplot ax flat[1].set_title(#"(var} Distribution’) # adjust spacing between subplots 41g. tight layout) show the plat pit show) locahost 8888inotebooks/Delh House Price Prediction ipynbit ae 5724123, 148 AM Delhi House Price Predton - Jupyter Notebook In (13): |mum_vars = ['Area', “Per sqrt] Fig, axs = plt.subplots(orows-1, neols-2, Figsize-(28, 10)) ax = axe. flatten’) for 4, var in enurerate(nun vars): shs.boxplot@cvar, datarde, 2x-aXs[1]) F3g.thght_Aayout() pit show) t i i i t i ' i i i i In (24): |mum_vars = ['Area', “Per sqrt") fig, axs = plt.subplots(orows=1, ncols=2, flgstze=(20, 18)) axe avs flatten’) for 4, var in enunerate nun vars): shs.violinplot(xovar, gata-df, ax-axs[i]) f34-thght_layout() plt shox) locas: 8888inotebooks/Delh House Price Preditonipynbt an 5724123, 148 AM Delhi House Price Predton - Jupyter Notebook In (15): |mum_vars = ['Area', "Per sqrt] Fig, axs = plt.subplots(orows-1, neols-2, Figsize-(28, 10)) axe = axe. flatten’) for 4, var in enurerate(nun_vars): shs.scatterplat(acvar, yo! Price 1 huee'Furnishing’, datandf, F3g.thght_Aayout() pit show) 1 bes i In (18): |mum_vars = ['Area', “Per sare") Fig, axs = plt-subplots(orowse1, neols=2, Flgstze=(28, 10)) axe - avs flatten’) for 4, var An enunerate nun vars): shs.scatterplot(aevar, yo'Price’, huee'Status’, datasd®, ax-axs[i]) f34-thght_layout() plt shox) 1 a i locas: 8888inotebooks/Delh House Price Preditonipynbt site 5724123, 148 AM Delhi House Price Predton - Jupyter Notebook In (17): |mumvars = ['Area', “Per sqrt] Fig, axs = plt.subplots(orows-1, neols-2, Figsize-(28, 10)) axe = axe. flatten’) for 4, var in enurerate(nun_vars): shs.scatterplot(acvar, yo'Price’, hue="Transaction’, datardf, ax-axs[i]) F3g.thght_Aayout() pit show) 1 al, 2 i in (28): /num_vars = ('Area', “Per_Sart"] Fig, axs = plt-subplots(orowse1, neols=2, Flgstze=(28, 10)) axe avs flatten’) for 4, var An enunerate nun vars): shs.scatterplot(xevar, o'Price’, hues'Type', datandf, axcaxs[]) f34-thght_layout() plt shox) 1 ole: : i locas: 8888inotebooks/Delh House Price Preditonipynbt ene 5724128, 148 AM tn (19) out(19) 1n (20) out 29) tn (21) 1n (22) out 22) 1h (23) out 23} in (24) 1n (25) Delhi House Price Predton - Jupyter Notebook af. shope (2259, 19) check missing value check missing = df-tsnul1().sum() * 189 / dF. shape(] check missinglcheck missing > 0].sort_values(ascending-False) persqft —19.282176 parking 2.621128, Furnishing 0.397181, ype eso Sathroon —_@.158855 type: floates # FALL mull volue with nedton 4¢{ Por sats" ]-F111na(aPer_Saft’].median(), anplace=Teue) {FL ‘Parking’ J-f4llna(a{ Parking’ ].nedian(), inplacesTeue) 1 drop null value (» these 3 coluans because trhe null value {s below 2% {f.dropna(subsets[Furnishing’, "Type", ‘sathroon’}, inplacestrue) ‘af.nead) Ares BK Fora Statue Traneacton Type Per Soft @ mo00 8 «20 SarPune 00009 Reasy_pove New Prey ular hor 12515 4 1992-20 SemsFumined 1.0. 5080002 Rasey.te.move. Now Prpery —Aoanment S678 2 000 2 20. Fumred 1.0 15500000 Reatyterove Resale Aaurment 5657.0 3 oo 2 20 SemsFunined 1.0 «200002 Reasyto.rove Ree utter Fiber 50070 4 100 2 20 SamFumisned 1.0 5200000 Reaey.to.move Now Pronery BulKerFbo” 8657.0 af shape 1252, 18) Label Encoding for Object datatype # Loop over each column in the DatoFrane ahere dtype is “object for col in of. select_dtypes (includ printce{eol}: (@F[eolJ-unique())") the column nase and the unique vatues object" J)-coluans: Furnishing: ['Sent-Furntshed' ‘Fuendsheg’ “Unfurntshed' Status: ("Ready to_nove" "Alnost_rea ‘Transaction: ['New Property" Resale] ‘ype: [Bulider oor’ *apartnent’] 1 ‘from sklesrn inport preprocessing “# Loop over each column in the OstoFrane where dtype is ‘abject for col in 6f.select_dtypes(Include-[ ‘object’ J)-coluans: # initialize a LobetEncoder object Isbel encoder label encoder. #it(a#{ col] -unsque()) { rransform the column using the encoder f[col) = 1abel_encoser. teansfora(ef{¢ol]) reprocessing. LabelEncoder() it the encoder to the unique values in the column ¥ Point the colunn nave and the usique encoded values print(#{eol}: (aFeol|-unique)}") Furnishing: (1 @ 2] status: (1 6] Transaction: [@ 1] ype: [1 8] locas: 8888inotebooks/Delh House Price Preditonipynbt m6 5724128, 148 AM Delhi House Price Predton - Jupyter Notebook Correlation Heatmap In {26}: | rcorretotion Heotmap pit figure(Figsize-(22, 16)) Sns heateap(df.corr(), fnt=".2g", annot=true) caxessupplot:> Train Test Split in {28}: from sKlearn.nodel_selection inport train_test_split 4 perform tratn-test split Aavain, Ktest, yotrain, ytest © train test_split(efuérop("Pice’, arlsea), efLPrice'], test_sizer8.2, randon_stater Remove Outlier using IQR locahost 8888/notebooksIDoth House Price ane 5724128, 148 AM Delhi House Price Predton - Jupyter Notebook tn [29]: |p concatenate X train and y_train for outlier renoval teain_af = pé.concat([X_train, y.traia], axis=2) # calculate the IQR values for each colum 1 = train ef quanttie(o.25) 03 + train af quantile, 75) raga # Renove outliers fran x trotn teain_af = train_of[-((erain df < (QL ~ 1.5 * 1QR)) | (train dé > (Q3 + 2.5 * 19R))).2ny(aetse2)) # Separate X train and y_troin after outlier renoval Xtrain = train_df.drop("Price", axisel) ylerain © enaimaafi Price’ Decision Tree Regressor In [30]: from skiearn.tree inport Oeeistontreetegressor fron sklearn-nodel_selection inport erdasearehcy From sklearn.datasets inport lead boston # create a Decistonireekegressor object dderee = Decistontreekegressor() “+ vefine the hyperparoneters to tune and their values paran_gria = ( saxdepth': (2, & 6 By minsanotes split’: {2, 4, 6. 8) ‘aincsanples_leef": [2,'2,'3,"4], naxfeatures': ['auto", “sat”, "1092", random. state": [@, 42] > # create a Grtasearchcy object arid_search = GridSearchcv{dtree, parangnid, c¥=5, scent vef_nean_squared_error') Fit the GrtdsearchC¥ object to the data eieseareh.FStQX train, y-train) 1 Print the best hyperparaveters print (arid_seareh.best-parans_) {nan depth’: 6, ‘max features’: “auto's “min_sanples_ + ‘min sanples_split": 2, ‘random_state': 42} In [31); from skiearn.tree Inport Oveistontreekegressor dtrge = Decisiontreehegressor(randon_stateni2, max depths, ax features dree. FRO train, y_teain) “auto', win_sanples_teaf=1, ain_sanples_split= ut[31}: Decssiontreetegressor(max depthe6, max Featuress'auto’, minsamples_leatea, Fandom. state-42) In [32]: from skearn saport metrics From sklearn.netrics Inport nean_absolute_percentage_erron Anport math yipred = dtree.predict(X test) ie = natrics.nean_absolute error(y_test, y_pred) rape = mean sbsolute percentage errar(y_test, pred) se = metrics.nean_squared_error(y_test, y_pred) Pa = netrics.r2_score(y test, ¥-pred) Paso = math. sgrtinse) print('MAE is ¢)' format(aae)) Prine mere fs ("format (nape)? print('MSE is ()' fornat(ase)) prine(’2 score is {)" -format(e2)) Prins(’RHSE score As ()"«Format(rase)) AE As 9490108, 759762005 ave is 0,42¢23957735847867 2 score is 0,4980663112331838 MSE score 42 22743687.609319054 locas: 8888inotebooks/Delh House Price Preditonipynbt one 5724123, 148 AM Delhi House Price Predton - Jupyter Notebook In (33): imp ef = pd.bataFrane(( “Feature None": Xtrain.coluans, “tnportance"s dtree. feature saportances_ inp_d#.sort_values(by="Ingortance", ascending #42 = f4.nead(20) ple figure(Figsize=(20,8)) sns.barplet(dataefi2, x«' Importance’, ys'Feature Name") Plt -eitie('Feature Tnportance cach Accributes (Decision Tree Regressor)’, fontsizesi8) plt-xlabel ("inportance’, fontsize=16) plt.ylabel ("Feature Wane’, fontsize-16) pit show() Feature Importance Each Attributes (Decision Tree Regressor) = Pe mee Lai co | Bathroom Transaction Feature Name Furrishing Parking Status ao as o2 os os 05 06 or os Importance locas: 8888inotebooks/Delh House Price Preditonipynbt so 5724128, 148 AM in (34): |amport shap explziner = shap.Treetxplainer(atree) Shap_values = explasnen-shap_valves(X_ test) Shap-sumary_plot(shap values, test) Delhi House Price Predton - Jupyter Notebook ih Area eee wore ote efene Per_Saft = + menalptnete Type we Bathroom . 3 Transaction + 2 Parking + 3 Furnishing “+ BHK + Status | ‘SHAP value (impact on model output) in [36]: explainer = shap.Explainer(dtree, x test) Shap_values = explaines(X-Lest) shap plots. waterfall (smap_values[@]) 999.51 Aix) = 10499 et_Saft 1300 = Area = Fumishing 1=Type 4=BHK 0 = Transaction 3 = Bathroom 1 = Parking 1 = Status 1.0 12 Random Forest Regressor locahost 8888inotebooks/Delh House Price Prediction ipynbit 167 Powe ) s125700.04 | arsesco +10915.98 10 14 16 nite 5724128, 148 AM tn B7) 1m (38) out 38} tn 139) Delhi House Price Predton - Jupyter Notebook from sklearn.ensenble inport RandonforestRegressor from sklearn-nodel. selection dnport Gridsearchcy create @ Random Forest Regressor object P4- RandoarorestRegressor() 1 vefine the hyperparaneter grid paren grid = ( nex depth": (3s Sy Ty Se ‘aun_samples.spiit': f2, 5, 20), sincsamples leat": (1, 2,'8], fax features efsert Ty ‘tandon_state': [2, 42] > create a Grtdsearchev object arid_search © GridSearehCV{rr, paran_grid, cv=5, scoringe'r2") 1 FLe the GntdsearchOv object to the training data rid_search. Fit(X train, y_train) 1 Print the best hyperparaneters print("Best hypersaraneters: °, grid_search.best_parans_) est hyperparaneters: {°nax.cepth’: 9, ‘max features’: ‘sort', ‘min samples leaf": 1 spate a) from skleara-ensenble nport RandonForestRegressor P= RandenForestiegressor(randon state=42, max depth-3, min_sanples_splite2, min_sanples_leafs1, ax features "=a y PE ACOLRratn, y_train) andonForestRegressor(max depth=8, max featuress'sqrt", random state=42) from sklearn Snport metrics from skleara.netrics Inport nean_absolute_percentage_error Anport ath yipred = rf.predict(X test) ban - netrice nean_absolute_error(y_test, y_preé) ape = nean_absolute_percentage_errar(y. vest, y-pred) se = netrics.nean squared error(y test, y_pred) P= netrics.r2_score(y. test, ¥-pred) se = math. sqrtinse) print( MAE is {)' format (nae) print Mabe is ° format (mape)) print('MSE is ()' format(ase)) prine(’@2 score 5 {)--farmat(2)) prine(RHSE score ts ()".Foraat(rase)) aE is 9136272. 326071279 ave Ss 0,4378542709594508 2 score 1s, 0.4250026174:999723 NSE score {= 22413985,90799013 locahost 8888inotebooks/Delh House Price Prediction ipynbit ‘nin_sanples_split® 2, ‘random rae 5724123, 148 AM Delhi House Price Predton - Jupyter Notebook In (40): imp ef = pd.bataFrane(( “Feature None": Xtrain.coluans, “tnportance" dtree. feature. iaportances_ » = Lap_df.sort_values(ty="Ingortance’, ascending #32 = f4.nead(20) ple #igure(Figsize=(20,8)) sns.barplet(dataefi2, x«' Importance’, ys"Feature Rane") Plt -titie('Feature Inportance cach Accributes (Random Forest Regressor)', fontsizesi8) plt-xlabel ("inportanee’, fontsize=16) plt.ylabel ("Feature Wane’, fontsize-16) pit show() Feature Importance Each Attributes (Random Forest Regressor) = Po mee Lai co | Bathroom Transaction Feature Name Furrishing Parking Status ao as o2 os os 05 06 or os Importance locas: 8888inotebooks/Delh House Price Preditonipynbt sa 5724128, 148 AM Delhi House Price Predton - Jupyter Notebook in (41): |amport shap explziner = shap.Treetxplainer(rf) Shap_values = explasnen-shap_valves( test) Shap-sumary_plot(shap values, test) igh Atea Bathroom Per_Sqft BHK Parking Feature value Type Transaction oe Furnishing oo Status 1 4s oo os to ts ‘SHAP value (impact on model output) et in [42): explainer = shap.Explainer(ré, test, check saditivity-False) Shapivalues = explainer( test, check aaditivity=False) shapoplots.waterfall(smap_values[]) fix) = 1325121138 1 «Paring Usssts29 3 =Baneoom ED ‘= Type Beers 4 = Fumishing p zrse4 = BHK ~s604e 81 | 0 = Transaction | sse1759 1 = Status 0 locas: 8888inotebooks/Delh House Price Preditonipynbt sais

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy