Internship Report Anthony and Joshil PDF
Internship Report Anthony and Joshil PDF
ON
SUBMITTED BY
Anthony Lasrado
(4SO18CS067)
Joshil Fernandes
(4SO18CS058)
AT
ZEPHYR TECHNOLOGIES
This dissertation would not have been possible without the guidance and the help of several
individuals and organizations who in one way or another contributed and extended their
valuable assistance during this internship project.
I would like to express our gratitude to employees of Zephyr technologies & Solutions Pvt.
Ltd for providing this internship opportunity under whom I executed this project. Their
constant guidance and willingness to share their vast knowledge made us understand this
project and its manifestations in great depths and helped me to complete the assigned tasks.
I would like to extend my special thanks to Mr. Vedanth Shenoy, Faculty at Zephyr
technologies for his constant guidance throughout this internship.
Finally, I would like to thank my family and friends for their blessings, and for helping me in
all aspects and appreciating me to spend all the time in the work during my internship period
and lending their helping hand in successfully completing the project.
Anthony Lasrado
(4SO18CS067)
Joshil Fernandes
(4SO18CS058)
ABOUT THE COMPANY
Their tools are professionalism, skills and expertise that translate into delivering quality work
at every step for any project we undertake. They work towards getting better than the best out
of every team member at ZEPHYR TECHNOLOGIES, which means when you hire them all
round quality is assured off as you want it. Their Advantage Quality includes protection of
intellectual for the source codes developed specifically for your business. They do not sell the
source codes to the third parties and all elements that they create for your web solution belongs
to you. ZEPHYR TECHNOLOGIES project managers and business analysts place great
value for building a clean communication link with you as they consider it the key ingredient
for the success of any project at hand.
ABSTRACT
Real estate involves the purchase, sale, and development of land, residential and
non-residential buildings.The main players in the real estate market are the
landlords, developers, builders, real estate agents, tenants, buyers etc. The
activities of the real estate sector encompass the housing and construction
sectors also.
The real estate sector in India has assumed growing importance with the
liberalization of the economy. The consequent increase in business
opportunities and migration of the labour force has, in turn, increased the
demand for commercial and housing space, especially rental housing.
Developments in the real estate sector are being influenced by the developments
in the retail, hospitality and entertainment [e.g., hotels, resorts, cinema theatres)
industries, economic services (e.g., hospitals, schools) and information
technology (IT)-enabled services (like call centres) etc. and vice versa.
The real estate sector is a major employment driver, being the second largest
employer next only to agriculture. This is because of the chain of backward and
forward linkages that the sector has with the other sectors of the economy,
specially with the housing and construction sector. About 250 ancillary
industries such as cement, steel, brick, timber, building materials are dependent
on the real estate industry.
The Indian real estate markets, as compared to the other more developed Asian
and Western markets is characterized by smaller size, lower availability of good
quality space and higher prices.
PRICE PREDICTION OF REAL ESTATE
TABLE OF CONTENTS
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2. System Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3. Implementation Details . . . . . . . . . . . . . . . . . . . . . . . . . 8
4. Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
5. Results . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . 17
6. Conclusion . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . 19
5
PRICE PREDICTION OF REAL ESTATE
CHAPTER 1
INTRODUCTION
The real estate sector is the second largest employer after agriculture and
experts have stated that the sector is poised to grow around 20 percent over the
next decade. The real estate sector comprises four sub sectors - housing, retail,
hospitality, and commercial. For the past decades, the high growth of the sector
is matched by the growth of the corporate environment, since there is a demand
for office space as well as urban and semi-urban accommodations. The
construction industry in India ranks third among the 14 primary sectors in terms
of direct, indirect and induced effects in the economy. It is expected that real
sector will incur more non-resident Indian (NRI) investments
We will understand the problem of a real estate company from its CEO and then
apply ML to solve it . In this project I will be walking you through analyzing the
problem from collecting data, importing it to a Jupyter notebook, looking for
promising attributes, finding out correlations, plotting graphs, creating a pipeline,
dealing with missing values.
6
PRICE PREDICTION OF REAL ESTATE
CHAPTER 2
SYSTEM DESIGN
• The user should be able to enter the input values for prediction.
CHAPTER 3
IMPLEMENTATION DETAILS
Algorithms
Ability of system to automatically learn and improve from experience without being
explicitly programmed is called machine learning and it focuses on the development of
computer programs that can access data and use it to learn by themselves. And classifier can
be stated as an algorithm that is used to implement classification especially in concrete
implementation, it also refers to a mathematical function implemented by algorithm that will
map input data into category. It is an instance of supervised learning i.e., where training set
of correctly identified observations is available.
8
PRICE PREDICTION OF REAL ESTATE
The Decision Tree algorithm has a major disadvantage in that it causes over-
fitting. This problem can be limited by implementing the Random Forest
Regression in place of the Decision Tree Regression. Additionally, the
Random Forest algorithm is also very fast and robust than other regression
models.
9
PRICE PREDICTION OF REAL ESTATE
Problem Analysis
The same data set that was used for the Decision Tree Regression is utilized
in this where we have one independent variable Temperature and one
independent variable Revenue which we have to predict. In this problem, we
have to build a Random Forest Regression Model which will study the
correlation between the Temperature and Revenue of the Ice Cream Shop and
predict the revenue for the ice cream shop based on the temperature on a
particular day.
As usual, the NumPy, matplotlib and the Pandas libraries are imported.
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
The data set is imported using the function ‘pd.read_csv’ from my github
repository. In this, we assign the independent variable (X) to the
‘Temperature’ column and the dependent variable (y) to the
‘Revenue’ column.
dataset = pd.read_csv('https://raw.githubusercontent.com/mk-
gurucharan/Regression/master/IceCreamData.csv')X =
dataset['Temperature'].values
y = dataset['Revenue'].valuesdataset.head(5)>>Temperature Revenue
24.566884 534.799028
26.005191 625.190122
27.790554 660.632289
20.595335 487.706960
11.503498 316.240194
10
PRICE PREDICTION OF REAL ESTATE
Step 3: Splitting the dataset into the Training set and Test set
Similar to the Decision Tree Regression Model, we will split the data set, we
use test_size=0.05 which means that 5% of 500 data rows (25 rows) will only
be used as test set and the remaining 475 rows will be used as training set for
building the Random Forest Regression Model.
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.05)
Step 4: Training the Random Forest Regression model on the training set
In this step, we predict the results of the test set with the model trained on the
training set values using the regressor.predict function and assign it to
‘y_pred’.
y_pred = regressor.predict(X_test.reshape(-1,1))
In this step, we shall compare and display the values of y_test as ‘Real
Values’ and y_pred as ‘Predicted Values’ in a Pandas dataframe.
df = pd.DataFrame({'Real Values':y_test.reshape(-1), 'Predicted
Values':y_pred.reshape(-1)})
df>>
Real Values Predicted Values
11
PRICE PREDICTION OF REAL ESTATE
534.622865 510.602018
542.608070 558.764770
618.457277 653.356430
460.402500 449.302331
759.377432 728.037404
631.318237 649.712332
572.672047 583.685756
494.627437 503.075097
250.131728 239.372956
594.651009 635.653662
383.956240 384.531416
491.230603 503.075097
875.019348 933.984685
273.073342 224.659296
500.925064 498.355934
191.623312 193.223331
691.516541 726.817925
421.621505 420.997198
636.298374 653.945550
321.848273 276.772845
283.679657 275.805778
608.936345 589.542982
212.591740 239.372956
594.804871 541.164031
500.065779 524.649546
From the above values, we infer that the model is able to predict the values of
the y_test with a good accuracy though it can be improved by tuning the
hyper-parameters such as n_estimators and max_depth. I leave it to you all to
play with those parameters and improve the accuracy of the Random Forest
Regression Model.
12
PRICE PREDICTION OF REAL ESTATE
In this graph, the Real values are plotted with “Red” color and the Predicted
values are plotted with “Green” color. The plot of the Decision Tree
Regression model is also drawn in “Black” color.
13
PRICE PREDICTION OF REAL ESTATE
CHAPTER 4
METHODOLODY
Pandas numpy
Scikit learn
Matpolotlip.pyplot
14
PRICE PREDICTION OF REAL ESTATE
15
PRICE PREDICTION OF REAL ESTATE
16
PRICE PREDICTION OF REAL ESTATE
Step 4: Experimenting and Trying to get the best accuracy. (In our Project, we have used
RandomForestRegressor method). gives the best accuracy.
17
PRICE PREDICTION OF REAL ESTATE
CHAPTER 5
RESULTS
Algorithm evaluation:
18
PRICE PREDICTION OF REAL ESTATE
CONCLUSION
19
PRICE PREDICTION OF REAL ESTATE
In this project, we tried predicting the realestate price using the various parameters that
were provided in the data about the real estate .On implementation, the prediction results show
the correlation among different attributes considered. Multiple instances, parameters and
various factors can be used to make this sales prediction more innovative and successful.
Accuracy, which plays a key role in prediction-based systems, can be significantly increased
as the number of parameters used are increased. The project can be further collaborated in a
web-based application or in any device supported with an in-built intelligence by virtue of
Internet of Things (IoT), to be more feasible for use. There is a further need of experiments for
proper measurements of both accuracy and resource efficiency to assess and optimize correctly
.
20