Project Synopsis Shaiba
Project Synopsis Shaiba
on topic
House Price Prediction
USN 221VMBR02724
0
House Price Prediction
Problem Statement
Machine Learning algorithms (hereafter ML) will be used in this project to predict the
price of a house with precise result so that it may help the buyer/seller of the house to
evaluate the justified price.
Project Summary
One of the basic needs of every person is to own a house. But, cost/price of house is
one of the important factors that affect the decision making of a buyer/seller in owing
a house. Therefore, estimating/predicting the price of a house plays a crucial role in
decision making of the buyer. Despite the fact that price of house plays a prominent
role in decision making we don’t have any reliable and precise price estimation
techniques based on bulk amount of available data.
In this project linear regression algorithm in machine learning will be used to predict
the house price trends. Linear regression is a supervised learning algorithm which
1
establishes a linear relationship between a dependent variable (target) and one or more
independent variables (features).
Herein this project, the dependent variable will be the house price, and the independent
variables can be factors like the size of the house, number of bedrooms, location, etc.
The House Price Prediction models using linear regression involves following steps:
2
setting competitive prices for their properties. Besides that the Real estate agents
may also be benefitted from better market insights and improved negotiation
strategies.
DOMAIN DATA
PROBLEM TO COLLECTION
TRAINING DEPLOYING
ML AND
PROBLEM PREPARATION
o Domain Problem:
In this project we are going to predict the price of a house for buyers/sellers
based on their preferences like no of bedrooms, area of the house, locality
etc.
o ML problem:
This problem can be solved using supervised learning in Machine Learning.
It is treated as a Regression problem as the target is continuously valued.
o Data Collection:
In order to proceed further into this project, first of all we should collect the
data. After getting a dataset, then we should pre-process the data and then
we do the exploratory data analysis on the given data set.
o Data Pre-Processing:
3
For handling of missing values from data sets approaches like Data
Dropping, Mean/Median Imputation, Random Sample Imputation and
Multiple Imputation etc.
Exploratory Data Analysis (EDA) is one of the important step for Machine
Learning Module. EDA is the process of analysing the dataset to identify
patterns, relationships, and outliers.
The purpose of EDA is to use summary statistics and visualizations to better
understand data, and find clues about the tendencies of the data, its quality
and to formulate assumptions and the hypothesis of our analysis.
Model Development
Once the data from the dataset are cleaned and visualized, the next step
will be to build a model to predict the sale price of a house. In this regard,
different prediction models will be used, including linear
regression, KNN regression, etc. We will use a series of models and
pipelines to find the best model by evaluating the model’s accuracy,
precision, and recall. I will also use cross-validation to ensure that the
model is generalizing well.
Thereafter, the model will be trained on the training data, and their hyper
parameters will be tuned using techniques like grid search or random search.
The model will be trained and tested on limited data (i.e. provided dataset;
the machine learning model may produce residual errors.