0% found this document useful (0 votes)
18 views5 pages

Project Synopsis Shaiba

Uploaded by

Arshad Zakaria
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views5 pages

Project Synopsis Shaiba

Uploaded by

Arshad Zakaria
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Project Synopsis

on topic
House Price Prediction

Name Shaiba Sami

USN 221VMBR02724

Elective DATA SCIENCE

Date of Submission 06/04/2024

0
House Price Prediction

 Problem Statement

Continuously increasing demands for affordable housing dominates the housing


market in India. Even though there is a big demand for housing in the country, we do
not have accurate measures of housing prices based on the vast amount of data
available. Therefore, the goal of this project is to use Machine Learning to predict the
selling prices of houses based on different factors.

Machine Learning algorithms (hereafter ML) will be used in this project to predict the
price of a house with precise result so that it may help the buyer/seller of the house to
evaluate the justified price.

 Project Summary

One of the basic needs of every person is to own a house. But, cost/price of house is
one of the important factors that affect the decision making of a buyer/seller in owing
a house. Therefore, estimating/predicting the price of a house plays a crucial role in
decision making of the buyer. Despite the fact that price of house plays a prominent
role in decision making we don’t have any reliable and precise price estimation
techniques based on bulk amount of available data.

In this project linear regression algorithm in machine learning will be used to predict
the house price trends. Linear regression is a supervised learning algorithm which

1
establishes a linear relationship between a dependent variable (target) and one or more
independent variables (features).

Herein this project, the dependent variable will be the house price, and the independent
variables can be factors like the size of the house, number of bedrooms, location, etc.

The House Price Prediction models using linear regression involves following steps:

1. Dataset Collection: Gather historical house price data and corresponding


features from different platforms. Here the dataset has been provided by the
department.

2. Data Pre-processing: Cleaning of data, handle missing values, handling


outliers, removing duplicate values and perform feature engineering, such as
converting categorical variables to numerical representations.

3. Splitting the Dataset: Dataset is required to be divided into training and


testing sets for model building and evaluation.
4. Building the Model: A linear regression model will be built to learn the
relationships between features and house prices.
5. Model Evaluation: Assess the model’s performance on the testing set
using metrics like MSE or RMSE.
6. Deployment and Prediction: Deploy the model into a real-world
application to predict the price of house under sale based on user inputs.

 Objectives of the Project

 Create a Predictive Model: One of the prominent objectives of this project,


is to create a Predictive Model using Machine Learning which can predict
efficiently price of a House for sale with respect to their budgets and priorities by
analysing various parameters like total/living area, no. of bedrooms, locations etc.
 Ease the tasks of stakeholders of Real Estate Market: The model will
predict house prices accurately which will aid to buyers in making informed
decisions about their investments. However, for sellers the model will assists in

2
setting competitive prices for their properties. Besides that the Real estate agents
may also be benefitted from better market insights and improved negotiation
strategies.

 Methodology of the Project


The methodology for this project involves following work flows:

DOMAIN DATA
PROBLEM TO COLLECTION
  TRAINING  DEPLOYING
ML AND
PROBLEM PREPARATION

 Domain problem to ML problem:

o Domain Problem:
In this project we are going to predict the price of a house for buyers/sellers
based on their preferences like no of bedrooms, area of the house, locality
etc.

o ML problem:
This problem can be solved using supervised learning in Machine Learning.
It is treated as a Regression problem as the target is continuously valued.

 Data collection and preparation:

This phase consists of three parts they are: -

o Data Collection:

In order to proceed further into this project, first of all we should collect the
data. After getting a dataset, then we should pre-process the data and then
we do the exploratory data analysis on the given data set.

o Data Pre-Processing:

This phase consists of handling of missing values, duplicate values and


outliers within the data sets.

3
For handling of missing values from data sets approaches like Data
Dropping, Mean/Median Imputation, Random Sample Imputation and
Multiple Imputation etc.

Duplicate values in a data set may be handled by deleting those duplicate


values.

Handling outliers consists of identifying the outliers using methods like


Boxplot, Z-Score. Subsequently, those identified outliers may be removed
using methods like Trimming, Capping, Imputation, Discretization etc.

o Exploratory Data Analysis (EDA):

Exploratory Data Analysis (EDA) is one of the important step for Machine
Learning Module. EDA is the process of analysing the dataset to identify
patterns, relationships, and outliers.
The purpose of EDA is to use summary statistics and visualizations to better
understand data, and find clues about the tendencies of the data, its quality
and to formulate assumptions and the hypothesis of our analysis.

 Model Development

Once the data from the dataset are cleaned and visualized, the next step
will be to build a model to predict the sale price of a house. In this regard,
different prediction models will be used, including linear
regression, KNN regression, etc. We will use a series of models and
pipelines to find the best model by evaluating the model’s accuracy,
precision, and recall. I will also use cross-validation to ensure that the
model is generalizing well.
Thereafter, the model will be trained on the training data, and their hyper
parameters will be tuned using techniques like grid search or random search.

 Limitations of the Project

The model will be trained and tested on limited data (i.e. provided dataset;
the machine learning model may produce residual errors.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy