Seminar Ppt4
Seminar Ppt4
ON
• Data Collection
• Data Visualization
• Data Pre-Processing
• EDA
• Performance Metrices
• Model Fitting
• SWOT
• Application
• Conclusion
ABSTRACT
• House price prediction is an important topic of real estate. The literature attempts to
derive useful knowledge from historical data of property markets. Machine learning
techniques are applied to analyze historical property transactions in India to discover
useful models for house buyers and sellers.
• Moreover, experiments demonstrate that the Linear Regression , Random forest and Xgboost
that is based on mean squared error measurement is a competitive approach.
INTRODUCTION
• Machine learning is an area of artificial intelligence (AI) with a concept that a computer program can learn
and adapt to new data without human intervention.
• Supervised learning is the types of machine learning in which machines are trained using well "labelled"
training data, and on basis of that data, machines predict the output. The labelled data means some input
data is already tagged with the correct output..
• Linear Regression
• Decision Tree
• House is one of human life's most essential needs, along with other fundamental needs such as food, water.
House price prediction can be done using multiple prediction models (Machine Learning Model) such as
linear regression ,Random Forest etc.
TOOLS USED
These are the tools used to create the house price prediction
model using machine learning
DATA COLLECTION
To create a machine learning model, the first thing we required is a dataset as a machine
learning model completely works on data. The collected data for a particular problem in a
proper format is known as the dataset.
DATA VISUALISATION
Data visualization is a crucial aspect of machine learning that enables analysts to understand and
make sense of data patterns, relationships, and trends.
STATISTICAL ANALYSIS OF FEATURES
Descriptive Statistics
Analyzing key statistical measures
such as mean, median, standard
deviation, and skewness to understand
the central tendencies and distributions
of the features.
Feature Importance
Utilizing statistical techniques to
determine the significance of each
feature in influencing house prices.
Correlation Analysis
Exploring the correlation matrix to
identify the relationships between
features and their potential impact on
predicting house prices.
DATA PRE-PROCESSING
Getting the dataset
Importing libraries
Importing datasets
Finding Missing Data
Encoding Categorical Data
Splitting dataset into training and test set
Feature scaling
EDA
Import libraries and dataset
Correlation between attributes
Missing Values
i. Finding missing values
ii. Inputting missing values
Feature Engineering
Preparing Data for Modelling
i. Dropping high correlated variables
ii. Removing outliers
R Squared (R2)
It is classified as a
microframework because it does
not require particular tools or
libraries. It has no database or any
other components where pre-
existing third-party libraries provide
common functions.
SWOT ANALYSIS
Strengths Weaknesses
1. Accuracy 1. Data Quality
2. Efficiency 2. Model Complexity
3. Scalability 3. Overfitting
4. Data-driven Insights 4. External Factors
Threats Opportunities
1. Competition 1. Feature Engineering
2. Cybersecurity Risks 2. Integration with Other Systems
3. Model Decay 3. Personalization
4. Ethical Concerns 4. Education and Outreach
APPLICATION OF MACHINE LEARNING MODELS
Real Estate
Investment
Property Valuation
Property Tax
Assessment
Personal Finance
Insurance
Market Analysis
CONCLUSION
• The RSME of the random forest model are very low, which tells us the
prediction in the random forest model tends to be more central than
regression model.
• However the R squared of the training set is very good. But the R
squared of the test set is relatively low, which may show that random
forest model is a little bit overfitting.
• The RSME and R squared of Xgboost are both good and shows that
model is reasonable and not overfitting, hence Xgboost is the final
model.
REFERENCE
• https://towardsdatascience.com/workflow-of-a-machine-learning-
projectec1dba419b94
• https://www.geeksforgeeks.org/what-is-reinforcement-learning/
• https://www.javatpoint.com/unsupervised-machine-learning
• https://www.edureka.co/blog/supervised-learning/
• https://www.einfochips.com/blog/understanding-image-recognition-and-its-uses/
• https://www.caranddriver.com/research/a31996016/what-is-a-self-driving-car/
• https://en.wikipedia.org/wiki/Virtual_assistant
• https://deepmind.com/blog/article/traffic-prediction-with-advanced-graphneural-
networks
Thank you