0% found this document useful (0 votes)
36 views

ML Practical 04

This document provides steps for building a simple linear regression model to predict house prices based on house area using Python. The steps include: 1) Importing necessary libraries and loading the dataset 2) Exploring and visualizing the data 3) Splitting the data into training and testing sets 4) Creating and training a linear regression model on the training set 5) Making predictions on the test set and evaluating the model's performance 6) Visualizing the regression line and predicted prices 7) Allowing users to input an area to predict the corresponding house price.

Uploaded by

chatgptlogin2001
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
36 views

ML Practical 04

This document provides steps for building a simple linear regression model to predict house prices based on house area using Python. The steps include: 1) Importing necessary libraries and loading the dataset 2) Exploring and visualizing the data 3) Splitting the data into training and testing sets 4) Creating and training a linear regression model on the training set 5) Making predictions on the test set and evaluating the model's performance 6) Visualizing the regression line and predicted prices 7) Allowing users to input an area to predict the corresponding house price.

Uploaded by

chatgptlogin2001
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

ITC 2252 - Introduction

to Machine Learning
Practical Session - 04
Steps of the process
01 Import Data
02 Clean the Data
03 Split the data to testing & Training
04 Design the model
05 Train the Model
06 Make Predictions
07 Evaluate and Improve
Today’s session
Simple linear regression model building
01 Split the data to testing & Training
02 Design the model
03 Train the Model
04 Make Predictions
01
Linear Regression
What is Linear Regression?
Linear regression is a statistical method used for modeling the
relationship between a dependent variable and one or more independent
variables by fitting a linear equation to the observed data.
How to predict house price according to the area of the house?
House area (m2) Price ($)

10000 4000

20000 5000

30000 6000

40000 7000

50000 8000

We can use linear regression model for the price prediction.


Y = a +bx
Y = Dependent variable (Price)
X = Independent variable (House area)
a = y intercept (value of the dependent variable when x = 0)
b = coefficient of the independent variable
02
Model building
Step 1: Import necessary libraries

This step imports the required Python libraries:


➔ pandas for data frame creation and manipulation.
➔ matplotlib.pyplot for data visualization.
➔ train_test_split from sklearn.model_selection to split the dataset into training and
testing sets.
➔ LinearRegression from sklearn.linear_model for building a linear regression model.
➔ mean_squared_error from sklearn.metrics to evaluate the model's performance.
Step 2: Load the dataset

This step reads the Housing dataset from a CSV file into a pandas DataFrame named
data.

Step 3: Explore the data

This prints the first few rows of the dataset(head), giving you an idea about its structure.
Step 4: Visualize the data

This step creates a scatter plot to visually represent the relationship between the 'Area'
and 'Price' columns.
Step 5: Prepare the data for training

This separates the independent variable (X - 'Area') and the dependent variable (y - 'Price').
Step 6: Split the data into training and testing sets

This splits the data into training and testing sets.


➔ The test_size parameter determines the proportion of the data used for testing (in
this case, 20%).
➔ The random _state parameter ensures that the split is fixed, meaning that the same
split will be produced every time you run the code.
Step 6: Split the data into training and testing sets cont.
The purpose of splitting the data into training and testing sets is to evaluate how well the machine
learning model generalizes to new, unseen data.
Training Set:
❖ Purpose: The model learns the patterns and relationships within the training data.
❖ Benefit: The model adjusts its parameters based on this data to make accurate predictions.
Testing Set:
❖ Purpose: The model's performance is assessed on data it has never seen before.
❖ Benefit: This evaluation provides an estimate of how well the model is likely to perform on new,
real-world data.
Test Size Parameter:
❖ Purpose: It determines the proportion of the data allocated to the testing set.
❖ Benefit: A larger test set can provide a more reliable evaluation, but a smaller test set may lead
to more data for training.
Random State Parameter:
❖ Purpose: It ensures reproducibility by fixing the random seed for the data split.
❖ Benefit: With the same random state, the data split remains consistent across runs, making
experiments reproducible.
Step 7: Create and train the linear regression model

This step creates a Linear Regression Model:


model = LinearRegression(): This line creates an instance of the LinearRegression class from the scikit-learn
library. This instance (model) will be used to represent the linear regression model.
Train the Model: model.fit(X_train, y_train): This line trains the linear regression model using the training data.
The fit method takes two main parameters:
● X_train: The input features (independent variable) from the training set. In the context of house price
prediction, it represents the 'Area' of the house.
● y_train: The target variable (dependent variable) from the training set. In this case, it represents the
corresponding house prices.
The fit method adjusts the model's parameters (slope and intercept) to find the best-fit line that minimizes the
difference between the predicted values and the actual values in the training data.
After this line is executed, the model object is now trained and can be used to make predictions on new,
unseen data.
Step 8: Make predictions on the test set

This step uses the trained model to make predictions on the test set.
● model.predict(X_test): This line uses the trained model to predict the dependent
variable (y) based on the independent variable (X_test), which represents the 'Area'
of the houses in the test set.
● y_pred : The predicted values are stored in the variable y_pred.
Step 9: Evaluate the model

This calculates the Mean Squared Error, a metric to evaluate how well the model is
performing on the test data.
Step 10: Visualize the regression line

This step visualizes the regression line along with the test set to understand how well
the model fits the data.
Step 11: Predict house price for user input

1. This step takes user input for the area of the house, converts it to a DataFrame with
the column name 'Area'.
2. Then uses the trained model to predict the house price based on the user's input.
3. The predicted price is then displayed.
4. This allows users to get a predicted house price for a specific area without having
to look at the entire dataset.
Thanks
Do you have any questions?

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy