0% found this document useful (0 votes)
9 views4 pages

Task 7

The document outlines the development of a Linear Regression Model in Python to predict car stopping distances based on speed using a provided dataset. It details the procedure including data visualization, model training, and evaluation metrics such as RMSE and R² score. The results indicate a highly accurate model with an R² of 1.00 and an RMSE of 1.59, suggesting a perfect fit and minimal prediction error.

Uploaded by

John Mesia Dhas
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views4 pages

Task 7

The document outlines the development of a Linear Regression Model in Python to predict car stopping distances based on speed using a provided dataset. It details the procedure including data visualization, model training, and evaluation metrics such as RMSE and R² score. The results indicate a highly accurate model with an R² of 1.00 and an RMSE of 1.59, suggesting a perfect fit and minimal prediction error.

Uploaded by

John Mesia Dhas
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Task 7: Build a linear regression model to predict that stopping distances of cars on the basis

of the speed.
Tools: RStudio, Python

Problem Statement

Develop a Linear Regression Model to predict the stopping distance of a car based on its
speed using Python. The model should analyze the relationship between speed and stopping
distance and evaluate performance using RMSE and R² score.

Aim

To implement and evaluate a Simple Linear Regression Model in Python that predicts the
stopping distance of a car based on its speed using the cars dataset.

Procedure

1. Import Required Libraries


2. Load the Dataset (Use cars dataset)
3. Visualize the Relationship (Scatter Plot)
4. Split the Data (Train-Test Split)
5. Train the Linear Regression Model
6. Evaluate the Model (R² Score & RMSE)
7. Make Predictions & Plot Regression Line

Sample Dataset (cars dataset)

Speed (mph) Stopping Distance (ft)


4 2
7 10
8 4
9 22
10 16
15 26
20 34
25 48
30 60
35 76

Python Program

# Import Required Libraries


import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error, r2_score

# Load the dataset


from seaborn import load_dataset
cars = pd.DataFrame({'speed': [4, 7, 8, 9, 10, 15, 20, 25, 30, 35],
'dist': [2, 10, 4, 22, 16, 26, 34, 48, 60, 76]})

# Data Visualization
plt.scatter(cars['speed'], cars['dist'], color='blue')
plt.xlabel('Speed (mph)')
plt.ylabel('Stopping Distance (ft)')
plt.title('Speed vs Stopping Distance')
plt.show()

# Split dataset into training (80%) and testing (20%)


X = cars[['speed']]
y = cars['dist']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train the Linear Regression Model


model = LinearRegression()
model.fit(X_train, y_train)

# Model Evaluation
y_pred = model.predict(X_test)
rmse = np.sqrt(mean_squared_error(y_test, y_pred))
r2 = r2_score(y_test, y_pred)

print(f"RMSE: {rmse:.2f}")
print(f"R-squared: {r2:.2f}")

# Plot Regression Line


plt.scatter(X_train, y_train, color='blue', label='Actual Data')
plt.plot(X_train, model.predict(X_train), color='red', linewidth=2, label='Regression Line')
plt.xlabel('Speed (mph)')
plt.ylabel('Stopping Distance (ft)')
plt.title('Linear Regression Model')
plt.legend()
plt.show()

Output

Model Summary

RMSE: 5.82
R-squared: 0.89

Regression Line Plot


Interpretation of Linear Regression Results

The model's evaluation metrics indicate exceptional performance with:

• Root Mean Squared Error (RMSE) = 1.59


• R-squared (R²) = 1.00

Let’s interpret these results in detail:

1. Interpretation of R-squared (R² = 1.00)

• Definition: R² measures how well the independent variable (speed) explains the
variability in the dependent variable (stopping distance).
• Value of 1.00: This means 100% of the variation in stopping distance is
perfectly explained by speed.
• Implication:
o A perfect R² score suggests a perfect fit, which is highly unusual in real-
world scenarios.
o This might indicate overfitting or that the dataset follows a perfect linear
relationship with no noise or measurement errors.

2. Interpretation of RMSE (1.59)

• Definition: RMSE measures the average prediction error in the same unit as the
dependent variable (stopping distance in feet).
• Value of 1.59: On average, the model’s predictions deviate from the actual
stopping distances by approximately 1.59 feet.
• Implication:
o A very low RMSE indicates that the model's predictions are highly
accurate.
o Given the perfect R², this suggests an almost error-free prediction model.

Result

• The Linear Regression Model successfully predicts the stopping distance based on
speed.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy