Task 7
Task 7
of the speed.
Tools: RStudio, Python
Problem Statement
Develop a Linear Regression Model to predict the stopping distance of a car based on its
speed using Python. The model should analyze the relationship between speed and stopping
distance and evaluate performance using RMSE and R² score.
Aim
To implement and evaluate a Simple Linear Regression Model in Python that predicts the
stopping distance of a car based on its speed using the cars dataset.
Procedure
Python Program
# Data Visualization
plt.scatter(cars['speed'], cars['dist'], color='blue')
plt.xlabel('Speed (mph)')
plt.ylabel('Stopping Distance (ft)')
plt.title('Speed vs Stopping Distance')
plt.show()
# Model Evaluation
y_pred = model.predict(X_test)
rmse = np.sqrt(mean_squared_error(y_test, y_pred))
r2 = r2_score(y_test, y_pred)
print(f"RMSE: {rmse:.2f}")
print(f"R-squared: {r2:.2f}")
Output
Model Summary
RMSE: 5.82
R-squared: 0.89
• Definition: R² measures how well the independent variable (speed) explains the
variability in the dependent variable (stopping distance).
• Value of 1.00: This means 100% of the variation in stopping distance is
perfectly explained by speed.
• Implication:
o A perfect R² score suggests a perfect fit, which is highly unusual in real-
world scenarios.
o This might indicate overfitting or that the dataset follows a perfect linear
relationship with no noise or measurement errors.
• Definition: RMSE measures the average prediction error in the same unit as the
dependent variable (stopping distance in feet).
• Value of 1.59: On average, the model’s predictions deviate from the actual
stopping distances by approximately 1.59 feet.
• Implication:
o A very low RMSE indicates that the model's predictions are highly
accurate.
o Given the perfect R², this suggests an almost error-free prediction model.
Result
• The Linear Regression Model successfully predicts the stopping distance based on
speed.