0% found this document useful (0 votes)
24 views17 pages

Dimpas Bscpe 2-7 Assignment No.9

Linear regression is a statistical method used to model the relationship between a dependent variable and one or more independent variables, represented by the equation Y=β0+β1X+ϵ. It includes simple and multiple linear regression, relies on key assumptions for valid results, and is widely applicable in fields like business, healthcare, and education. Despite its effectiveness, linear regression has limitations such as sensitivity to outliers and inability to model complex relationships.

Uploaded by

clarecoletted
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views17 pages

Dimpas Bscpe 2-7 Assignment No.9

Linear regression is a statistical method used to model the relationship between a dependent variable and one or more independent variables, represented by the equation Y=β0+β1X+ϵ. It includes simple and multiple linear regression, relies on key assumptions for valid results, and is widely applicable in fields like business, healthcare, and education. Despite its effectiveness, linear regression has limitations such as sensitivity to outliers and inability to model complex relationships.

Uploaded by

clarecoletted
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

LINEAR

REGRESSION
Submitted by: Clare Colette S. Dimpas
BSCpE 2-7
INTRODUCTION TO
LINEAR REGRESSION
Linear regression is a statistical technique
used to model the relationship between a
dependent variable (also known as the
target) and one or more independent
variables (also known as predictors).

Widely used in data analysis, predictive


modeling, and machine learning.

Some real-life examples of it are predicting


house prices, stock trends, or customer
spending.
KEY CONCEPTS
The relationship is expressed
Linear regression aims to find the with the equation:
best-fitting straight line (or
hyperplane in higher dimensions)
that describes how the dependent
variable changes as the independent
variables change. The simplest form,
known as simple linear regression,
involves just one independent
variable and one dependent
variable.
EXAMPLE:
Suppose you want to predict a student’s test score
based on hours studied. The equation could be:

Score = 50 + 5 × Hours Studied + ϵ

Here, 50 is the intercept, 5 is the slope, and ϵ is


the error term.
ASSUMPTIONS OF LINEAR REGRESSION
1. . Linear Relationship: 3. Independence of Observations:
The relationship between the independent Observations should be independent
variable(s) and the dependent variable is of each other
linear.
Example: Test scores of students
Example: Hours studied and test scores should not influence each other.
follow a straight-line relationship.

2. Homoscedasticity: 4. Normal Distribution of Residuals:


The variance of errors (ϵ) is Residuals (differences between
constant across all levels of X. observed and predicted Y) should
follow a normal distribution.
Violations lead to patterns in
residual plots, indicating a poor This is important for hypothesis testing.
fit.
TYPES OF LINEAR REGRESSION
Simple Linear Regression: Multiple Linear Regression:

Involves two or more independent


Involves one independent
variables.
variable.
Example: Predicting house prices
Example: Predicting a person’s using variables like square
weight based on their height. footage, number of rooms, and
location.
Formula: Y = β0 +β1X + ϵ
Formula: Y = β0 + β1X1 + β2X2 + ... +
βnXn + ϵ
SAMPLE PROBLEM 1
Problem: A teacher wants to predict a student’s final exam score based on the
number of hours the student studied.
SAMPLE PROBLEM 1
Solution:
1. Calculate the slope (β1) and intercept (β0​) using formulas:
β0 = Y - β1X and β1 = Cov (X, Y) / Var(X)

2. Substitute these values into the regression equation:


Y = β0 +β1X + ϵ

3. Use the equation to predict Y when X=8X = 8X=8.

Expected Output:
Regression Equation: Y=45+4XY = 45 + 4XY=45+4X.
Prediction for 8 hours: Y=45+4(8)=77Y = 45 + 4(8) = 77Y=45+4(8)=77
SAMPLE PROBLEM 2
Problem: A real estate agent wants to predict the price of a house based on its
size (in square feet) and the number of bedrooms.
SAMPLE PROBLEM 2
Solution:

1. Calculate the coefficients (β0,β1,β2​) using statistical software or matrix


operations (this involves solving the normal equations).

2. Substitute the coefficients into the regression equation: Y = β0 + β1X1 + β2X2

3. Use the equation to predict Y for X1=1800 and X2=3.

Expected Output:
Regression Equation: Y=50,000+100X1+25,000X2
Prediction for 1800 sq ft and 3 bedrooms:
Y=50,000+100(1800)+25,000(3)=325,000.
MODEL EVALUATION
METRICS
1. R^2 (Coefficient of Determination):

Measures how well the independent


variables explain the variability of the
dependent variable.

Ranges from 0 to 1: Higher values indicate a


better fit.

Example: An R^2 of 0.8 means 80% of the


variability in Y is explained by the model.
2. Mean Squared Error (MSE):

Measures the average squared difference


between observed and predicted values.
3. P-Value:
Formula:

Tests the significance of


predictors.

A low p-value (< 0.05)


Smaller values indicate a better fit. suggests the predictor
significantly influences
Y.

4. Residual Analysis:

Inspect residual plots to ensure


assumptions are not violated.
LIMITATIONS OF LINEAR
REGRESSION
1. Linearity Assumption: 2. Outlier Sensitivity:

Linear regression assumes the Outliers can skew results,


relationship between X and Y is making the model less
linear. accurate.

Non-linear data requires


Example: An extremely high
alternative models (e.g.,
house price in a dataset may
polynomial regression).
distort predictions.
3. Multicollinearity
4. Limited by Assumptions:
In multiple regression, high correlation
among independent variables can lead Violating assumptions (e.g., non-
to unreliable estimates of coefficients. constant error variance or
dependent observations) can
Example: Including both "years of invalidate results.
education" and "years of experience"
might create multicollinearity.

5. Cannot Determine Complex Relationships:

Linear regression fails to model interactions or


non-linear patterns in data effectively.
Business: Sales Education: Modeling
forecasting, market trend student performance
analysis. based on study habits.

APPLICATIONS
Healthcare:
Predicting patient
outcomes.
SUMMARY AND CONCLUSION
Linear regression is a fundamental statistical technique used to model the
relationship between a dependent variable and one or more independent
variables. It is grounded in the concept of the linear equation Y=β0+β1X+ϵ, where
β0​ represents the intercept, β1 the slope, and ϵ the error term accounting for
deviations not captured by the model.

The method comes in two main forms: simple linear regression, which involves
one predictor, and multiple linear regression, which handles multiple predictors
for more complex analyses. To produce valid results, linear regression relies on
key assumptions, including a linear relationship between variables, equal variance
of residuals (homoscedasticity), independent observations, and normally
distributed errors.
This technique finds extensive applications across fields such as business
(e.g., sales forecasting), healthcare (e.g., predicting patient outcomes),
and education (e.g., modeling student performance). However, evaluating
the model requires attention to metrics like R2R^2R2, which measures the
proportion of variance explained, and the mean squared error (MSE),
which quantifies prediction accuracy.

Despite its advantages, linear regression has limitations. It is sensitive to


outliers, struggles with multicollinearity in multiple regression, and
cannot effectively handle non-linear or complex data relationships.
Nonetheless, linear regression’s simplicity and effectiveness make it an
essential tool for data analysis, providing a solid foundation for
understanding data trends and relationships while paving the way for
exploring more advanced analytical methods.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy