0% found this document useful (0 votes)

26 views5 pages

Cl-Vii Ass2 4301063

The document discusses generating a 2D dataset, splitting it into training and test sets, performing linear regression with least squares method, analyzing training and test MSE, bias-variance tradeoff, cross validation, and subset selection. Code is provided to implement linear regression on a head size and brain weight dataset, calculate MSE, and determine R2 score.

Uploaded by

ATHARVA SHINDE

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

26 views5 pages

Cl-Vii Ass2 4301063

Uploaded by

ATHARVA SHINDE

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

Name: Akkshada Anil Jagtap Roll No.

: 4301002

Assignment No. 2

Aim: Generate a proper 2-D data set of N points. Split the data set
into Training Data set and Test Data set.
i) Perform linear regression analysis with Least Squares Method.
ii) Plot the graphs for Training MSE and Test MSE and comment on Curve Fitting and
Generalization Error.
iii) Verify the Effect of Data Set Size and Bias-Variance Tradeoff.
iv) Apply Cross Validation and plot the graphs for errors.
v) Apply Subset Selection Method and plot the graphs for errors.
vi) Describe your findings in each case

Theory :

Mean Squared Error: General steps to calculate MSE the from a set of
X and Y values:
1. Find the regression line.
2. Insert your X values into the linear regression equation to find the new Y values (Y').
3. Subtract the new Y value from the original to get the error.
4. Square the errors.
5. Add up the errors.
6. Find the mean.

Simple Linear Regression :

When we have a single input attribute (x) and we want to use linear regression, this is
called simple linear regression.
If we had multiple input attributes (e.g. x1, x2, x3, etc.) This would be called multiple linear
regression. The procedure for linear regression is different and simpler than that for
multiple linear regression, so it is a good place to start.
In this section we are going to create a simple linear regression model from our training
data, then make predictions for our training data to get an idea of how well the model
learned the relationship in the data. With simple linear regression we want to model our
data as follows: y = B0 + B1 * x This is a line where y is the output variable we want to
predict, x is the input variable we know and B0 and B1 are coefficients that we need to
estimate that move the line around.
Technically, B0 is called the intercept because it determines where the line intercepts the y-
axis. In machine learning we can call this the bias, because it is added to offset all
predictions that we make. The B1 term is called the slope because it defines the slope of the
line or how x translates into a y value before we add our bias. The goal is to find the best
estimates for the coefficients to minimize the errors in predicting y from x. Simple
regression is great, because rather than having to search for values by trial and error or
calculate them analytically using more advanced linear algebra, we can estimate them
directly from our data.
The goal is to find the best estimates for the coefficients to minimize the errors in
predicting y from x. Simple regression is great, because rather than having to search for
values by trial and error or calculate them analytically using more advanced linear algebra,
we can estimate them directly from our data. We can start off by estimating the value for
B1 as:
B1 = ∑ ( 𝑋 𝑖 − 𝑋) ∗ ( 𝑌 𝑖 − 𝑌) / ∑ ( 𝑋 𝑖 − 𝑋^2)
Where mean() is the average value for the variable in our dataset. The xi and yi refer to the
fact that we need to repeat these calculations across all values in our dataset and i refers to
the i’th value of x or y. We can calculate B0 using B1 and some statistics from our dataset,
as follows:
B0 =( 𝐵 1 ∗ 𝑋)

Implementation:
# Making imports
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

data=pd.read_csv("headbrain.csv")

data.head()

Gender Age Range Head Size(cm^3) Brain Weight(grams)

0 1 1 4512 1530
1 1 1 3738 1297
2 1 1 4261 1335
3 1 1 3777 1282
4 1 1 4177 1590
Let us divide data set in training and testing data sets
X_train =data[:200]
X_test =data[200:]

X = X_train['Head Size(cm^3)'].values
Y = X_train['Brain Weight(grams)'].values
mean_x = np.mean(X)
mean_y = np.mean(Y)

print("Printing mean x")

print(mean_x)
print("Printing mean y")
print(mean_y)
m = len(X)
print("Number of samples in training set")
print(m)
numer = 0
denom = 0
for i in range(m):
numer += (X[i] - mean_x) * (Y[i] - mean_y)
denom += (X[i] - mean_x) ** 2
b1 = numer / denom
b0 = mean_y - (b1 * mean_x)
print("Coefficient and bias is as follows")
print(b1, b0)

Printing mean x
3679.225
Printing mean y
1299.01
Number of samples in training set
200
Coefficient and bias is as follows
0.24984027563731726 379.79141186829145

Plotting Values and Regression Line

max_x = np.max(X)
min_x = np.min(X)

Calculating line values x and y

x = np.linspace(min_x, max_x)
y = b0 + b1 * x

Ploting Line & Scatter Points

#Plotting Line
plt.plot(x, y, color='YELLOW', label='Regression Line')
# Ploting Scatter Points
plt.scatter(X, Y, c='GREEN', label='Scatter Plot headsize vs brain
wt')
plt.xlabel('Head Size in cm3')
plt.ylabel('Brain Weight in grams')
plt.legend()
plt.show()

Calculating Root Mean Squares Error

sse = 0
for i in range(m):
y_pred = b0 + b1 * X[i] #y_pred i.e brain wt= bo=b1* head size
sse += (Y[i] - y_pred) ** 2 #Y[i] means brain weight
print("Meansqaureerrorofbrainwt oftraindata",sse)
mse = sse/m
rmse = np.sqrt(mse)
print("Root Mean sqaure error is",rmse)

Meansqaureerrorofbrainwt oftraindata 1069588.9925686093

Root Mean sqaure error is 73.12964489755879
Score of determination starts here.The coefficient of determination
(denoted by R2) is a key output of regression analysis.The coefficient
of determination is the square of the correlation (r) between
predicted y scores and actual y scores; thus, it ranges from 0 to 1.
ss_t = 0
ss_r = 0
for i in range(m):
y_pred = b0 + b1 * X[i]
ss_t += (Y[i] - mean_y) ** 2
ss_r += (Y[i] - y_pred) ** 2
scorer2 = 1 - (ss_r/ss_t)
print("R^2 score for training data is",scorer2)

R^2 score for training data is 0.5949551398852462

Conclusion: Hence,we successfully studied to generate a proper 2-D

data set of N points. Split the data set into Training Data set and Test
Data set.

Textbook of Neonatal Resuscitation 8th Edition
83% (18)
Textbook of Neonatal Resuscitation 8th Edition
333 pages
ML Unit-2
No ratings yet
ML Unit-2
138 pages
CPSC 4830 2025summer Lecture 3
No ratings yet
CPSC 4830 2025summer Lecture 3
33 pages
2.1 ML (Implementation of Simple Linear Regression in Python)
No ratings yet
2.1 ML (Implementation of Simple Linear Regression in Python)
8 pages
ML Lab Manual
100% (1)
ML Lab Manual
37 pages
Gradient Descent
No ratings yet
Gradient Descent
16 pages
Module 3: Linear Regression: TMA4268 Statistical Learning V2025
No ratings yet
Module 3: Linear Regression: TMA4268 Statistical Learning V2025
110 pages
ML Combined
No ratings yet
ML Combined
254 pages
Pathfit 1. Lesson 1-3
No ratings yet
Pathfit 1. Lesson 1-3
7 pages
Lecture 09 - 02.09.2024 - Regression-01
No ratings yet
Lecture 09 - 02.09.2024 - Regression-01
62 pages
BA501 Week5 Linear Regression
No ratings yet
BA501 Week5 Linear Regression
45 pages
UnivariateRegression Summary
No ratings yet
UnivariateRegression Summary
36 pages
50 Inference
No ratings yet
50 Inference
31 pages
d3 It ML Jan 2023 Part 2
No ratings yet
d3 It ML Jan 2023 Part 2
32 pages
Art Curriculum Overview
100% (1)
Art Curriculum Overview
13 pages
Regression
No ratings yet
Regression
16 pages
ML Ai
No ratings yet
ML Ai
53 pages
Linear Regression With Python
No ratings yet
Linear Regression With Python
140 pages
Lec 3 Regression.
No ratings yet
Lec 3 Regression.
20 pages
Generative AI and Higher Education
No ratings yet
Generative AI and Higher Education
61 pages
Unit 5
No ratings yet
Unit 5
171 pages
Lecture-2 Unit 2
No ratings yet
Lecture-2 Unit 2
56 pages
Public Administration
No ratings yet
Public Administration
17 pages
Answer PDF Lab
No ratings yet
Answer PDF Lab
34 pages
Lec2 ASE
No ratings yet
Lec2 ASE
86 pages
Unit 2
No ratings yet
Unit 2
80 pages
Machine Learning (CSEN3203) 1-14
No ratings yet
Machine Learning (CSEN3203) 1-14
15 pages
Action Research Proposal Sample
100% (3)
Action Research Proposal Sample
16 pages
Import Pandas As PD
No ratings yet
Import Pandas As PD
3 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
30 pages
Lecture 3 - Linear Regression Imran 20022025 092939am
No ratings yet
Lecture 3 - Linear Regression Imran 20022025 092939am
46 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
5 pages
Linear Regression
No ratings yet
Linear Regression
1 page
Assignment 2
No ratings yet
Assignment 2
5 pages
ML Manoj
No ratings yet
ML Manoj
51 pages
Assignment 1
No ratings yet
Assignment 1
5 pages
Organizational Behavior Summary Chapter 7 Motivation Concept
No ratings yet
Organizational Behavior Summary Chapter 7 Motivation Concept
6 pages
Regression
No ratings yet
Regression
16 pages
Preliminaries: - Prayer - in - Energizer - Asedilla, Gwyneth Kyle - Checking of Attendance - Review
No ratings yet
Preliminaries: - Prayer - in - Energizer - Asedilla, Gwyneth Kyle - Checking of Attendance - Review
20 pages
Regression Dataset Example
No ratings yet
Regression Dataset Example
14 pages
ML Practical File
100% (2)
ML Practical File
43 pages
Division One On One Reading 2023 1
No ratings yet
Division One On One Reading 2023 1
65 pages
Data Science Chapitre 2
No ratings yet
Data Science Chapitre 2
98 pages
Supervised Machine Learning - Regression
No ratings yet
Supervised Machine Learning - Regression
34 pages
Practical # 10
No ratings yet
Practical # 10
5 pages
Unit A1 Translating and Translation
No ratings yet
Unit A1 Translating and Translation
6 pages
College Grades 1st Sem (S.Y 2019-2020)
No ratings yet
College Grades 1st Sem (S.Y 2019-2020)
51 pages
NRP Math
100% (1)
NRP Math
2 pages
Data Science Manual
No ratings yet
Data Science Manual
16 pages
DLP - Arithmetic Sequence
No ratings yet
DLP - Arithmetic Sequence
10 pages
LP III Lab Manual
100% (1)
LP III Lab Manual
8 pages
4013 WEEK 3 Gass & Mackey 2020
No ratings yet
4013 WEEK 3 Gass & Mackey 2020
31 pages
BlackList Students
No ratings yet
BlackList Students
8 pages
Grammar Quiz Bee
No ratings yet
Grammar Quiz Bee
35 pages
Math and Diversity Course Syllabus FA17
No ratings yet
Math and Diversity Course Syllabus FA17
5 pages
Fdsa Question-Bank
No ratings yet
Fdsa Question-Bank
7 pages
MATH6183 Introduction+Regression
No ratings yet
MATH6183 Introduction+Regression
70 pages
Complex Variables
No ratings yet
Complex Variables
2 pages
BC Alumni NEwsletter February 2023 Final
No ratings yet
BC Alumni NEwsletter February 2023 Final
16 pages
Machine Learning 2
No ratings yet
Machine Learning 2
45 pages
Section 2
No ratings yet
Section 2
22 pages
Wiley Society For Research in Child Development
No ratings yet
Wiley Society For Research in Child Development
13 pages
Lab 11,12
No ratings yet
Lab 11,12
7 pages
Mungadze Linear
No ratings yet
Mungadze Linear
21 pages
Simple Linear Regression in Machine Learning
No ratings yet
Simple Linear Regression in Machine Learning
7 pages
Unit 2 Regression Analysis
No ratings yet
Unit 2 Regression Analysis
16 pages
A. Kornberg - Never A Dull Enzyme
No ratings yet
A. Kornberg - Never A Dull Enzyme
33 pages
1993 Transforming Mission - Book Review
No ratings yet
1993 Transforming Mission - Book Review
3 pages
LMC02 App
No ratings yet
LMC02 App
3 pages
Linear Regression With Assumpt
No ratings yet
Linear Regression With Assumpt
3 pages
Annual Summary Report 2015 - Minhaj Welfare Foundation
No ratings yet
Annual Summary Report 2015 - Minhaj Welfare Foundation
44 pages
Bias Variance Ridge Regression
No ratings yet
Bias Variance Ridge Regression
4 pages
Lab 8
No ratings yet
Lab 8
13 pages
Data Science Lab 5
No ratings yet
Data Science Lab 5
8 pages
DS303: Introduction To Machine Learning: Manjesh K. Hanawal
No ratings yet
DS303: Introduction To Machine Learning: Manjesh K. Hanawal
17 pages
Experiment 1
No ratings yet
Experiment 1
17 pages
Social Work ESC Motivation Letter
No ratings yet
Social Work ESC Motivation Letter
2 pages
Group Work Assignment Supervised and Unsupervised Learning
No ratings yet
Group Work Assignment Supervised and Unsupervised Learning
10 pages
SAMPLE-LP
No ratings yet
SAMPLE-LP
7 pages
Chapter 5 Learning Deterministic Models
No ratings yet
Chapter 5 Learning Deterministic Models
28 pages
FYMCA IDSLab A6 Submission
No ratings yet
FYMCA IDSLab A6 Submission
9 pages
Hameed
No ratings yet
Hameed
2 pages
Course Outline (EEE315Lab)
No ratings yet
Course Outline (EEE315Lab)
4 pages
Chandigarh Group of Colleges College of Engineering Landran, Mohali
No ratings yet
Chandigarh Group of Colleges College of Engineering Landran, Mohali
47 pages
Simple Linear Regression: Math Behind
No ratings yet
Simple Linear Regression: Math Behind
6 pages
Chapter 5. Regression Models: 1 A Simple Model
No ratings yet
Chapter 5. Regression Models: 1 A Simple Model
49 pages
Synthesis Writing Template: I. Introduction - MUST HAVE ALL THREE
No ratings yet
Synthesis Writing Template: I. Introduction - MUST HAVE ALL THREE
4 pages
Cultivating Chemistry - How To Spark Emotional Connection On Dates
No ratings yet
Cultivating Chemistry - How To Spark Emotional Connection On Dates
3 pages
Pre-Calculus Essentials
From Everand
Pre-Calculus Essentials
Ernest Woodward
No ratings yet
Profound Python Data Science
From Everand
Profound Python Data Science
Onder Teker
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Cl-Vii Ass2 4301063

Uploaded by

Cl-Vii Ass2 4301063

Uploaded by

Name: Akkshada Anil Jagtap Roll No.

Simple Linear Regression :

Gender Age Range Head Size(cm^3) Brain Weight(grams)

print("Printing mean x")

Plotting Values and Regression Line

Calculating line values x and y

Ploting Line & Scatter Points

Calculating Root Mean Squares Error

Meansqaureerrorofbrainwt oftraindata 1069588.9925686093

R^2 score for training data is 0.5949551398852462

Conclusion: Hence,we successfully studied to generate a proper 2-D

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.