0% found this document useful (0 votes)
20 views97 pages

Chapter 4 Regression (2) - Unlocked

Chapter 04 of BMED-3083 covers regression analysis, including essential statistical terminologies and concepts such as mean, variance, and standard deviation. It explains the purpose of regression analysis in understanding relationships between dependent and independent variables, and highlights the importance of data quality for accurate conclusions. Additionally, it discusses the uses, potential abuses, and the least squares method for estimating regression parameters.

Uploaded by

nigatudebebe3
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views97 pages

Chapter 4 Regression (2) - Unlocked

Chapter 04 of BMED-3083 covers regression analysis, including essential statistical terminologies and concepts such as mean, variance, and standard deviation. It explains the purpose of regression analysis in understanding relationships between dependent and independent variables, and highlights the importance of data quality for accurate conclusions. Additionally, it discusses the uses, potential abuses, and the least squares method for estimating regression parameters.

Uploaded by

nigatudebebe3
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 97

BMED-3083: Computational

Methods

Chapter 04: Regression

1
Statistics Background of
Regression Analysis
After covering this topic, you should be able to:
1. Review the statistics background needed for
learning regression.

2
Review of Statistical Terminologies
- It is important that we review some of the statistical
terminologies that we may encounter in studying the topic
of regression.
- Some key terms we need to review are:
- sample,
- arithmetic mean (average),
- error or deviation,
- standard deviation,
- variance, and
- coefficient of variation.

3
Elementary Statistics
- A statistical sample is a fraction or a portion of the whole
(population) that is studied.
- Consider an engineer is interested in understanding the
relationship between the rate of a reaction and temperature.
- It is impractical for the engineer to test all possible and
measurable temperatures.
- Apart from the fact that the instrument for temperature
measurement have limited temperature ranges for which they
can function, the sheer number of hours required to measure
every possible temperature makes it impractical.

4
Elementary Statistics
- What the engineer does is choose a temperature range
(based on his/her knowledge of the system) in which to
study.
- Within the chosen temperature range, the engineer further
chooses specific temperatures that span the range within
which to conduct the experiments.
- These chosen temperatures for study constitute the sample while all
possible temperatures are the population.
- In statistics, the sample is the fraction of the population
chosen for study.

5
Elementary Statistics
- The location of the center of a distribution - the mean or
average - is an item of interest.
- The arithmetic mean of a sample is a measure of its central
tendency and is evaluated by dividing the sum of individual
data points by the number of points.
- A measure of central tendency (also referred to as measure of center
or central location) is a summary measure that attempts to describe a
whole set of data with a single value that represents the middle or
center of its distribution. There are 3 measure of central tendency:
- Mode,
- Median, and
- Mean

6
Elementary Statistics
- The arithmetic mean y is mathematically defined as
n

y i
y= i =1

n
- which is the sum of the individual data points y i divided by
the number of data points n .

7
Elementary Statistics
- One of the measures of the spread of the data is the range
of the data.
- The range R is defined as the difference between the
maximum and minimum value of the data as:
R = ymax − ymin

- where
y max is the maximum of the values of y i , i = 1,2,...,n,
y min is the minimum of the values of y i , i = 1,2,...,n,

8
Elementary Statistics

- However, range may not give a good idea of the spread of


the data as some data points may be far away from most
other data points (such data points are called outliers).
- That is why the deviation from the average or arithmetic
mean is looked as a better way to measure the spread.
- The residual between the data point and the mean is
defined as:

ei = yi − y

9
Elementary Statistics
- The difference of each data point from the mean can be
negative or positive depending on which side of the mean the
data point lies (recall the mean is centrally located).
- If one calculates the sum of such differences to find the
overall spread, the differences may simply cancel each other.
- That is why the sum of the square of the differences is
considered a better measure.
- The sum of the squares of the differences, also called
summed squared error (SSE), S t , is given by:
n
S t =  ( yi − y )
2

i =1

10
Elementary Statistics
- Since the magnitude of the summed squared error is
dependent on the number of data points, an average value
of the summed squared error is defined as the variance,  2
n
(
 iy − y )2

St
 =
2
= i =1

n −1 n −1
- The variance,  2 is sometimes written in two different
convenient formulas as 2
n
 
n n
  y
 i =1 
i

 i −  i −
2 2 2
y ny y
or n
 =
2 i =1  2 = i =1
n −1 n −1

11
Elementary Statistics
- However, why is the variance divided by ( n − 1) and not n as
we have n data points?
- To bring the variation back to the same level of units as the
original data, a new term called standard deviation,  , is
defined as: n
(
 iy − y ) 2

St
 = = i =1

n −1 n −1

- Furthermore, the ratio of the standard deviation to the mean,


known as the coefficient variation c.v is also used to normalize
the spread of a sample.

c.v =  100
y
12
Example 1
Use the data in Table below to calculate the
a) mean concentration,
b) range of data,
c) residual of each data point,
d) sum of the square of the residuals.
e) sample standard deviation,
f) variance, and
g) coefficient of variation.
Table 1 Ion concentration

12.0 15.0 14.1 15.9 11.5 14.8 11.2 13.7 15.9 12.6 14.3 12.6 12.1 14.8

13
Solution
Set up a table (see Table 2) containing the data, the residual
for each data point and the square of the residuals.
Table 2 Data and data summations for statistical calculations.
i yi yi
2
yi − y ( yi − y )2
1 12 144 -1.6071 2.5829
2 15 225 1.3929 1.9401
3 14.1 198.81 0.4929 0.24291
4 15.9 252.81 2.2929 5.2572
5 11.5 132.25 -2.1071 4.4401
6 14.8 219.04 1.1929 1.4229
7 11.2 125.44 -2.4071 5.7943
8 13.7 187.69 0.0929 0.0086224
9 15.9 252.81 2.2929 5.2572
10 12.6 158.76 -1.0071 1.0143
11 14.3 204.49 0.6929 0.48005
12 12.6 158.76 -1.0071 1.0143
13 12.1 146.41 -1.5071 2.2715
14 14.8 219.04 1.1929 1.4229
14
14


i =1
190.50 2625.3 0.0000 33.149
Solution
(a) Mean concentration as from the equation:
n

y i
190.5
y= i =1
= = 13.607
n 14

(b) The range of data as per the equation is:


R = ymax − ymin
= 15.9 − 11.2
= 4.7

(c) Residual at each point is shown in Table 2. For example,


at the first data point as per the equation:
e1 = y1 − y
= 12.0 − 13.607
= −1.6071 15
Solution
(d) The sum of the square of the residuals as from the
equation is: n
S t =  ( yi − y )
2

i =1

= 33.149 ( See Table 2)

(e) The standard deviation as per the equation is:


n
(
 iy − y )2

= i =1

n −1
33.149
=
14 − 1
= 1.5969
16
Solution
(f) The variance is calculated as from the equation:
 2 = (1.597 )2
= 2.5499
The variance can be calculated using the following equation
2
 n 
  yi 
y i −  i =1 
n n
  yi − ny 2
2 2
n
 2 = i =1 or by using 2 = i =1
n −1 n −1
(190 .5) 2 2625 .3 − 14  13.607 2
2625 .31 − =
14 14 − 1
=
14 − 1 = 2.5499
= 2.5499

17
Solution
(f) The coefficient of variation, c.v as from the equation is:

c.v =  100
y
1.5969
=  100
13.607
= 11.735 %
Chlorate Concentration

19 y+2
(mmol/cm )

y+
3

15
y
11 y-
y-2
7
1 6 11

Data point

Figure 1 Ion concentration data points. 18


Introduction of Regression
Analysis
After covering this topic, you should be able to:
1. know what regression analysis is,
2. know the effective use of regression, and
3. enumerate uses and abuses of regression.

19
What is regression analysis?

- Regression analysis gives information on the relationship


between a response (dependent) variable and one or more
predictor (independent) variables to the extent that
information is contained in the data.
- The goal of regression analysis is to express the response
variable as a function of the predictor variables.

20
What is regression analysis?

- fit and the accuracy of conclusion depend on the data used.


- The goal of regression analysis is to express the response
variable as a function of the predictor variables.
- Hence non-representative or improperly compiled data result
in poor fits and conclusions. Thus, for effective use of
regression analysis one must:
1. investigate the data collection process,
2. discover any limitations in data collected, and
3. restrict conclusions accordingly.

21
What is regression analysis?
- An example of a regression model is the linear regression
model which is a linear relationship between response
variable, y and the predictor variable, xi , i = 1,2...,n of the
form
y = 0 + 1x1 + 2 x2 + ... + n xn + 

where

 0 , 1....... n are regression coefficients (unknown model parameters), and


 is the error due to variability in the observed responses.

22
What is regression analysis?
- Linear as used in linear regression refers to the form of
occurrence of the unknown parameters,  1 and  2 as ,simple
linear multipliers of the predictor variable. Thus, the two
equations below are both linear.

y = 0 + 1t + 2t + 3 + 

y = 0 + 1t + 2 + 

23
Comparison of Regression
and Correlation
- Unlike regression, correlation analysis assesses the
simultaneous variability of a collection of variables.
- Correlation quantifies the strength of the linear relationship
between a pair of variables, where as regression express the
relationship in the form of an equation.
- For example: in patients attending an accident and emergency
unit (A&E), we could use correlation to determine whether
there is a relationship between age and urea level, and use
regression to determine whether the level of urea can be
predicted for a given age.

24
Uses of Regression Analysis
- Three uses for regression analysis are for:
1. prediction
2. model specification and
3. parameter estimation.
- Regression analysis equations are designed only to make
predictions.
- Good predictions will not be possible if the model is not correctly
specified and accuracy of the parameter not ensured.
- Parameter estimation is the most difficult to perform because not
only is the model required to be correctly specified, the prediction
must also be accurate and the data should allow for good
estimation.
- Thus, limitations of data and inability to measure all predictor
variables relevant in a study restrict the use of prediction equations.
25
Abuses of Regression Analysis
- There are three common abuses of regression analysis:
1. Extrapolation
2. Generalization
3. Causation
- We extrapolate when we use a regression equation to produce a y
value from an x value that is outside the range of the observed x
values.
- Generalization could arise when unsupported or over-exaggerated
claims are made.
- It is not often possible to measure all predictor variables relevant in a study.
Hence, limitations imposed by the data restricts the use of prediction equation.
- Regression analysis cannot prove causality, rather it can only aid in
the confirmation or refutation of a causal model – the model must
however have a theoretical basis.
26
Least Squares Methods
- This is the most popular method of parameter estimation for coefficients of
regression models. It has well-known probability distributions and gives unbiased
estimators of regression parameters with the smallest variance.
- We wish to predict the response to n data points ( x1 , y1 ), ( x2 , y2 ),......,( xn , yn )
by a regression model given by
y = f (x)
where, the function f (x) has regression constants that need to be estimated.
For example
f ( x) = a0 + a1 x is a straight-line regression model with constants a0 and a1

f ( x) = a0ea1x is an exponential model with constants a0 and a1

f ( x) = a0 + a1 x + a2 x 2 is a quadratic model with constants a0 , a1 and a 2

27
Least Squares Methods
- A measure of goodness of fit, that is how the regression model f (x)
predicts the response variable y is the magnitude of the residual, Ei at each of the
n data points.

Ei = yi − f ( xi ), i = 1,2,....n

- Ideally, if all the residuals Ei are zero, one may have found an equation in which
all the points lie on a model.

- Thus, minimization of the residual is an objective of obtaining regression coefficients.

- In the least squares method, estimates of the constants of the models are chosen
such that minimization of the sum of the squared residuals is achieved, that is
minimize:
n

E
i =1
i
2

28
Linear Regression
After covering this topic, you should be able to:
1. define linear regression,
2. use several minimizing of residual criteria to
choose the right criterion,
3. derive the constants of a linear regression
model based on least squares method criterion,
4. use in examples, the derived formulas for the
constants of a linear regression model, and
5. prove that the constants of the linear regression
model are unique and correspond to a minimum.
29
What is Regression?
What is regression? Given n data points ( x1, y1 ), ( x2 , y2 ),......,( xn , yn )
best fit y = f (x) to the data.

Residual at each point is Ei = yi − f ( xi )

y
( xn , yn )
( xi , yi )
Ei = yi − f ( xi )
y = f (x)

( x1 , y1 )
x

Figure. Basic model for regression

30
Linear Regression-Criterion#1
Given n data points ( x1, y1 ), ( x2 , y2 ),......,( xn , yn ) best fit y = a0 + a1 x to the data.
n
Does minimizing  Ei work as a criterion?
i =1

( xi , yi )

Ei = yi − a0 − a1 xi ( xn , yn )

( x2 , y 2 )
( x3 , y3 )

y = a0 + a1x
( x1 , y1 )

x
Figure. Linear regression of y vs x data showing residuals at a typical point, xi .

31
Example for Criterion#1
Example: Given the data points (2,4), (3,6), (2,6) and (3,8), best fit
the data to a straight line using Criterion#1
n
Minimize  Ei Using y=4x − 4 as the regression curve?
i =1
Table. Data Points 10

8
x y
6
2.0 4.0
y

3.0 6.0 4

2.0 6.0 2

3.0 8.0 0
0 1 2 3 4

Figure. Data points for y vs x data.


32
Linear Regression-Criteria#1

Using y=4x − 4 as the regression curve


Table. Residuals at each point
for regression model y=4x − 4 10

8
x y ypredicted E = y - ypredicted
6
2.0 4.0 4.0 0.0
y 4
3.0 6.0 8.0 -2.0
2
2.0 6.0 4.0 2.0
0
3.0 8.0 8.0 0.0 0 1 2 3 4
4
 Ei = 0 x
i =1

Figure. Regression curve y=4x − 4 and y vs x data

33
Linear Regression-Criterion#1
Using y=6 as a regression curve
Table. Residuals at each point
for regression model y=6 10
x y ypredicted E = y - ypredicted 8
2.0 4.0 6.0 -2.0 6

y
3.0 6.0 6.0 0.0 4
2.0 6.0 6.0 0.0 2
3.0 8.0 6.0 2.0 0
4 0 1 2 3 4
 Ei = 0
i =1
x

Figure. Regression curve y=6 and y vs x data

34
Linear Regression – Criterion #1
4
 Ei = 0 for both regression models of y=4x-4 and y=6
i =1

The sum of the residuals is minimized, in this case it is zero,


but the regression model is not unique.
Hence, the criterion of minimizing the sum of the residuals is
a bad criterion.
10

6
y

0
0 1 2 3 4

35
Linear Regression-Criterion#2
n
Will minimizing  | Ei | work any better?
i =1

( xi , yi )

Ei = yi − a0 − a1 xi ( xn , yn )

( x2 , y 2 )
( x3 , y3 )

y = a0 + a1x
( x1 , y1 )

x
Figure. Linear regression of y vs. x data showing residuals at a typical point, xi .

36
Example for Criterion#2
Example: Given the data points (2,4), (3,6), (2,6) and (3,8), best fit
n
the data to a straight line using Criterion#2
Minimize  | Ei |
i =1
Table. Data Points 10

8
x y
6
2.0 4.0
y

4
3.0 6.0
2
2.0 6.0
0
3.0 8.0 0 1 2 3 4

Figure. Data points for y vs. x data.


37
Linear Regression-Criterion#2

Using y=4x − 4 as the regression curve


Table. Residuals at each point
for regression model y=4x − 4
10
x y ypredicted E = y - ypredicted
8
2.0 4.0 4.0 0.0
6
3.0 6.0 8.0 -2.0 y
4
2.0 6.0 4.0 2.0
2
3.0 8.0 8.0 0.0
0
4 0 1 2 3 4
 | Ei | = 4
i =1 x

Figure. Regression curve y= y=4x − 4 and y vs. x


data

38
Linear Regression-Criterion#2
Using y=6 as a regression curve
Table. Residuals at each point
for regression model y=6 10
x y ypredicted E = y - ypredicted 8
2.0 4.0 6.0 -2.0 6

y
3.0 6.0 6.0 0.0 4
2.0 6.0 6.0 0.0 2
3.0 8.0 6.0 2.0 0
4 0 1 2 3 4
 | Ei | = 4
i =1
x

Figure. Regression curve y=6 and y vs x data

39
Linear Regression-Criterion#2

4
 Ei = 4 for both regression models of y=4x − 4 and y=6.
i =1

The sum of the absolute residuals has been made as small as


possible, that is 4, but the regression model is not unique.
Hence, the criterion of minimizing the sum of the absolute value
of the residuals is also a bad criterion.

40
Least Squares Criterion
The least squares criterion minimizes the sum of the square of the
residuals in the model, and also produces a unique line.
2
S r =  Ei =  ( yi − a0 − a1 xi )
n 2 n

i =1 i =1

xi , yi

Ei = yi − a0 − a1 xi xn , y n

x ,y
2 2
x3 , y 3

y = a0 + a1x
x ,y
1 1

Figure. Linear regression of y vs x data showing residuals at a typical point, xi .

41
Finding Constants of Linear Model
2
Minimize the sum of the square of the residuals: S r =  Ei =  ( yi − a0 − a1 xi )
n 2 n

i =1 i =1

To find a 0 and a1 we minimize Sr with respect to a1 and a 0 .


S r n
= −2 ( yi − a0 − a1 xi )(− 1) = 0
a0 i =1

S r n
= −2 ( yi − a0 − a1 xi )(− xi ) = 0
a1 i =1

giving
n n n

a + a x =  y
i =1
0
i =1
1 i
i =1
i

n n n

a x + a x =  yi xi
2
0 i 1 i
i =1 i =1 i =1

42
Finding Constants of Linear Model
Solving for a 0 and a1 directly yields,
n n n
n x i y i − x i  y i
i =1 i =1 i =1
a1 = 2
2  
n n
n x i −  x i 
i =1  i =1 
and
n n n n

x y −x x y
2
i i i i i
a0 = i =1 i =1 i =1 i =1 a 0 = y − a1 x
2
n
 n 
n x −  x i 
2
i
i =1  i =1 

43
Example 1
The torque, T needed to turn the torsion spring of a mousetrap through
an angle, is given below. Find the constants for the model given by
T = k 1 + k 2

Table: Torque vs Angle for a


0.4
torsional spring

Angle, θ Torque, T
Torque (N-m)

0.3

Radians N-m
0.2
0.698132 0.188224
0.959931 0.209138
1.134464 0.230052 0.1
0.5 1 1.5 2
1.570796 0.250965
θ (radians)
1.919862 0.313707
Figure. Data points for Torque vs Angle data

44
Example 1 cont.
The following table shows the summations needed for the calculations of
the constants in the regression model.
Table. Tabulation of data for calculation of important
summations
Using equations described for
 T 2 T
a 0 and a1 with n = 5
Radians N-m Radians2 N-m-Radians 5 5 5
0.698132 0.188224 0.487388 0.131405 n   i Ti −  i  Ti
0.959931 0.209138 0.921468 0.200758 k2 = i =1 i =1 i =1
2
2  
5 5
1.134464 0.230052 1.2870 0.260986
n   i −   i 
1.570796 0.250965 2.4674 0.394215 i =1  i =1 
5(1.5896 ) − (6.2831)(1.1921)
1.919862 0.313707 3.6859 0.602274
5
=
=
i =1
6.2831 1.1921 8.8491 1.5896 5(8.8491) − (6.2831)
2

= 9.6091 10−2 N-m/rad

45
Example 1 cont.
Use the average torque and average angle to calculate k1
5 5

_ T i _  i
T= i =1
= i =1
n n
1.1921 6.2831
= =
5 5
= 2.3842  10 −1 = 1.2566

Using,
_ _
k1 = T − k 2 
= 2.3842  10−1 − (9.6091  10−2 )(1.2566)
= 1.1767  10−1 N-m

46
Example 1 Results
Using linear regression, a trend line is found from the data

Figure. Linear regression of Torque versus Angle data

Can you find the energy in the spring if it is twisted from 0 to 180 degrees?
47
Linear Regression (special case)

Given

( x1 , y1 ), ( x 2 , y 2 ), ... , ( xn, yn)

best fit

y = a1 x

to the data.

48
Linear Regression (special case cont.)

y
xi , yi

 i = yi − a1 xi xn , y n

xi , a1 xi

x ,y
1 1

49
Linear Regression (special case cont.)

Residual at each data point

 i = yi − a1 xi
Sum of square of residuals
n
Sr =   i
2

i =1

n 2

=  ( yi − a1 xi )
i =1

50
Linear Regression (special case cont.)
Differentiate with respect to a1
n
dS r
=  2( yi − a1 xi )(− xi )
da1 i =1

( )
n
=  − 2 yi xi + 2a1 xi
2

i =1

dS r
=0
da1
gives
n

x y i i
a1 = i =1
n

 i
x 2

i =1
51
Linear Regression (special case cont.)
Does this value of a1 correspond to a local minima or local
maxima? n

x y i i
a1 = i =1
n

x
i =1
2
i

( )
n
dS r
=  − 2 yi xi + 2a1 xi
2

da1 i =1
d 2Sr n
=  2 xi  0
2
2
da1 i =1

Yes, it corresponds to a local minima.


n

x y i i
a1 = i =1
n

x
i =1
2
i

52
Example 2
To find the longitudinal modulus of composite, the following data is
collected. Find the longitudinal modulus, E using the regression model

Table. Stress vs. Strain data


Strain Stress
 = E and the sum of the square of the
(%) (MPa)
residuals.
0 0 3.0E+09
0.183 306
0.36 612 Stress, σ (Pa)
2.0E+09
0.5324 917
0.702 1223
0.867 1529 1.0E+09
1.0244 1835
1.1774 2140
1.329 2446 0.0E+00
0 0.005 0.01 0.015 0.02
1.479 2752
Strain, ε (m/m)
1.5 2767
1.56 2896 Figure. Data points for Stress vs. Strain data
53
Example 2 cont.
n
Table. Summation data for regression model   i i

i ε σ ε2 εσ E= i =1
n


2
1 0.0000 0.0000 0.0000 0.0000 i
2 1.8300×10−3 3.0600×108 3.3489×10−6 5.5998×105 i =1
12
3
4
3.6000×10−3
5.3240×10−3
6.1200×108
9.1700×108
1.2960×10−5
2.8345×10−5
2.2032×106
4.8821×106

i =1
i
2
= 1.2764  10 −3
5 7.0200×10−3 1.2230×109 4.9280×10−5 8.5855×106 12
6 8.6700×10−3 1.5290×109 7.5169×10−5 1.3256×107  
i =1
i i = 2.3337  10 8
7 1.0244×10−2 1.8350×109 1.0494×10−4 1.8798×107 12
8 1.1774×10−2 2.1400×109 1.3863×10−4 2.5196×107
  i i
9 1.3290×10−2 2.4460×109 1.7662×10−4 3.2507×107 E= i =1
12
10
11
1.4790×10−2
1.5000×10−2
2.7520×109
2.7670×109
2.1874×10−4
2.2500×10−4
4.0702×107
4.1505×107
 i =1
i
2

12 1.5600×10−2 2.8960×109 2.4336×10−4 4.5178×107 2.3337 108


=
1.2764 10 −3
12

 1.2764×10−3 2.3337×108
i =1 = 182.84 GPa

54
Example 2 Results
The equation  = 182.84  10 9  describes the data.

Figure. Linear regression for stress vs. strain data

55
Nonlinear Regression
After covering this topic, you should be able to:
1. derive constants of nonlinear regression
models,
2. use in examples, the derived formula for the
constants of the nonlinear regression model, and
3. linearize (transform) data to find constants of
some nonlinear regression models.

56
Nonlinear Regression
Some popular nonlinear regression models:

1. Exponential model: ( y = aebx )


2. Power model: ( y = axb )
 ax 
3. Saturation growth model:  y = 
 b + x 
4. Polynomial model: ( y = a0 + a1x + ...+ amxm )

57
Nonlinear Regression
Given n data points ( x1, y1), ( x 2, y 2), ... , ( xn, yn) best fit y = f (x)
to the data, where f (x) is a nonlinear function of x .

( xn , y n )

(x , y )
2 2

y = f (x)
( xi , yi )
yi − f ( xi )
(x , y )
1 1

Figure. Nonlinear regression model for discrete y vs. x data

58
Regression
Exponential Model

59
Exponential Model
Given ( x1 , y1 ), ( x2 , y2 ), ... , ( xn , yn ) best fit y = ae to the data.
bx

(x , y )
1 1
y = aebx

yi − aebxi
( xi , yi )

(x , y )
2 2 ( xn , y n )

Figure. Exponential model of nonlinear regression for y vs. x data

60
Finding Constants of Exponential Model
The sum of the square of the residuals is defined as

( )
n
Sr =  y − ae
bx 2
i
i
i =1
Differentiate with respect to a and b

S r
( )( )
n
=  2 yi − ae bxi − e bxi = 0
a i =1

S r
( )( )
n
=  2 yi − ae bxi
− axi e = 0
bxi

b i =1

61
Finding Constants of Exponential Model
Rewriting the equations, we obtain

n n
−  yi e bxi
+ ae 2bxi
=0
i =1 i =1

n n
 yi xi e
bxi
− a  xi e 2bxi
=0
i =1 i =1

62
Finding constants of Exponential Model
Solving the first equation for a yields
n
bxi
 yi e
i =1
a= n
2bxi
 e
i =1

Substituting a back into the previous equation


n
bxi
n
 i
y e n
i =1
 y i xi e
bxi
− n
 i
x e 2bxi
=0
i =1 2bxi i =1
e
i =1
The constant b can be found through numerical
methods such as bisection method.
63
Example 1-Exponential Model
Many patients get concerned when a test involves injection of a
radioactive material. For example for scanning a gallbladder, a
few drops of Technetium-99m isotope is used. Half of the
Technetium-99m would be gone in about 6 hours. It, however,
takes about 24 hours for the radiation levels to reach what we
are exposed to in day-to-day activities. Below is given the
relative intensity of radiation as a function of time.

Table. Relative intensity of radiation as a function of time.

t(hrs) 0 1 3 5 7 9
 1.000 0.891 0.708 0.562 0.447 0.355

64
Example 1-Exponential Model cont.
The relative intensity is related to time by the equation
t
 = Ae
Find:
a) The value of the regression constants A and 
b) The half-life of Technetium-99m
c) Radiation intensity after 24 hours

65
Plot of data

66
Constants of the Model
 = Aet

The value of λ is found by solving the nonlinear equation


n
ti
n

 i e n
f ( ) =   i t i e ti
− i =1
n
 i
t e 2ti
=0
i =1 2ti i =1
 e
i =1
n

 i
 e t i

A= i =1
n

 e 2 t i

i =1
67
Constants of the Model
n
ti
n
 ie n
f ( ) =   i t i e ti
− i =1
n
 ti e
2ti
=0
i =1 2ti i =1
e
i =1

t (hrs) 0 1 3 5 7 9
γ 1.000 0.891 0.708 0.562 0.447 0.355
68
Constants of the Model

n
ti
n
 ie n
f ( ) =   i t i e ti
− i =1
n
 i
t e 2ti
=0
i =1 2ti i =1
e
i =1

 = −0.1151

69
Calculating the Other Constant
The value of A can now be calculated
6

 e i
ti

A= i =1
6 = 0.9998
 e 2 ti

i =1

The exponential regression model then is

 = 0.9998 e −0.1151t

70
Plot of data and regression curve

 = 0.9998 e −0.1151t

71
Relative Intensity After 24 hrs
The relative intensity of radiation after 24 hours
−0.1151(24)
 = 0.9998  e
−2
= 6.3160  10
This result implies that only
−2
6.316  10
 100 = 6.317%
0.9998
radioactive intensity is left after 24 hours.
72
Polynomial Model

Given ( x1, y1), ( x 2, y 2), ... , ( xn, yn) best fit y = a + a x + ... + a x m
0 1 m
(𝑚 ≤ 𝑛 − 1) to a given data set.

( xn , y n )

(x , y )
2 2

( xi , yi ) y = a + a x + + a xm
0 1 m

yi − f ( xi )
(x , y )
1 1

Figure. Polynomial model for nonlinear regression of y vs. x data

73
Polynomial Model cont.
The residual at each data point is given by
Ei = yi − a0 − a1 xi − . . . − am xim
The sum of the square of the residuals then is
n
S r =  E i2
i =1

( )
n
=  y i − a 0 − a1 xi − . . . − a m xim
2

i =1

74
Polynomial Model cont.
To find the constants of the polynomial model, we set the derivatives
with respect to ai where i = 1, m, equal to zero.

S r
=  2.( yi − a0 − a1 xi − . . . − am xim )(−1) = 0
n

a0 i =1
S r
=  2.( yi − a0 − a1 xi − . . . − am xim )(− xi ) = 0
n

a1 i =1

   
S r
=  2.( yi − a0 − a1 xi − . . . − am xim )(− xim ) = 0
n

am i =1

75
Polynomial Model cont.
These equations in matrix form are given by
   
   n 
 n  n   n m 
  xi  .  xi  a
   yi
. .  
  i =1   i =1    0 
 n    ni =1
   xi   n 2  n m+1  a1 =

  xi  . . .  xi   xi yi 
  i =1   i =1   i =1  . . 
.  i =1 
. . . . . . . . . . .  a  . . . 
  m   
 n m   n m+1   
n
 xim yi
n

  xi    xi  . . .  xi2 m  
 i =1 
 i =1   i =1   i =1 

The above equations are then solved for a0 , a1 ,, am

76
Example 2-Polynomial Model
Regress the thermal expansion coefficient vs. temperature data to
a second order polynomial.

Table. Data points for


temperature vs α
7.00E-06

Thermal expansion coefficient, α


Temperature, T Coefficient of 6.00E-06
(oF) thermal
expansion, α 5.00E-06
(in/in/oF)
80 6.47×10−6 (in/in/o F) 4.00E-06

40 6.24×10−6
3.00E-06
−40 5.72×10−6
−120 5.09×10−6 2.00E-06

−200 4.30×10−6 1.00E-06


−280 3.33×10−6 -400 -300 -200 -100 0 100 200

−340 2.45×10−6 Temperature, o F

Figure. Data points for thermal expansion coefficient vs


temperature.
77
Example 2-Polynomial Model cont.
We are to fit the data to the polynomial regression model
α = a0 + a1T + a2T 2
The coefficients a0 ,a1, a2 are found by differentiating the sum of the
square of the residuals with respect to each variable and setting the
values equal to zero to obtain

  n   n 2   n 
 n   Ti    Ti     i 
  i =1   i =1   a   i =1 
0
 n   n 2  n 3     n 
  i 
 T   Ti    Ti   a1  =  Ti  i
 i =n1   i =1   i =1      i =1 
n 
 n 4    2   T 2
a
 T 2   n 3
  i   Ti    Ti  


 i i 

 i =1   i =1   i =1   i =1

78
Example 2-Polynomial Model cont.
The necessary summations are as follows
7
Table. Data points for temperature vs.
Temperature, T Coefficient of
α
T
i =1
i
2
=2.5580 105
(oF) thermal expansion,
7

T
α (in/in/oF)
i
3
= − 7.0472  10 7
80 6.47×10−6
i =1
40 6.24×10−6 7
−40 5.72×10−6 T
i =1
i
4
= 2.1363 1010
−120 5.09×10−6
7


−200 4.30×10−6
i = 3.3600  10 −5
−280 3.33×10−6
i =1
−340 2.45×10−6 7

T 
i =1
i i = − 2.6978  10 −3
7

T
i =1
i
2
 i =8.5013  10 −1

79
Example 2-Polynomial Model cont.
Using these summations, we can now calculate a0 ,a1, a2
 7.0000 − 8.6000  10 2 2.5800  10 5  a0   3.3600  10 −5 
   
 − 8.600  10 2
2.5800  10 5 − 7.0472  10 7   a1  = − 2.6978  10 −3 
 2.5800  10 5 − 7.0472  10 7 2.1363  1010  a 2   8.5013  10 −1 

Solving the above system of simultaneous linear equations we have
a0   6.0217  10 
−6

 a  =  6.2782  10 −9 
 1  
a 2  − 1.2218  10 
−11

The polynomial regression model is then


α = a0 + a1T + a 2T 2
= 6.0217  10 −6 + 6.2782  10 −9 T − 1.2218  10 −11 T 2

80
Transformation of Data
To find the constants of many nonlinear models, it results in solving
simultaneous nonlinear equations. For mathematical convenience,
some of the data for such models can be transformed. For example,
the data for an exponential model can be transformed.
As shown in the previous example, many chemical and physical processes
are governed by the equation,
y = aebx
Taking the natural log of both sides yields,
ln y = ln a + bx
Let z = ln y and a0 = ln a
We now have a linear regression model where z = a0 + a1 x
(implying) a = e ao with a1 = b

81
Transformation of data cont.
Using linear model regression methods,
n n n
n xi z i −  xi  z i
a1 = i =1 i =1 i =1
2
n
  n
n xi2 −   xi 
i =1  i =1 
_ _
a 0 = z − a1 x
Once ao , a1 are found, the original constants of the model are found as
b = a1
a = e a0

82
Example 3-Transformation of
data
Many patients get concerned when a test involves injection of a radioactive
material. For example for scanning a gallbladder, a few drops of Technetium-
99m isotope is used. Half of the Technetium-99m would be gone in about 6
hours. It, however, takes about 24 hours for the radiation levels to reach what
we are exposed to in day-to-day activities. Below is given the relative intensity
of radiation as a function of time.
1
Table. Relative intensity of radiation as a function

Relative intensity of radiation, γ


of time
t(hrs) 0 1 3 5 7 9
 1.000 0.891 0.708 0.562 0.447 0.355 0.5

0
0 5 10
Time t, (hours)

Figure. Data points of relative radiation intensity


vs. time
83
Example 3-Transformation of data
cont.
Find:
a) The value of the regression constants A and 
b) The half-life of Technetium-99m
c) Radiation intensity after 24 hours
The relative intensity is related to time by the equation
 = Aet

84
Example 3-Transformation of data
cont.
Exponential model given as,
 = Aet
ln ( ) = ln ( A) + t
Assuming z = ln  , ao = ln ( A) and a1 =  we obtain
z = a0 + a1t
This is a linear relationship between z and t

85
Example 3-Transformation of data cont.
Using this linear relationship, we can calculate a0 , a1 where
n n n
n  ti zi −  ti  zi
a = i =1 i =1 i =1
2
1 n
 n 
n t −   t 
2

i =1
1
 i =1 i 
and
a0 = z − a1t
 = a1
a
A=e 0

86
Example 3-Transformation of Data
cont.
Summations for data transformation are as follows

Table. Summation data for Transformation of data With n = 6


6

i ti i
model
zi = ln  i tz t 2 t
i =1
i = 25.000
i i i
6

z
1 0 1 0.00000 0.0000 0.0000
2 1 0.891 −0.11541 −0.11541 1.0000 = −2.8778
i
3 3 0.708 −0.34531 −1.0359 9.0000 i =1
4 5 0.562 −0.57625 −2.8813 25.000 6
5
6
7
9
0.447
0.355
−0.80520
−1.0356
−5.6364
−9.3207
49.000
81.000 t z
i =1
i i
= −18.990


6
−2.8778 −18.990
t
25.000 165.00
i
2
= 165.00
i =1

87
Example 3-Transformation of Data
cont.
Calculating a0 , a1
6(− 18.990 ) − (25)(− 2.8778 )
a1 = = −0.11505
6(165.00 ) − (25)
2

− 2.8778
− (− 0.11505 )
25 = −2.6150 10 −4
a0 =
6 6
Since
a0 = ln ( A)
A = e a0
−2.615010−4
=e = 0.99974
also
 = a1 = −0.11505

88
Example 3-Transformation of Data
cont.
Resulting model is  = 0.99974  e
−0.11505t

1
 = 0.99974  e −0.11505t

Relative
Intensity
0.5
of
Radiation,

0
0 5 10
Time, t (hrs)

Figure. Relative intensity of radiation as a function of


temperature using transformation of data model.

89
Example 3-Transformation of Data
cont.
The regression formula is then
 = 0.99974  e −0.11505t
1
b) Half life of Technetium-99m is when  = 
2 t =0

0.99974  e −0 .11505 t = (0.99974 )e −0 .11505 (0 )


1
2
e −0 .11508 t = 0.5
− 0.11505t = ln (0.5)
t = 6.0248 hours

90
Example 3-Transformation of Data
cont.
c) The relative intensity of radiation after 24 hours is then
 = 0.99974e −0.11505(24)
= 0.063200
6.3200 × 10−2
This implies that only 0.99974
× 100 = 6.3216% of the radioactive
material is left after 24 hours.

91
Comparison
Comparison of exponential model with and without data
Transformation:
Table. Comparison for exponential model with and without data
Transformation.
With data Without data
Transformation Transformation
(Example 3) (Example 1)
A 0.99974 0.99983
λ −0.11505 −0.11508
Half-Life (hrs) 6.0248 6.0232
Relative intensity
6.3200×10−2 6.3160×10−2
after 24 hrs.

92
Power Function
The power function equation describes many scientific and engineering
phenomena.
y = axb

The linearization of the data is as follows.

ln ( y ) = ln (a ) + b ln (x )

The resulting equation shows a linear relation between ln (y ) and ln (x )

Let
z = ln y
w = ln( x)
a0 = ln a implying a = e a 0

a1 = b

93
Power Function
we get
z = a0 + a1w
n n n
n wi z i −  wi  z i
a1 = i =1 i =1 i =1
2
 n
n

n wi −   wi 
2

i =1  i =1 
n n

z i w i
a0 = i =1
− a1 i =1

n n

Since a 0 and a 1 can be found, the original constants of the model are

b = a1
a = e a0
94
Growth Model
❑ The growth models are used to describe how something grows with changes
in a regressor variable (often the time). Examples in this category include
the growth of thin films or population with time.

❑ In the logistic growth model, an example of a growth model in which a


measurable quantity y varies with some quantity x is:

ax
y= For x = 0 then y=0 while as x → , then y → a .
b+x

To linearize the data for this method,

1 b+x
=
y ax
b1 1
= +
ax a

95
Growth Model
Let
1
z=
y
1
w =
x

1 1
a0 = implying that a=
a a0

b a1
a1 = implying b = a1  a =
a a0

Then

z = a0 + a1 w

96
Growth Model
The relationship between z and w is linear with the coefficients a0 and a1 found
as follows. n n n
n wi z i −  wi  z i
a1 = i =1 i =1 i =1
2
n
 n

n wi2 −   wi 
i =1  i =1 
 n   n 
  zi    wi 
a0 =  i =1 − a  i =1 
 n  1 n 
   
   

Finding a0 and a1 , then gives the constants of the original growth model as
1
a=
a0
a1
b=
a0

97

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy