Numerical Analysis Guide
Numerical Analysis Guide
CHAPTER 1
CHAPTER 2
NON-LINEAR FUNCTIONS
APPROXIMATION TO FUNCTIONS
CHAPTER 3
FINITE DIFFERENCE # 2
CHAPTER 4
1|Page
NUMERICAL SYSTEM TO LINEAR SYSTEM
LU decomposition method. 49-54
Gauss-Seidel method. 55-57
Curve fitting. 58-66
CHAPTER 5
MATHEMATICAL MODELLING
First –Order Differential Equations and Applications. 67-76
Difference Equations and Applications. 76-87
REFERENCES 88
2|Page
SECTION A
CHAPTER 1
INTRODUCTION
This is a scientific area that refers to the study and development of algorithms and software
for manipulating mathematical expressions and other mathematical objects.
The study of algorithms has an ancient pedigree. The study of computers as we know them
may date back a hundred years or so. The study of algorithms dates back a millennia or more.
In fact the word algorithm is derived from the name of the great Islamic Mathematician,
Astronomer, Geographer and all-round polymath, Muhammad ibn Musa al-Khwarizmi, who
was a member of Dar Al-Hikmah (the House of Knowledge) in Baghdad in the 800s.
Algorithms are just precise ways of achieving some task. Follow the algorithm and job is
done. It may be a computer following the algorithm but it doesn't have to be. For most of
history it has been people doing so. Al-Khwarizmi was interested in algorithms for solving
algebraic equations and on calculation using our "modern" Hindu-Arabic positional number
system which he introduced to the western world. Back in the 9th century Islam, having ways
to do things such as calculating shares in inheritance was an important requirement of the
Qur'an. It was vital to be sure you had a way of calculating such things that was guaranteed to
get the right answer. That is what the study of algorithms is about, though algorithms can be
devised to do much more than simple algebra. Every computer gadget you ever used is
following algorithms to do whatever it does
Computer Scientists both invent algorithms and study their properties. Algorithms have been
devised to beat humans at games, fly planes, recognize faces, process DNA, send money
around the world, crack codes, navigate you home, control your washing machine, detect
your movements, write down the words you speak, paint works of art, write jokes, control
nuclear power plants ... You name it. Any individual program, in fact, will involve a whole
range of algorithms some simple, some complex.
Software is a program and other operating information used by a computer. This relates to
the use of machines, such as computers to manipulate mathematical equations.
3|Page
How computers communicates
Most of computers use ASCII code to represent text {words,sentences and paragraphs} which
makes it possible for the transfer of data from one computer to the other. ASCII is an
acronym for American Standard Code for Information Interchange {This is a character set not
a language}. This represents English characters as numbers, with each letter being assigned a
number from 0 to 127. For example ASCII code for upper case M is 77.
Computers can only understand numbers, so an ASCII code is the numerical representation of
a character such as 'a' or '@' or an action of some sort. ASCII was developed a long time ago
and now the non-printing characters are rarely used for their original purpose. Below is the
ASCII character table and this includes descriptions of the first 32 non-printing characters.
ASCII was actually designed for use with teletypes and so the descriptions are somewhat
obscure. If someone says they want your CV however in ASCII format, all this means is they
want 'plain' text with no formatting such as tabs, bold or underscoring - the raw format that
any computer can understand. This is usually so they can easily import the file into their own
applications without issues. Notepad.exe creates ASCII text, or in MS Word you can save a
file as 'text only'
This helps in the process of allowing symbols represented on the keyboard to be printed on
the screen. All letters, digits, punctuation symbols and many more other things are given
codes. As an example ASCII code for “a” is 97 and for “A” is 65. If you open a notepad you
can demonstrate this , first make sure that Num-Lock is on . Hold down the alt key and
keeping it held down type 65 on the numeric keyboard { the one on the right of your
keyboard}. Let go of the alt key. An “A” should appear on the notepad. Try this for another
numbers between 0 and 255 you will be able to get characters and symbols that are not on
your keyboard.
The software in your computer is able to recognise what each key is defined to do. If you
change your software in your computer you can have your key board giving different
characters than the ones on your keyboard.
ASCII has limited set of characters and cannot support the Chinese and Japanese languages
as they have thousands of different characters. To use Chinese and Japanese characters you
need character set called Unicode which is on modern computers. Other code is EBCDIC
which is an acronym for Extended Binary Coded Decimal Interchange Code. Which does not
store characters in time and this can create problems alphabetising “words”. This EBCDIC
code is an IBM designed code for representing characters as numbers. IBM is an acronym for
International Business Machines, which is a leading U.S computer manufacturer.
The disadvantage of ASCII is that it is biased to English language, so other countries cannot
write programs in ASCII.
4|Page
Storage of characters
When you write 123 using text editor, the file does not store 123, instead it stores the ASCII
code for the character “1”; ”2”; ”3”which is 31, 32, 33 in hexadecimal or 0011 0001, 0010
0010, 0011 0011 in binary.
ASCII TABLE
1 0001 1 1 SOH
2 0010 2 2 STX
3 0011 3 3 ETX
4 0100 4 4 EOT
5 0101 5 5 ENQ
6 0110 6 6 ACK
7 0111 7 7 BEL
8 1000 10 8 BS
9 1001 11 9 TAB
10 1010 12 A LF
11 1011 13 B VT
12 1100 14 C FF
13 1101 15 D CR
14 1110 16 E SO
15 1111 17 F SI
The above table is a block of 4 bits called nibble {this is half a byte} and can hold a
maximum number of 1111=15 in decimal.
The following table is a block of 8 bits called a byte and can hold a maximum number of
11111111=255 in decimal.
5|Page
Extended ASCII Codes
Most computers manipulate data in 8 bit-bytes for each character. From 0 to 127, characters
used to be stored as 7 bits. Bit is a single numeric value either “1” or “0” and a byte is
sequence of bits, usually 8bits=1 byte {e.g. 11000000 = 1byte}. Ever since Extended ASCII
is introduced which is 128 extra characters from 128 to 255 characters each character is
stored as 8 bits. The first 32 characters are control codes which Microsoft word does not
display on screen and are non-printable.
6|Page
An acronym for BITS is Binary Intelligent Transfer Service. The only way to understand
BITS is to compare them to something you know, which is digit:
A digit is a single place that can hold numerical values between 0 and 9. Computers happen
to use base 2 number system also known as binary system, reason being that it is easy to
implement them with the current electronic technology and are relatively cheap as compared
to base 10.
Digits are normally combined to together to create large numbers. For example 5,357 has
four digits. It is understood that 7 is filling the 1st (first) place or unit place, 5 the 10th (tenths)
place, 3 the 100s (hundreds) place and 6 the 1000s (thousands) place.
The above equation is in polynomial form of degree 3 {raised by base 10}, where “5”, “3”,
“5”, and “7” are the leading coefficient and “10” the base of a polynomial.
BITS have only two possible values: 0 and 1, therefore a binary number is composed of only
0s and 1s like 1011. How do you figure out what the value of the binary number is?
We do it similar way as we did for 5,357 but this time we use base 2 instead of base 10. Now
we have:
(1011)2 = 1× 23 +¿ 0× 22 +¿ 1× 21 +¿ 1× 20 ¿ 11
I was recently asked this question by someone who knows a good deal about computers. But
this question is also often asked by people who aren't so tech-savvy. Either way, the answer is
quite simple.
WHAT IS "DIGITAL"?
A modern-day "digital" computer, as opposed to an older "analog" computer, operates on the
principle of two possible states of something – "on" and "off". This directly corresponds to
there either being an electrical current present, or said electrical current being absent. The
"on" state is assigned the value "1", while the "off" state is assigned the value "0".
The term "binary" implies "two". Thus, the binary number system is a system of numbers
based on two possible digits – 0 and 1. This is where the strings of binary digits come in.
7|Page
Each binary digit, or "bit", is a single 0 or 1, which directly corresponds to a single "switch"
in a circuit. Add enough of these "switches" together, and you can represent more numbers.
So instead of 1 digit, you end up with 8 to make a byte. (A byte, the basic unit of storage, is
simply defined as 8 bits; the well-known kilobytes, megabytes, and gigabytes are derived
from the byte, and each is 1,024 times as big as the other. There is a 1024-fold difference as
opposed to a 1000-fold difference because 1024 is a power of 2 but 1000 is not.)
Increasing the base will decrease the number of digits required to represent any given
number, but taking directly from the previous point, it is impossible to create a digital circuit
that operates in any base other than 2, since there is no state between "on" and "off" (unless
you get into quantum computers... more on this later).
NON-BINARY COMPUTERS
Imagine a computer based on base-10 numbers. Then, each "switch" would have 10 possible
states. These can be represented by the digits (known as "bans" or "dits", meaning "decimal
digits") 0 through 9. In this system, numbers would be represented in base 10. This is not
possible with regular electronic components of today, but it is theoretically possible on a
quantum level.
Is this system more efficient? Assuming the "switches" of a standard binary computer take up
the same amount of physical space (nanometers) as these base-10 switches, the base-10
computer would be able to fit considerably more processing power into the same physical
space. So although the question of binary being "inefficient" does have some validity in
theory, but not in practical use today.
Full answer: We only use binary because we currently do not have the technology to create
"switches" that can reliably hold more than two possible states. (Quantum computers aren't
exactly on sale at the moment.) The binary system was chosen only because it is quite easy to
distinguish the presence of an electric current from an absence of electric current, especially
when working with trillions of such connections. And using any other number base in this
8|Page
system ridiculous, because the system would need to constantly convert between them. That's
all there is to it.
DECIMAL NUMBERS
The commonly used scientific digits are the so called real numbers. The basis
arithmetic operations performed by a computer are addition, subtraction, division
and multiplication. This real numbers are first converted into machine language
which consists of 0 and 1(binary system).
3 2 1 0
¿ a3 β + a2 β + a1 β +a0 β
where a 3=9 is called the leading coefficient and β is called the base of the
polynomial.
BINARY SYSTEM
N= ( an an−1 an−2 … a1 a0 )2
9|Page
( 11100 )2=1 ×2 4+ 1× 23+1 ×22 +0 × 21+ 0× 20=28
HORNER’S ALGORITHM
b n=a n
b n−1=an−1 +b n z
b n−2=an−2 +b n−1 z
b n−3=an −3 +b n−2 z
⋮ ⋮⋮
b 0=a 0+ b1 z
b 3=1
b 2=1+ 2× 1=3
b 1=0+2 ×3=6
b 0=1+2× 6=13
b 4=1
b 3=0+2 ×1=2
b 2=0+2 ×2=4
b 1=0+2 × 4=8
10 | P a g e
b 0=0+2 ×8=16
Converting a decimal integer N into its binary equivalent can also be accomplished
by Horner’s algorithm.
Example : Convert ( 156 )10 into binary system using the Horner’s algorithm.
( 1 )10=( 1 )2
( 5 )10=( 101 )2
( 6 )10= (110 )2
( 10 )10=( 1010 )2
b 2=a2= (1 )2
b 1=( 1111 )2
b 0=( 10011100 )2
OCTAL SYSTEM
N= ( an an−1 an−2 … a1 a0 ) 8
n n−1 n−2 1 0
¿ a n 8 + an−1 8 + an−2 8 + …+a 1 8 +a 0 8
11 | P a g e
a) ( 9978 )10 b) ( 998 )10
Solution:
( 9 )10= (11 )8
( 8 )10= (10 )8
( 7 )10=( 7 )8
( 10 )10=( 12 )8
3 2 1 0
¿ ( 11 )8 × ( 12 )8+ ( 11 )8 × (12 )8+ (7 )8 × ( 12 )8+ (10 )8 × ( 12 )8
Where ( β=12)
b 3=a3 =( 11 )8
b 2=a2 +b 3 ( β )
¿ ( 11 )8 + ( 11 )8 × ( 12 )8
¿ ( 143 )8
b 1=a1 +b 2 ( β )
¿ ( 7 )8 + ( 143 )8 × ( 12 )8
¿ ( 1745 )8
b 0=a 0+ b1 ( β )
¿ ( 10 )8 + ( 1745 )8 × ( 12 )8
¿ ( 23372 )8
12 | P a g e
¿ 8192+1536+192+56+2 ¿ ( 9978 )10
EXERCISES:
g) ( 111011110100011 )2
algorithm:
algorithm:
13 | P a g e
1.COMPUTER ARITHMETIC AND ERRORS
Since we are usually interested in the magnitude or absolute value of the error we
can also define
~
ABSOLUTE ERROR=| A− A|
|~
A− A|< E
In practice we are often more interested in so-called ‘relative error’ than absolute
error and we define,
|~
A− A|
RELATIVE ERROR =
| A|
−5
answer =1000 error=10 very good
−5
answer =1 error=10 good
−5 −5
answer =10 error=10 very bad
1. Human error
2. Truncation error
3. Rounding error
14 | P a g e
Arithmetic error
Programming error
These errors can be very hard to detect unless they give obviously incorrect
solutions. In discussing errors, we shall assume that human errors are not present.
x x2 xn
e =1+ x + + …+ + …
2! n!
( 0.1 )2 ( 0.1 )3
f =1+0.1+ + +…
2! 3!
For this calculation, the truncation error TE (i.e. the sum of the terms that
have been chopped off) is,
~ − ( 0.1 )5 ( 0.1 )6
TE=f −f = − −…
5! 6!
The numerical analyst might try and estimate the size of the truncation error,
i.e. |TE|. In this example, we can easily get a rough estimate.
( 0.1 )5
( )
2
0.1 ( 0.1 ) ( 0.1 )3
|TE|= 1+ + + +…
5! 6 ! 6 × 7 6 × 7 ×8
≤
( 0.1 )5
5! (1+0.1+
( 0.1 )2 ( 0.1 )3
+
1 ×2 1× 2× 3
+… )
( 0.1 )5 0.1 0.00001
≤ e ≅ ×1.105 ≈ 10−7
5! 120
15 | P a g e
∴ the error in truncating to five terms is approximately
10 at x=0.1.
−7
( 0.1 )
=0.1
1!
( 0.1 )2
=0.005
2!
( 0.1 )3
=0.000166 6̇
3!
( 0.1 )4
=0.00000416 6̇
4!
~
summing above=1.10517083 3̇=f
~
The exact answer to the truncated problem, f is an infinite string of digits and ,
as such, is not very useful. Since we know that it is in error in the seventh
decimal place we could round it to six or seven decimal places. For example,
rounding to six decimal places gives,
~
f ≅ 1.105171=f
where the usual rounding process has been adopted; namely, if the next figure
~
is 0,1,2,3 or 4 round down; 5,6,7,8 or 9 round up. The difference between f ∧f
~
f −f =0.00000016 6̇=ℜ
is the rounding error RE. Using the usual rounding process (and rounding to six
1−6
decimal places) the rounding error is always bounded by 2 10 . Thus , in
computing the answer,
0.1 ~
e ≈ f =1.105171
16 | P a g e
two errors are present and we have,
~ ~
ERROR= f −f =( f −f ) + ( f −f )
¿ ℜ+TE
|ERROR|≤|ℜ|+|TE|
1 1
≈ 10−6 +10−7 ≈ 10−6
2 2
Computers allocate a fixed amount of storage to every number they use. Each
number is stored as a string of digits. In practice, so-called floating point
numbers are used. A computer using four digit decimal arithmetic with floating
point numbers would store
2
14.211 as ( 0.1421,2 )=0.1421 ×10
The number pair ( p , q ) is called a floating point number. pis called the
MANTISSA (or REAL PART) and q is the CHARACTERISTIC(or INDEX or
EXPONENT). The mantissa is always a fixed number of digits and the index
must lie some range. Typically
If the INDEX goes outside that range then we get underflow(less than -256) or
overflow (greater than 256). Some computers/systems automatically replace
underflow by the special number 0 (zero). Overflow always gives some sort of
error.
We note that the mantissa is always of the form 0. … and the digit after the
decimal point is always no-zero. Thus, in the third example above 0.03 is
stored as (0.3000,-1) and not as (0.0300,0). We also note that there is no
representation of zero. A computer normally has some special representation
17 | P a g e
for this number. We further note, as in the fourth example, that the
representation may not be exact.
Rounding errors are therefore always present since we can never be certain that
a computation has been done exactly. For example, a computer working with
four digit, decimal, floating point arithmetic with,
would compute
A+ B → ( +0.4958,3 ) ≠ A + B
A × B → ( +0.1542,5 ) ≠ A × B
( 1
a=0.642136 accurate ¿ 10−6
2 )
( 1
b=0.642125 accurate ¿ 10−6
2 )
18 | P a g e
Then,
a−b=0.000011
The relative error in a−b is about 10% and is therefore unacceptably large when
the relative errors in a and b are only 0.0005%. Moreover, if the errors in the data,
1 −4
a and , were 10 then the answer would be meaningless!
2
1.5 Examples
Example 1
Using 3 digit floating point arithmetic find the answer of the calculation,
a+b∗c
M=
b+c
Solution
The representation of a , b ,∧c as three digit floating point numbers are:
19 | P a g e
M ≔ (+ 0.982,+1 ) rounding error =+ 0.00018
Thus the computed answer is 9.82. The exact answer is 9.8812. Hence, the total
effect of rounding error, i.e. the computed value minus the exact value, is -0.06.
EXERCISES
~
1. If the exact answer is A and the computed answer is A , find the absolute
and relative error when
~
a) A=10.147 , A=10.159
~
b) A=0.0047 , A=0.0045
c) A=0.671× 1012 , ~
A=0.669× 10
12
2. Let a=0.471 ×10−2 ∧b=−0.185× 10− 4. Use 3 digit floating point arithmetic to
a∗b∧a
compute a+ b , a−b , b
.Find the rounding error in each case.
CHAPTER 2
and all students will be familiar with the formula for its roots:
20 | P a g e
−b ± √ b2−4 ac
x=
2a
The formula for the roots of a general cubic is somewhat more complicated and
that for a general quartic usually takes several pages to describe! We are spared
further effort by a theorem which states that there is no such formula for general
polynomials of degree higher than four. Accordingly, except in special cases (for
example, when factorization is easy), we prefer in practice to use a numerical
method to solve polynomial equations of degree higher than two.
Useful analytic solutions of such equations are rare so that we are usually forced to
use numerical methods.
1. A transcendental equation
π π
2 θ−2 sinθcosθ= ∨x +cosx=0 , where x= −2 θ
2 2
21 | P a g e
FIGURE 2.
Cylindrical tank (cross-section).
f ( x )=x +cosx=0
[
π x
we obtain h from h=OB−OD=r−rcosθ=r 1−cos 4 − 2 ( )]
2. Locating roots
Let it be required to find some or all of the roots of the nonlinear f(x) = 0.
Before we use a numerical method (Bisection method, False position
method ,Newton Raphson method and Simple iteration method), we should
have some idea about the number, nature and approximate location of the
roots. The usual approach involves the construction of graphs and perhaps a
table of values of the function f, in order to confirm the information
obtained from the graph.
a) sinx−x +0.5=0
22 | P a g e
we can separate f into two parts, sketch two curves on a single set of axes,
and find out whether they intersect. Thus we sketch .
y=sinx∧ y=x−0.5 .Since |sinx|≤ 1, we are only interested in the interval -0.5
x 1.5 (outside which |x - 0.5| > 1). Thus we deduce from Fig. 3 that the
equation has only one real root, near x =1.5 as follows:
We now know that the root lies between 1.49 and 1.50, and we can use a
Steps.
In order to sketch the second curve, we use the three obvious zeros at x = 0,
2, and 3, as well as the knowledge that x(x - 2) (x - 3) is negative for x < 0
and 2 < x < 3, but positive and increasing steadily for x > 3. We deduce from
the graph (Fig. 4) that there are three real roots, near x = 0.2, 1.8, and 3. 1,
and tabulate as follows (with f ( x )=e−0.2 x −x ( x −2 )( x−3 )) :
We conclude that the roots lie between 0.15 and 0.2, 1.6 and 1.8, and 3.1
and 3.2, respectively. Note that the values in the table were calculated to an
accuracy of at least 5SD. For example, working to 5S accuracy, we have f
(0.15) = 0.97045- 0.79088= 0.17957, which is then rounded to 0.1796. Thus
the entry in the table for f(0.15) is 0.1796 and not 0.1795 as one might
expect from calculating 0.9704 - 0.7909.
23 | P a g e
EXERCISES
a) x + 2 cos x = 0.
b) x + ex= 0.
c) x(x - 1) - ex= 0.
d) x(x - 1 - sin x = 0.
Theorem: If f is continuous for x between a and b and if f (a) and f(b) have
opposite signs, then there exists at least one real root of f (x) = 0 between a and b.
24 | P a g e
a rule, a and b may be found from a graph of f.) If we calculate f ( )
a+b
2
,
which is the function value at the point of bisection of the interval
25 | P a g e
otherwise evaluated (for example, see Steps 9 and 10). Of course, such a
close examination also avoids another nearby root being overlooked.
Finally, note that bisection is rather slow; after n iterations the interval
( b−a )
containing the root is of length . However, provided values of f can be
2n
generated readily, as when a computer is used, the rather large number of
iterations which can be involved in the application of bisection, is of
relatively little consequence.
3. Example
x 3x ex f(x)
0.25 0.7 0.7788 -0.0288
5
0.27 0.8 0.7634 0.0466
1
Denote the lower and upper endpoints of the interval bracketing the root at
the n -th iteration by a n and b n , respectively (with a 1=0.25 and b 1=0.27
). Then the approximation to the root at the n-th iteration is given by
x n= ( a +b2 ). Since the root is either in[ a , b ] or [ x ,b ] and both intervals are
n n
n n n n
( bn −an )
of length , we see that x n will be accurate to three decimal places
2
( bn −an )
when < 5 10-4. Proceeding to bisection:
2
n an bn ( a n+ bn ) 3 xn e
− xn
f ( xn )
x n=
2
1 0.25 0.27 0.26 0.78 0.7711 0.0089
2 0.25 0.26 0.255 0.765 0.7749 -0.0099
3 0.255 0.26 0.2575 0.7725 0.7730 -0.0005
4 0.2575 0.26 0.2588 0.7763 0.7720 0.0042
5 0.2575 0.258 0.2581 0.7744 0.7725 0.0019
8
6 0.2575 0.258 0.2578
1
26 | P a g e
(Note that the values in the table are displayed to only 4D.) Hence the root
accurate to three decimal places is 0.258.
b) Use the bisection method to solve f ( x )=x 2−3. Let ε step=0.01 and ε ¿ =0.01 ¿
||
Thus, with the seventh iteration , we note that the final interval
[1.7266,17344] has a width less than 0.01 and |f (1.7344)|<0.01 and
therefore we choose b=1.7344 to be our approximation of the root.
EXERCISES
a. Use the bisection method to find the root of the equation x+cosx = 0.
b. Use the bisection method to find to 3D the positive root of the equation
x - 0.2sinx - 0.5=0.
c. Each equation in Exercises 2(a)-2(c) above has only one root. For each
equation use the bisection method to find the root correct to 2 D.
1
d. Use the bisection method to solve f ( x )=x + x −3 sinx with the
27 | P a g e
The Newton-Raphson iterative method
The Newton-Raphson method is suitable for implementation on a computer . It is
a process for the determination of a real root of an equation f (x) = 0, given just one
point close to the desired root. It can be viewed as a limiting case of the secant
method or as a special case of simple iteration .
1. Procedure
Let x0 denote the known approximate value of the root of f(x) = 0 and h the
difference between the true value α and the approximate value, i.e.,
α =x 0 +h
' f ( x0)
f ( x 0 ) + h f ( x0 ) ≈ 0 ,whence h ≈− ' and consequently,
f ( x 0)
f (x 0)
x 1=x 0− '
f ( x0 )
should be a better estimate of the root than x0. Even better approximations may be
obtained by repetition (iteration) of the process, which then becomes
f ( xn)
x n+1=x n −
f ' ( x n)
28 | P a g e
The geometrical interpretation is that each iteration provides the point at which the
tangent at the original point cuts the x-axis (Figure 9). Thus the equation of the
tangent at (xn, f (xn)) is
f ( x 0)
whence x1 = x0 - ' .
f (x 0 )
2. Example
We will use the Newton-Raphson method to find the positive root of the equation
sin x = x2, correct to 3D.
x f ( x )=sinx−x
2
0 0
0.25 0.1849
0.5 0.2294
0.75 0.1191
1 −¿0.1585
With numbers displayed to 4D, we see that there is a root in the interval
0.75 < x < 1 at approximately
x 0=
1
|
0.75 0.1191
−0.1585−0.1191 1 −0.1585 |
1 0.2380
¿− (−0.1189−0.1191 )= =0.8573
0.2777 0.2777
29 | P a g e
2
f ( 0.8573 )=sin ( 0.8573 )−(0.8573)
¿ 0.7561−0.7349=0.0211
and
f ' ( x )=cosx −2 x
yielding
0.0211
x 1=0.8573+ =0.8573+0.0200=0.8772
1.0600
' '
and f ( x 1 )=f ( 0.8772 ) =−1.1151
0.0005
so that x 2=0.8772− 1.1151 =0.8772−0.0005=0.8767
3. Convergence
f (x)
If we write ∅ ( x ) =x− '
, the Newton-Raphson iteration expression
f (x )
f ( xn )
x n+1=x n −
f ' ( x n)
may be rewritten
x n+1=∅ ( xn )
30 | P a g e
so that the criterion for convergence is
2
|f ( x) f ' ' ( x )|< [ f ' ( x ) ]
i.e., convergence is not as assured as, say, for the bisection method.
4. Rate of convergence
f (x n ) e2n f '' (ξ n )
0= + ( α −x n ) +
f ' (x n) 2 f ' ( xn)
e n+1=α −x n+1
2 ''
en f (ξ n)
¿− '
2 f ( xn)
2 ''
en f (α )
≈− '
2 f (α )
'' '
f (α )≈ 4 f (α )
This result states that the error at the (n + 1)-th iteration is proportional to the
square of the error at the nth iteration; hence, if f '' (α )≈ 4 f ' (α ), an answer correct to
one decimal place at one iteration should be accurate to two places at the next
iteration, four at the next, eight at the next, etc. This quadratic - second-order
31 | P a g e
convergence - outstrips the rate of convergence of the methods of bisection and
false position!
In relatively little used computer programs, it may be wise to prefer the methods of
bisection or false position, since convergence is virtually assured. However, for
hand calculations or for computer routines in constant use, the Newton-Raphson
method is usually preferred.
f(x) = x2 - a = 0.
xn+1 = xn
2
−(x n−a)
2 xn
=
xn+
a
xn , ( )
2
EXERCISES
( xnk −a )
x n+1=x n − k−1
k xn
x cos x = 0.
32 | P a g e
Method of false position
As mentioned in the Prologue, the method of false position dates back to the
ancient Egyptians. It remains an effective alternative to the bisection method for
solving the equation f(x) = 0 for a real root between a and b, given that f (x) is
continuous and f (a) and f(b) have opposite signs. The algorithm is suitable for
automatic computation .
1. PROCEDURE
The curve y = f(x) is not generally a straight line. However, one may join the
points (a,f(a)) and (b,f(b)) by the straight line
Suppose that f(a) is negative and f(b) is positive. As in the bisection method, there
are the three possibilities :
Again, in Case 1, the process is terminated, in either Case 2 or Case 3, the process
can be repeated until the root is obtained to the desired accuracy. In Fig. 6, the
successive points where the straight lines cut the axis are denoted by x1, x2, x3.
33 | P a g e
2. EFFECTIVENESS AND THE SECANT METHOD
Like the bisection method, the method of false position has almost assured
convergence, and it may converge to a root faster. However, it may happen that
most or all the calculated values of X are on the same side of the root, in which
case convergence may be slow (Fig. 7). This is avoided in the secant method,
which resembles the method of false position except that no attempt is made to
ensure that the root is enclosed; starting with two approximations to the root
(x0, x1), further approximations x2, x3,… are computed from
x n−x n−1
x n+1=x n −f (x n )
f (x n)−f ( x n−1)
There is no longer convergence, but the process is simpler (the sign of f(xn+1) is not
tested) and often converges faster.
With respect to speed of convergence of the secant method, one has at the (n+1)th
step:
34 | P a g e
Hence, expanding in terms of the Taylor series,
e n+1=
e n−1
[ '
( )
f ( α ) −e n f ( α ) +
e 2n ' '
2!
f ( α )−…
] −
[ '
e n f ( α ) −e n−1 f ( α )+ ( )
e 2n−1 ' '
2!
f ( α )−…
]
[ f ( α ) −e n f ' ( α ) +… ] −[ f ( α )−en−1 f ' ( α ) +… ] [ f ( α )−e n f ' ( α ) +… ]−[ f ( α )−en −1 f ' ( α )+ … ]
[ ]
''
f (α)
≈− '
en −1 e n
2 f (α )
where we have used the fact that f()=0. Thus we see that en+1 is proportional to
enen-1, which may be expressed in mathematical notation as
e n+1 ≈ e n−1 e n
2 ( 1+ √ 5 )
e n+1 ≈ e kn ≈ e kn , en−1 e n ≈ e k+ 1 2
n−1 ,⟹ k ≈ k +1 , ⟹ k ≈ ≈ 1.618 .
2
35 | P a g e
Hence the speed of convergence is faster than linear (k =1 ), but slower than
quadratic (k=2). This rate of convergence is sometimes referred to as superlinear
convergence.
3. EXAMPLE
Then
The student may verify that doing one more iteration of the method of false
position yields an estimate x2 = 0.257628 for which the function value is less than
5*10-6. Since x1 and x2 agree to 4D, we conclude that the root is 0.2576, correct to
4D.
EXERCISES
a. Use the method of false position to find the smallest root of the equation
f (x) = 2 sin x + x - 2 = 0, stopping when
|f ( x n)|< 5∗10−5.
b. Compare the results obtained when you use
i. the bisection method,
ii. the method of false position, and
iii. the secant method
3sin x = x + 1/x.
iv. Use the method of false position to find the root of the equation
36 | P a g e
The method of simple iteration
The method of simple iteration involves writing the equation f(x) = 0 in a form
in a repetitive fashion.
1. Procedure
2. Example
3xex = 1
37 | P a g e
Assuming x0 = 1, successive iterations yield
x1 = 0.12263, x2 = 0.29486,
x3 = 0.24821, x4 = 0.26007,
x5 = 0.25700, x6 = 0.25779,
x7 = 0.25759, x8 = 0.25764.
Thus, we see that after eight iterations the root is 0.2576 to 4D. A graphical
interpretation of the first three iterations is shown in Fig. 8.
3. Convergence
x = 3/x = (x)
where k is a point between the root and the approximation xk. We have
'
α −x 1=∅ ( α )−∅ ( x 0 )=( α−x 0 ) ∅ ( ζ 0 )
. .
. .
. .
38 | P a g e
'
α −x n+1 =∅ ( α )−∅ ( x n )= ( α −x n ) ∅ ( ζ n )
Multiplying the n + 1 rows together and cancelling the common factors x1,
x2, ··· , xn leaves |α−x n +1|=|α−x 0||∅ ( ζ 0 )||∅ ( ζ 1 )|…|∅ ( ζ n )| ,
' ' '
whence
so that the absolute error |xn+1| can be made as small as we please by sufficient
iteration if | '|< 1 in the neighbourhood of the root.
Note that (x) = 3/x has derivative | '(x)| = |-3/x²| > 1 for |x| < 3½.
2x - 1 -2sinx = 0 is 1.4973.
2. Use simple iteration to find (to 4D) the root of the equation x + cos x = 0.
CHAPTER 3
FINITE DIFFERENCES 1
Tables
Historically speaking, numerical analysts have always been concerned with tables
of numbers, and many techniques have been developed for dealing with
mathematical functions, represented in this way.
For example, the value of the function at an untabulated point may be required, so
that a interpolation is necessary. It is also possible to estimate the derivative or
the definite integral of a tabulated function, using some finite processes to
approximate the corresponding (infinitesimal) limiting procedures of calculus. In
each case, it has been traditional to use finite differences. Another application of
finite differences, which is outside the scope of this book, is the numerical
solution of partial differential equations.
1. Tables of values
39 | P a g e
Many books contain tables of mathematical functions. One of the most
comprehensive is the Handbook of Mathematical Functions, edited by
Abramowitz and Stegun (see the Bibliography for publication details),
which also contains useful information about numerical methods.
x x
f ( x )=e
0.10 1.10517
0.11 1.11628
0.12 1.12750
0.13 1.13883
0.14 1.15027
2. Finite differences
40 | P a g e
standard layout, with decimal points and leading zeros omitted from the
differences):
Differences
1111
0.11 1.11628 11
1122 0
0.12 1.12750 11
1133 0
0.13 1.13883 11
1144 1
0.14 1.15027 12
1156 1
0.15 1.16183 12
1168 −1
0.16 1.17351 11
1179 2
0.17 1.18530
13
1192
0.18 1.19722
(In this case, the differences must be multiplied by 10 -5 for comparison with
the function values.)
41 | P a g e
Consider the difference table given below for f ( x )=e x : 0.1 ( 0.05 ) 0.5 to 6S,
constructed as in Section 2. As before, differences of increasing order
decrease rapidly in magnitude, but the third differences are irregular. This is
largely a consequence of round-off errors, as tabulation of the function to
7S and differencing to fourth order illustrates (compare Exercise 3 ).
Differences
5666
5957 15
6263 14
6583 18
6921 16
0.35 1.41907 354
7275 20
0.40 1.49182 374
7649 18
0.45 1.56831
392
8041
0.50 1.64872
42 | P a g e
Although the round-off errors in f should be less than 1/2 in the last
significant place, they may accumulate; the greatest error that can be
obtained corresponds to:
Differences
+2
−1
2 -4
+1 +8
+1 -2
2
−1 +4 -16
+2 -8 +32
−1
2
+1 -4 +16
+1 -2 +8
2
−1 +4
−1
2 +2
+1
+1
2
A rough working criterion for the expected fluctuations (noise level) due to
round-off error is shown in the table:
Order of differences 1 2 3 4 5 6
Expected error limits ±1 ±2 ±3 ±6 ± 12 ± 22
43 | P a g e
EXERCISES
1. Construct the difference table for the function f (x) = x3 for x = 0(1) 6.
a) 2 x−1for x=0 ( 1 ) 3.
x = 0.1(0.05) 0.5
x f (x) x f ( x) x f ( x)
0.1 1.105171 0.25 1.284025 0.40 1.491825
0
1.161834 0.30 1.349859 0.45 1.568312
0.1
5 1.221403 0.35 1.419068 0.50 1.648721
0.2
0
FINITE DIFFERENCES 2
There are several different notations for the single set of finite differences,
described in the preceding Step. We introduce each of these three notations in
terms of the so-called shift operator, which we will define first.
E f j ≡ f j+ 1.
44 | P a g e
Consequently,
2
E f j=E ( E f j ) =E f j+1=f j+2.
Ek f j=f j+k ,
where k is any positive integer. Moreover, the last formula can be extended
to negative integers, and indeed to all real values of j and k, so that, for
example,
And
( ) ( ( ))
1
1 1
E 2 f j=f 1 =f x j + h =f x 0 + j+ h .
j+
2
2 2
∆ ≡ E−1
then
2
∆ f j =∆ ( ∆ f j) =∆ f j +1−∆ f j=f j+2 −2 f j +1+ f j
45 | P a g e
k k−1
∆ f j=∆ ( ∆ f j )=∆k−1 ( f j+1−f j ) =∆ k−1 f j +1−∆ k−1 f j
∇ ≡ 1−E−1,
then
then
( ) f =E
1 −1 1 −1
δ f j= E 2 −E 2
j
2
f j−E 2
f j=f 1 −f 1
j+ j−
2 2
46 | P a g e
which is the first-order central difference at xj. Similarly,
( )=f
2
δ f j =δ ( δ f j )=δ f 1 −f 1 j +1 −2 f j+ f j−1
j+ j−
2 2
( )
k k−1
δ f j=δ ( δ f j )=δk−1 f j+ 1 −f j− 1 =δ k−1 f j + 1 −δ k−1 f j −1
2 2 2 2
k
where k is any integer. Note that δ f j+ 12 =∆ f j=∇ f j+1 .
5. Differences display
Differences
∆f0
x1 f1 ∆2 f 0
∆ f1 ∆ f0
3
x2 f2
∆2 f 1 ∆4 f 0
∆ f2 ∆ f1
3
x3 f3
2
∆ f2
∆ f3
x4 f4
47 | P a g e
⋮ ⋮
x j−2 f j−2 δf 3
j−
2
2
δ f j −1
x j−1 f j−1
δf
j−
1 δ3 f 1
j −¿ ¿
2 2
xj fj δ2f j δ4 f j
δf
j+
1 δ3 f 1
j+
2 2
x j +1 f j +1
2
δ f j +1
δf 3
j+
x j +2 f j +2 2
⋮ ⋮
x n−4 f n−4
x n−3 f n−3
∇ f n−3
x n−2 f n−2
∇ 2 f n−2
∇ f n−2
x n−1 f n−1 3
∇ f n−1
2 4
∇ f n−1 ∇ fn
xn fn
∇ f n−1 ∇3 f n
48 | P a g e
2
∇ fn
∇fn
1. Forward differences are useful near the start of a table, since they
only involve tabulated function values below xj ;
2. Central differences are useful away from the ends of a table, where
there are available tabulated function values above and below xj;
3. Backward differences are useful near the end of a table, since they
only involve tabulated function values above xj.
EXERCISES
f ( x )=3 x 3−2 x 2+ x +5 ;
a ¿ ∆ f 1 , ∆ f 1 , ∆ f 1 , ∆ f 0 , ∆ f 2 .;
2 3 3 2
b ¿ ∇ f 1 , ∇ f 2 , ∇ f 2 , ∇ f 3 , ∇ f 4 .;
2 2 3
c ¿ δ f 1 , δ 2 f 1 , δ 3 f 3 , δ 3 f 5 , δ2 f 2 .
2 2 2
49 | P a g e
six significant digits the quantities (taking x0 = 0.1 ):
a ¿ ∆ f 2 , ∆ 2 f 2 , ∆3 f 2 , ∆ 4 f 2 . ; b ¿ ∇ f 6 , ∇ 2 f 6 , ∇3 f 6 , ∇ 4 f 6 .;
3 3 3
d ¿ ∆ f 1 , δ f 2 , ∇ f 3.; e ¿ ∆ f 3 , ∇ f 6 , δ f 9 .
2 4 2 2 2
c ¿ δ f 4 , δ f 4 .;
2
a ¿ E x j=x j+1 .;
d ¿ δ 3 f j=f 3 −3 f 1 +3 f 1 −f 3 .
j+ j+ j− j−
2 2 2 2
1. Procedure
n n−1
Pn ( x ) =an x + an−1 x +…+ a1 x +a 0
50 | P a g e
( x−x 0 ) ( x−x 1 ) … ( x−x k−1 )( x− xk +1 ) … ( x−x n )
Lk ( x ) =
( x k −x 0) ( x k −x1 ) … ( x k −x k−1 ) ( x k −x k+1 ) … ( x k −x n )
then:
n
Pn ( x ) =∑ Lk ( x) f k
k=0
Hence,
n
Pn ( x ) =∑ Lk ( x) f k
k=0
Pn ( x j )=f j , j=0,1,2 , … ,n ,
i.e., the (unique) interpolating polynomial. Note that for x = xj all terms in
the sum vanish except the j-th, which is fj; Lk(x) is called the k-th Lagrange
interpolation coefficient, and the identity
n
∑ L k ( x )=1
k=0
(established by setting f(x) 1) may be used as a check. Note also that with
n = 1 we recover the linear interpolation formula:
2. Example
51 | P a g e
The Lagrange coefficients are:
−3 3
P3 ( x ) = ( x −7 x 2+14 x−8 ) + 2 ( x 3−6 x2 +8 x )− 7 ( x 3−5 x 2 +4 x ) + 59 ( x 3−3 x 2+2 x )
8 3 4 24
1
¿ ( −9 x3 +63 x 2−126 x +72+ 16 x3 −96 x 2+128 x−42 x 3 +210 x 2−168 x +59 x 3−177 x 2+118 x )
24
1
¿ ( 24 x3 + 0 x 2−48 x+72 )
24
¿ x 3−2 x+3
EXERCISE
Given that f (-2) = 46, f (-1 ) = 4, f ( 1 ) = 4, f (3) = 156, and f (4) = 484, use
Lagrange's interpolation formula to estimate the value of f(0).
52 | P a g e
CHAPTER 4
Use of LU decomposition
Another general approach to solving Ax = b is known as the method of LU
decomposition, which provides new insights into matrix algebra and has many
theoretical and practical uses. It yields efficient computer algorithms for handling
practical problems.
The symbols L and U denote lower triangular matrix and upper triangular
matrices, respectively. Examples of lower triangular matrices are
[ ] [ ]
1 0 0 2 0 0
L1 = 0 1 0 ∧L 2 = 1 −1 0
2 −0.5 1 2 3 1
Note that in such a matrix all elements above the leading diagonal are zero.
Examples of upper triangular matrices are:
[ ] [ ]
−1 2 1 −1 2 0
U 1= 0 8 6 ∧U 2= 0 1 2
2 0 6 0 0 −1
where all elements below the leading diagonal are zero. The product of L1 and U1
is
[ ]
−1 2 1
A=L1 U 1= 0 8 6
−2 0 5
1. Procedure
53 | P a g e
Suppose we have to solve a linear system Ax = b and that we can express the
coefficient matrix A in the form of the socalled LU decomposition A = LU. Then
we may solve the linear system as follows:
Stage l:
Write Ax = LUx = b.
Stage 2:
Ly = b is:
[ ]
l 11 0 ⋯ 0 0 b1
l 21 l 22 ⋯ 0 0 b2
⋮ ⋮ ⋱ ⋮ ⋮ ⋮
l n−1,1 ln−1,2 ⋯ l n−1 , n−1 0 b n−1
ln 1 ln 2 ⋯ l n ,n−1 l nn b n
b1
Then forward substitution yields y 1= , and, subsequently,
l 11
[ ]
i−1
1
y i= b −∑ l y , i=2,3 , …
l ii i j=1 ij j
Note that the value of yi depends on the values y1, y2, . . , yi-1, which have already
been calculated.
Stage 3:
54 | P a g e
Example
−x 1+ ¿ 2 x 2 +¿ x3 =0
¿ 8 x 2+ ¿ 6 x3 =10 +¿ 5 x 3=−11 ¿
¿
Stage l:
AX =L1 U 1 X =b
[ ][ ][ ] [ ]
1 0 0 −1 2 1 x 1 0
AX= 0 1 0 0 8 6 x2 = 10
2 −0.5 1 2 0 6 x 3 −11
Stage 2:
[ ][ ] [ ]
1 0 0 y1 0
0 1 0 y2 = 10
2 −0.5 1 y 3 −11
y 1=0
y 2=10
[]
0
Y = 10
−6
Stage 3:
55 | P a g e
Solve U 1 X=Y
[ ][ ] [ ]
−1 2 1 x 1 0
0 8 6 x 2 = 10
2 0 6 x 3 −6
Back-substitution yields:
6 x 3=−6 ⟹ x 3=−1
8 x 2+ 6 x3 =10⟹ x 2=2
−x 1+ 2 x 2 + x 3=0 ⟹ x 1=3
[]
3
X= 2
−1
which you may check, using the original equations. We turn now to the problem of
finding an LU decomposition of a given square matrix A.
Realizing an LU decomposition
x+ ¿ y −¿ z=2
x+ ¿ 2 y +¿ z=6
2 x−¿− y+ ¿ z =1
56 | P a g e
[ ]
1 1 −1
U= 0 1 2
0 0 9
a21
Also, we saw that in the first stage we calculated the multipliers m21= =1 and
a11
a31 a32
m31= =2 , while, in the second stage, we calculated the multiplier m32= =−3.
a11 a22
Thus
[ ][ ]
1 0 0 1 0 0
L= m 21 1 0 = 1 1 0
m 31 m32 1 2 −3 1
It is readily verified that LU equals the coefficient matrix of the original system:
[ ]
1 1 −1
LU = 1 2 1
2 −1 1
[ ] [ ]
l 11 0 0 u 11 u12 u13
L= l 21 l 22 0 , U= 0 u22 u23
l 31 l 32 l 33 0 0 u33
Note that the total number of unknowns in L and U is 12, whereas there are only 9
elements in the 3 x 3 coefficient matrix A. To ensure that L and U are unique, we
need to impose 12 - 9 = 3 extra conditions on the elements of these two triangular
matrices. (In the general nn case, n extra conditions are required.) One common
choice is to require all the diagonal elements of L to be 1; the resulting method is
known as Doolittle's method. Another choice is make the diagonal elements in U
to be 1; this is Crout's method. Since Doolittle's method will give the same in this
direct LU decomposition for A, given above, we shall use Crout's method to
illustrate decomposition procedure.
57 | P a g e
We then require that
[ ][ ][ ]
l 11 0 0 1 u12 u13 1 1 −1
l 21 l22 0 0 1 u23 = 1 2 1
l 31 l32 l 33 0 0 1 2 −1 1
[ ac bd ]
It is clear that this construction by Crout's method yields triangular matrices L and
U for which A=LU.
EXERCISES
[ ac bd ]
where
a , b , c , d ≠ 0.
x 1 +¿ x 2−¿ x 3=0
2 x 1−¿ x 2 +¿ x3 =6
3 x 1+¿ 2 x 2−¿ 4 x 3=−4
58 | P a g e
b.
2 x +¿ 6 y +¿ 4 z=5
6 x +¿ 19 y+ ¿12 z=6
2 x +¿ 8 y +¿ 14 z=7
Systems of over 100 000 variables have been successfully solved on computers by
iterative methods, whereas systems of 10 000 or more variables are difficult or
impossible to solve by direct methods.
This text will only present one iterative method for linear equations, due to Gauss
and improved by Seidel. We shall use this method to solve the system
59 | P a g e
10 x 1 +2 x 2 + x 3=13
2 x1 +10 x 2 + x 3=13
2 x1 + x2 +10 x 3=13
The first step is to solve the first equation for x1, the second for x2, and the third for
x3 when the system becomes:
Beginning with this second approximation, we repeat the process to obtain a third
approximation, etc. Under certain conditions relating to the coefficients of the
system, this sequence will converge to the exact solution.
We can set up recurrence relations which show clearly how the iterative process
proceeds. Denoting the k-th and k+1-th approximations by (x(k)1, x(k)2, x(k)3) and
(x(k+1)1, x(k+1)2, x(k+1)3), respectively, we find
( k +1) ( k) (k )
x1 =1.3−0.2 x 2 −0.1 x 3 … … … … … … .(1) '
( k +1) ( k+1) ( k)
x2 =1.3−0.2 x 1 −0.1 x 3 … … … … … .(2)'
We begin with the starting vector x(0) = (x(0)1, (x(0)2, (x(0)3) all components of which
are 0, and then apply these relations repeatedly in the order (1)', (2)' and (3)'. Note
that, when we insert values for xl, x2 and x3 into the right-hand sides, we always use
the most recent estimates found for each unknown.
3. Convergence
The sequence of solutions produced by the iterative process for the above
numerical example are shown in the table:
60 | P a g e
Iteration Approximate solution(Gauss-seidel)
(k ) (k ) (k )
k x1 x2 x3
0 0 0 0
The student should check that the exact solution for this system is (1,1,1). It is seen
that the Gauss-Seidel solutions are rapidly approaching these values; in other
words, the method is converging.
becomes less than a prescribed small number (usually chosen according to the
accuracy of the machine on which the calculations are carried out).
In order to improve the chance (and rate) of convergence, the system of equations
should be rearranged before applying the iterative method, so that, as far as
possible, each leading-diagonal coefficient is larger (in absolute value) than any
other in its row.
EXERCISES
1. For the example treated above, compute the value of S3, the quantity used in
the suggested stopping rule after the third iteration.
61 | P a g e
2. Use the Gauss-Seidel method to solve the following systems to 5D
accuracy (remembering to rearrange the equations if appropriate).
Compute the value of Sk (to 6D) after each iteration.
a)
x - y + z = -7,
20x + 3y - 2z = 51,
2x + 8y + 4z = 25.
b)
10x - y = 1
-x + 10y - z = 1
- y + 10z - w = 1
- z + 10w = 1
CURVE FITTING
1. Least squares
A rather different, but often quite suitable approach is a least square fit, in which,
instead of trying to fit points exactly, a polynomial of low degree (often linear or
quadratic) is obtained which fits the points closely (after all, the points themselves
may not, in general, be exact, but subject to experimental error).
62 | P a g e
times, we obtain six pairs of values (xj, yj), which can be plotted on a diagram such
as Figure 11(a).
We may believe that the relationship between the variables can be described
satisfactorily by a function y = f (x), but that the y-values, obtained experimentally,
are subject to errors (or noise). Therefore one arrives at the mathematical model:
f ( x i ) = y i+ ϵ i , i=1,2 , … , n
with n data, where f (xi ) are the values of y, corresponding to the value of xi, used
in the experiment, and i is the experimental error involved in the measurement of
the variable y at the point. Thus, the error in y at the observed point is
ϵ i=f ( x i )− y i .
In the problem of curve fitting, we use the information of the sample data points to
determine a suitable curve (i.e., find a suitable function f ) so that the equation
y = f (x) gives a description of the (x, y) relationship, in other words, it is hoped
that predictions made by means of this equation will not be too much in error.
Let us, first of all, answer the question regarding the choice of function. Given a set
of values (x1, y1), (x2, y2),. . , (xn, yn); we shall pick a function which we can specify
completely except for·the values of a set of k parameters c1, c2, .. , ck; we shall
denote this function by y=f ( x ; c1 , c 2 , … , c k ). We then choose values for the
63 | P a g e
parameters which will make the errors at the observation points (xi, yi) as small as
possible. Next, we shall suggest three ways by which the phrase as small as
possible can be given specific meaning.
In 1., the set of functions is{ 1 , x , x 2 , … , x k−1 } ; in 2., { sinωx , sin 2 ωx , … , sinkωx } with ω
a constant chosen to coincide with a periodicity in the data, while in 3., the set is
{ cosωx , cos 2 ωx , … , coskωx }. Other functions commonly used in curve fitting are
exponential functions, Bessel functions, Legendre polynomials, and
Chebyshev polynomials (cf., for example, Burden and Faires (1993)).
We now present criteria which make precise the concept of choosing a function to
make measurement errors as small as possible. We suppose that the curve to be
fitted can be expressed in a general linear form, with a known set of functions
{∅1 , ∅ 2 , … , ∅ k }.
ϵ 1=c1 ∅ 1 ( x 1 ) +c 2 ∅ 2 ( x 1 ) +…+ c k ∅k ( x 1 )− y 1
ϵ 2=c1 ∅ 1 ( x 2 ) +c 2 ∅ 2 ( x 2 ) +…+ c k ∅k ( x 2 )− y 2
⋮⋮
ϵ n=c 1 ∅ 1 ( xn ) + c 2 ∅ 2 ( xn ) + …+c k ∅ k ( x n ) − y n
If the number of data points is less than or equal to the number of parameters, i.e.,
n ≤ k , it is possible to find values for {c1, c2,. .. ., ck) which make all the errors i
zero. If n is an infinite number of solutions for {ci} which make al1 the errors zero,
then an infinite number of curves of the given form pass through all the
64 | P a g e
experimental points; in this case, the problem is not fully determined, i.e., more
information is needed to choose an appropriate curve.
If n > k, which, in practice, is mostly the case, then it is not normally possible to
make all the errors zero by a choice of the {ci}. There are three possible choices:
1. A set {ci} which minimizes the total absolute error, i.e., minimize the sum:
n
∑|ϵ i|;
i=1
2. a set {ci} which minimizes the maximum absolute error, i.e., minimizes
max |ϵ i|;
i=1,2,… ,n
3. a set {cI} which minimizes the sum of the squares of the errors, i.e.,
minimize
n
S=∑ |ϵ 2i |;
i=1
In order to apply the principle of least squares, use has to be made of partial
differentiation. We now describe the method here and give examples, in order to
show how it is used.
n n
S=∑ ϵ i =∑ [ c1 ∅ 1 ( x i ) +c 2 ∅ 2 ( x i ) +…+ c k ∅k ( x i )− y i ]
2 2
i=1 i=1
The n values of (xi, yi) are the known measurements taken from n experiments.
When they are inserted on the right-hand side, S becomes an expression involving
only the k unknowns c1, c2, . . , ck. In other words, S may be regarded as a function
65 | P a g e
of the ci, i.e., ≡ S ( c1 , c 2 , … , c k ). The problem is now to choose that set of values {ci}
which makes S a minimum.
A theorem in calculus tells us that, under certain conditions which are usually
satisfied in practice, the minimum of S occurs when all the partial derivatives
∂S ∂S ∂S
, ,…,
∂ c1 ∂ c2 ∂c k
∂S
vanish. The partial derivative ∂ c coincides here with the differential coefficient
1
dS
d c1
, while all the other ci are held constant; for instance, if S = 3cl + 5c2, then
∂ S 3∧∂ S
= =5
∂ c1 ∂ c2
∂S
=0
∂ c1
∂S
=0
∂ c2
∂S
=0
∂ ck
This system is a set of equations which is linear in the variables cl, c2, . . , ck and is
referred to as the normal equations for the least squares approximation.
6. Example
x 1 2 3 4 5 6
y 1 3 4 3 4 2
66 | P a g e
We shall plot the points on a diagram and use the method of least squares to fit
through them
The plotted points are shown in Figure 12(a). In order to fit a straight line, we have
to find a function y = cl + c2x, i.e., a first degree polynomial which minimizes
6 6
s=∑ ϵ 2i =∑ [ y i−c1 −c 2 x i ]
2
i=1 i=1
Differentiating first with respect to cl (keeping c2 constant) and then with respect to
c2 (keeping cl constant), and setting the results equal to zero, yields the normal
equations:
6
∂S
≡−2 ∑ ( y i−c 1−c 2 xi ) =0
∂ c1 i=1
6
∂S
≡−2 ∑ x i ( y i−c 1−c2 x i )=0
∂ c2 i =1
We may divide both equations by -2, take the summation operations through the
brackets, and rearrange, in order to obtain:
(∑ )
6 6
∑ yi =6 c 1+ xi c2
i=1 i=1
( ) (∑ )
6 6 6
∑ x i y i= ∑ xi c 1+ x 2i c 2
i=1 i=1 i=1
We see that, in order to obtain a solution, we have to evaluate the four sums
∑ x i , ∑ y i , ∑ x 2i , ∑ x i y i and insert them into these equations. We can arrange the
work in a table as follows (the last three columns are devoted to fitting of the
parabola and the required sums are in the last row):
i xi yi x 2i xi yi x 2i y i x 3i x 4i
1 1 1 1 1 1 1 1
2 2 3 4 6 12 8 16
67 | P a g e
3 3 4 9 12 36 27 81
4 4 3 16 12 48 64 256
6 6 2 36 12 72 216 1296
∑❑ 21 17 91 63 269 441 2275
17=6 c 1+ 21c 2
63=21 c1 +91 c 2
The solutions to 2D are c1 = 2.13 and c2 = 0.20, whence the required line is
(Figure 12(b)):
y=2.13+ 0.20 x
which minimizes
6 6
s=∑ ϵ =∑ [ y i −c 1−c 2 x i−c3 x 2i ]
2 2
i
i=1 i=1
( ) ( )
6 6 6
∑ yi =6 c 1+ ∑ x i c2 + ∑ x 2i c3
i=1 i=1 i=1
( ) ( ) (∑ )
6 6 6 6
∑ x i y i= ∑ xi c 1+ ∑ x 2i c 2 + x 3i c3
i=1 i=1 i=1 i =1
(∑ ) (∑ ) (∑ )
6 6 6 6
∑x 2
i y i=
2
x c1 +
i
3
x c2+
i xi c3
4
68 | P a g e
Inserting the values for the sums (see the table above), we obtain the system of
linear equations:
it is also plotted in Figure 13(b). Obviously, the parabola is a better fit than the
straight line!
EXERCISES
1. For the example above (the data points are shown in Figure 12(a)) compute
the value of S, the sum of the squares of the errors at the points, from 1. the
fitted line, and 2. the fitted parabola. Plot the points on graph paper, and fit
a straight line by eye (i.e., use a ruler to draw a line, guessing its best
position). Determine the value of S for this line and compare it with the
value for the least squares line.
69 | P a g e
Fit a straight line by the least squares method to each of the following sets
of data:
toughness x 36 41 42 43 44 45 47 50
% nickel y 2.5 2.7 2.8 2.9 3.0 3.2 3.3 3.5
b) Aptitude test marks x, given to six trainee sales people, and their first-
year sales y in thousands of dollars.
Aptitude test x 25 29 33 36 42 54
First-year sales y 42 45 50 48 73 90
For both sets of data, plot the points and draw the least squares line. Use the
lines to predict the % - nickel of a specimen of steel the toughness of which
is 38, and the likely first-year sales of a trainee sales person who obtains a
mark of 48 in the aptitude test.
[ ][ ][ ]
∑ yi n ∑ xi ∑ x 2i ∑ x31 c1
∑ xi y i =
∑ xi ∑ x 2i ∑ x3i ∑ x 4i c2
∑ x2i y i ∑ x 2i ∑ x 3i ∑ x i4 ∑ x5i c3
∑ x 3i y i ∑ x 3i ∑ xi4 ∑ x5i ∑ x6i c4
Deduce the matrix form of the normal equations for fitting a fourth-
degree polynomial.
3. Use the least squares method to fit a parabola to the points (0,0), (1,I),
(2,3), (3,3), and (4,2). Find the value of S for this fit.
70 | P a g e
4. Find the normal equations which arise while fitting by the least squares
method an equation of the form y = c1 + c2sin x to the set of points
( 0,0 ) ( π6 , 1)( π2 ,3)∧( 56π , 2) .Solve them for c and c .
1 2
SECTION B
Definition:
The ORDER of a differential equation is the order of the highest derivative that it
contains.
71 | P a g e
Examples
d2 y dy 2
b) 2 −6 + 8 y=0
dx dx
d3 y dy t 3
c) 3 −t +ty=e
dt dt
d) y ' − y =e2 x 1
Example :
dy 2x
y=e is a solution of the differential equation
2x
− y=e ……………..(1)
dx
on the interval I ¿ (−∞ , ∞ ) since substituting y and its derivatives into the left side of
dy d (e2x ) 2x
this equation yields − y=
2x 2x
−e =2 e −e =e
2x
dx dx
for all real values of x . However, this is not the only solution on I (e.g)
y=Ce x +e 2 x ……………..(2) is also a solution for every real values of the constant
C . Since
72 | P a g e
dy d ( Cex +e 2 x )
−(Ce ¿ ¿ x+ e )=(Ce +2 e )−(Ce ¿ ¿ x +e )=e ¿ ¿ .
2x x 2x 2x 2x
− y=
dx dx
After developing some techniques for solving equations such as (1) we will be able
to show that all solutions of on (−∞ , ∞ ) can be obtained by substituting values for
the constant C in (2).
INITIAL-VALUE PROBLEM(IVP)
For a first order equation, the single arbitrary constant can be determined by
specifying the value of the unknown function y ( x ) at an arbitrary x−¿value x 0 say
y ( x 0 )= y 0 . This is called an initial condition , and the problem of solving a first-
order equation subject to an initial condition is called a first-order initial-value
problem.
can be obtained by substituting the initial condition x=0 , y=3 in the in the general
solution (2) to find C.
⟹ 3=C e0 + e0 =C+1
⟹ C=2
73 | P a g e
x 2x
⟹ y ( x )=2 e +e
dy
(e.g) =x 3.………………(4)
dx
x4
⟹ y= +C .
4
(e.g)
dy 2 x
2
P ( x ) =x , Q ( x )=e
x
a) dx + x y=e
dy 3 P ( x ) =sinx ,Q ( x )=x 3
b) dx + ( sinx ) y + x =0
dy P ( x ) =5 ,Q ( x )=2
c) dx +5 y =2
dy P ( x ) =−1 ,Q ( x )=x
d) dx − y=x
74 | P a g e
Example #1: (Solve the differential equation)
dy 2 dy 2
=−4 xy and then solve the (I.V.P) =−4 xy , y ( 0 )=1.
dx dx
1 dy
⟹ =−4 x where y ≠ 0
y 2 dx
⟹∫ y dy=−4∫ xdx
−2
1
⟹− =−2 x 2+C
y
1
Solving for y as a function of x , we obtain y= 2 .
2 x +C
INTEGRATING FACTOR.
We will assume that the functions P ( x ) and Q ( x ) in (5) are continuous on a common
interval I, and we will look for a general solution that is valid on I.
dμ ∫ P (x )dx d
⟹ =e . ∫ P( x )dx
dx dx
dμ
⟹ =μP ( x )
dx
d dy dμ
Thus , ( μy )=μ + y
dx dx dx
dy
¿μ + μP ( x ) y …………………….(7)
dx
dy
Multiply (5) by μ ⟹μ + μP ( x ) y=μQ ( x )
dx
75 | P a g e
d
⟹ ( μy ) =μQ( x)……………………………..(8)
dx
1
⟹ y=
μ
∫ μQ( x)dx …………………………….(9)
The function μ is called an integrating factor for (5) and this method for finding a
general solution of (5) is called the method of integrating factors.
STEP # 1: Calculate the integrating factor μ=e∫ P (x)dx since any μ will
step.
this step.
d −x
⟹ ( e y )=e x
dx
⟹∫ d ( e−x y )=∫ e x dx
76 | P a g e
−x x
⟹e y=e +C
2x x
⟹ y=e +C e
dy Q ( x ) R(x)
⟹ + y= … … … … …(10)
dx P ( x ) P(x)
dy 1 1
⟹ − y=1 , P ( x ) = ∧Q ( x ) =1
dx x x
∫ −1 dx 1
μ=e∫
P (x)dx x −lnx
=e =e =
x
μ
dy
dx
−μ
1
x ()
y=μ .1
1 dy 1 1
⟹ − 2 y=
x dx x x
⟹
d 1
dx x ( )
y =
1
x
⟹∫ d ( 1x y)=∫ 1x dx
1
⟹ y =ln |x|+C
x
77 | P a g e
Although there is no general method for solving non-linear first (O.D.E). we will
now consider a method of solution that can often be applied to first-order equations
that are expressible in the form
dy
h( y) =g( x)……………….(11)
dx
⟹ h ( y ) dy=g(x) dx ………………(12)
STEP #1: Separate the variables in (11) by rewriting the equation in the
STEP #2: Integrate both sides of the equation in STEP #1 (the left side with
∫ h ( y ) dy=∫ g (x)dx
STEP #3: If H ( y ) is any antiderivative of h ( y ) and G(x ) is any
1
⟹∫ dy=∫ dx
y
⟹ ln | y|=x+C
ln| y| x+C
⟹e =e
x c
⟹ y= A e , where A=e (constant )
EXERCISES
1) Solve the equations using both the method of integrating factors and
78 | P a g e
the method of separation of variables and determine whether the
dy dy
c) dx −4 xy =0 d) dx + y =0
dy
c) y ' + y=cos (e x ) d) 2 dx +4 y=1
dy
e) ( x +1 ) dx + xy=0
2
dy 1
b) dx + y + =0
1−e x
c) ( √ 1+ x 2
1+ y ) dy
dx
=−x d) y ' =−xy
dy
4) In each part, find the solution of the differential equation x + y=x
dx
a) y (1)=2 b) y (−1)=2
79 | P a g e
dy
5) In each part, find the solution of the differential equation =xy that
dx
1
satisfies the initial-value problem(I.V.P) a) y (0)=1 b) y (0)= 2
3 x2
c) y'= , y ( 0 )=π d) y ' −x e y =2 e y , y ( 0 )=0
2 y + cos ( y )
2
dy 2 t+1 dx t +1 ( )
e) = , y ( 0 )=−1 f) = , x 0 =−2
dt 2 y−2 dt x +2
dx dx
h) dt =( x −1 ) cost , x ( 0 )=2
2
g) t ( t−1 ) dt =x ( x+1 ) , x ( 2 ) =2
dx x+t dx 4 lnt
i) dt =e , x ( 0 )=a j) dt = 2 , x ( 1 )=0
x
enter the tank at a rate of 2gal/min and the mixed solution is drained
80 | P a g e
ultimate value of the concentration ?
Difference Equations
At this point almost all of our sequences have had explicit formulas for their terms.
That is, we have looked mainly at sequences for which we could write the nth term
as a n=f (n) for some known function f . For example, if
n+1
a n= 2
n +3
11 101
then it is an easy matter to compute explicitly, say, a 10 = 103 ∨a100 = 10003 . In such
cases we are able to compute any given term in the sequence without reference to
any other terms in the sequence. However, it is often the case in applications that
we do not begin with an explicit formula for the terms of a sequence; rather, we
may know only some relationship between the various terms. An equation which
expresses a value of a sequence as a function of the other terms in the sequence is
called a difference equation. In particular, an equation which expresses the value
an of a sequence{ a n } as a function of the term a n−1 is called a first-order difference
equation. If we can find a function f such that a n=f ( n ) , n=1,2,3 , … then we will have
solved the difference equation. In this section we will consider a class of difference
equations that are solvable in this sense; in the next section we will discuss an
example where an explicit solution is not possible.
year. If we let x 0 represent the size of the initial population of owls and
81 | P a g e
x n the number of owls n years later, then
for n=0,1,2 , ….That is, the number of owls in any given year is equal to the number
of owls in the previous year plus 2% of the number of owls in the previous year.
Equation (1) is an example of a first-order difference equation; it relates the
number of owls in a given year with the number of owls in the previous year.
Hence we know the value of a specific x n once we know the value of x n−1.
To get the sequence started we have to know the value of x 0. For example, if
initially we have a population of x 0=100 owls and we want to know what the
population will be after 4 years, we may compute
x 4 =1.02 x 3
¿ ( 1.02 )( 1.02 ) x 2
¿ ( 1.02 )( 1.02 ) (1.02) x 1
82 | P a g e
¿ ( 1.02 )( 1.02 ) (1.02)(1.02)x 0
¿ ( 1.02 )4 x 0.
For a geometric feeling of how the population is changing with time, Figure 1.1
shows a plot of the points ( n , x n ) , n=0,1,2, . . . 100. Of course, whether or not our
model will provide an accurate prediction of the owl population 100 or 200 years
into the future is an entirely different question. Frequently, a simple population
model like this will be valid only for a short span of time during which the rate of
growth of population remains stable.
x n+1=α x n …………………………..(3)
n=0,1,2 , …, is given by
n
x n=α x 0 …………………………….(4)
n=0,1,2 , … . Note that this difference equation, and its solution, are useful
whenever we are interested in a sequence of numbers where the (n+1)st term is a
constant proportion of the nth term. Our first example, where a population was
assumed to grow at a constant rate, is a common example of this type of behavior.
Another common example is when a quantity decreases at a constant rate over
time. This behavior is discussed in the next example in the context of radioactive
decay.
Example
83 | P a g e
Radium is a radioactive element which decays at a rate of 1% every 25 years. This
means that the amount left at the beginning of any given 25 year period is equal to
the amount at the beginning of the previous 25 year period minus 1% of that
amount. That is, if x0 is the initial amount of radium and x n is the amount of
radium still remaining after 25n years, then
for ¿ 0,1,2 , … . Since this is a difference equation of the form of (3) with α =0.99 we
know that the solution is of the form (4). Namely,
x n=( 0.99 )n x 0
for ¿ 0,1,2 , … . For example, the amount left after 100 years is given by
4
x 4 =( 0.99 ) x 0=0.9606 x 0
where we have rounded the answer to four decimal places. That is, approximately
96% of the initial amount of radium will be left after 100 years. A plot of the
amount of radium left versus number of years, assuming an initial amount of 500
grams, is given in Figure1.2.
The half-life of a radioactive element is the number of years required for one-half
of an initial amount to decay. Suppose that, for this example, N is the smallest
integer for which x N is less than one-half of the initial amount of radium. This
would mean that
1 N
x ≥ ( 0.99 ) x0
2 0
84 | P a g e
Figure 1.2 Plot of amount of radium versus number of years.
which implies that
1 N
≥ ( 0.99 ) .
2
N≥
log 10 ( 12 ) =68.98,
log 10 ( 0.99 )
rounding to two decimal places. Hence, since N must be an integer, we have N=69.
Recalling that we are working with 25 year units of time, this shows that the half-
life of radium is approximately (25)(69) = 1725 years. For example, this means
that if we started with an initial amount of 100 grams of radium, after 1725 years
we would still have 50 grams left. It would then take an additional 1725 years until
the remaining amount would be reduced to 25 grams.
Although we have stated the results of the preceding example in discrete time
units, namely, units of 25 years each, later we will see that the results hold for
continuous time as well. In other words, although the difference equation (5) has
been set up for nonnegative integer values of n, the solution (6) is valid for
arbitrary nonnegative values of n .
It is interesting to compare the plots in Figures 1.1 and 1.2. The first is an example
85 | P a g e
of exponential growth, whereas the second is an example of exponential decay.
In the first, the steepness of the graph increases with time; in the second, the graph
flattens out over time. The difference equation (3) will always lead to the first
behaviour when α >1 and to the second when 0< α <1 .
x n+1=α x n+ β …………………….(6)
x n=α x n−1 + β
¿ α ( αx n−2 + β ) + β
2
¿ α x n−2 + β ( α +1 )
¿ α 2 ( α x n−3 + β ) + β (α +1)
3 2
¿ α x n−3 + β (α + α +1)
⋮
¿ α x 0 + β ( α n−1 + α n−2 +⋯+ α 2 +α +1 )
n
1−α n
α n−1+ α n −2 + ⋯+α 2+ α +1=
1−α
Hence
n
x n=α x 0 + β ( 1−α)
1−α n
………………………..(8)
86 | P a g e
We have seen examples of first-order linear equations in the population growth and
radioactive decay examples above. Another interesting example arises in modeling
the change in temperature of an object placed in an environment held at some
constant temperature, such as a cup of tea cooling to room temperature or a glass of
lemonade warming to room temperature. If T 0 represents the initial temperature of
the object, S the constant temperature of the surrounding environment, and T n the
temperature of the object after n units of time, then the change in temperature over
one unit of time is given by
n=0,1,2 , …, where k is a constant which depends upon the object. This difference
equation is known as Newton’s law of cooling. The equation says that the change
in temperature over a fixed unit of time is proportional to the difference between
the temperature of the object and the temperature of the surrounding environment.
That is, large temperature differences result in a faster rate of cooling (or warming)
than do small temperature differences. If S is known and enough information is
given to determine k , then this equation may be rewritten in the form of a first
order-linear difference equation and, hence, solved explicitly. The next example
shows how this may be done.
Example
Solution
If we let T n be the temperature of the tea after n minutes and we let S be the
temperature of the room, then we have T 0=180 , T 1=175 , and S=80.
Newton’s law of cooling states that
That is,
87 | P a g e
175−180=k (180−80).
Hence
−5=100 k ,
and so
−5
k= =−0.05
100
( )
n
n 1−( 0.95 )
T n=( 0.95 ) ( 180 ) +4
1−0.95
¿ 180 ( 0.95 )n +80 ( 1− ( 0.95 )n )
n
¿ 80+100 ( 0.95 )
88 | P a g e
for ¿ 0,1,2 , … . In particular,
20
T 20 =80+100 ( 0.95 ) =115.85
where we have rounded the answer to two decimal places. Hence after 20 minutes
the tea has cooled to just under 116° F . Also, since
we see that
n→∞ n →∞
That is, as we would expect, the temperature of the tea will approach an
equilibrium temperature of 80° F , the room temperature. In Figure 1.3 we have
plotted temperature T n versus time n for n=0,1,2 , … ,60 , along with the horizontal
line T =80. As indicated by (12), we can see that T n decreases asymptotically
toward 80° F as n increases.
EXERCISES
1. Compute the next five terms of each of the following sequences from the given
information.
1
(a) x 0=10 , x n+1=x n +4 (b) y 0=−1, y n +1= y
n
89 | P a g e
(c) x 0=40 , x n+ 1=2 x n−20 (d) z 0=2 , z n+1=z 2n−z n
1
(e) x 0=2 , x1 =3 , x n +2=x n+1 + x n (f) x 0=15 , x n= 3 x n−1+ 2
2. Solve the following difference equations with the given initial condition.
Use your solution to find x 10.
3
(a) x n+1=2 x n , x 0=5 (b) x n+1= 4 x , x 0 =100
n
of weasels n years from now and suppose that there are currently 350 weasels.
(a) Write a difference equation which describes how the population changes from
year to year.
(b) Solve the difference equation of part (a). If the population growth continues at
the rate of 3%, how many weasels will there be 15 years from now?
(d) How many years will it take for the population to double?
3%, how many years would it take for the population to double?
year if left to itself, but poachers kill 6 weasels every year for their fur.
90 | P a g e
(a) Write a difference equation which describes how the population changes from
year to year.
(b) Solve the difference equation of part (a). How many weasels will there be in 15
years?
(d) Will the population eventually double? If so, how long will this take?
6. A cup of coffee has an initial temperature of 180° F , but cools to 180° F in one
(a) Write a difference equation, in standard first order linear form, which describes
9. A glass of ginger ale is left in a room. Initially, the ginger ale has a temperature
of 45 ° F , but after one minute the temperature has increased to 50° F and after
REFERENCES
1. First steps in numerical analysis (2nd Edition) by Hosking, Joe, Joyce and
Turner.
2. Applied Numerical Analysis (7th Edition) by Gerald and Wheatley.
(Recommended as a prescribed).
92 | P a g e
4. Numerical Analysis by Timothy Sauer.
by Laurene V. Fausett.
93 | P a g e