Approximations and Errors in Numerical Computing
Approximations and Errors in Numerical Computing
Objective:
Approximations and errors are an indispensable piece of human life. They are all over the place
and unavoidable. In numerical methods, we additionally can't disregard the presence of error.
Blunders come in assortment of structures and sizes; some are avoidable, some are most
certainly not. For instance, data conversion and round off mistakes can't be evaded, yet human
error can be dispensed with. Although certain error can't be wiped out totally, we must tried to
minimize these errors for our final solutions. It is accordingly basic to know how mistakes
emerge, how they develop during the numerical procedure, and how they influence accuracy of
an answer. The main objective of this lecture is to understand the major sources of errors in
Numerical Method.
Exact Numbers:
2, 1/3, 100 etc are exact numbers because there is no approximation or uncertainty associated
with them. Π, √2 etc are also exact numbers when written in this form.
Approximate Numbers:
An approximate number is a number which is used as an approximation to an exact number and differs
only slightly from the exact number for which it stands. For example, an approximate value of ∏ is 3.14
or if we desire a better approximation, it is 3.14159265. But we cannot write the exact value of ∏.
Significant Digits:
The digits that are used to express a number are called significant digits (or figures).
A significant digit/figure is one of the digits 1,2, ... 9 and 0 is a significant figure except
when it is used to fix the decimal point or to fill the places of unknown or disorder digits.
The following notion describe how to count significant digits:
(2)
We come across numbers in numerical computation that have large numbers of digits and it will
be necessary to cut them to manageable numbers of figures. This process is referred to as
rounding off. The error caused by a large number cut-off into usable figure number is called a
round-off error.
For example, 3.14159265... rounded to the nearest thousandth is 3.142. That is because the third
number after the decimal point is the thousandths place, and because 3.14159265... is closer to
3.142 than 3.141.
http://www.rit.edu/~meseec/eecc250-winter99/IEEE-754references.html
http://www.gotdotnet.com/Community/MessageBoard/Thread.aspx?id=260335
1.6583 → 1.658
30.0567 → 30.06
0.859458 → 0.8594
3.14159 → 3.142
3.14358 → 3.144
Sources of Errors
A number of different types of errors arise during the process of numerical computing. All these
errors contribute to the total error in the final result.
1. Inherent Errors: Inherent errors are those present in the data provided to the model (also
known as input errors). It contains two components, data errors and errors of conversion.
Data Errors:
Data errors (also known as statistical errors) occur when some experimental methods obtain dat
a for a problem and are therefore of limited precision and reliability. This may be due to some
limitation in instrumentation & reading and may therefore be inevitable.
Conversion Error:
Conversion errors (also known as errors of representation) occur due to the computer's
limitation to store the data accurately. We know that there is only a specified number of digits in
the floating point representation. The unretained digits are the round-up error.
Example: Represent the decimal number 0.1 and 0.4 in binary number form with an accuracy of
8 binary digits. Add them and then convert the result back to the decimal form.
Solution:
2. Numerical Errors:
Numerical errors were created during the process of implementing a numerical procedure (also
known as procedural errors). These come in two ways round off errors and and truncation
errors.
i) Chopping: In chopping, the extra digits are dropped. This is called truncating the
number. For example, suppose we are using a computer with a fixed word length of four
digits. Then a number like 42.7893 will be stored as 42.78 and the digits 93 will be
dropped.
x = 42.7893
= 0.427893 × 102
= (0.4278 + 0.000093) × 102
= (0.4278 + 0.93× 10-4) × 102
This can be expressed in general form as:
True x = (fx + gx × 10-d) × 10E
= fx × 10E + gx × 10E-d
= approximate x + error
In chopping, Error = gx × 10E-d ; 0 ≤ gx < 1, where d is the length of the mantissa, E is
the exponent and gx is the truncated part of the number in normalized form.
In chopping, absolute error 10E.
ii) Symmetric Round off: In the symmetric Round off method, the last retained significant
digit is “rounded up” by 1 if the first discarded digit is larger or equal to 5; otherwise, the
last retained digit is unchanged. For example, the number 42.7893 would become 42.79
and the number 76.5432 would become 76.54.
In symmetric round off, when gx < 0.5
True x = fx × 10E + gx × 10E-d
Approximate x = fx × 10E
Error = gx × 10E-d
In symmetric round off, when gx ≥ 0.5
True x = (fx + 10-d) × 10E = fx × 10E + 10-d × 10E
Error = [fx × 10E + gx × 10E-d] – [fx × 10E + 10-d × 10E]
= (gx –1) × 10E-d
In symmetric round off, absolute error 0.5 × 10E-d.
Sometimes banker’s rounding rule is used for symmetric round off.
Example: Find the round off error in storing the number 752.6835 using a four digit
mantissa.
Solution:
True x = 752.6835
= 0.7526835 × 103
= (0. 7526 + 0.0000835) × 103
= (0. 7526 + 0.835× 10-4) × 103
= 0. 7526 × 103 + 0.835× 10-1
Chopping method
Approximate x = 0. 7526 × 103
Error = 0.835× 10-1
Symmetric Round off
Approximate x = 0. 7527 × 103
Error = (gx –1) × 10E-d = (0.835 – 1)× 10-1 = – 0.0165
b) Truncation Errors: Truncation errors arise from using an approximation instead of a precise
mathematical method. Usually, this is the error that occurs from the numerical system
truncation. To approximate the sum of an infinite series, we often use a finite number of
terms. For example, consider the following infinite series:
sin(x) = x – x3/ 3! + x5 / 5! – x7 / 7! + ... ... ... ...
When we calculate the sine of an angle using this series, we cannot use all the terms in the
series for computation. We usually terminate the process after a certain term is calculated.
The term “truncated” introduces an error which is called truncation error.
Example: Find the truncation error in the result of the following function for x = 1/5 when
we use first three terms.
ex = 1 + x + x2/ 2! + x3 / 3! + x4 / 4! + x5 / 5! + x6 / 6!
Solution:
Truncation error = x3 / 3! + x4 / 4! + x5 / 5! + x6 / 6! = 0.1402755 × 10-2
3. Modeling Errors:
Modeling errors occur in the formulation of mathematical models due to some simplifying
assumption. For example, when designing a model to measure the force acting on a falling body,
we may not be able to properly estimate the coefficient of air resistance or determine the
direction and magnitude of wind force acting on the body, etc. To simplify the model, we may
assume that the force due to air resistance is linearly proportional to the velocity of the falling
body or we may assume that there is no wind force acting on the body. All such simplifications
certainly result in errors in the output from such model which is called Modeling error.
Through adding more functionality, we can reduce modeling errors through improving or
extending the models. But the improvement may take the model more difficult to solve or may
take more time to implement the solution process. It is also not always true that an enhanced
model will provide better results. On the other hand, an oversimplified model may produce a
result that is unacceptable. It is, therefore, necessary to strike a balance between the level of
accuracy and the complexity of the model.
4. Blunders: Blunders are errors that are cause due to human imperfections. Since these errors
are due to human mistakes, it should be possible to avoid them to a large extent by acquiring a
sound knowledge of all aspects of the problem as well as the numerical process. Some common
types of errors are:
Absolute Error:
The absolute error is the absolute difference between the true value of a quantity and its
approximate value as given or obtained by measurement or calculation. Thus, if X t is the true
value of a quantity and Xa is its approximate value, then the absolute error E a is given by:
Ea = | Xt – Xa |.
Relative Error:
The relative error is nothing but the “normalized” absolute error. The relative error E r is defined
as follows:
More often, the quantity that is known to us is Xa and, therefore, we can modify the above
relation as follows: Er = | Xt – Xa | / | Xt |
The percent relative error is 100 times the relative error. It is denoted by Ep and defined by: E p =
Ea x 100
Relative and Percent relative errors are independent of the unit of measurement, whereas
absolute errors are expressed in terms of the unit uses.
Example:
Suppose that you have the task of measuring the lengths of a bridge and a rivet and come up with
9999 and 9 cm, respectively. If the true values are 10,000 and 10 cm, respectively, compute:
(a) the true absolute error and
(b) the true percent relative error for each case.
Solution:
(a) The Error for measuring the bridge is: Et = 10,000-9999 = 1 cm.
The Error for measuring the bridge is: Et = 10-9 = 1 cm.
(b) The percent relative error for the bridge is: €t = 1/1000 *100% = 0.01%
For the rivet it is: €t = 1/10*100% = 10%
Thus, although both measurements have an error of 1 cm, the relative error for the rivet is much
greater. We would conclude that we have done an adequate job of measuring the bridge, whereas
our estimate for the rivet leaves something to be desired.
Machine Epsilon:
The round off error introduced in a number when it is represented in floating point form is given
by, Chopping error = g * 10E-d, 0 ≤ g < 1
Where,
g: truncated part of the number in normalized form,
d: is the number of digits permitted in the mantissa, and
E: is the exponent.
The absolute relative error due to chopping is then given by, Er = │(g * 10E-d ) / (f * 10E)│
The relative error is maximum when g is maximum and f is minimum. We know that the
maximum possible value of g is less than 1.0 and minimum possible value of f is 0.1. The
absolute value of the relative error therefore satisfies:
Er ≤│(1.0 * 10E-d ) / (0.1 * 10E)│= 10 -d+1
The maximum relative error given above is known as machine epsilon. The name “machine”
indicates that this value is machine dependent. This is true because the length of mantissa d is
machine dependent.
Error propagation
Numerical computing involves a series of computations consisting of basic arithmetic operations.
Therefore, it is not the individual round off errors that are important but the final error on the
result. Our major concern is how an error at one point in the process propagates and how it
effects the final total error.
Addition and Subtraction:
Consider addition of two numbers, say, x and y.
xt + yt = xa + ex + ya + ey = (xa + ya) + (ex + ey)
Therefore,
Total error = ex+y = ex + ey
Similarly, for subtraction
Total error = ex-y = ex - ey
The addition ex + ey does not mean that error will increase in all cases. It depends on the sign of
individual errors. Similar is the case with subtractions.
Since we do not normally know the sign of errors, we can only estimate error bounds. That is, we
can say that |ex±y| ≤ |ex| + |ey|
Therefore, the rule for addition and subtraction is: the magnitude of the absolute error of a sum
(or difference of the absolute errors of the operands.
This inequality is called triangle inequality.
Multiplication:
Here, we have
xt * yt = (xa + ex) * (ya + ey) = xaya + yaex + xa ey + exey
Errors are normally small and their products will be much smaller. Therefore, if we neglect the
product of the errors, we get
xt * yt = xaya + xaey + yaex = xaya + xaya (ex/xa + ey/ya)
Then,
Total error = exy = xaya (ex/xa + ey/ya)
Division:
We have
xt / yt = (xa + ex) / (ya + ey)
Multiplying both numerator and denominator by ya - ey and rearranging the terms, we get
xt / yt = (xaya + yaex - xaey - exey) / ( ya2 - ey2)
Dropping all terms that involve only product of errors, we have
xt / yt = (xaya + yaex - xaey) / ya2 = xa/ya + xa/ya (ex/xa – ey/ya)
Thus,
Total error = ex/y = xa/ya (ex/xa – ey/ya)
Applying the triangle inequality theorem, we have
ex/y ≤ |xa/ya| ( |ex/xa| + |ey/ya| )
Example: If ∆X = 0.005 and ∆Y = 0.001 be the absolute error in X = 2.11 and Y = 4.15. Find the
relative error in the computation of x + y.
CONVERGENCE
References:
2. Steven C.Chapra, Raymon P. Cannale. Numerical Methods for Engineers. New Delhi : Tata
McGRAW-HILL, 2003. ISMN 0-07-047437-0.