0% found this document useful (0 votes)
93 views7 pages

An Algorithm For Solving Non-Linear Equations Based On The Secant Method

This paper describes an algorithm based on the secant method for solving nonlinear equations. The algorithm uses an approximate Jacobian that is corrected after each function evaluation, requiring fewer evaluations than Newton-Raphson. It is shown to require about half as many evaluations as Newton-Raphson near a solution. The algorithm makes use of an initial Jacobian approximation, allowing it to solve problems more quickly when solving similar problems repeatedly.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
93 views7 pages

An Algorithm For Solving Non-Linear Equations Based On The Secant Method

This paper describes an algorithm based on the secant method for solving nonlinear equations. The algorithm uses an approximate Jacobian that is corrected after each function evaluation, requiring fewer evaluations than Newton-Raphson. It is shown to require about half as many evaluations as Newton-Raphson near a solution. The algorithm makes use of an initial Jacobian approximation, allowing it to solve problems more quickly when solving similar problems repeatedly.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

An algorithm for solving non-linear equations based

on the secant method


By J. G. P. Barnes*

This paper describes a variant of the generalized secant method for solving simultaneous non-
linear equations. The method is of particular value in cases where the evaluation of the residuals
for imputed values of the unknowns is tedious, or a good approximation to the solution and
the Jacobian at the solution are available.
Experiments are described comparing the method with the Newton-Raphson process. It is
shown that for suitable problems the method is considerably superior.

1. Introduction where the same set of equations are to be solved several


With the increased use of digital computers, sets of times with slightly different values of certain parameters.
non-linear algebraic equations are now being solved in The final solution point and Jacobian of one problem
which the functions may be defined by a lengthy process. often provide excellent initial conditions for the next,
There seems to be a need to develop methods which and under these circumstances the method may prove
demand as few function evaluations as possible by to be many times faster than Newton-Raphson.
making the utmost use of the information obtained from Both theoretical and experimental results will show
each evaluation. In this paper, therefore, the number of that this method is in general about twice as good as
function evaluations required will be taken as the criterion Newton-Raphson in the neighbourhood of a solution.
for the comparison of methods.
The Newton-Raphson method for solving non-linear 2. Notation
equations is well known and has the advantage of being Vectors, matrices and tensors will be denoted by bold-
rapidly convergent in the neighbourhood of a solution. face type.
It does, however, demand the determination of the Subscripts will in general refer to components of
matrix of first derivatives (the Jacobian) of the system tensors and superscripts to iterates. Thus xfp is the
at each iteration. For lengthy sets of equations, explicit /th component of theyth iterate of the vector x.
derivatives are rarely available, and even if they were
their evaluation would usually be more tedious than The summation convention is applied where relevant.
that of the parent functions. Hence, if the Newton- The transpose of a matrix will be denoted by the
Raphson method is to be employed the derivatives superfix T.
must be evaluated numerically, and this means at least The general non-linear equations to be solved (in n
n + 1 function evaluations per iteration (n is the dimen- dimensions) are
sion of the system). Mxj) = o (l)
For equations whose Jacobian does not change much where subscripts run from 1 to n.
from iteration to iteration (i.e. for near-linear equations),
the complete re-evaluation of the Jacobian at each
iteration is unnecessary, and it has been suggested that 3. The basic method
the Jacobian only be re-evaluated every k iterations, Let / ( l ) be the initial (guessed) value of the Jacobian
where A: is a small integer. In practice this procedure and let xw be the initial point at which the function
is not to be recommended. value is/ ( I ) .
The method to be described has the great advantage Then the first step Sx^ is defined by
of not requiring the explicit evaluation of derivatives, but /(i) OJC (i> = -/(»>. (2)
uses instead an approximate value of the Jacobian and (l)
corrects this after each function evaluation. Like the Note that if/ were correct this would be the Newton-
Newton-Raphson method it is (given suitable initial Raphson step.
conditions) capable of determining both real and com- This gives rise to the point
plex roots. It will be shown to be equivalent to the x(2) = x (i) + 5,(0 (3)
generalized secant method described by Bittner (1959)
(2)
and Wolfe (1959), but has the additional advantage of at which the function value i s / .
being able to make use of an initial approximation to The correction to be applied to the Jacobian / ( l ) is
the Jacobian. The benefit of this last fact should not be determined by considering the behaviour of a linear
underestimated since the situation often arises in practice system which has values/ (1) ,/ (2) at *<'>, x(2), respectively.
Imperial Chemical Industries Limited, Bozedown House, Whitchurch Hill, Reading, Berks.
66
Secant method for non-linear equations
Suppose / were the Jacobian of such a system, then we The important consequence of this choice of z (l) is
would have that
h JW>. (4) o \<i-j<n. (13)
(2)
The corrected Jacobian / is chosen to satisfy equa- Hence
tion (4). Suppose the correction is Z)(l) so that /o+D! [/C/+D + Z)0 + DW]8xW
jd) = jo) + DO) (5) = /O- 1, < n. (14)
(1) (1)
thenZ><'> satisfies /"> =/*•> + (/ + Z)('))5* and
using (2) this gives 5. The linear case
y(2) = D<i)8x°>. (6) Now consider the behaviour of the method on a
general set of linear equations
A solution of equation (6) is
/= Gx - b. (15)
(m+1)
Suppose that the step 8x is the first step linearly
dependent on the previous steps. Then certainly m < n
where z(1) is an arbitrary vector. since n + 1 vectors cannot be linearly independent.
A general iteration is thus Let
(7) \ (16)
(8)
Now
(9) /(-+1)8*0> = ju+ 08*0) from (14) 1 < j
= /(/+1) - / O ) from (12)
(10) = G(xO+0 _ ;cO>) from (15)
(;)
where the vectors z are as yet undefined. (17)
Note that
Hence from (16)
01)
and (12)
(()
It remains to choose the vectors z .
= Sk
4. The vectors z(/> = C8*('"+ | ). (18)
A desirable feature of any method of solving non- So + C8* (m+I)
from (15)
linear equations is that it should rapidly solve a linear
set of equations. In fact, n + 1 function evaluations = /("' +!) + 7(m + l)8 A: (m+l)fr om (18)
should suffice to determine the Jacobian of the system
= 0. from (7)
exactly, and hence the solution ought to be found on
the (n -f- 2)nd function evaluation. In our case this Thus the step 8jc (m+1) leads to the solution which is
means that jr("+2) ought to be the solution. It will be hence found after at most n + 2 function evaluations.
shown in the next section that with the choice of An alternative interpretation is afforded by considering
vectors z(l) as follows this desirable feature is in fact the approach of successive approximations / ( l ) to the
obtained. correct value G. To be explicit consider the eigenvectors
If i > n then z(i) is chosen orthogonal to the previous of zero eigenvalue (nullity vectors) of the matrices
n — 1 steps, 8x('-" + 1> . . . Sx<'-»>. (/ ( / ) — G). Equation (17) shows that all previous steps
If i < n then we can only demand that z(l) is orthogonal are such eigenvectors, and are independent. Each step
to the available j — 1 steps, 8x(1) . . . Sx (l ~'). This reduces by one the maximum possible value of the rank
naturally leaves some freedom of choice, and for sim- of / ( l ) — G, until after at most n steps its rank is zero
plicity in this case z(l) is taken to be that linear combina- implying that / ( ( ) = G. The following step, of course,
tion of 8x(I) . . . 8;t(;) which is orthogonal to gives the solution.
> If the rank of / ( 1 ) — G is not n then it is possible for
It will be noticed from equation (9) that the magnitude the solution to be found in less than n + 1 steps, but
of z (l) is irrelevant. For convenience it is taken to be not necessary. It depends upon the null eigenvectors of
a unit vector, and the above computation is then readily / ( 1 ) — G. If the orthogonalization procedure could be
performed in either case by the usual Gram-Schmidt started with such null eigenvectors accounted for then
orthogonalization process (see Section 13). earlier convergence would be assured.
67
Secant method for non-linear equations

6. The general case From (24) using Cramer's rule


Now consider the behaviour with the general set of X
non-linear equations (1).
We have (25)
/<«8x<*-'> = /(*-'+»8x<*-0 from (14)
« / » - I + D -.fOc-n from (12) Equations (22) and (25) provide an explicit expression
= 8/(*-''> say, i < k and 0 < / < n. (19) for * (fc+1) in terms of the previous n + 1 pairs of points
In particular if k > n and function values.

8. Convergence rate
+'), . . . 8/»-n]. (20) Consider a general set of non-linear equations, and
(k) expand using Taylor's theorem about the solution
The value of the Jacobian J is therefore that of the point \
linear system defined by the n + 1 pairs of points and
function values, x<-k-«),/(*-»> . . . *<*>,/«>. hx) =fiJ8xJ + ^fiJ
The method is therefore identified with the generalized
secant method and as such has been previously described where the derivatives are evaluated at the solution
by Bittner (1959) and Wolfe (1959). The present point %.
representation of the secant method, however, has the No loss of generality is incurred by taking % = 0, and
advantage of being able to use an initial value of /, and since the algorithm is seen from equations (7) and (20)
in practice has been found to be more reliable. to be invariant under a linear transformation, we will
also assume fUj = 8;y.
So a general set of non-linear equations may be con-
7. An explicit expression for x(k+1'> sidered to be
In this section suppose k > n and for clarity put fi — xi
/<» = / .
Then the equations Near the solution point terms O(x3) may be ignored.
The convergence rate of the method will therefore be
I = 0 . . . I! - 1
considered with respect to the system of equations
may be written in the form /, = Xi + BiJkXjXk = 0 (26)
/<*-» = /x<fc-'> +L i= 0...n (21) where Bljk is symmetric in j and k.
To enable comparisons to be made, the Newton-
from which Raphson algorithm will be considered first.
x(k+\) _ x(k) _ J-lf(k)
9. Newton-Raphson convergence
We consider equation (26). The derivatives are
(22) j . . = S y + 2BUkxk and J^xSp = — /J" defines the step
8y\
Rewrite (21) in the form Now
F=JX (23) xtf) = X(D + 8*0)
where / is a 1 x (n + 1) matrix with each element unity = J- '[/r<'> -
and and /,,*</> - / ? > = 8ijX<P
p _ ry(*-/,)ya-n+i)) ./ ( w ] - *<
= B
Equation (23) represents n sets of linear equations in Now 7 - ' = l + O(x)
n + 1 variables for the n2 + n unknown elements of so xf = BurfW + O(x3). . (27)
/ and L.
Taking the /th row of equation (23) This is a well-known result indicating second-order
convergence.
(24) If x is some linear measure of the vector x, then
heuristically
where F, is the /th row of matrix F. (28)
68
Secant method for non-linear equations
10. Convergence of secant method We wish to find the dominant root of this equation.
To simplify the notation in this section we will con- Equation (35) has one real root > 1 and, if n is odd,
sider x ( n + 2 ) . The general result follows at once. one other real root between — 1 and 0. It has no other
real roots.
Now x<"+2> = - / - ' L (22) The roots of the equation are the same as the eigen-
where L is given by values of the companion matrix:

X 0
1 0
(25)
1 0
/ 1
Applying these formulae to equations (26) gives
x
// = i + Bijkxjxk-
Suppose that successive iterations give much closer
approximations to the root. Then 0 0
1 0 0
> |*<2)| > . . . > |x("+')|.
I* (0 1 1
X
Expanding by the bottom row , the dominant term
1 This is a non-negative reducible matrix. Hence by
is |x<»\ x<2>, . the weak form of a theorem of Frobenius (Gantmacher
X by the bottom row, then the dominant (1959), Varga (1962)), it has a non-negative real eigen-
Expand
F, value equal to its spectral radius. It follows that the
] }
term is Bijkx^ lx (2) x ( 3 ) x(n+1)l only positive real root of equation (35) is equal to this
+1 eigenvalue, and hence no other root of equation (35) has
So L, == E r0)ra)\
xm
>xO)> • •.x<" >|
(29) larger modulus. It remains to prove that no other
"1
root of the equation has the same modulus.
+ higher terms. It is obvious that the positive real root is a simple
root. Let it be r.
As before J~l = 1 + O(x), and if x is again some 1
linear measure of the error we obtain Then (36)
x
r- 1
~ Bx x" . (30)
Suppose re'9 is also a root.
Then
11. Comparison of convergence rates
rV'oCe' 9 — 1) = 1 from (35),
The convergence rates have been shown to be approxi-
mately given by
so g/i/8 from (36).
xit+i) = B{x(i))2 Newton-Raphson (31)
x<.i+n+i) — £xu)x(n+o Secant method (32) re'« — 1
Taking moduli = 1
r— 1
where x ( ( ) is some linear measure of the error of the
ith iterate. which is impossible unless e'9 = 1. Hence there are
Put no roots of modulus r other than the real positive root.
The positive real root of (35) is therefore dominant.
v. = log (Bx<-») then We will henceforth denote this root by t.
The solution of the difference equation (34)
vi+l = 2v, Newton-Raphson (33)
« , + n + 1 = V; + vn+i Secant method (34) vi+n+i =v,+vn+i
Consider first the Newton-Raphson case. Each is hence dominated by vt = vot' and the reduction factor
iteration requires the evaluation of the Jacobian and so per iteration is t. A similar result has previously been
involves at least n + 1 function evaluations. For the obtained by Tornheim (1964) as an example of a multi-
purpose of comparison we will suppose only n + 1 are point iterative method.
in fact required. From equation (33) the reduction Hence the ratio of the number of function evaluations
factor is seen to be 2 1 / n + 1 per function evaluation. required by Newton-Raphson to the number for the
Consider now the difference equation (34) describing secant method for a given error reduction is
the secant method. Its characteristic equation is _ (n + 1) log ,
(37)
f+l - f - 1 = 0. (35) Ioi2
69
Secant method for non-linear equations
The secant method may be said to be better by a Table 1
factor Rn. Some values of Rn and t are shown in
Table 1. Rates of convergence

n t Rn

12. Experimental results 1 1-618 1-388


To test the above theoretical prediction, the equations 2 1-466 1-654
3 1-380 1-860
f, = Xf + BiJkXjXk = 0 4 1-325 2-028
were solved by both Newton-Raphson (the derivatives 5 1-285 2-172
being available analytically) and the secant method, 6 1-255 2-297
using a Mercury digital computer. 7 1-232 2-409
The coefficients BiJk were generated as random 8 1-213 2-509
variables from a rectangular distribution (\BiJk\ < Bo 9 1-197 2-600
say). In each run the starting point was taken at 10 1-184 2-684
random on the unit sphere. By varying Bo the effective 20 1-114 3-283
degree of non-linearity is altered. 50 1-058 4-179
Two minor difficulties arise. First, the length of the 100 1034 4-914
mantissa of the floating-point representation used (on
Mercury) was 29 bits, and so cancellation errors pre-
cluded an improvement of more than about 8 decimal (1) Initial /exact: Since thefirststep is the same in each
digits per iteration. With the rapid convergence of the case, the count was from the end of that step. This
two methods here under comparison, many successive score would be expected to be in favour of the
valid iterations could not therefore be obtained. secant method since the initial J is good.
Secondly, it was difficult to know what initial value of (2) Initial J exact: As an alternative to the above
Jacobian to give the modified secant method, and how the iterations were counted from the beginning,
to penalize it for having such knowledge. but a penalty of n function evaluations was added
To obtain results of reasonable variance in the face to the score of the secant method to compensate
of the first difficulty several runs were carried out for for the exact /. This penalty is obviously too
each n and Bo. Each run was terminated when the heavy, and the score is therefore in favour of
ratio of successive values of f2 = fj-, (the accuracy Newton-Raphson.
measure) exceeded 1014 using Newton-Raphson. (In (3) Initial J unit: The initial J is equal to the correct
each case at the correponding level with the secant final value; this sort of situation may well arise in
method the ratio of successive values of f1 was less practice. Iterations were both counted from the
than 1014.) The iteration number of the last allowed start. This score is in favour of the secant method
iteration on Newton-Raphson was recorded, and the especially for large n since / is not altered sub-
corresponding number of iterations to obtain the same stantially before the solution is nearly reached.
degree of accuracy with the secant method was found by
interpolation between the two iterations around that The equations were solved with n = 2(1)7 and Bo = 0 • 01
accuracy. The interpolation was carried out on the and 0-1. Five runs were carried out for each pair of
logarithms of the successive values of/2. Linear inter- values of n and Bo. The relative behaviour of the two
polation was discarded since it would have given rise to methods was found to be essentially independent of Bo
consistently low values for the iteration. Interpolation although obviously more iterations were required for the
was actually carried out on the assumption that the higher value. The runs are therefore grouped only
successive values of these logarithms fitted a curve of according to the value of n. The total scores over all runs
the form log (/ 2 ) = at' + j8 where t is given by Table 1. for each value of n were accumulated, and the correspond-
This is, of course, a result predicted by the above ing estimates of Rn were evaluated and are shown in
theory. Tables 2 and 3. Comparison with the predicted value
Two series of runs were carried out with the secant of Rn shows good agreement in every case in view of the
method. One was with the initial Jacobian exact (at comments about the bias of the scores. Scores 1 and 2
the starting point) so that the first iteration is the same straddle the predicted value in every case. Score 3
as Newton-Raphson. The second was with the Jacobian shows how rapid the secant method is under favourable
equal to the unit matrix (which is, of course, the correct conditions.
value at the solution).
In comparing the number of function evaluations 13. Stability
required, each Newton-Raphson iteration is scored as Bittner (1959) has shown that the secant method as
n + 1 evaluations. defined by equations (7) and (20) has the following
The following scores were evaluated for each run. property.
70
Secant method for non-linear equations
Table 2
Mean number of iterations required

NEWTON- SECANT INITIAL JACOBIAN


DIMENSION
RAPHSON (a) EXACT (t>) UNIT and
i = 2, . . . m.
2 2-5 3-72 3-47
3 2-5 4-10 3-68
4 2-7 4-82 4-14 Then the vectors eu . . . em are an orthonormal basis
5 3-0 6-12 5-40 of the space spanned by the set ?i> • • • ?m-
6 2-8 5-95 4-61 If we now put m = min («, k) and
7 3-4 8-36 607

in the above we obtain em = z(k), and, writing Ck for C,


Table 3 (to distinguish iterations),
Comparison of scores with predicted value
1
"" |z<»||Sxt»|
SCORE
DIMENSION
PREDICTED l 2 3 so that condition (39) may be written
|C*| > />. (40)
2 1-654 1-656 1-312 2-163 In particular, if k > n so that m = n
3 1-860 1-938 1-409 2-718
4 A pkrk fk (41)
2-028 2-222 1-530 3-264
5 2-172 2-343 1-618 3-336 See Todd (1962).
6 2-297 2-544 1-640 4-250
Note that
7 2-409 2-608 1-771 4-479
Ck = 1 for all k (42)
and \c}+>\ (43)
Denote the determinant the latter being easily seen by considering the geometric
Sx<« properties of the system.
by A,
|5x(*—+i)|' |8x«-"+ 2 >i' ' ' ' |Sx<»| Then for k > n we have
Then given w such that 0 < w < 1 and provided that (i) If |A,| > w
| A* | > wfor allfc> n (38) then |C* . . . Ck\ > w
there exists a neighbourhood of the solution within which and so |C*| > w.
convergence is assured.
The condition |Afc| > w is the sort of expedient that (ii) If \Ck\> p for all k < fc0, say,
might be thought necessary for the reliable computation then |A t | = \C\ . . . Ck\
of 7 ( A + 1 ) from equation (20).
~> \Ck-" + 2 C k\ hv ^ 4 ^
This sort of condition is not obviously required for
the algorithm as expressed by equation (7) et seq. It
might, however, be thought necessary to impose a con-
dition of the form SO |%| >p"~'.

|(z<W)r8x<»| It has thus been shown that conditions (38) and (39),
(39) although not identical, nevertheless are related. Conse-
quently either of the two tests may be used to ensure
to ensure the reliable computation of (9). convergence under suitable conditions.
As a practical consequence of the above, one or both
It will now be shown that conditions (38) and (39) are of the tests are applied at each iteration and the proposed
related by considering the orthogonalization process used step Sx(k) rejected if the test fails. A satisfactory alter-
to determine the z w . native is to set 8x(Ar) parallel to z w (in which case the
Suppose that 5i, • . . ? m are a set of m < n unit vectors. tests will evidently be satisfied). The magnitude of the
Define the vectors eu . . . em and the scalars Cu . . . Cm alternative step is still arbitrary; a suitable value might
by the following equations. be that of the rejected vector.
71
Secant method for non-linear equations
A generalization of the algorithm as defined by equa- A more general method, which has a much larger
tions (7) et seq. is now needed for the case in which &x domain of convergence, may be formed by imposing a
is not prescribed by equation (7). success criterion. The usual criterion employed is the
An argument similar to that of Section 3 leads to the minimization of/2.
replacement of (9) by It is ensured that each step gives rise to an improve-
ment (i.e. reduces/2) by multiplying the step by a suitable
scalar in those cases where the direct application of the
(44)
(z«))r6*<'> algorithm does not give rise to an improvement. The
imposition of such a criterion ensures convergence over
which reduces to (9) in the usual case, In practice the a large domain but does not impair the final convergence
use of (44) for the calculation of in every case is rate.
recommended.
The values of C* and Ak were monitored for all the Acknowledgements
experiments of Section 12. From these it would seem The author wishes to express his thanks to Imperial
that the simplest procedure likely to give consistent Chemical Industries Limited for permission to publish
results is to test C* only, and reject the step if |C*| < p0. this paper, to his colleagues Mr. I. Gray and Dr. H. H.
p0 might be 10~4. Larger values of p0 may delay con- Robertson for their constant advice and encouragement,
vergence considerably. and to the referee for his constructive criticisms.

References
BITTNER, L. (1959). "Eine Verallgemeinerung des Sekantenverfahrens (regula falsi) zur naherungsweisen Berechnung der
Nullstellen eines nichtlinearen Gleichungssystems," Wissen. Zeit. der Technischen Hochschule Dresden, Vol. 9, p. 325.
GANTMACHER, F. R. (1959). Applications of the Theory of Matrices, New York: Interscience Publishers Inc.
TODD, J. (1962) (Ed.). A Survey of Numerical Analysis, New York: McGraw-Hill Book Co.
TORNHEIM, L. (1964). "Convergence of Multipoint Iterative Methods," / . Assoc. Comp. Mach., Vol. 11, p. 210.
VARGA, R. S. (1962). Matrix Iterative Analysis, London: Prentice-Hall International.
WOLFE, P. (1959). "The Secant Method for Simultaneous Non-linear Equations," Comm. Assoc. Comp. Mach., Vol. 2, p. 12.

To the Editor, contradiction shows that either the function T does


The Computer Journal. not exist or that P is not a program". Since the non-
existence of T itself implies that P is not a program,
"An impossible program" the most that can be concluded is that in any event P
Dear Sir, is not a program.
I do not know whose leg Mr. Strachey is pulling (this Journal,
January 1965, p. 313); but if each letter in refutation of his I am, of course, being careful not to claim that Mr. Strachey's
proof adds to some private tally for his amusement, then I initial assertion (that it is impossible to write a program
am happy to amuse him. May I offer three independent which can examine any other program and tell, in every case,
refutations? if it will terminate or get into a closed loop when it is run) is
false. But what is manifest is that his proof of the far stronger
(i) He defines a function T[R]. Any subsequent "proof" assertion (that T[R] does not exist) is invalid: both in its final
that T cannot exist is then idle; the function exists by step (see (iii) above) and in its assumption that a set of state-
definition. ments in CPL—or any other language—necessarily con-
(ii) If T does not exist, then P does not exist, since T is stitutes a program. (If anybody doubts my counter assertion
essentially involved in the statement of P. So P is that P is not a program, let him try compiling P in—any—
not a program. So P is not an acceptable argument machine language!)
forT. Yours faithfully,
(iii) If one accepts Mr. Strachey's reasoning up to the H. G. APSIMON.
point "In each case T[P] has exactly the wrong value", 22 Stafford Court,
the appropriate deduction is not "this contradiction London, W.8.
shows that the function T cannot exist" but "this 18 February 1965.

72

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy