An Algorithm For Solving Non-Linear Equations Based On The Secant Method
An Algorithm For Solving Non-Linear Equations Based On The Secant Method
This paper describes a variant of the generalized secant method for solving simultaneous non-
linear equations. The method is of particular value in cases where the evaluation of the residuals
for imputed values of the unknowns is tedious, or a good approximation to the solution and
the Jacobian at the solution are available.
Experiments are described comparing the method with the Newton-Raphson process. It is
shown that for suitable problems the method is considerably superior.
8. Convergence rate
+'), . . . 8/»-n]. (20) Consider a general set of non-linear equations, and
(k) expand using Taylor's theorem about the solution
The value of the Jacobian J is therefore that of the point \
linear system defined by the n + 1 pairs of points and
function values, x<-k-«),/(*-»> . . . *<*>,/«>. hx) =fiJ8xJ + ^fiJ
The method is therefore identified with the generalized
secant method and as such has been previously described where the derivatives are evaluated at the solution
by Bittner (1959) and Wolfe (1959). The present point %.
representation of the secant method, however, has the No loss of generality is incurred by taking % = 0, and
advantage of being able to use an initial value of /, and since the algorithm is seen from equations (7) and (20)
in practice has been found to be more reliable. to be invariant under a linear transformation, we will
also assume fUj = 8;y.
So a general set of non-linear equations may be con-
7. An explicit expression for x(k+1'> sidered to be
In this section suppose k > n and for clarity put fi — xi
/<» = / .
Then the equations Near the solution point terms O(x3) may be ignored.
The convergence rate of the method will therefore be
I = 0 . . . I! - 1
considered with respect to the system of equations
may be written in the form /, = Xi + BiJkXjXk = 0 (26)
/<*-» = /x<fc-'> +L i= 0...n (21) where Bljk is symmetric in j and k.
To enable comparisons to be made, the Newton-
from which Raphson algorithm will be considered first.
x(k+\) _ x(k) _ J-lf(k)
9. Newton-Raphson convergence
We consider equation (26). The derivatives are
(22) j . . = S y + 2BUkxk and J^xSp = — /J" defines the step
8y\
Rewrite (21) in the form Now
F=JX (23) xtf) = X(D + 8*0)
where / is a 1 x (n + 1) matrix with each element unity = J- '[/r<'> -
and and /,,*</> - / ? > = 8ijX<P
p _ ry(*-/,)ya-n+i)) ./ ( w ] - *<
= B
Equation (23) represents n sets of linear equations in Now 7 - ' = l + O(x)
n + 1 variables for the n2 + n unknown elements of so xf = BurfW + O(x3). . (27)
/ and L.
Taking the /th row of equation (23) This is a well-known result indicating second-order
convergence.
(24) If x is some linear measure of the vector x, then
heuristically
where F, is the /th row of matrix F. (28)
68
Secant method for non-linear equations
10. Convergence of secant method We wish to find the dominant root of this equation.
To simplify the notation in this section we will con- Equation (35) has one real root > 1 and, if n is odd,
sider x ( n + 2 ) . The general result follows at once. one other real root between — 1 and 0. It has no other
real roots.
Now x<"+2> = - / - ' L (22) The roots of the equation are the same as the eigen-
where L is given by values of the companion matrix:
X 0
1 0
(25)
1 0
/ 1
Applying these formulae to equations (26) gives
x
// = i + Bijkxjxk-
Suppose that successive iterations give much closer
approximations to the root. Then 0 0
1 0 0
> |*<2)| > . . . > |x("+')|.
I* (0 1 1
X
Expanding by the bottom row , the dominant term
1 This is a non-negative reducible matrix. Hence by
is |x<»\ x<2>, . the weak form of a theorem of Frobenius (Gantmacher
X by the bottom row, then the dominant (1959), Varga (1962)), it has a non-negative real eigen-
Expand
F, value equal to its spectral radius. It follows that the
] }
term is Bijkx^ lx (2) x ( 3 ) x(n+1)l only positive real root of equation (35) is equal to this
+1 eigenvalue, and hence no other root of equation (35) has
So L, == E r0)ra)\
xm
>xO)> • •.x<" >|
(29) larger modulus. It remains to prove that no other
"1
root of the equation has the same modulus.
+ higher terms. It is obvious that the positive real root is a simple
root. Let it be r.
As before J~l = 1 + O(x), and if x is again some 1
linear measure of the error we obtain Then (36)
x
r- 1
~ Bx x" . (30)
Suppose re'9 is also a root.
Then
11. Comparison of convergence rates
rV'oCe' 9 — 1) = 1 from (35),
The convergence rates have been shown to be approxi-
mately given by
so g/i/8 from (36).
xit+i) = B{x(i))2 Newton-Raphson (31)
x<.i+n+i) — £xu)x(n+o Secant method (32) re'« — 1
Taking moduli = 1
r— 1
where x ( ( ) is some linear measure of the error of the
ith iterate. which is impossible unless e'9 = 1. Hence there are
Put no roots of modulus r other than the real positive root.
The positive real root of (35) is therefore dominant.
v. = log (Bx<-») then We will henceforth denote this root by t.
The solution of the difference equation (34)
vi+l = 2v, Newton-Raphson (33)
« , + n + 1 = V; + vn+i Secant method (34) vi+n+i =v,+vn+i
Consider first the Newton-Raphson case. Each is hence dominated by vt = vot' and the reduction factor
iteration requires the evaluation of the Jacobian and so per iteration is t. A similar result has previously been
involves at least n + 1 function evaluations. For the obtained by Tornheim (1964) as an example of a multi-
purpose of comparison we will suppose only n + 1 are point iterative method.
in fact required. From equation (33) the reduction Hence the ratio of the number of function evaluations
factor is seen to be 2 1 / n + 1 per function evaluation. required by Newton-Raphson to the number for the
Consider now the difference equation (34) describing secant method for a given error reduction is
the secant method. Its characteristic equation is _ (n + 1) log ,
(37)
f+l - f - 1 = 0. (35) Ioi2
69
Secant method for non-linear equations
The secant method may be said to be better by a Table 1
factor Rn. Some values of Rn and t are shown in
Table 1. Rates of convergence
n t Rn
|(z<W)r8x<»| It has thus been shown that conditions (38) and (39),
(39) although not identical, nevertheless are related. Conse-
quently either of the two tests may be used to ensure
to ensure the reliable computation of (9). convergence under suitable conditions.
As a practical consequence of the above, one or both
It will now be shown that conditions (38) and (39) are of the tests are applied at each iteration and the proposed
related by considering the orthogonalization process used step Sx(k) rejected if the test fails. A satisfactory alter-
to determine the z w . native is to set 8x(Ar) parallel to z w (in which case the
Suppose that 5i, • . . ? m are a set of m < n unit vectors. tests will evidently be satisfied). The magnitude of the
Define the vectors eu . . . em and the scalars Cu . . . Cm alternative step is still arbitrary; a suitable value might
by the following equations. be that of the rejected vector.
71
Secant method for non-linear equations
A generalization of the algorithm as defined by equa- A more general method, which has a much larger
tions (7) et seq. is now needed for the case in which &x domain of convergence, may be formed by imposing a
is not prescribed by equation (7). success criterion. The usual criterion employed is the
An argument similar to that of Section 3 leads to the minimization of/2.
replacement of (9) by It is ensured that each step gives rise to an improve-
ment (i.e. reduces/2) by multiplying the step by a suitable
scalar in those cases where the direct application of the
(44)
(z«))r6*<'> algorithm does not give rise to an improvement. The
imposition of such a criterion ensures convergence over
which reduces to (9) in the usual case, In practice the a large domain but does not impair the final convergence
use of (44) for the calculation of in every case is rate.
recommended.
The values of C* and Ak were monitored for all the Acknowledgements
experiments of Section 12. From these it would seem The author wishes to express his thanks to Imperial
that the simplest procedure likely to give consistent Chemical Industries Limited for permission to publish
results is to test C* only, and reject the step if |C*| < p0. this paper, to his colleagues Mr. I. Gray and Dr. H. H.
p0 might be 10~4. Larger values of p0 may delay con- Robertson for their constant advice and encouragement,
vergence considerably. and to the referee for his constructive criticisms.
References
BITTNER, L. (1959). "Eine Verallgemeinerung des Sekantenverfahrens (regula falsi) zur naherungsweisen Berechnung der
Nullstellen eines nichtlinearen Gleichungssystems," Wissen. Zeit. der Technischen Hochschule Dresden, Vol. 9, p. 325.
GANTMACHER, F. R. (1959). Applications of the Theory of Matrices, New York: Interscience Publishers Inc.
TODD, J. (1962) (Ed.). A Survey of Numerical Analysis, New York: McGraw-Hill Book Co.
TORNHEIM, L. (1964). "Convergence of Multipoint Iterative Methods," / . Assoc. Comp. Mach., Vol. 11, p. 210.
VARGA, R. S. (1962). Matrix Iterative Analysis, London: Prentice-Hall International.
WOLFE, P. (1959). "The Secant Method for Simultaneous Non-linear Equations," Comm. Assoc. Comp. Mach., Vol. 2, p. 12.
72