0% found this document useful (0 votes)
98 views8 pages

An Algorithm For Minimax Solution of Overdetennined Systems of Non-Linear Equations

The document summarizes an algorithm for finding the minimax solution to an overdetermined system of nonlinear equations. The algorithm works as follows: 1. At each iteration, a linear approximation is made and the linear system is solved subject to a bound on the solution to minimize the maximum residual. 2. The solution and bound are then updated. If the actual decrease in the function value is close to the predicted decrease, the solution is accepted and the bound is tightened or loosened depending on the accuracy of the linear approximation. 3. It is proved that the algorithm generates a sequence that converges to the set of stationary points, where the function value cannot be further decreased. Numerical examples demonstrate the method.

Uploaded by

Dewi Fitriyani
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
98 views8 pages

An Algorithm For Minimax Solution of Overdetennined Systems of Non-Linear Equations

The document summarizes an algorithm for finding the minimax solution to an overdetermined system of nonlinear equations. The algorithm works as follows: 1. At each iteration, a linear approximation is made and the linear system is solved subject to a bound on the solution to minimize the maximum residual. 2. The solution and bound are then updated. If the actual decrease in the function value is close to the predicted decrease, the solution is accepted and the bound is tightened or loosened depending on the accuracy of the linear approximation. 3. It is proved that the algorithm generates a sequence that converges to the set of stationary points, where the function value cannot be further decreased. Numerical examples demonstrate the method.

Uploaded by

Dewi Fitriyani
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

J. Inst.

Maths Applies (1975) 16, 321-328

An Algorithm for Minimax Solution of Overdetennined


Systems of Non-linear Equations
K. MADSEN f

Institute for Numerical Analysis, Technical University of Denmark,


Building 301, DK-2SO0 Lyngby, Denmark

[Received 29 November 1973 and in revised form 6 November 1974]

Downloaded from http://imamat.oxfordjournals.org/ at Aston University on January 26, 2014


The problem of minimizing the maximum residual of a system of non-linear equations
is studied in the case where the number of equations is larger than the number of un-
knowns. It is supposed that the functions defining the problem have continuous first
derivatives, and the algorithm is based on successive linear approximations to these
functions. The resulting linear systems are solved in the minimax sense, subject to bounds
on the solutions, the bounds being adjusted automatically, depending on the goodness
of the linear approximations. It is proved that the method always has sure convergence
properties. Some numerical examples are given.

1. Introduction
THE PROBLEM of finding a solution to an overdetermined set of non-linear algebraic
equations
fj(x) == / , ( * ! , . . . , xn) = 0, j=\,2,...,m, m> n, (1)
in the minimax sense, i.e. to find a vector x that minimizes the maximum residual
F(x) s max|//x)|, (2)
j

can be treated by a method similar to the classical Newton-Raphson iteration with


line searches (Osborne & Watson, 1969). The estimate xk to the solution is replaced by
x*+i = x*+<xA (3)
where hk is a minimax solution to the linear system

/,<**)+ E ]£(**)*« = °' J = 1. 2, • • •> m, (4)


«=i °xt
and ak > 0 is chosen to minimize F(xk+akhk). In the case where every linear system
consisting of n of the equations (4) is bounded away from singularity this method will
generate a sequence converging to a local minimum of F. However, if we do not have
this sort of condition the method may fail. It might happen, for example, that the
sequence generated converges rapidly to a point x, which is not a stationary point off.
This is shown by an example given by Powell (1970) (though for the case n = m).
In the least squares solution of the problem (1), the Levenberg (1944)/Marquardt
(1963) method, in which the approximating linear systems are solved subject to a
restriction on the length of the solution, has been used very successfully (Powell, 1972).
The convergence properties of this method suggest that it might be worthwhile to use
a similar technique for the minimax solution of the problem.
t This work has been carried out during the author's stay at AERE Harwell, Didcot, England.
321
322 K. MADSEN

Therefore, in the method we present in this paper, the minimax solution to (4) is
found subject to a restriction
||hj| = max|/y < Xk, (5)
i

which is similar to the bounds used by Griffith & Stewart (1961). The value of Xk
is calculated to try to provide the inequality
F(xk+hk) < F(xk) (6)
Thus the value of kk should depend on the goodness of the linear approximation to (1)
at x = xk, and should be chosen as large as possible subject to a certain measure of
agreement between each/} and its linearization being maintained.
Throughout this paper we suppose that the problem under consideration is well
scaled in the variables. It would have been desirable if the method was independent of

Downloaded from http://imamat.oxfordjournals.org/ at Aston University on January 26, 2014


scaling, but we have not been able to find a satisfactory solution to this problem.
In Section 2 we describe our algorithm in detail and in Section 3 we prove two
convergence theorems. It is proved that the vectors xk generated by the algorithm
converge to the set of stationary points of F. In Section 4 we give two examples.

2. The Algorithm
We will use the notation

and
F(x, h) = max|//x)+(V/}(x), h)| (8)
j
where (V/}(x), h) is an inner product.
Now suppose that we have found an estimate xk to the solution to our problem, and
a bound on the step length, Xk. Then we calculate a displacement hk to add to xk as a
solution to the linear problem
F(xk,bk) = min {F(x,,h)} (9)
||fl
(hk may be found as the solution to a linear programming problem with (2m+2n)
constraints or more efficiently by a method which is an extension of the exchange
algorithm, Powell (1974)).
In order to accept (xk+hfc) as the next point in the iteration it is desirable to ensure
that the function value will decrease. However, this criterion is not always sufficient to
guarantee convergence, as shown by an example by Fletcher (1972). Therefore we use
the stronger condition
F(xk)-F(xk+hk) > Pl(F(xk)-F(xk, hk)) (10)
where pj is a small, positive number, i.e. we test if the decrease in F(x) exceeds a
small multiple of the decrease predicted by the linear approximation. If (10) is
satisfied we choose x t+1 = xk + hk, otherwise we let xk+l = xk.
The value of Xk+l is defined in the following way: if the decrease in the function
value is small compared to the predicted decrease, we wish to use a smaller bound.
Therefore, if
F(xk) - F(xk+bk) < P2(F(xk) -F(xk, bk)) (11)
ALGORITHM FOR MINIMAX SOLUTION 323

where p t < p2 < 1, we choose At+1 = <r1 ||hA||, 0 < al < 1. As a consequence of this
we will have Ai+1 ^ atXk in the case where xk+1 = xk. If (11) is not fulfilled, and the
agreement between each/; and its linearization is very good, we will allow the value
of the bound to increase. So we use the following test: if
max\fJ{xk+bk)- {fj(xk)+(Vfj(xk), hk)}\ < p3(F(xk)-F(xk+hk)) (12)
0 < p 3 < 1, we choose Xk+l = o2\^k II. where a2 > 1. If neither (11) nor (12) is
satisfied we let At+1 = ||hj|.
Our numerical experience with the algorithm indicates that the choice of the
constants is not critical. For 8 test examples we tried different values (0-20 and 0-33)
°f P2> PZ an£ l au while p t = 001 and a2 = 20, but we did not find any significant

Downloaded from http://imamat.oxfordjournals.org/ at Aston University on January 26, 2014


difference in the results. For the numerical results given in Section 4 we have used the
values (pl5 p2> p 3 , eu a2) = (001, 0-25, 0-25, 0-25, 2-0).
3. Convergence Theorems
To prove the convergence theorems we need a smoothness assumption which
permits local linearization. We will assume that
fj(x+h) = y}(x) + (V/}(x), h)+O(||h||2), j = 1,..., m (13)
We define a "stationary point" as a point xk in which the linear approximation (4)
predicts that no further progress can be made.
Definition. The vector x is a stationary point of F if
F(x) = min{F(x,h)}. (14)
b
This is a generalization of the normal definition of the term for differentiable
objective functions. If F is differentiable at a stationary point then the partial deriva-
tives of F are zero. It is easy to prove, Osborne & Watson (1969), that if the linear
systems (4) are bounded away from singularity there is no downhill direction from a
stationary point. We cannot prove anything stronger than convergence towards
stationary points for our algorithm, since an iteration starting at a stationary point
would give hk = 0, k = 0, 1, 2
LEMMA. Let {xk} and {hk} be the sequences of vectors defined in Section 2. If x* is
any point that is not a stationary point ofF, then there exist 5 > 0 and e > 0 such that
F(xk)-F(xk,hk)^5mrn{\\hk\\,e}, if \\xk-x*\\ ^ e, (15)
Proof. Let the maximum function value at x* be attained by the first t functions, i.e.
*Xx*) = |/,(x*)| = • • • = |/,(x*)| > |/ ( (x*)|, i>t (16)
From the continuity of the functions and their derivatives it then follows that there
exists ej > 0 such that for ||x-x*|| < et and ||h|| < et we have
max|/ ; (x)+(v7/x), h)| is attained only for 1 < j < /, (17)
j
and
j=l,...,t. (18)
Since x* is not a stationary point, there exists a vector h* such that
F(x*, h*) < F(x*). (19)
Therefore we must have
|//x*)|, j=l,...,t (20)
324 K. MADSEN

and consequently
/ / x * ) . (V//x*), h*) < 0, 7 = 1 , . . . , t. (21)
It follows from the continuity of/} and V/} that there exists e2 > 0 such that for
||x—x*|| < e2 we have
f/x). (V/}(x), h*) < 0, y = 1,...,/, (22)
and
|(V/}(x),h*)|^£ 2 , y=l,...,r. (23)
Now suppose that ||xt—x*|| < e = min{e1, e 2 }. Let yk be the vector in the direction
of h* with the length min{||hj, e},

Downloaded from http://imamat.oxfordjournals.org/ at Aston University on January 26, 2014


y* = ckh*. (24)
Then it follows from the definition of hk that
F(xk)hk)^F(xk,yk). (25)
Expression (17) states that there exists;?, 1 ^ p < t, such that
W + (V/P(xt), y,)|
x*) + (V/p(x*),cth*)|, (26)
and, since ck > 0, (18), (22) and (23) give
F(xk, yk) = | / p ( x t ) | - |(V/p(x,), ckh*)\
<l/P(xt)|-ect. (27)
The result is now a consequence of (25), (27) and the definition of ck:
F(xk)-\fp(xk)\+eck

(28)

THEOREM 1. If the sequence of vectors, xk, k = 1, 2 , . . . , generated by the algorithm


converges to a limit x*, then x* is a stationary point of F.
Proof. Suppose x* is not a stationary point. Let e be defined as in the lemma, and let
k0 be such that ||xk—x*|| < e for all k ^ k0. It follows from the lemma that for
k Js fc0 we have the equation
), h t )+O(||h t || 2 )|

... t
F(xk)-F(xk,hk)
= 1+O(||h t ||). (29)
Now the rules of the algorithm for adjusting and using kk, and the fact that the
sequence {xk} converges, imply that \\hk \\ tends to zero. It follows from equation (29)
that for all sufficiently large k inequality (11) is never satisfied, in which case we have
the bound
> U > MM- (30)
ALGORITHM FOR M1N1MAX SOLUTION 325

Further, since the function values F(xk) must decrease monotonically to F(x*), it also
follows that
lim {F(xk)-F(xk,bk)} = 0. (31)
*-»oo

As in the proof of the lemma we let h* be a vector satisfying inequality (19). By


continuity we deduce that there exist r\ > 0 and a neighbourhood of x* such that for
all x in this neighbourhood
F(x)-F(x,h*) > r,. (32)
Therefore equation (31) can be satisfied only because ||ht|| is restricted for all
sufficiently large k, in which case the condition
IIM = J* (33)

Downloaded from http://imamat.oxfordjournals.org/ at Aston University on January 26, 2014


is satisfied. Now from (30) and (33) we deduce that ||h t+1 1| ^ ||hA|| for all sufficiently
large k, contradicting the fact that ||hfc || tends to zero. Therefore x* must be a stationary
point of F, which proves the theorem.
THEOREM 2. Let xk,k= 1, 2 , . . . , be the sequence generated by the algorithm. Let L be
the set of stationary points of F and let
</(x) = inf||?-x|| (34)
If the sequence {xk} stay in a finite region, then
d{*k) -> 0 for k -+ oo. (35)
Proof. Suppose d(xk) ++ 0. Then the sequence {xk} will have infinitely many elements
bounded away from L, and therefore, since the sequence is bounded, it must have a
cluster point x* that is not a stationary point. From Theorem 1 we know then that
xk +* x*. Define
K?= {x| ||x—x*|| < e}. (36)
Then, using the lemma, we know that there exist S > 0 and e > 0 such that
i?(x,.)-F(x 7 ,h;)^<5min{||h,||,6}, for xteK* (37)
and
Xj $ K* for an infinite number of values / (38)
If rj = E/2 we can find an infinite sequence of indices, /(, i = 1,2,..., such that
x/, e Kf, and there exists a number «,• > /, such that xU( £ K*. Let {/J be chosen so
/ i+1 > «j for all values of /. Then, using (10) and (37), we have for i = 1 , 2 , . . .
F(xu)-F(xu) = " £ ' {F(xj)-F(xJ+i)}

JeD,

>Pi.8.n (39)
where C, = {je N\(lt ^j^ ut-1) A (X, # x 7+1 )}, and D, = {je C,\Xj e K*}.
Hence F(xk) -* — oo for k -* oo, which is a contradiction.
4. Examples
As one would expect the final rate of convergence depends on whether the linear
systems (4) are bounded away from singularity. We illustrate the two situations by
326 K. MADSEN

two examples. The calculations were performed in double precision on the IBM 370/165
computer.
First we consider an example given by Barrodale, Powell & Roberts (1972). Find
the parameters xh i = 1, 2,.. ., 5 such that the rational expression

(40)

is the best fit to ey at the points y = yi = —1(0.1)1. Thus the functions are
f / \ ^1 **2yj VI • •% *\ *>A /A1\
JTj \J XVt , . . ., XYS ) I ^=
— , , _ . . , _. __2 . 3 — 0*3
^ ' J1 —
~~ 1I Z/
> Zl/ I 1 4L1 1
V* )

Downloaded from http://imamat.oxfordjournals.org/ at Aston University on January 26, 2014


Starting at x0 = (0-5, 0, 0, 0, 0) and with Xo = 0125 we find the solution with an
accuracy of 6 decimals after only 8 iterations. The solution found is
0-999878
0-253588
0-746608 (42)
0-245202
0037490
which agrees with the solution found by Barrodale, Powell & Roberts. In Table 1 we
give some of the constants in the iteration. It is seen that the final rate of convergence
is very rapid. Note that the bound on ||hj| is active until xk is close to the solution.
TABLE 1

k llXfc—X9II I k II h F(xk)

0 0-747 0125 0125 0-222 XlO1


1 0-622 0125 0125 0152 xlO 1
2 0-497 0125 0125 0128 XlO1
3 0-373 0125 0125 0-607
4 0-308 0125 0125 0146
5 0190 0172 0-250 0-509 XlO-2
6 0183 XlO"1 0181 x 10-1 0172 0-869 XlO-3
7 0-243 XlO-3 0-243 XlO-3 0-362 xlO"1 0135 XlO-3
8 0-457 X 10-7 0-457 XlO-7 0-486 XlO-3 0122 xlO- 3
9 0 0122 xlO"3

As the second example we consider the 3 functions in 2 variables:


fi(xux2) = x\+x\+xix2
fifri, x2) = sin xx
h(xu x2) = cos x2. (43)
Starting at x0 = (3, 1) and with Ao = 1-2 we find the result presented in Table 2.
After 20 iterations the estimate to the solution is correct to only 4 decimals. This
slow rate of convergence is due to the fact that the linear approximation (4) near the
solution has the following form:
ALGORITHM FOR MIN1MAX SOLUTION 327

«Ai -1-4A*2 = -0-62


0-90 hkl = -0-44
-0-79 hk2 = -0-62 (44)
where sk -» 0 as k -*• oo. This implies that the unrestricted solution to the system (4)
does not tend to zero, so ||h* || = Xk when xk is near the solution. Therefore the rate
of convergence is no better than linear.
TABLE 2

k Xkl xkz llhjtll h F(xk)

0 30000 10000 0-llxlO 1 0-12X101 0130X10 2

Downloaded from http://imamat.oxfordjournals.org/ at Aston University on January 26, 2014


1 1-9021 01827 0-55 Oil xlO 1 0-400 xlO 1
2 1-4811 -0-3633 0-55 0-55 0179 xlO 1
3 0-9352 -0-9092 0-55 0-55 0-851
4 0-9352 -0-9092 014 014 0-851
5 0-7987 -0-9136 014 014 0-743
6 0-6622 -0-8924 0-27 0-27 0-644
7 0-6622 -0-8924 0-68x10-' 0-68 xlO" 1 0-644
8 0-5939 -0-8993 0-68 x 10- 1 0-68 xlO- 1 0-627
9 0-5257 -0-9066 0-68 x 10-1 0-68 xlO" 1 0-622
10 0-4574 -0-9088 0-68 xlO- 1 0-68 xlO- 1 0-619
20 0-4533 -0-9066 0-67x10-* 0-67x10-* 0-616

Finally in Table 3 we give the number of iterations used to find the solution (with
4 decimals accuracy) of two types of test examples, well known from work on least
squares. These are the Rosenbrock (1960) function,
A(x) = mx2-x\)
f2(x)=l-Xl (45)
and the "Chebyquad" functions as defined by Fletcher (1965). For comparison we
quote the number of iterations used to find the least squares solution of the problem
(1) by the method of Fletcher (1971), which is similar to our method. In two cases
different solutions were found because the objective functions were different from 0 at
the minimum. For the functions (41) the solutions found agreed in the first two
decimals.
TABLE 3

n m Least squares Minimax

Functions (41) 5 21 38 8
(43) 2 3 19t 20f
(45) 2 2 17 20
2 4 4

Chebyquad
f- 4
6
8

t Different solutions.
6
8
22f
5
5
14t

22
328 K. MADSEN

Five other types of examples, with the number of unknowns ranging from 2 to 20,
have been used for test purposes. The computational experience in these cases has
been very similar to that shown in the examples above.

The author is greatly indebted to M. J. D. Powell for many helpful comments and
suggestions.

REFERENCES
BARRODALE, I., POWELL, M. J. D. & ROBERTS, F. D. K. 1972 S.I.A.M. J. Num. Anal. 9,493.
FLETCHER, R. 1965 Comput. J. 8, 33-41.
FLETCHER, R. 1971 Report R. 6799, AERE Harwell, Didcot, Berks, England.

Downloaded from http://imamat.oxfordjournals.org/ at Aston University on January 26, 2014


FLETCHER, R. 1972 Math. Prog. 2, 133.
GRIFFITH, R. E. & STEWART, R. A. 1961 Man. Sci. 7, 379.
LEVENBERG, K. 1944 Q. appl. Math. 2, 164.
MARQUARDT, D. W. 1963 S.I.A.M. J. 11, 431.
OSBORNE, M. R. & WATSON, G. A. 1969 Comput. J. 12, 63.
POWELL, M. J. D. 1970 In Numerical methods for non-linear algebraic equations (ed. P.
Rabinowitz). London and New York: Gordon and Breach, p. 87.
POWELL, M. J. D. 1972 In Numerical methods for unconstrained optimization (ed. W. Murray).
London and New York: Academic Press, p. 29.
POWELL, M. J. D. 1974 Report No. CSS 11, AERE, Harwell, Didcot, Berks, England.
ROSENBROCK, H. H. 1960 Comput. J. 3,175-184.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy