0% found this document useful (0 votes)
48 views16 pages

The Particle Swarm-Explosion, Stability, and Convergence in A Multidimensional Complex Space

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
48 views16 pages

The Particle Swarm-Explosion, Stability, and Convergence in A Multidimensional Complex Space

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

58 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, VOL. 6, NO.

1, FEBRUARY 2002

The Particle Swarm—Explosion, Stability, and


Convergence in a Multidimensional Complex Space
Maurice Clerc and James Kennedy

Abstract—The particle swarm is an algorithm for finding op- A. The Particle Swarm
timal regions of complex search spaces through the interaction of
individuals in a population of particles. Even though the algorithm, A population of particles is initialized with random positions
which is based on a metaphor of social interaction, has been shown and velocities and a function is evaluated, using the par-
to perform well, researchers have not adequately explained how ticle’s positional coordinates as input values. Positions and ve-
it works. Further, traditional versions of the algorithm have had locities are adjusted and the function evaluated with the new
some undesirable dynamical properties, notably the particles’ ve- coordinates at each time step. When a particle discovers a pat-
locities needed to be limited in order to control their trajectories.
The present paper analyzes a particle’s trajectory as it moves in tern that is better than any it has found previously, it stores the
discrete time (the algebraic view), then progresses to the view of coordinates in a vector . The difference between (the best
it in continuous time (the analytical view). A five-dimensional de- point found by so far) and the individual’s current position
piction is developed, which describes the system completely. These is stochastically added to the current velocity, causing the tra-
analyses lead to a generalized model of the algorithm, containing jectory to oscillate around that point. Further, each particle is
a set of coefficients to control the system’s convergence tendencies.
Some results of the particle swarm optimizer, implementing modi- defined within the context of a topological neighborhood com-
fications derived from the analysis, suggest methods for altering the prising itself and some other particles in the population. The
original algorithm in ways that eliminate problems and increase stochastically weighted difference between the neighborhood’s
the ability of the particle swarm to find optima of some well-studied best position and the individual’s current position is also
test functions. added to its velocity, adjusting it for the next time step. These
Index Terms—Convergence, evolutionary computation, opti- adjustments to the particle’s movement through the space cause
mization, particle swarm, stability. it to search around the two best positions.
The algorithm in pseudocode follows.
I. INTRODUCTION
Intialize population

P ARTICLE swarm adaptation has been shown to suc-


cessfully optimize a wide range of continuous functions
[1]–[5]. The algorithm, which is based on a metaphor of social
Do
For i =1 to Population Size
if f (~x ) < f (p~ ) then p~ = ~x
interaction, searches a space by adjusting the trajectories of
p~ = min(p~ )
individual vectors, called “particles” as they are conceptualized
For d = 1 to Dimension
as moving points in multidimensional space. The individual
v = v + ' (p 0 x ) + ' (p 0x )
particles are drawn stochastically toward the positions of
v = sign(v ) 1 min(abs(v ); v )
their own previous best performance and the best previous
x =x +v
performance of their neighbors.
Next d
While empirical evidence has accumulated that the algorithm
Next i
“works,” e.g., it is a useful tool for optimization, there has thus
Until termination criterion is met
far been little insight into how it works. The present analysis
begins with a highly simplified deterministic version of the par-
ticle swarm in order to provide an understanding about how it The variables and are random positive numbers, drawn
searches the problem space [4], then continues on to analyze from a uniform distribution and defined by an upper limit ,
the full stochastic system. A generalized model is proposed, in- which is a parameter of the system. In this version, the term vari-
cluding methods for controlling the convergence properties of able is limited to the range for reasons that will be
the particle system. Finally, some empirical results are given, explained below. The values of the elements in are deter-
showing the performance of various implementations of the al- mined by comparing the best performances of all the members
gorithm on a suite of test functions. of ’s topological neighborhood, defined by indexes of some
other population members and assigning the best performer’s
index to the variable . Thus, represents the best position
Manuscript received January 24, 2000; revised October 30, 2000 and April
30, 2001. found by any member of the neighborhood.
M. Clerc is with the France Télécom, 74988 Annecy, France (e-mail: Maurice. The random weighting of the control parameters in the al-
Clerc@WriteMe.com). gorithm results in a kind of explosion or a “drunkard’s walk”
J. Kennedy is with the Bureau of Labor Statistics, Washington, DC 20212
USA (e-mail: Kennedy_jim@bls.gov). as particles’ velocities and positional coordinates careen toward
Publisher Item Identifier S 1089-778X(02)02209-9. infinity. The explosion has traditionally been contained through
1089–778X/02$17.00 © 2002 IEEE
CLERC AND KENNEDY: THE PARTICLE SWARM—EXPLOSION, STABILITY, AND CONVERGENCE 59

implementation of a parameter, which limits step size or The present paper analyzes the particle swarm as it moves in
velocity. The current paper, however, demonstrates that the im- discrete time (the algebraic view), then progresses to the view of
plementation of properly defined constriction coefficients can it in continuous time (the analytical view). A five-dimensional
prevent explosion; further, these coefficients can induce parti- (5-D) depiction is developed, which completely describes the
cles to converge on local optima. system. These analyses lead to a generalized model of the al-
An important source of the swarm’s search capability is the gorithm, containing a set of coefficients to control the system’s
interactions among particles as they react to one another’s find- convergence tendencies. When randomness is reintroduced to
ings. Analysis of interparticle effects is beyond the scope of this the full model with constriction coefficients, the deleterious ef-
paper, which focuses on the trajectories of single particles. fects of randomness are seen to be controlled. Some results of
the particle swarm optimizer, using modifications derived from
B. Simplification of the System the analysis, are presented; these results suggest methods for al-
We begin the analysis by stripping the algorithm down to a tering the original algorithm in ways that eliminate some prob-
most simple form; we will add things back in later. The particle lems and increase the optimization power of the particle swarm.
swarm formula adjusts the velocity by adding two terms to it.
The two terms are of the same form, i.e., , where is II. ALGEBRAIC POINT OF VIEW
the best position found so far, by the individual particle in the The basic simplified dynamic system is defined by
first term, or by any neighbor in the second term. The formula
can be shortened by redefining as follows:
(2.1)

where .
Let
Thus, we can simplify our initial investigation by looking at
the behavior of a particle whose velocity is adjusted by only one
term

be the current point in and

where . This is algebraically identical to the stan-


dard two-term form.
When the particle swarm operates on an optimization
the matrix of the system. In this case, we have
problem, the value of is constantly updated, as the system
and, more generally, . Thus, the system is defined
evolves toward an optimum. In order to further simplify the
completely by .
system and make it understandable, we set to a constant
The eigenvalues of are
value in the following analysis. The system will also be
more understandable if we make a constant as well; where
normally it is defined as a random number between zero and a
constant upper limit, we will remove the stochastic component (2.2)
initially and reintroduce it in later sections. The effect of on
the system is very important and much of the present paper is
involved in analyzing its effect on the trajectory of a particle. We can immediately see that the value is special.
The system can be simplified even further by considering a Below, we will see what this implies.
one-dimensional (1-D) problem space and again further by re- For , we can define a matrix so that
ducing the population to one particle. Thus, we will begin by
looking at a stripped-down particle by itself, e.g., a population (2.3)
of one 1-D deterministic particle, with a constant .
Thus, we begin by considering the reduced system (note that does not exist when ).
For example, from the canonical form , we find
(1.1)

where and are constants. No vector notation is necessary


and there is no randomness. (2.4)
In [4], Kennedy found that the simplified particle’s trajectory
is dependent on the value of the control parameter and recog-
nized that randomness was responsible for the explosion of the
In order to simplify the formulas, we multiply by to pro-
system, although the mechanism that caused the explosion was
duce a matrix
not understood. Ozcan and Mohan [6], [7] further analyzed the
system and concluded that the particle as seen in discrete time
(2.5)
“surfs” on an underlying continuous foundation of sine waves.
60 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, VOL. 6, NO. 1, FEBRUARY 2002

TABLE I
SOME ' VALUES FOR WHICH THE SYSTEM IS CYCLIC

(a) (b)

So, if we define , we can now write

(c) (d)
(2.6) p =
Fig. 1. (a) Cyclic trajectory of a nonrandom particle when ' 3. (b) Cyclic
trajectory of a nonrandom particle when ' = (5 + 5)=2. (c) Cyclic trajectory
and, finally, of a nonrandom particle when ' = (5 0p 5)=2. (d) Particle’s more typical
quasi-cyclic behavior when ' does not satisfy (2.11). Here, ' = 2:1.
However, is a diagonal matrix, so we have simply

For example, for and , we have


(2.7)

In particular, there is cyclic behavior in the system if and only


if (or, more generally, if ). This just means
that we have a system of two equations

(2.8) (2.13)

A. Case
For , the eigenvalues are complex and there is B. Case
always at least one (real) solution for . More precisely, we can If , then and are real numbers (and ),
write so we have either:
1) (for even) which implies , not
(2.9) consistent with the hypothesis ;
2) (or ), which is impossible;
with and . Then 3) , that is to say , not consistent with
the hypothesis .
(2.10) So, and this is the point: there is no cyclic behavior for
and, in fact, the distance from the point to the center (0,0) is
strictly monotonic increasing with , which means that
and cycles are given by any such that .
So for each , the solutions for are given by
(2.14)
for (2.11)
So
Table I gives some nontrivial values of for which the system
is cyclic.
Fig. 1(a)–(d) show the trajectories of a particle in phase space,
(2.15)
for various values of . When takes on one of the values from
Table I, the trajectory is cyclical, for any other value, the system
is just quasi-cyclic, as in Fig. 1(d). One can also write
We can be a little bit more precise. Below, is the 2-norm
(the Euclidean one for a vector)

(2.16)
(2.12)
So, finally, increases like .
CLERC AND KENNEDY: THE PARTICLE SWARM—EXPLOSION, STABILITY, AND CONVERGENCE 61

In Section IV, this result is used to prevent the explosion of as long as , which means that de-
the system, which can occur when particle velocities increase creases as long as
without control.

C. Case Integer part (2.22)

In this situation
After that, increases.
The same analysis can be performed for . In this case,
, as well, so the formula is the same. In fact, to be even
In this particular case, the eigenvalues are both equal to more precise, if
and there is just one family of eigenvectors, generated by

So, we have . then we have


Thus, if is an eigenvector, proportional to (that is to say,
if ), there are just two symmetrical points, for
(2.23)
(2.17)
Thus, it can be concluded that decreases/increases al-
In the case where is not an eigenvector, we can directly
most linearly when is big enough. In particular, even if it
compute how decreases and/or increases.
begins to decrease, after that it tends to increase almost like
Let us define . By recurrence, the
.
following form is derived:
(2.18) III. ANALYTIC POINT OF VIEW
where , , are integers so that for . A. Basic Explicit Representation
The integers can be negative, zero, or positive. From the basic iterative (implicit) representation, the fol-
Supposing for a particular we have , one can easily lowing is derived:
compute . This quantity is pos-
itive if and only if is not between (or equal to) the roots
.
Now, if is computed, then we have
(3.1)
and the roots are .
As , this result means that is also positive. Assuming a continuous process, this becomes a classical
So, as soon as begins to increase, it does so infinitely, but second-order differential equation
it can decrease, at the beginning. The question to be answered
next is, how long can it decrease before it begins increasing?
Now take the case of . This means that is between (3.2)
and . For instance, in the case where 1
where and are the roots of
with (2.19)
(3.3)
By recurrence, the following is derived:
As a result

(3.4)

with
(2.20) The general solution is

Finally (3.5)

(2.21) A similar kind of expression for is now produced, where


1Note that the present paper uses the Bourbaki convention of representing
open intervals with reversed brackets. Thus, ]a,b[ is equivalent to parenthetical (3.6)
notation (a,b).
62 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, VOL. 6, NO. 1, FEBRUARY 2002

The coefficients and depend on and . If


, we have

(3.7)
(3.13)

In the case where , (3.5) and (3.6) give


C. General Implicit and Explicit Representations
(3.8) A more general implicit representation (IR) is produced by
adding five coefficients , which will allow us to
identify how the coefficients can be chosen in order to ensure
so we must have convergence. With these coefficients, the system becomes
(3.9)

in order to prevent a discontinuity.


Regarding the expressions and , eigenvalues of the ma-
trix , as in Section II above, the same discussion about the (3.14)
sign of ( ) can be made, particularly about the (non) ex-
istence of cycles. The matrix of the system is now
The above results provide a guideline for preventing the ex-
plosion of the system, for we can immediately see that it depends
on whether we have

(3.10) Let and be its eigenvalues.


The (analytic) explicit representation (ER) becomes

B. A Posteriori Proof
One can directly verify that and are, indeed, solu-
tions of the initial system.
On one hand, from their expressions
(3.15)

(3.11) with

and on the other hand (3.16)

Now the constriction coefficients (see Section IV for details)


and are defined by
(3.12)
(3.17)
and also
with

(3.18)

which are the eigenvalues of the basic system. By computing


the eigenvalues directly and using (3.17), and are

(3.19)
CLERC AND KENNEDY: THE PARTICLE SWARM—EXPLOSION, STABILITY, AND CONVERGENCE 63

The final complete ER can then be written from (3.15) and D. From ER to IR
(3.16) by replacing and , respectively, by and The ER will be useful to find convergence conditions. Nev-
and then , , , by their expressions, as seen in (3.18) ertheless, in practice, the iterative form obtained from (3.19) is
and (3.19). very useful, as shown in (3.24) at the bottom of the page.
It is immediately worth noting an important difference be- Although there are an infinity of solutions in terms of the five
tween IR and ER. In the IR, is always an integer and and parameters , it is interesting to identify some par-
are real numbers. In the ER, real numbers are obtained if ticular classes of solutions. This will be done in the next section.
and only if is an integer; nothing, however, prevents the assign-
ment of any real positive value to , in which case and E. Particular Classes of Solutions
become true complex numbers. This fact will provide an elegant
1) Class 1 Model: The first model implementing the five-
way of explaining the system’s behavior, by conceptualizing it
parameter generalization is defined by the following relations:
in a 5-D space, as discussed in Section IV.
Note 3.1: If and are to be real numbers for a given
value, there must be some relations among the five real coef- (3.25)
ficients . If the imaginary parts of and are
set equal to zero, (3.20) is obtained, as shown at the bottom of In this particular case, and are
the page, with
sign

sign
(3.26)
sign
An easy way to ensure real coefficients is to have
. Under this additional condition, a class of solution is
sign simply given by

(3.21) (3.27)
The two equalities of (3.20) can be combined and simplified
2) Class Model: A related class of model is defined by
as follows:
the following relation:
sign
sign (3.28)
(3.22)
The solutions are usually not completely independent of . In The expressions in (3.29), shown at the bottom of the next
order to satisfy these equations, a set of possible conditions is page, for and are derived from (3.24).
If the condition is added, then
(3.23)
or (3.30)
However, these conditions are not necessary. For example,
an interesting particular situation (studied below) exists where Without this condition, one can choose a value for , for ex-
. In this case, for ample, and a corresponding value ( ), which give
any value and (3.20) is always satisfied. a convergent system.

sign sign
(3.20)
sign sign

(3.24)
64 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, VOL. 6, NO. 1, FEBRUARY 2002

3) Class Model: A second model related to the Class 1 2) For the Class models, with the conditions
formula is defined by and

(3.31)

(3.32) (3.38)
For historical reasons and for its simplicity, the case has 3) For the the Class 2 models, see (3.39) at the bottom of the
been well studied. See Section IV-C for further discussion. page, with .
4) Class 2 Model: A second class of models is defined by This means that we will just have to choose ,
the relations , and , class , respectively, to have a
convergent system. This will be discussed further in Section IV.
(3.33)
F. Removing the Discontinuity
Under these constraints, it is clear that Depending on the parameters the system
may have a discontinuity in due to the presence of the term
in the eigen-
values.
Thus, in order to have a completely continuous system, the
(3.34) values for must be chosen such that

which gives us and , respectively.


Again, an easy way to obtain real coefficients for every
value is to have . In this case (3.40)
By computing the discriminant, the last condition is found to
(3.35) be equivalent to
(3.41)
In the case where , the following is obtained:
In order to be “physically plausible,” the parameters
must be positive. So, the condition becomes
(3.36) (3.42)
The set of conditions taken together specify a volume in
From the standpoint of convergence, it is interesting to note for the admissible values of the parameters.
that we have the following.
G. Removing the Imaginary Part
1) For the Class 1 models, with the condition
When the condition specified in (3.42) is met, the trajectory
is usually still partly in a complex space whenever one of the
(3.37)
eigenvalues is negative, due to the fact that is a complex

(3.29)

(3.39)
CLERC AND KENNEDY: THE PARTICLE SWARM—EXPLOSION, STABILITY, AND CONVERGENCE 65

number when is not an integer. In order to prevent this, we


must find some stronger conditions in order to maintain positive
eigenvalues.
Since

(3.43)

the following conditions can be used to ensure positive eigen-


values:

(3.44)

Note 3.2: From an algebraic point of view, the conditions


described in (3.43) can be written as
(a)

(3.45)
trace
Now, these conditions depend on . Nevertheless, if the max-
imum value is known, they can be rewritten as

(3.46)

Under these conditions, all system variables are real numbers


in conjunction with the conditions in (3.42) and (3.44), the pa-
rameters can be selected so that the system is completely con-
tinuous and real. (b)
= =
Fig. 2. (a) Convergent trajectory in phase space of a particle when 1
H. Example and  =  , where ' = 4. Both velocity v and y , the difference between the
previous best p, and the current position x converge to 0.0. (b) y increases over
As an example, suppose that and . Now the time, even when the parameters are real and not complex.
conditions become
The answer is no. It can be demonstrated that convergence is
(3.47) not always guaranteed for real-valued variables. For example,
given the following parameterization:

For example, when


(3.50)

(3.48)
the relations are

the system converges quite quickly after about 25 time steps (3.51)
and at each time step the values of and are almost the same
over a large range of values. Fig. 2(a) shows an example of
convergence ( and ) for a continuous real-valued which will produce system divergence when (for in-
system with . stance), since . This is seen in Fig. 2(b)

I. Reality and Convergence IV. CONVERGENCE AND SPACE OF STATES


The quick convergence seen in the above example suggests From the general ER, we find the criterion of convergence
an interesting question. Does reality—using real-valued vari-
ables—imply convergence? In other words, does the following (4.1)
hold for real-valued system parameters:
where and are usually true complex numbers.
(3.49) Thus, the whole system can be represented in a 5-D space
Re Im Re Im .
66 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, VOL. 6, NO. 1, FEBRUARY 2002

In this section, we study some examples of the most simple


class of constricted cases: the ones with just one constriction
coefficient. These will allow us to devise methods for control-
ling the behavior of the swarm in ways that are desirable for
optimization.

A. Constriction for Model Type 1


Model Type 1 is described as follows:
(a)
(4.2)

We have seen that the convergence criterion is satisfied when


. Since , the constriction
coefficient below is produced

(4.3)

B. Constriction for Model Type


Just as a constriction coefficient was found for the Type 1
model, the following IR (with instead of ) is used for Type :

(4.4) (b)
Fig. 3. (a) Type 1 constriction coefficient  as a function of ' and . It drops
The coefficient becomes below  only when ' > 4:0. (b) Type 1 coefficient is less than 1.0 when ' <
4:0. These coefficients identify the conditions for convergence of the particle
system.
for (4.5)

However, as seen above, this formula is a priori valid only mean and minimally acceptable values for sure convergence.
when , so it is interesting to find another constriction For example, for , the constraint must hold,
coefficient that has desirable convergence properties. We have but there is no such restriction on if .
here Note 4.1: The above analysis is for constant. If is
random, it is nevertheless possible to have convergence, even
with a small constriction coefficient, when at least one value
(4.6) is strictly inside the interval of variation.

The expression under the square root is negative for C. Constriction Type
. In this case, the eigenvalue is a Referring to the Class model, in the particular case where
true complex number and . Thus, if , , we use the following IR (with instead of )
that is to say, if , a needs to be selected such that
in order to satisfy the convergence cri- (4.8)
terion. So, for example, define as
In fact, this system is hardly different from the classical par-
for (4.7) ticle swarm as described in the Section I
Now, can another formula for greater values be found? The (4.9)
answer is no. For in this case, is a real number and its absolute
value is: so it may be interesting to detail how, in practice, the constriction
1) strictly decreasing on and the coefficient is found and its convergence properties proven.
minimal value is (greater than 1); Step 1) Matrix of the System
2) strictly decreasing on , with a We have immediately
limit of 1.
For simplicity, the formula can be the same as for Type 1, (4.10)
not only for , but also for . This is, indeed, also
possible, but then cannot be too small, depending on . More Step 2) Eigenvalues
precisely, the constraint must be sat- They are the two solutions for the equation
isfied. However, as for , we have , which means
that the curves in Fig. 3(a) and (b) can then be interpreted as the trace determinant (4.11)
CLERC AND KENNEDY: THE PARTICLE SWARM—EXPLOSION, STABILITY, AND CONVERGENCE 67

or

(4.12)

Thus

(4.13)

with

trace determinant

Fig. 4. Discriminant remains negative within some bounds of ', depending


(4.14) on the value of , ensuring that the particle system will eventually converge.

Step 3) Complex and Real Areas on TABLE II


VALUES OF ' BETWEEN WHICH THE DISCRIMINANT IS NEGATIVE,
The discriminant is negative for the values in FOR TWO SELECTED VALUES OF 
. In
this area, the eigenvalues are true complex numbers
and their absolute value (i.e., module) is simply .
Step 4) Extension of the Complex Region and Constriction
Coefficient
In the complex region, according to the conver-
gence criterion, in order to get convergence.
So the idea is to find a constriction coefficient de- . This relation is valid as soon as .
pending on so that the eigenvalues are true com- Fig. 4 shows how the discriminant depends on , for two
plex numbers for a large field of values. In this values. It is negative between the values given in Table II.
case, the common absolute value of the eigenvalues
is D. Moderate Constriction
for While it is desirable for the particle’s trajectory to converge,
(4.15) by relaxing the constriction the particle is allowed to oscillate
else
through the problem space initially, searching for improvement.
which is smaller than one for all values as soon as Therefore, it is desirable to constrict the system moderately,
is itself smaller than one. preventing explosion while still allowing for exploration.
This is generally the most difficult step and sometimes needs To demonstrate how to produce moderate constriction, the
some intuition. Three pieces of information help us here: following ER is used:
1) the determinant of the matrix is equal to ;
2) this is the same as in Constriction Type 1;
3) we know from the algebraic point of view the system is
(eventually) convergent like . (4.18)
So it appears very probable that the same constriction coeffi-
cient used for Type 1 will work. First, we try that is to say

(4.16)

that is to say
From the relations between ER and IR, (4.19) is obtained, as
for shown at the bottom of the next page.
(4.17) There is still an infinity of possibilities for selecting the pa-
else rameters . In other words, there are many different IRs
that produce the same explicit one. For example
It is easy to see that is negative only between and ,
depending on . The general algebraic form of is quite
complicated (polynomial in with some coefficients being
roots of an equation in ) so it is much easier to compute
it indirectly for some values. If is smaller than four,
then and by solving we find that (4.20)
68 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, VOL. 6, NO. 1, FEBRUARY 2002

Fig. 5. Real parts of y and v , varying ' over 50 units of time, for a range of ' values.

or

(4.21)

From a mathematical point of view, this case is richer than (a) (b)
the previous ones. There is no more explosion, but there is not
always convergence either. This system is “stabilized” in the
sense that the representative point in the state space tends to
move along an attractor which is not always reduced to a single
point as in classical convergence.

E. Attractors and Convergence


Fig. 5 shows a three-dimensional representation of the
(c) (d)
real restriction Re Re of a particle moving in
the 5-D space. Fig. 6(a)–(c) show the “real” restrictions
(Re Re ) of the particles that are typically studied. We
can clearly see the three cases:
1) “spiral” easy convergence toward a nontrivial attractor for
[see Fig. 6(a)];
2) difficult convergence for [see Fig. 6(b)];
3) quick almost linear convergence for [see Fig. 6(c)].
Nevertheless, it is interesting to have a look at the true system, (e) (f)
including the complex dimensions. Fig. 6(d)–(f) shows some Fig. 6. Trajectories of a particle in phase space with three different values of
other sections of the whole surface in . '. (a) (c) and (e) Real parts of the velocity v and position relative to the previous
Note 4.2: There is a discontinuity, for the radius is equal best y . (b) (d) and (f) Real and imaginary parts of v . (a) and (d) show the attractor
for a particle with ' = 2:5. Particle tends to orbit, rather than converging to
to zero for (see Fig. 7). 0.0. (b) and (e) show the same views with ' = 3:99. (c) and (f) depict the
Thus, what seems to be an “oscillation” in the real space is in “easy” convergence toward 0.0 of a constricted particle with ' = 6:0. Particle
fact a continuous spiralic movement in a complex space. More oscillates with quickly decaying amplitude toward a point in the phase space
(and the search space).
importantly, the attractor is very easy to define: it is the “circle”
[center (0,0) and radius ]. When , and
when , then ( with ), for the part of tends to zero. This provides an intu-
the constriction coefficient has been precisely chosen so that itive way to transform this stabilization into a true convergence.

(4.19)
CLERC AND KENNEDY: THE PARTICLE SWARM—EXPLOSION, STABILITY, AND CONVERGENCE 69

Upon computing the constriction coefficient, the following


form is obtained:

(a)

if
else
(5.3)

Coming back to the ( ) system, and are


(b)
Fig. 7. “Trumpet” global attractor when ' < 4. Axis (Re(v ); Im(v ); ');  =
' of the real and imaginary parts of v . (b) Effects of the real and
8. (a) Effect on
imaginary parts of y . (5.4)
The use of the constriction coefficient can be viewed as a rec-
ommendation to the particle to “take smaller steps.” The conver-
We just have to use a second coefficient in order to reduce the gence is toward the point (
attractor, in the case , so that ). Remember is in fact the velocity of the particle, so it will
indeed be equal to zero in a convergence point.2 Example
(4.22)

The models studied here have only one constriction coeffi-


cient. If one sets , the Type 1 constriction is produced,
and are uniform random variables between 0 and
but now, we understand better why it works.
and respectively. This example is shown in Fig. 8.

V. GENERALIZATION OF THE PARTICLE-SWARM SYSTEM VI. RUNNING THE PARTICLE SWARM WITH CONSTRICTION
COEFFICIENTS
Thus far, the focus has been on a special version of the particle
As a result of the above analysis, the particle swarm algorithm
swarm system, a system reduced to scalars, collapsed terms and
can be conceived of in such a way that the system’s explosion
nonprobabilistic behavior. The analytic findings can easily be
can be controlled, without resorting to the definition of any ar-
generalized to the more usual case where is random and two
bitrary or problem-specific parameters. Not only can explosion
vector terms are added to the velocity. In this section the results
be prevented, but the model can be parameterized in such a way
are generalized back to the original system as defined by
that the particle system consistently converges on local optima.
(Except for a special class of functions, convergence on global
(5.1) optima cannot be proven.)
The particle swarm algorithm can now be extended to include
many types of constriction coefficients. The most general mod-
Now , , and are defined to be ification of the algorithm for minimization is presented in the
following pseudocode.

Assign ; '
Calculate ; ; ; ; ; 
(5.2) Initialize population: random x ;v
Do
For i=1 to population size
to obtain exactly the original nonrandom system described in
2Convergence implies velocity = 0, but the convergent point is not neces-
Section I.
sarily the one we want, particularly if the system is too constricted. We hope to
For instance, if there is a cycle for , then there is an show in a later paper how to cope with this problem, by defining the optimal
infinity of cycles for the values so that . parameters.
70 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, VOL. 6, NO. 1, FEBRUARY 2002

in the Type version, results in slow convergence,


meaning that the space is thoroughly searched before the popu-
lation collapses into a point.
In fact, the Type constriction particle swarm can be pro-
grammed as a very simple modification to the standard version
presented in Section I. The constriction coefficient is calcu-
lated as shown in (4.15)

, for

else

The coefficient is then applied to the right side of the velocity


adjustment.

Fig. 8. Example of the trajectory of a particle with the “original” formula Calculate 
0
containing two '(p x) terms, where ' is the upper limit of a uniform random
variable. As can be seen, velocity v converges to 0.0 and the particle’s position Initialize population
x converges on the previous best point p. Do
For i = 1 to Population Size
if f (~
x ) < f (p~ ) then p~ = ~x
if f (x ) < f (p ) then p = x
p~ p~
= min( )
For d = 1 to dimension
d = 1 to Dimension
'1 = rand() 2 (' =2)
For
v = (v + ' (p 0 x ' (p 0x
'2 = rand() 2 (' =2)
)+ ))
x = x +v
' = '1 + '2
Next d
p = (('1 p ) + ('2 p ))='
Next i
x=x
Until termination criterion is met.
v=v
v = v + ' (p 0 x)
x = p + v 0 ( 0 ( ')) (p 0 x) Note that the algorithm now requires no explicit limit .
Next d The constriction coefficient makes it unnecessary. In [8], Eber-
Next i hart and Shi recommended, based on their experiments, that a
Until termination criterion is met. liberal , for instance, one that is equal to the dynamic range
of the variable, be used in conjunction with the Type con-
striction coefficient. Though this extra parameter may enhance
In this generalized version of the algorithm, the user selects
performance, the algorithm will still run to convergence even if
the version and chooses values for and that are consistent
it is omitted.
with it. Then the two eigenvalues are computed and the greater
one is taken. This operation can be performed as follows.
VII. EMPIRICAL RESULTS

discrim = ((') 0
4 + ( 0 ) +2 '( 0 ))=4 Several types of particle swarms were used to optimize a set
0
a = ( +  ')=2
of unconstrained real-valued benchmark functions, namely, sev-
if (discrim > 0) then
eral of De Jong’s functions [9], Schaffer’s f6, and the Griewank,
neprim1 = abs(a + discrim)
p Rosenbrock, and Rastrigin functions. A population of 20 parti-
neprim2 = abs(a 0p
discrim)
cles was run for 20 trials per function, with the best performance
else
evaluation recorded after 2000 iterations. Some results from An-
neprim1 = a + abs(discrim)
geline’s [1] runs using an evolutionary algorithm are shown for
neprim2 = neprim1
comparison.
max(eig.) = max(neprim1 ; neprim2)
Though these functions are commonly used as benchmark
functions for comparing algorithms, different versions have ap-
peared in the literature. The formulas used here for De Jong’s f1,
These steps are taken only once in each program and, thus, do f2, f4 (without noise), f5, and Rastrigin functions are taken from
not slow it down. For the versions tested in this paper, the con- [10]. Schaffer’s f6 function is taken from [11]. Note that earlier
striction coefficient is calculated simply as eig. . editions give a somewhat different formula. The Griewank func-
For instance, the Type 1 version is defined by the rules tion given here is the one used in the First International Contest
. on Evolutionary Optimization held at ICEC 96 and the 30-di-
The generalized description allows the user to control the de- mensional generalized Rosenbrock function is taken from [1].
gree of convergence by setting to various values. For instance, Functions are given in Table III.
CLERC AND KENNEDY: THE PARTICLE SWARM—EXPLOSION, STABILITY, AND CONVERGENCE 71

TABLE III
FUNCTIONS USED TO TEST THE EFFECTS OF THE CONSTRICTION COEFFICIENTS

TABLE IV B. Results
FUNCTION PARAMETERS FOR THE TEST PROBLEMS
Table V compares various constricted particle swarms’ per-
formance to that of the traditional particle swarm and evo-
lutionary optimization (EO) results reported by [1]. All particle
swarm populations comprised 20 individuals.
Functions were implemented in 30 dimensions except for f2,
f5, and f6, which are given for two dimensions. In all cases ex-
cept f5, the globally optimal function result is 0.0. For f5, the
best known result is 0.998004. The limit of the control param-
eter was set to 4.1 for the constricted versions and 4.0 for the
versions of the particle swarm. The column labeled “E&S”
A. Algorithm Variations Used was programmed according to the recommendations of [8]. This
condition included both Type constriction and , with
Three variations of the generalized particle swarm were used
set to the range of the initial domain for the function. Func-
on the problem suite.
tion results were saved with six decimal places of precision.
Type 1: The first version applied the constriction coefficient
As can be seen, the Type and Type 1 constricted versions
to all terms of the formula
outperformed the versions in almost every case; the exper-
imental version was sometimes better, sometimes not. Further,
the Type and Type 1 constricted particle swarms performed
using .
better than the comparison evolutionary method on three of the
Type 1 : The second version tested was a simple constriction,
four functions. With some caution, we can at least consider the
which was not designed to converge, but not to explode, either,
as was assigned a value of 1.0. The model was defined as performances to be comparable.
Eberhart and Shi’s suggestion to hedge the search by re-
taining with Type constriction does seem to result in
good performance on all functions. It is the best on all except the
Rosenbrock function, where performance was still respectable.
Experimental Version: The third version tested was more ex- An analysis of variance was performed comparing the “E&S”
perimental in nature. The constriction coefficient was initially version with Type , standardizing data within functions.
defined as . If , then it was multiplied It was found that the algorithm had a significant main effect
by 0.9 iteratively. Once a satisfactory value was found, the fol- , , but that there was a significant
lowing model was implemented:
interaction of algorithm with function ,
, suggesting that the gain may not be robust across
all problems. These results support those of [8].
Any comparison with Angeline’s evolutionary method
should be considered cautiously. The comparison is offered
As in the first version, a “generic” value of was used. only as a prima facie standard by which to assess performances
Table IV displays the problem-specific parameters implemented on these functions after this number of iterations. There are
in the experimental trials. numerous versions of the functions reported in the literature
72 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, VOL. 6, NO. 1, FEBRUARY 2002

TABLE V
EMPIRICAL RESULTS

Mean best evaluations at the end of 2 000 iterations for various versions of particle swarm and Angeline’s evolu-
tionary algorithm [1].

and it is extremely likely that features of the implementation is seen to spiral toward an attractor, which turns out to be quite
are responsible for some variance in the observed results. simple in form: a circle. The real-number section by which this
The comparison though does allow the reader to confirm that is observed when time is treated discretely is a sine wave.
constricted particle swarms are comparable in performance to The 5-D perspective summarizes the behavior of a particle
at least one evolutionary algorithm on these test functions. completely and permits the development of methods for
As has long been noted, the particle swarm succeeds at controlling the explosion that results from randomness in the
finding optimal regions of the search space, but has no feature system. Coefficients can be applied to various parts of the
that enables it to converge on optima (e.g., [1]). The constriction formula in order to guarantee convergence, while encouraging
techniques reported in this paper solve this problem, they do exploration. Several kinds of coefficient adjustments are
force convergence. The data clearly indicate an increase in the suggested in the present paper, but we have barely scratched
ability of the algorithm to find optimal points in the search space the surface and plenty of experiments should be prompted
for these problems as a result. by these findings. Simple modifications based on the present
No algorithmic parameters were adjusted for any of the analysis resulted in an optimizer which appears, from these
particle swarm trials. Parameters such as , , population preliminary results, to be able to find the minima of some
size, etc., were held constant across functions. Further, it should extremely complex benchmark functions. These modifications
be emphasized that the population size of 20 is considerably can guarantee convergence, which the traditional particle
smaller than what is usually seen in evolutionary methods, swarm does not. In fact, the present analysis suggests that no
resulting in fewer function evaluations and consequently faster problem-specific parameters may need to be specified.
clock time in order to achieve a similar result. For instance, An- We remind the reader that the real strength of the particle
geline’s results cited for comparison are based on populations swarm derives from the interactions among particles as they
of 250. search the space collaboratively. The second term added to the
velocity is derived from the successes of others, it is considered
VIII. CONCLUSION a “social influence” term; when this effect is removed from
This paper explores how the particle swarm algorithm works the algorithm, performance is abysmal [3]. Effectively, the
from the inside, i.e., from the individual particle’s point of view. variable keeps moving, as neighbors find better and better
How a particle searches a complex problem space is analyzed points in the search space and its weighting relative to varies
and improvements to the original algorithm based on this anal- randomly with each iteration. As a particle swarm population
ysis are proposed and tested. Specifically, the application of con- searches over time, individuals are drawn toward one another’s
striction coefficients allows control over the dynamical charac- successes, with the usual result being clustering of individuals
teristics of the particle swarm, including its exploration versus in optimal regions of the space. The analysis of the social-in-
exploitation propensities. fluence aspect of the algorithm is a topic for a future paper.
Though the pseudocode in Section VI may look different
from previous particle swarm programs, it is essentially the REFERENCES
same algorithm rearranged to enable the judicious application [1] P. Angeline, “Evolutionary optimization versus particle swarm opti-
of analytically chosen coefficients. The actual implementation mization: Philosophy and performance differences,” in Evolutionary
may be as simple as computing one constant coefficient and Programming VII, V. W. Porto, N. Saravanan, D. Waagen, and A. E.
Eiben, Eds. Berlin, Germany: Springer-Verlag, 1998, pp. 601–610.
using it to weight one term in the formula. The Type method, [2] J. Kennedy and R. C. Eberhart, “Particle swarm optimization,” in Proc.
in fact, requires only the addition of a single coefficient, calcu- IEEE Int. Conf. Neural Networks, Perth, Australia, Nov. 1995, pp.
1942–1948.
lated once at the start of the program, with almost no increase [3] J. Kennedy, “The particle swarm: Social adaptation of knowledge,” in
in time or memory resources. Proc. 1997 Int. Conf. Evolutionary Computation, Indianapolis, IN, Apr.
In the current analysis, the sine waves identified by Ozcan and 1997, pp. 303–308.
[4] , “Methods of agreement: Inference among the eleMentals,”
Mohan [6], [7] turn out to be the real parts of the 5-D attractor. in Proc. 1998 IEEE Int. Symp. Intelligent Control, Sept. 1998, pp.
In complex number space, e.g., in continuous time, the particle 883–887.
CLERC AND KENNEDY: THE PARTICLE SWARM—EXPLOSION, STABILITY, AND CONVERGENCE 73

[5] Y. Shi and R. C. Eberhart, “Parameter selection in particle swarm adap- Maurice Clerc received the M.S. degree in mathe-
tation,” in Evolutionary Programming VII, V. W. Porto, N. Saravanan, matics (algebra and complex functions) from the Uni-
D. Waagen, and A. E. Eiben, Eds. Berlin, Germany: Springer-Verlag, versité de Villeneuve, France, and the Eng. degree in
1997, pp. 591–600. computer science from the Institut industriel du Nord,
[6] E. Ozcan and C. K. Mohan et al., “Analysis of a simple particle swarm Villeneuve d’Asq, France, in 1972.
optimization problem,” in Proc. Conf. Artificial Neural Networks in He is currently with Research and Design, France
Engineering, C. Dagli et al., Eds., St. Louis, MO, Nov. 1998, pp. Télécom, Annecy, France. His current research in-
253–258. terests include cognitive science, nonclassical logics,
[7] , “Particle swarm optimization: Surfing the waves,” in Proc. 1999 and artificial intelligence.
Congr. Evolutionary Computation, Washington, DC, July 1999, pp. Mr. Clerc is a Member of the French Association
for Artificial Intelligence and the Internet Society.
1939–1944.
[8] R. C. Eberhart and Y. Shi, “Comparing inertia weights and constriction
factors in particle swarm optimization,” in Proc. 2000 Congr. Evolu-
tionary Computation, San Diego, CA, July 2000, pp. 84–88. James Kennedy received the Master’s degree in
[9] K. De Jong, “An analysis of the behavior of a class of genetic adaptive psychology from the California State University,
systems,” Ph.D. dissertation, Dept. Comput. Sci., Univ. Michigan, Ann Fresno, in 1990 and the Doctorate from the Univer-
Arbor, MI, 1975. sity of North Carolina, Chapel Hill, in 1992.
[10] R. G. Reynolds and C.-J. Chung, “Knowledge-based self-adaptation in He is currently a Social Psychologist with the Bu-
evolutionary programming using cultural algorithms,” in Proc. IEEE reau of Labor Statistics, Washington, DC, working in
Int. Conf. Evolutionary Computation, Indianapolis, IN, Apr. 1997, pp. data collection research. He has been working with
71–76. particle swarms since 1994.
[11] L. Davis, Ed., Handbook of Genetic Algorithms. New York: Van Nos-
trand Reinhold, 1991.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy