0% found this document useful (0 votes)
43 views8 pages

Parameter Control Mechanisms in Differential Evolution: A Tutorial Review and Taxonomy

DE algorithm
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
43 views8 pages

Parameter Control Mechanisms in Differential Evolution: A Tutorial Review and Taxonomy

DE algorithm
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Parameter Control Mechanisms in Differential

Evolution: A Tutorial Review and Taxonomy


Tsung-Che Chiang Cheng-Nan Chen Yu-Chieh Lin
Department of Computer Science and Department of Computer Science and Department of Computer Science and
Information Engineering, Information Engineering, Information Engineering,
National Taiwan Normal University National Taiwan Normal University National Taiwan Normal University
Taipei, Taiwan, R.O.C. Taipei, Taiwan, R.O.C. Taipei, Taiwan, R.O.C.
tcchiang@ieee.org 698470219@ntnu.edu.tw bbeennlin@hotmail.com

Abstract—Differential evolution (DE) is a promising algorithm Table I gives the pseudo code of a typical DE algorithm. In
for continuous optimization. Its two parameters, CR and F, have each generation G, every individual serves as the target vector.
great effect on the algorithm performance. In recent years many In the mutation step, several non-identical individuals are
DE algorithms with parameter control mechanisms were chosen randomly. One individual is the base vector and then
proposed. In this paper we propose a taxonomy to classify these
adds an amplified difference vector to be a donor vector. Next,
algorithms according to the number of candidate parameter
values, the number of parameter values used in a single a trial vector is produced by taking the gene values from the
generation, and the source of considered information. We classify target vector or the donor vector probabilistically. Finally, the
twenty-three recent studies into nine categories and review their trial vector replaces the target vector if the former is not worse
design features. Two types of relationships between these than the latter. The DE in Table I is denoted by rnd/1/bin.
algorithms and several research directions are also summarized. Common variants include best/1/bin, rnd/2/bin, and so on [2].
DE has three parameters, NP, CR, and F. Experimental
Index Terms—differential evolution, parameter control, results have shown that their values have great effect on the
adaptive, self-adaptive, classification, taxonomy
convergence speed and solution quality. Setting parameter
values for evolutionary algorithms is carried out in two ways:
I. INTRODUCTION the parameter tuning method tests different values and runs the
Differential evolution (DE) [1][2] has been recognized as a algorithm with the best (and fixed) value; the parameter
promising algorithm for continuous optimization in the last control method adjusts the parameter values during the
decade. It is featured by using the difference between execution of algorithm. Although many advices were given for
individuals in the mutation operator and the local selection by parameter tuning, it is still a time-consuming process to find
comparing one parent and its offspring to determine the proper parameter values. Moreover, we may need different
survivor. parameter values for different stages in the evolution process,
different individuals (search regions), and even different
TABLE I. PSEUDO CODE OF DIFFERENTIAL EVOLUTION objectives (in the case of multiobjective optimization).
NP: population size G: generation number D: problem dimension
Eiben et al. [3] classified parameter control methods into
CR: crossover rate F: scaling factor three groups: deterministic, adaptive, and self-adaptive. The
difference between the first two methods is in that the adaptive
Initialize the population. G = 1. method considers feedback information during the evolution
while the stopping criterion is not met process. The self-adaptive method is featured by encoding
for i = 1 to NP // for each target vector Xi,G = {x1,i,G, x2,i,G, … xD,i,G}
parameters on the chromosomes and evolving the parameters in
// mutation: generate a donor vector Vi, G = {v1,i,G, v2,i,G, … vD,i,G} the same way of evolving the decision variables. After
Vi,G = Xri1,G + F⋅(Xri2,G – Xri3,G) reviewing recent DE algorithms with parameter control, we
// crossover: generate a trial vector Ui,G = {u1,i,G, u2,i,G, … uD,i,G} found that most algorithms fall into the same group (the
for j = 1 to D adaptive group) according to the classification scheme in [3].
°­v j ,i ,G , if U j (0,1) ≤ CR ∨ j = j rnd This motivates us to propose a new taxonomy and notation to
u j ,i ,G = ®
°̄ x j ,i ,G , otherwise identify the features of these parameter control methods and to
end for know the similarity and difference between them.
The rest of this paper is organized as follows. In Section II
// selection: accept the trial vector if not worse than the target vector
we describe the proposed taxonomy and classification criteria.
°­U i , G , if f (U i ,G ) ≤ f ( X i , G ) Section III reviews nine categories of parameter control
X i , G +1 = ®
°̄ X i , G , otherwise
mechanisms in twenty-three DE studies. Section IV
end for summarizes relationships of algorithm design and performance
G=G+1
comparison among these algorithms. Conclusions and research
end while
directions are given in Section V.

978-1-4673-5873-6/13/$31.00 2013
c IEEE 1
II. PROPOSED TAXONOMY In the literature of DE, a standard three-field notation has
Although some studies addressed the dynamic control of been commonly adopted to describe the mutation strategy. For
the population size, most studies fixed the population size and example, the rnd/1/bin strategy refers to that (1) the base vector
focused on the control of the other two parameters, CR and F. is selected randomly, (2) one difference vector is used in
In this paper we only consider the studies of DE that control the generating a donor vector, and (3) the binomial crossover is
values of CR and F. We propose to distinguish the parameter used to produce the trial vector. Similarly, we propose a three-
control mechanisms by three aspects: (1) the number of field notation to give a simple and pertinent tag to the
candidate parameter values, (2) the number of parameter values parameter control mechanisms. For example, the con/mul/pop
used in a single generation, and (3) the source of considered strategy refers to that (1) parameter values are from a
information. The different designs and corresponding notations continuous range, (2) multiple values are used in a single
in the proposed taxonomy are detailed in the following. generation, and (3) parameter values are adjusted based on the
1) The number of candidate parameter values: Almost all statistics collected from the entire population.
existing parameter control mechanisms allow any value in a III. LITERATURE REVIEW
predefined range, e.g. [0, 1], for CR. In our survey, only one
We classify the literature on the parameter control of DE
study selected from a finite set of values for parameters. We
into nine groups. For each group, we will give some examples
denote these two kinds of strategies by con (continuous) and
and describe their core design ideas.
dis (discrete).
2) The number of parameter values used in a single A. con/1/pop
generation: In this aspect, we identify four kinds of strategies Ali and TĘrn [4] proposed the DEPD, in which the value of
in the literature. CR is fixed and the value of F is adjusted by (1). Fmin denotes
a) 1: This is the simplest strategy. All offspring are the minimum value of F, and fmax/fmin denotes the
produced by the same parameter value in a generation. maximum/minimum fitness value in the population. When the
b) mul (multiple): This kind of strategy draws a random difference between the fitness values of the best and the worst
value from a specified distribution every time an offspring is individuals decreases, the value of F increases. It follows the
produced. For example, it may draw a value for the parameter common idea that a larger perturbation is made when the
F from a normal distribution to generate one offspring and population diversity gets lower.
draw another value for another offspring. ­°max{Fmin ,1 − f max / f min }, if f max / f min < 1,
F =® (1)
c) idv (individual): This could be the most popular °̄max{Fmin ,1 − f min / f max }, otherwise.
strategy. It associates with each individual one parameter
value. When an individual i serves as the target vector, its Fi
Liu and Lampinen proposed to use two fuzzy logic
and CRi will be used to generate the donor and trial vectors.
controllers (FLCs) to adjust CR and F in their FADE [5]. The
d) var (variable): It is like the idv strategy, but the inputs of the FLC are the change of values of decision variables
parameter values are associated with the decision variables, (d1) and the change of objective values (d2) between two
not with individuals. We found one study taking this kind of generations. When d1 is small, CR and F increase as d2
strategy. increases. When d2 is medium or large, CR and F increase as d1
3) The source of considered information: When the increases.
parameter value is adjusted, information can be collected from The ADEA [6] was proposed by Qian and Li to deal with
different sources. We classify them into four groups. multiobjective optimization problems. It follows the well-
a) rnd (random): This kind of strategy selects parameter known NSGA-II [7] to separate the individuals into different
values from random distributions such as the uniform fronts and calculates a similar crowding measure. The value of
distribution, normal distribution, and Cauchy distribution. F is adjusted according to how well the individuals are evenly
distributed on the fronts and how many individuals are non-
b) pop (population): It considers the statistics collected dominated solutions. The detailed equation is expressed in (2).
from the entire population. Common statistics include the Assume that there are k fronts and there are mj individuals in
population diversity and the successful rate of generating the jth front. The symbol dij is the crowding measure of an
better offspring.
individual i in the jth front, d j is the average crowding measure
c) par (parent): This strategy is used together with the
idv strategy in the second aspect. It adjusts the parameter in the jth front, d is the average crowding measure of all
values according to the parameter values of the selected individuals, and df is the Euclidean distance between the two
parents for generating the donor and trial vectors. boundary solutions. |P| and |Q| denote the number of non-
dominated solutions and the population size, respectively.
d) idv (individual): This strategy is also used together
with the idv strategy in the second aspect. It adjusts the Generally speaking, the value of F increases when the
parameter values based on the records of historical values of individuals are not evenly distributed and when the number of
the target vector. non-dominated solutions is small.

2 2013 IEEE Symposium on Differential Evolution (SDE)


§ k mj · for CR, it is like SaDE and considers more detailed information
¨
F = max¨
¦ ¦ d −d j =1 i =1 ij j + df
,1 −
2P ¸
, Fmin ¸ (2) when calculating CRm. In addition to recording the successful
¨
©
¦ Q ⋅ d + df Q ¸
¹
CR values in the set CRrec, it also records the improvement on
fitness of these successful CR values in Δfrec. The value of CRm
B. con/mul/rnd is the weighted average of all successful CR values, where the
weight wj is the portion of the total improvement contributed
Yang et al. [8] focused on the control of the parameter F.
by the jth CR value.
They noticed that the normal distribution N(0, 1) is likely to
produce small values and the Cauchy distribution has a greater CRrec

probability of producing larger values. Thus, their NSDE CRm = ¦w j ⋅ CRrec ( j ) (7)
adjusts the value of F by the two distributions in equal j =1

probability, as shown in (3). The mean and standard deviation § Δf rec ·


w j = Δf rec ( j ) /¨¨ Δf rec (c) ¸¸ (8)
of the normal distribution were taken after determining an ¦
empirical value for F. In (3), δ is a Cauchy random variable © c =1 ¹
with scale parameter t = 1. The value of F changes every time a To control the value of F, SaNSDE takes the dual-
donor vector is produced. distribution design from NSDE. It is enhanced by adapting the
­ N (0.5,0.5), if U i (0,1) < 0.5, probability fp of selecting between the normal and Cauchy
Fi = ® (3) distributions based on the successful rate of these two
¯ δ, otherwise.
distributions. Another small difference is that the standard
deviation of the normal distribution was changed from 0.5 to
C. con/mul/pop 0.3.
The con/mul/pop strategy is extended from the con/mul/rnd ­ N (0.5,0.3), if U i (0,1) < fp,
Fi = ® (9)
strategy. It also draws the values of CR and F from specified δ, otherwise.
¯
random distributions. However, it does not use fixed values for
the distribution parameters (e.g. the mean of the normal The JADE proposed by Zhang and Sanderson [12] is
distribution). Instead, it adjusts the values of the distribution another good representative of the con/mul/pop strategy. It
parameters based on the population statistics. adjusts the values of CR and F in the way similar to the method
The SaDE [9][10] has adaptive control of the mutation of controlling CR in SaDE. One difference is that JADE
strategies and the value of CR. Given K mutation strategies, the updates the distribution parameters progressively at each
algorithm calculates the probability pk,g of choosing a strategy k generation. As in (11), the new mean μCR of the normal
at generation g following (4) and (5). Simply speaking, it distribution in (10) is the weighted sum of the current mean and
counts the number of successful trials nsk,j and the number of the arithmetic average of the successful CR values in CRrec.
failed trials nfk,j of producing better offspring for each strategy The method of adjusting F is the same except that the normal
k every LP generations. Then, the successful rate Sk,g is distribution and arithmetic mean are replaced by the Cauchy
calculated, and each strategy is selected in a probability distribution and Lehmer mean, as described in (12)-(14). The
proportional to its successful rate. The additional parameter LP use of the Cauchy distribution and the Lehmer mean is to place
determines the duration of collecting statistics. Another more chances on larger F values. The JADE2 [13] followed
additional parameter ε was introduced to give small chances to JADE and used the same parameter control methods. It dealt
the strategies with zero successful rates. with multiobjective optimization problems.
g −1 CRi = N ( μCR ,0.1) (10)
Sk,g =
¦ j = g − LP
ns k , j
+ε (4) μ CR = (1 − c) ⋅ μ CR + c ⋅ mean A (CRrec ) (11)
g −1 g −1
¦ j = g − LP
nsk , j + ¦ j = g − LP
nf k , j

S k,g Fi = C ( μ F ,0.1) (12)


pk ,g = (5)
K μ F = (1 − c) ⋅ μ F + c ⋅ meanL ( Frec ) (13)
¦ j =1
S j,g
2 (14)
meanL ( Frec ) = ¦F / ¦F
Control of the parameter CR is done in a similar way. The F∈Frec F∈Frec
values of CR that lead to better offspring are recorded for each
mutation strategy k. Then, the median CRmk of these successful Gong et al. [14] followed JADE and proposed the SaJADE.
CR values is taken as the mean of the normal distribution when It utilizes the idea in JADE to control the selection among K
the mutation strategy k is chosen. mutation strategies. In (16) μs is set by 0.5 and the standard
CRk ,i = N (CRmk ,0.1) (6) deviation is set by 1/6 at the first generation (g = 1) to ensure
that ηi is generated in the range [0, 1). After starting the
Yang et al. [11] proposed the SaNSDE, which was derived collection of successful mutation strategies in Srec, the standard
from SaDE and NSDE. It uses the same mechanism of deviation is set smaller (0.1) to emphasize the adaptation effect.
selecting among multiple mutation strategies as SaDE does. As

2013 IEEE Symposium on Differential Evolution (SDE) 3


S i = ¬η i ⋅ K ¼ + 1 (15) calculates the accumulated fitness improvement Δfi for each
­ N ( μ s ,1 / 6), if g = 1 individual i every α generations. The individuals whose Δfi is
ηi = ® (16) among the top 1/β% keep their F values in the next α
¯ N ( μ s ,0.1), otherwise
generations. The remaining individuals re-sample a random
μ s = (1 − c) ⋅ μ s + c ⋅ mean A ( S rec ) (17) value within the feasible range [Fmin, Fmax]. The values of α and
β were set by 5 and 2, respectively. The value of CR was a
D. con/idv/rnd constant depending on the problem dimension.
Similar to the con/mul/rnd strategy, the con/idv/rnd strategy
adjusts the parameter values based on specified probability ­ Fi , g , if Δf i is among the top 1 / β %,
Fi , g +α = ® (22)
distributions such as uniform distribution. The difference is in U ( F , F ), otherwise.
¯ min max
that the con/idv/rnd strategy records the parameter values on
the individuals so that the selection procedure of the evolution
Jia et al. [20] proposed the ISADE as an extension of the
process can help to identify good parameter values.
jDE. When the parameter value on an individual Xi is to be
The jDE proposed by Brest et al. [15] leads the studies in
changed, ISADE has two options. If the fitness f(Xi) is smaller
this category. Before an offspring is produced for a target
than the average fitness favg over the whole population, the
vector i at generation (g+1), there is a probability (τ1 and τ2 in
value changes toward Fmin. The better the individual is, the
(18) and (19)) of changing the values of F and CR to a random
smaller the value of F is. The rationale behind is to search
value within the predetermined range. In their experiments, the
locally for good individuals. If the fitness is equal to or greater
authors proposed to give a small probability of change and set
than the average fitness, a random value within the specified
τ1 and τ2 both by 0.1. range is chosen. The value of CR is controlled in exactly the
­U 1 ( Fmin , Fmax ), if U 2 (0,1) < τ 1 , same way, and thus the equation is omitted here.
Fi , g +1 = ® (18)
¯ F i, g , otherwise.
­­ f ( X i ) − f min
°°Fmin + ( Fi − Fmin ) ⋅ , if f ( X i ) < f avg ,
°® f avg − f min if U1(0,1) < τ1
­U 3 (0,1), if U 4 (0,1) < τ 2 , Fi = ®
CRi , g +1 = ® (19) °¯
° U 2 ( Fmin , Fmax ), if f ( X i ) ≥ f avg
¯ CR i , g , otherwise.
¯° F i, otherwise

Later, Brest et al. proposed jDE-2 [16] by integrating the idea (23)
of multiple mutation strategies in SaDE into jDE. It records the F. con/idv/par
values of F and CR for each of the three adopted mutation
strategies. In addition, jDE-2 replaces the k worst individuals at The con/idv/par strategy evolves the parameter on the
every l generation with parameter values randomly selected individuals in the same way as it evolves the decision variables.
from the feasible range. This may speed up the adaptation of In other words, the parameter values are adjusted based on the
parameter values. information on the parents. This category matches the “self-
Soliman and Bui [17] proposed a control strategy like that adaptive” category in Eiben et al. [3]. (In fact, most of the so-
in jDE but added more randomness in the control of F. Instead called self-adaptive DE algorithms in the literature fall into the
of controlling the value of F directly, their strategy samples the “adaptive” category according to their taxonomy. Our
F value based on the Cauchy distribution and adjusts the scale taxonomy helps to further identify their features.)
The SPDE by Abbass [21] adjusts the value of CR of a
parameter of the Cauchy distribution in a small probability (τ1).
target vector i according to the values of CR of the three
The value of CR is controlled in the same way as in jDE.
randomly selected parents r1, r2, and r3, as (24) defines.
Besides τ1 and τ2, three more parameters, μ, δl, and δu are
required. The authors did not name their strategy, and in the CRi = CRr1 + N (0,1) ⋅ (CRr 2 − CRr 3 ) (24)
following we call it CSDE.
Omran et al. [22] proposed the SDE, which adjusted the
­C ( μ , δ i , g +1 ), if U 1 (0,1) < τ 1 , value of F by (25). Note that the individuals used to adjust the
Fi , g +1 =® (20)
¯ C ( μ , δ i , g ), otherwise. F value are different from the individuals used to adjust the
δ i , g +1 = δ l + δ u ⋅U 2 (0,1) (21) decision variables.
Fi = Fr 4 + N (0,0.5) ⋅ ( Fr 5 − Fr 6 ) (25)
The strategy in MOSADE [18] can be viewed as a special
case of that in jDE with τ1 and τ2 set by 1. In other words, the Instead of using the normal distribution in the SPDE and
values of F and CR are re-sampled at every generation. SDE, the DESAP [23] uses the value of scaling factor F in the
adjustment of CR in (28). In the experiments, the value of F
E. con/idv/pop
was fixed as one.
The con/idv/pop strategy records parameter values on the
individuals and adjusts the values using the information on the CRi = CRr1 + F ⋅ (CRr 2 − CRr 3 ) (26)
target vector as well as the whole population. The RADE [19]

4 2013 IEEE Symposium on Differential Evolution (SDE)


Zamuda et al. [24] proposed the DEMOwSA. The new the population in the previous (g) and current (g+1) generations.
value of CR is the product of a random variable eτ⋅N(0, 1) and the The APDE deals with multiobjective optimization, and
average CR over the target vector and three parents. The value Var(f(g)) is the average of Var(fj(g)) over M objectives. In (31),
of the extra parameter τ was set by 1/(8 2 D ), D being the γ is an additional parameter and was set by 1.25 in the APDE.
problem dimension. The value of F is adjusted in the same way,
and the equation is omitted. ­° K / 2 N ⋅ CRi , if K ≥ 0,
Fi = ®
°̄ Fmin , otherwise. (29)
CRi + CR r1 + CR r 2 + CR r 3 τ ⋅ N ( 0,1)
CRi = ⋅e (27) K = N (ci − 1) + CRi (2 − CRi )
4
G. con/idv/idv
As the name indicates, the con/idv/idv strategy adjusts the ­°− ( F 2 N − 1) + ( F 2 N − 1) 2 − N (1 − c ) , if ci ≥ 1
parameter values recorded on the individual based on the CRi = ® i i i (30)
°̄ CRmin otherwise
individual’s own information. The SFLSDE [25] is a
descendant of the jDE. It is different from the jDE in that two
local searches are carried out probabilistically to search for the
proper F value for the best individual in the population. (The Var ( x i ( g ))Var ( f ( g ))
ci ( g + 1) = γ
control mechanism for F of the remaining individuals and the Var ( x i ( g + 1))Var ( f ( g + 1)) (31)
control mechanism for CR of all individuals are identical to 1 M
those of the jDE.) Given the best individual and its current F Var ( f ( g )) =
M ¦ j =1
Var ( f j ( g ))
value, the local search procedure repeats generating new F
values, using these F values to produce new offspring I. dis/mul/pop
individuals, and accepting the F value that leads to the best Different from the above strategies, the dis/mul/pop is
individual. In (28), the golden section search and hill climbing featured by a finite set of candidate parameter values. DEBR
search use different neighborhood functions to generate the [29] pre-specified nine combinations of parameter values by
new F values. Details are referred to [25]. CR = {0, 0.5, 1} and F = {0.5, 0.8, 1}. Let nh denote the
Fi , g +1 = number of successful trials of generating better offspring by a
combination h in previous generations, the probability qh of
­Golden section search, if U 1 (0,1) < τ 1 , selecting h is defined by (32). H is the total number of
° if τ 1 ≤ U 1 (0,1) < τ 2 , (28)
° Hill climbing search, combinations. In general, the probability is proportional to the
® ­U 3 ( Fmin , Fmax )⋅, if U 2 (0,1) < τ 3 ,
° if τ 2 < U 1 (0,1)
number of successes. The additional parameter n0 is introduced
®
°¯ ¯ F i, g otherwise to avoid dramatic variation of the probability. If any probability
qh decreases below a given limit δ, it will be reset to the
Pan et al. [26] proposed the SspDE, in which each starting value 1/H.
individual i has its own lists CRLi, FLi, and SLi, recording CR n h + n0
qh = (32)
values, F values, and mutation strategies, respectively. Each of H
these lists consists of LP elements and is updated every LP ¦ j =1
n j + n0
generations. In each generation each individual serves as the
target vector and uses its own CR, F, and mutation strategy to
produce an offspring. If the offspring is not worse, record the IV. RELATIONSHIP
CR, F, and mutation strategy into another three lists wCRLi (w In last two sections we propose a taxonomy to identify the
for winning), wFLi, and wSLi, respectively. Every LP features of parameter control mechanisms of DE and review
generations, the CRLi list is refilled by the values in wCRLi in state-of-the-art studies in each category. The categorization
probability RP and by random values in probability (1 – RP). helps to see the similarities and differences between the
The FLi and SLi lists are updated in the same way. LP and RP existing control mechanisms. In this section we summarize two
are two additional parameters. relationships between these algorithms.
H. con/var/pop The first is the design relationship, depicted in Fig. 1. In Fig.
1, A ← B means that algorithm B is derived from algorithm A.
The con/var/pop strategy is like the con/idv/pop strategy,
For example, SaNSDE combines the idea of different statistical
but it associates parameter values with decision variables
distributions in NSDE and the idea of adaptive probability in
instead of individuals. The APDE [27] is an example. Based on
SaDE. As more new algorithms will be developed based on the
a theoretical result [28] of the relation between the variance of
existing ones, the design relationship will keep growing. This
values of decision variables Var(xi) and the values of CRi and
relationship could help researchers to track how an algorithm
Fi, APDE adjusts Fi values in even generations by (29) and
evolved in the history.
adjusts CRi values in odd generations by (30). In (29) and (30),
Another relationship is constructed by a summary of the
N denotes the population size, and ci is calculated based on
performance comparison results in the reviewed studies. When
Var(xi) and the variance of the objective values Var(fj) among

2013 IEEE Symposium on Differential Evolution (SDE) 5


an algorithm A claimed that it was superior to algorithm B, we another mainly in the number of parameter values used in a
put an arrow as A → B in Fig. 2. Perhaps due to the simplicity, single generation (1, mul, idv, and var) and the source of
jDE is the most popular benchmark algorithm in the literature. considered information (rnd, pop, par, and idv). Our three-field
Note that the testing functions and performance measures in notation gives a quick and informative tag. Table II presents
these papers were not necessary identical (although many of nine categories of parameter control mechanisms and
them did use the same functions and measures). Another note is representative algorithms. It also gives the additional
that performance difference between these algorithms is not parameters, the number of objectives in the optimization
totally determined by the parameter control mechanism. For problems, and brief descriptions. There is still much to
example, the good performance of JADE is partially investigate. Here we give four directions for future research:
contributed by an external archive and the modified mutation 1) Making the algorithm simpler: The motivation to do
strategy. On one hand, this relationship helps us to find out parameter control is twofold. One is to adapt parameter values
competitive benchmark algorithms for performance to different problems, search stages, search regions, etc. to
comparison. On the other hand, it reveals that more numerical improve algorithm performance. The other is to reduce users’
studies are required to complete this relationship diagram. Neri burden on testing and finding the proper parameter setting.
and Tirronen [30] conducted an experimental analysis on eight The new algorithms achieve better and better solution quality,
advanced DE including jDE, SaDE, and SFLSDE. They but most of them replace the parameters of DE (i.e. CR and F)
reported the good performance of jDE and SFLSDE. The with another set of parameters of the proposed control
analysis pointed out that the original DE employs too much a mechanisms. For example, the jDE-like algorithms need to
deterministic search logic and may suffer from the stagnation define the probability (τ) of random changes and the SaDE-
condition. The restricted randomization in the jDE enriches the
like algorithms need to set the learning period (LP). Although
set of moves, and the local search in the SFLSDE increases the
the algorithm performance was shown to be not sensitive to
exploitative pressure within the explorative DE structure.
the values of these newly introduced parameter (e.g. [10][26]),
Researches like this experimental study are very helpful for the
design of parameter control mechanisms. it would be better if we can save this task for the users.
2) Considering problem-oriented information: Various
information has been considered for adjusting parameter
SaDE (2005) SaNSDE (2008)
values. In general, the information is either quality-based or
NSDE (2008) diversity-based. A very common quality-based information is
jDE2 (2006)
the (weighted) successful times/rate of producing better
SFLSDE (2009) offspring. The relative quality of individuals to the population
ISADE (2009) is sometimes used. Diversity-based information includes
jDE (2006)
fitness difference between the best and worst individuals,
MOSADE (2010)
variance of values of decision variables, and distribution along
JADE2 (2008) the Pareto front. All are search process-oriented. They
JADE (2005) measure how well the search process goes. We should
SaJADE (2011)
consider also problem-oriented information. Many useful
Fig. 1. Design relationship experiences have been found in the literature on parameter
tuning. If we can identify the problem characterisitcs such as
unimodal/multimodal and separable/non-separable, we can do
SspDE SaJADE parameter control more strategically.
(2011) (2011)
3) Adapting with respect to multiple objectives: In our
survey, the number of researches on adaptive DE for
SaDE CSDE JADE ISADE SFLSDE multiobjective optimization is still much fewer than that of the
(2009) (2008) (2009) (2009) (2009)
single-objective ones. We have seen algorithms that adapt
parameter values in the individual-wise (⋅/idv/⋅) and variable-
jDE SaNSDE APDE wise (⋅/var/⋅) manners. When multiple objectives are to be
(2006) (2008) (2004) minimized, it is likely that different values of CR and F must
be used for different objectives. We are looking forward to
FADE SDE SaDE NSDE SPDE seeing future researches investigating this issue.
(2005) (2005) (2005) (2008) (2002) 4) Doing parameter control through distributed DE: The
⋅/1/⋅ and ⋅/idv/⋅ strategies are the two extremes in terms of the
Fig. 2. Performance comparison relationship
control granularity. The former takes the whole population to
find the good parameter values, and the latter lets every
V. CONCLUSIONS AND RESEARCH DIRECTIONS
individual do by itself. Doing parameter control through
In this study we reviewed literature on DE algorithms with several (structured) sub-populations may be a good balance.
parameter control and proposed a taxonomy and notation to Some promising initial results can be found in [31] and [32].
classify these methods. These algorithms are different from one

6 2013 IEEE Symposium on Differential Evolution (SDE)


ACKNOWLEDGMENT comparative study on numerical benchmark problems,” IEEE
Transactions on Evolutionary Computation, vol. 10, no. 6, pp.
The authors are very grateful to the reviewers for their 646–657, 2006. [jDE]
valuable and instructive comments. This research is supported
[16] J. Brest, V. Žumer, and M. S. Mauþec, “Self-adaptive
by the National Science Council of Republic of China under differential evolution algorithm in constrained real-parameter
research grant No. NSC101-2221-E-003-007. optimization,” IEEE Congress on Evolutionary Computation, pp.
215 –222, 2006. [jDE-2]
REFERENCES [17] O. S. Soliman and L. T. Bui, “A self-adaptive strategy for
[1] R. Storn and K. Price, “Differential evolution – a simple and controlling parameters in differential evolution,” IEEE Congress
efficient heuristic for global optimization over continuous on Evolutionary Computation, pp. 2837–2842, 2008. [CSDE]
spaces,” Journal of Global Optimization, vol. 11, no. 4, pp. 341– [18] Y. N. Wang, L. H. Wu, and X. F. Yuan, “Multi-objective self-
359, 1997. adaptive differential evolution with elitist archive and crowding
[2] S. Das and P. N. Suganthan, “Differential evolution: a survey of entropy-based diversity measure,” Soft Computing, vol. 14, no.
the state-of-the-art,” IEEE Transactions on Evolutionary 3, pp. 193–209, 2010. [MOSADE]
Computation, vol. 15, no. 1, pp. 4–31, 2011. [19] A. Nobakhti and H. Wang, “A simple self-adaptive differential
[3] A. E. Eiben, R. Hinterding, and Z. Michalewicz, “Parameter evolution algorithm with application on the ALSTOM gasifier,”
control in evolutionary algorithms,” IEEE Transactions on Applied Soft Computing, vol. 8, no. 1, pp. 350–370, 2008.
Evolutionary Computation, vol. 3, no. 2, pp. 124–141, 1999. [RADE]
[4] M. M. Ali and A. TĘrn, “Population set-based global [20] L. Jia, W. Gong, and H. Wu, “An improved self-adaptive control
optimization algorithms: some modifications and numerical parameter of differential evolution for global optimization,” In Z.
results,” Computers & Operations Research, vol. 31, no. 10, pp. Cai et al. (Eds.) Computational Intelligence and Intelligent
1703–1725, 2004. [DEPD] Systems, 2009, pp. 215–224. [ISADE]
[5] J. Liu and J. Lampinen, “A fuzzy adaptive differential evolution [21] H. A. Abbass, “The self-adaptive Pareto differential evolution
algorithm,” Soft Computing, vol. 9, no. 6, pp. 448–462, 2005. algorithm,” IEEE Congress on Evolutionary Computation, pp.
[FADE] 831–836, 2002. [SPDE]
[6] W. Qian and A. Li, “Adaptive differential evolution algorithm [22] M. G. H. Omran, A. Salman, and A. P. Engelbrecht, “Self-
for multiobjective optimization problems,” Applied Mathematics adaptive differential evolution,” Lecture Notes in Artificial
and Computation, vol. 201, no. 1-2, pp. 431–440, 2008. [ADEA] Intelligence, vol. 3801, pp. 192–199, 2005. [SDE]
[7] K. Deb, A. Pratap, S. Agarwal and T. Meyarivan, “A fast and [23] J. Teo, “Exploring dynamic self-adaptive populations in
elitist multiobjective genetic algorithm 㧦 NSGA-II, ” IEEE differential evolution,” Soft Computing, vol. 10, pp. 673–686,
Transactions on Evolutionary Computation, vol. 6, no. 2, pp. 2006. [DESAP]
182–197, 2002. [24] A. Zamuda, J. Brest, B. Boškoviü, and V. Žumer, “Differential
[8] Z. Yang, X. Yao, J. He, “Making a difference to differential evolution for multiobjective optimization with self adaptation,”
evolution,” In P. Siarry and Z. Michalewicz (Eds.) Advance in IEEE Congress on Evolutionary Computation, pp. 3617–3624,
Metaheuristics for Hard Optimization, 2008, pp. 397–414. 2007. [DEMOwSA]
[NSDE] [25] F. Neri and V. Tirronen, “Scale factor local search in differential
[9] A. K. Qin and P. N. Suganthan, “Self-adaptive differential evolution,” Memetic Computing, vol. 1, no. 2, pp. 153–171,
evolution algorithm for numerical optimization,” IEEE Congress 2009. [SFLSDE]
on Evolutionary Computation, pp. 1785–1791. [SaDE] [26] Q. K. Pan, P. N. Suganthan, L. Wang, L. Gao, and R. Mallipeddi,
[10] A. K. Qin, V. L. Huang, and P. N. Suganthan, “Differential “A differential evolution algorithm with self-adapting strategy
evolution algorithm with strategy adaptation for global and control parameters,” Computers & Operations Research, vol.
numerical optimization,” IEEE Transactions on Evolutionary 38, no. 1, pp. 394–408, 2011. [SspDE]
Computation, vol. 13, no. 2, pp. 398–417, 2009. [SaDE] [27] D. Zaharie and D. Petcu, “Adaptive Pareto differential evolution
[11] Z. Yang, K. Tang, and X. Yao, “Self-adaptive differential and its parallelization,” Lecture Notes in Computer Science, vol.
evolution with neighborhood search,” IEEE Congress on 3019, pp. 261–268, 2004. [APDE]
Evolutionary Computation, pp. 1110–1116, 2008. [SaNSDE] [28] D. Zaharie, “Control of population diversity and adaptation in
[12] J. Zhang and A. C. Sanderson, “JADE: adaptive differential differential evolution algorithms,” International Conference on
evolution with optional external archive,” IEEE Transactions on Soft Computing, pp. 41–46, 2002.
Evolutionary Computation, vol. 13, no. 5, pp. 945–958, 2009. [29] J. Tvrdik, “Adaptation in differential evolution: a numerical
[JADE] comparison,” Applied Soft Computing, vol. 9, no. 3, pp. 1149–
[13] J. Zhang and A. C. Sanderson, “Self-adaptive multi-objective 1155, 2009. [DEBR]
differential evolution with direction information provided by [30] F. Neri and V. Tirronen, Recent advances in differential
archived inferior solutions,” IEEE Congress on Evolutionary evolution: a survey and experimental analysis, Artificial
Computation, pp. 2801–2810, 2008. [JADE2] Intelligence Review, vol. 33, pp. 61–106, 2010.
[14] W. Gong, Z. Cai, C. X. Ling, and H. Li, “Enhanced differential [31] M. Weber, V. Tirronen, and F. Neri, Scale factor inheritance
evolution with adaptive strategies for numerical optimization,” mechanism in distributed differential evolution, Soft Computing,
IEEE Transactions on Systems, Man, and Cybernetics – Part B, vol. 14, pp. 1187–1207, 2010.
vol. 41, no. 2, pp. 397–413, 2011. [SaJADE] [32] M. Weber, F. Neri, and V. Tirronen, A study on scale factor in
[15] J. Brest, S. Greiner, B. Boškoviü, M. Mernik, and V. Žumer, distributed differential evolution, Information Sciences, vol. 181,
“Self-adapting control parameters in differential evolution: a pp. 2488–2511, 2011.

2013 IEEE Symposium on Differential Evolution (SDE) 7


TABLE II. PROPOSED TAXONOMY AND EXAMPLES

Adjusted Additional
Category* Algorithm parameters parameters** #Obj Brief descriptions
con/1/pop DEPD 2004 F SO (1) Fixed CR;
(2) Increase F when abs(fmax/fmin) decreases.
FADE 2005 CR, F membership SO (1) Two fuzzy logic controllers for CR and F, respectively;
functions (2) Inputs of the controllers are based on the average genotypic and
phenotypic distances between two generations.
ADEA 2008 F MO (1) Fixed CR;
(2) Increase F when the individuals are not evenly distributed and when the
number of non-dominated solutions is small.
con/mul/rnd NSDE 2008 F SO (1) CR~U(0, 1);
(2) F~N(0.5, 0.5)/Cauchy in equal probability.
con/mul/pop SaDE 2005/2009 CR, s LP, ε SO (1) CR~N(CRm, 0.1), CRm as the median of successful CR values;
(2) F~N(0.5, 0.3);
(3) Selection probability of mutation strategies is proportional to the
successful rate.
SaNSDE 2008 CR, F, s LP SO (1) Descendant of NSDE and SaDE;
(2) CR~N(CRm, 0.1), CRm as the weighted average of successful CR values,
where weights are the portion of improvement on fitness;
(3) F~N(0.5, 0.5)/Cauchy with probability depending on the successful rate.
JADE 2009 CR, F c SO (1) CR~N(μ, 0.1), μ as the weighted sum of current μ and arithmetic mean of
successful CR values;
(2) F~C(μ, 0.1), μ as the weighted sum of current μ and Lehmer mean of
successful CR values.
JADE2 2008 CR, F c MO (1) Descendant of JADE, using the same parameter control mechanism.
SaJADE 2011 CR, F, s c SO (1) Descendant of JADE;
(2) Selection of mutation strategies by the same mechanism in JADE.
con/idv/rnd jDE 2006 CR, F τ1, τ2 SO (1) Change CR by U(0, 1) with probability τ1;
(2) Change F by U(Fmin, Fmax) with probability τ2.
jDE-2 2006 CR, F, s τ1, τ2, k, l SO (1) Descendant of jDE, adding multiple mutation strategies;
(2) Re-initialize the parameter values of the worst k individuals every l
generations.
CSDE 2008 CR, F τ1, τ2, μ, δl, δu SO (1) Change CR by U(0, 1) with probability τ1;
(2) Change F by C(μ, δ) with probability τ2, δ ~U(δl, δu).
MOSADE 2010 CR, F MO (1) Descendant of jDE;
(2) CR~U(0.0, 0.5);
(3) F~U(0.1, 0.9).
con/idv/pop RADE 2008 F α, β SO (1) For the individuals whose accumulated fitness improvement is among the
top 1/β% in the population, keep their F values; for the remaining
individuals, set random values
(2) Update F values every α generations.
ISADE 2009 CR, F τ1, τ2 SO (1) Descendant of jDE;
(2) Change the values of CR and F toward the lower bound if the individual’s
fitness is better than the average fitness over the population.
con/idv/par SPDE 2002 CR MO (1) CRi = CRr1 + N(0, 1)⋅(CRr2 – CRr3);
(2) F~N(0, 1).
SDE 2005 F SO (1) CR~N(0.5, 0.15);
(2) Fi = Fr4 + N(0, 0.5)⋅(Fr5 – Fr6).
DESAP 2006 CR SO (1) CRi = CRr1 + F⋅(CRr2 – CRr3);
(2) Fixed F value.
DEMOwSA 2007 CR, F τ MO (1) CRi = (1/4)⋅(CRi + CRr1 + CRr2 + CRr3)⋅ eτ⋅N(0, 1);
(2) Fi = (1/4)⋅(Fi + Fr1 + Fr2 + Fr3)⋅ eτ⋅N(0, 1).
con/idv/idv SFLSDE 2009 CR, F τ1, τ2, τ3, τ4 SO (1) Descendant of jDE;
(2) Use the golden section search and hill climbing search probabilistically to
adjust the F value of the best individual.
SspDE 2011 CR, F, s LP, RP SO (1) Each individual has its own lists, CRLi, FLi, and SLi;
(2) Successful CR, F, and strategy are recorded in wCRLi, wFLi, and wSLi;
(3) Every LP generations, CRLi, FLi, and SLi are refilled by wCRLi, wFLi, and
wSLi, respectively;
(4) Refilled values are taken from the winning lists in probability RP.
con/var/pop APDE 2004 CR, F γ MO (1) Adjust the parameter values to maintain the variances of decision variables
and average variance of objective values between two generations.
dis/mul/pop DEBR 2009 CR, F, s n0, δ SO (1) Use a prespecified combinations of CR, F, and mutation strategies;
(2) Select one combination in a probability proportional to the successful rate.
*
Comparing with the classification scheme in [3], the ⋅/⋅/rnd category matches the deterministic parameter control category, the con/idv/par category matches the
self-adaptive parameter control category, and the remaining categories match the adaptive parameter control category.
**
The upper bound and lower bound of CR and F values are not listed as additional parameters here.

8 2013 IEEE Symposium on Differential Evolution (SDE)

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy