Optimization With Q Learning
Optimization With Q Learning
article info a b s t r a c t
Article history: The operations of metaheuristic optimization algorithms depend heavily on the setting of control
Received 23 October 2020 parameters. Therefore the addition of adaptive control parameter has been widely studied and
Received in revised form 15 March 2021 shown to enhance the problem flexibility and overall performance of the algorithms. This paper
Accepted 22 April 2021
proposes Q-learning Differential Evolution (qlDE) algorithm, an adaptive control parameter Differential
Available online 30 April 2021
Evolution(DE) algorithm, for structural optimization. The reinforcement learning Q-learning model
Keywords: is integrated into DE as an adaptive parameter controller, adaptively adjusting control parameters
Differential evolution of the algorithm at each search iteration in order to optimally regulate its behavior for different
Adaptive control parameter search domains. Moreover, by automatically controlling the balance of exploration and exploitation
Q-learning algorithm at different stages of the process, the performance of the optimizer will also be enhanced. To verify
Truss structures the effectiveness and robustness of the proposed qlDE algorithm in comparison with the classical DE
Optimization and several other algorithms in the literature, five benchmark examples of truss structural weight
minimizations will be performed in this study.
© 2021 Elsevier B.V. All rights reserved.
https://doi.org/10.1016/j.asoc.2021.107464
1568-4946/© 2021 Elsevier B.V. All rights reserved.
T.N. Huynh, D.T.T. Do and J. Lee Applied Soft Computing 107 (2021) 107464
et al. [3], Liu and Lampinen [18], Hasançebi and Azad [19], self- the control parameters of DE adaptively at runtime. The param-
adapting control parameters by Qin and Suganthan [20], Brest eters of DE are flexibly chosen at each iteration by the controller
et al. [21], Yu et al. [22], Meng et al. [23], Isiet and Gadala [24]. according to gathered information. With the control parameters
From the mentioned research, adaptive parameter controller in adaptively adjusted, the exploration and exploitation abilities
improving metaheuristic algorithms performance is shown to be of the algorithm are regulated to the search space, reinforcing
a promising approach. its problem flexibility. Moreover, by adjusting the balance of
In the literature, Differential Evolution (DE) algorithm since its exploration and exploitation priorities of the process, the Q-
introduction in 1997 by Storn and Price [25] has been a staple learning model also enhances the performance of the optimizer at
global optimizer for optimization problems. The performance of different stages of the process. The effectiveness and robustness
the algorithm and its variants have been demonstrated in many of the present method are tested with five truss size and shape
research [26–29], particularly for engineering problems, they are optimization problems with multiple frequency constraints. The
proven to be highly effective and robust in searching for global addition of the Q-learning model to the DE algorithm is fairly
optimal solution.
simple, and the overall architecture of the algorithm is kept
Moreover, there have been many studies on adaptive pa-
intact, yet, through the obtained results from the truss problems,
rameter control for DE. Fuzzy adaptive DE (FADE) by Liu and
the performance of the proposed qlDE algorithm, especially its
Lampinen [18] was one of the first studies of the application
convergence rate, is shown to considerately surpass the classical
of parameter control for DE, the algorithm utilized fuzzy logic
version and other investigated algorithms.
controllers. After that, JADE algorithm [30] was introduced, it uses
parameter adaption strategy based on updating the parameter The remainder of this article is constructed as follows. A brief
of the probability distributions, from which values of F and Cr introduction of DE is provided in Section 2. A short explanation
are sampled. Elsayed et al. [31] presented a variant of adaptive of Q-learning and the overview proposed model qlDE are given in
parameter control for DE called DE-APC, in which the parameters Section 3. The effectiveness of the proposed algorithm compared
F and Cr are chosen randomly for each individual and parameter to its origin is discussed in Section 4 through several truss struc-
combinations of well-performing individuals, are kept for the tural weight minimization problems. Finally, Section 5 gives the
next generation. Sarker et al. [32] later proposed an improvement conclusions of the study.
for DE-APC [31], the improved version used a historical memory
of successful control parameter settings to guide the selection 2. Brief Differential Evolution (DE) algorithm introduction
of future control parameter values. For further reading on DE
development, Das et al. [33] gives many useful information on The Differential Evolution (DE) algorithm, after the first ini-
the topic. tialization step, can be interpreted as iterations of the following
From the above merits, DE is chosen to be investigated in three steps: mutation, crossover and selection. The three steps
this study. The operation of DE is governed by the number of are repeated until one of the stopping criteria, which are ei-
individuals NP along with two control parameters: the scale
ther predetermined by the optimization problems or the user, is
factor F and crossover probability Cr, they influence greatly on
satisfied. In this study, the relative error of the best and mean
the operation of the algorithm [34,35].
fitness values δ = |fmean /fbest − 1| is used as a stopping condition,
Q-learning, first introduced by Watkins and Dayan [36], is
i.e. the algorithm stops if tol > δ , where tolerance tol is a value
a model of reinforcement learning (RL), which is an area of
predetermined by the user.
machine learning concerned with how an artificial agent would
interact with the environment in particular states. Based on the A pseudo-code of DE is presented in Algorithm 1. For a global
reactions of the environment to previous interactions, the agent optimization problem of D dimensions, with a constant popu-
chooses which action to perform in order to optimally gain pos- lation size of NP, each of the steps mentioned above can be
itive rewards. Q-learning model is chosen for investigation due explained as follows:
to its simple application, relatively fast convergence rate and Initialization
low computation cost, many other RL models can also be imple- This step initializes the starting population of the algorithm
mented for more complex parameter control, such as Bayesian by randomly choosing NP individuals in the search space. The
Q-learning [37], Double Q-learning [38], Deep Q-learning [39] etc. random initialization of an ith individual xti =0 , which is a vector
In the literature, studies on the implementation of Q-learning of D design variables, is given as follows:
into DE algorithm. Rakshit et al. [40] introduced AMA, an al-
xti =0 = {x0i,1 , x0i,2 , . . . , x0i,D }
gorithm coupling DE for global search and Q-learning for local (1)
with xti,=j 0 = xmin,j + randi,j [0, 1](xmax,j − xmin,j )
refinement. Li et al. [41] presented their DE-RLFR for multi-
objective optimization, the algorithm utilized each individual in where xmin,j and xmax,j are, respectively, the predetermined lower
the population as a Q-learning agent and dynamically adjust and upper bounds of the search space of the jth design variable;
the evolution direction of the population. Kim and Lee [42] in- for each xi,j , randi,j [0, 1] is a random number generated in [0, 1].
troduced an algorithm combining the mutation and crossover Mutation
strategies of PSO, DE and Q-learning is utilized to enhance the At the beginning of the step, mutant vectors vti are created
search capabilities. A recent study of applying Q-learning into from the current individual population xt = {xt1 , xt2 , . . . , xtNP }.
Simulated Annealing(SA) was published in [43] for engineering
In this study, one of the most frequently used mutation schemes
benchmark problems, which shows the potential of the approach.
DE/rand/1 [47] is used, which can be expressed as:
The study investigated simple engineering problems with small
numbers of design variables, and SA is recognized as an effec- DE/rand/1 : vti = xtR1 + F (xtR2 − xtR3 ) (2)
tive combinatorial local search optimizer [44–46]. The present
study aims to enrich the topic and fill the gap for its applica- where xtR , xtR , xtR are three randomly chosen target individuals
1 2 3
tions with continuous optimization algorithm DE and structural from the population with i ̸ = R1 ̸ = R2 ̸ = R3 ; F is the scale vector
optimization problems. determining the deviation of the two target vectors, F is chosen
In this study, an adaptive control parameter DE algorithm in (0, 1] and frequently chosen as 0.8 [35].
is introduced. Q-learning DE (qlDE) algorithm implements Q- The mutant vectors are then checked for violation of boundary
learning model into DE as the parameter controller, which adjusts constraints, the violated individuals are returned to the search
2
T.N. Huynh, D.T.T. Do and J. Lee Applied Soft Computing 107 (2021) 107464
vi,j =
t
2xmax,j − vit,j , if vit,j > xmax,j (3)
⎩ vt otherwise
i,j
where vit,j is the jth component of vti ; xmin,j and xmax,j are, respec-
tively, the predetermined lower and upper bounds of the search
space of the jth design variable.
Fig. 1. Illustration of Reinforcement learning process.
Crossover
Next, the crossover step is introduced to diversify the indi-
viduals in the current population. Binary crossover strategy is 3. Q-learning and Q-learning differential evolution algorithm
utilized, as stated in [33], it is the most commonly used and
often the most promising strategy. The strategy produces trial 3.1. Q-learning, a reinforcement learning model
vectors uti = {uti,1 , uti,2 , . . . , uti,D } by mixing design variables from
individual vectors xti and mutant vectors vti as follows: Reinforcement learning is a branch of Artificial Intelligence
concerning with how an artificial agent would interact with its
vit,j , if j = Ki or randi,j [0, 1] ≤ Cr
{
environment in different states. As illustrated in Fig. 1, the Re-
uti,j = (4)
xti,j , otherwise inforcement learning model consists of five main components:
learning agent, environment, action, state and reward, its oper-
where Ki is randomly chosen in [1, NP ]; randi,j [0, 1] is randomly ation can be described as cycles of state, action and reward.
generated in [0, 1] and the crossover rate Cr parameter is chosen Q-learning is a model-free reinforcement learning algorithm.
in [0, 1] and commonly chosen as 0.9 [35]. ‘‘Q’’ in Q-learning stands for Quality, the agent learns the qual-
Selection ity of each given action in gaining future reward. Let S =
At the end of each iteration, this step selects the better indi- {s1 , s2 , . . . , sm } be the set of states of the environment and A =
viduals for the next iteration between the current individual set {a1 , a2 , . . . , an } be the set of actions that can be executed. At each
xt and the mutant vector set vt . The comparison is based on the iteration, the agent is given a state and asked to choose an action
to be executed, the agent can perform the most optimal action
fitness of the individuals, i.e. the values of the objective function
based on an accumulated table of information, Q-table. After the
f (·), which can be expressed as:
action transpired, a reward rt +1 and a new state St +1 are then
uti , given by the environment, the obtained information is learned by
{
if f (uti ) ≤ f (xti )
xti +1 = (5) the agent in order to form evaluations of expected values Q (st , at )
xti , otherwise
gained by each given action.
Q-learning is a value-based learning algorithm, meaning it
Algorithm 1 Differential Evolution updates its value function using an expectation equation, one
1: procedure DE(NP , MaxE v al,tol) popular update equation is the Bellman Equation. The update
2: t=0 equation can be expressed as follows:
3: (I)Initialization
4: Initialize NP random candidates xti =0 Qstt+,a1t = Q (st , at ) + α[rt + γ maxa Q (st +1,at ) − Q (st , at )] (6)
5: Compute objective function of NP initial candidates xit =0
where rt is the reward/penalty value returned by the environ-
6: ev al = NP
ment, in this case, it is determined by the fitness improvement of
7: δ = |fmean /fbest − 1|
8: while (ev al < MaxE v al) and δ > tol do
the population; learning rate α and the discount factor γ values
9: t =t +1 is chosen within [0,1].
The learning rate α determines how aggressively the agent
10: for i = 1 → NP do would react to the received reward, the higher value is, the more
11: (II)Mutation: DE/rand/1 scheme the Q-values of the Q-table will fluctuate in each learning phase.
12: F = 0 .8 Thus the value of α will be set to high in the early stage of the
13: Choose 3 random candidates (xtR , xtR , xtR ) from the optimization process and linearly decreased in later iterations,
1 2 3
population as the policy for optimal actions would become more and more
t t t t
14: vi = xR + F (xR − xR ) converged:
1 2 3
ev al
( )
15: (III)Crossover: binary strategy
α (t) = 1 − 0.9 (7)
16: Cr = 0.9 MaxE v al
17: Ki = rand({1, 2, . . . , D})
18: for j = 1 → D do where ev al and MaxE v al is the number of utilized function evalu-
19: k = rand([0, 1]) ations and the maximum number of function evaluations, respec-
20: if k ≤ Cr or j == Ki then uti,j = vit,j tively.
21: else uti,j = xti,j The discount factor γ determines how much the future reward
expectation affects the agent decisions, the higher γ is, the more
22: Compute objective function of NP new candidates uti
the agent will prioritize the future reward over the immediate
23: ev al = ev al + NP
and vice versa. The parameter is commonly set as γ = 0.8.
24: (IV)Selection
25: if f (ui ) ≤ f (xi ) then xti +1 = ui 3.2. Q-learning differential evolution (qlDE) algorithm
26: else xti +1 = xi
The proposed qlDE algorithm utilizes Q-learning model for
27: δ = |fmean /fbest − 1|
adaptive parameter control, specifically the scale factor F and
3
T.N. Huynh, D.T.T. Do and J. Lee Applied Soft Computing 107 (2021) 107464
Fig. 4. Convergence histories of 10-bar truss problem using various algorithms. calculation process, truss member sizes and/or structural shape
elements are considered as continuous variables, the connectivity
information of the structure is predetermined and assumed to
and one size-shape optimization example with 37-bar truss struc- remain unchanged during the optimization process. The opti-
ture, are implemented to verify the performance of the proposed mization problem, with m cross-sectional areas and of n element
qlDE algorithm. All investigated examples are performed with nodal coordinates as design variables, can be expressed as:
natural frequency constraints, and the results are compared to
m
those of classical DE and several other metaheuristic optimization ∑
algorithms. Minimize: f (A, x) = ρi Ai Li (xj )
Regarding the general settings of the studied examples, the i=1
ω ≤ ω̄k
⎧
population size NP is set as 50, the iterative process of the ⎨ k (10)
ωl ≥ ω̄l
⎪
algorithms is terminated when either the convergence condition: Subject to:
δ < 1e − 6, or the maximum number of function evaluations con- ⎩ Ai,min ≤ Ai ≤ Ai,max
⎪
xj,min ≤ xj ≤ xj,max
dition: MaxE v al = 150 000, is met. In this study, the optimization
algorithms employed to investigate the examples are Firefly Al- where A = {A1 , . . . , Ai , . . . , Am } and x = {x1 , . . . , xj , . . . , xn }
gorithm(FA) [48], JAYA [49], Gray wolf Optimizer (GWO) [50] and are the vector of m cross-sectional areas and of n element nodal
Whale Optimization Algorithm (WOA) [51]. Due to the stochastic coordinates, respectively; ρi and Li are the material density and
nature of metaheuristic algorithms, each example is investigated the length of the ith member of the structure, respectively; ω̄k and
with each method in 30 independent trials. The obtained results
ω̄l denote the upper and lower constraints of natural frequency
are presented to show the optimal accuracy, convergence rate and
ωk and ωl , respectively; Ai,max and Ai,min indicate the upper and
consistency of the methods in comparison with one another. The
lower bounds of ith cross-sectional area Ai , respectively; xj,max and
parameter settings for qlDE and the other algorithms in this study
xj,min are the upper and lower bounds for the nodal coordinate xj ,
are tabulated in Table 1.
respectively.
4.1. Problem statement Information on the values of the variables, cross-sectional area
bounds and constraints for each problem are tabulated in Table 2.
The truss structural optimization problems aim to seek the In order to investigate the constrained optimization problem
optimal design of member size and the shape of the structure, stated in Eq. (10), it is first expressed as an unconstrained op-
in which its total weight is minimized. The optimization prob- timization problem. The penalty function, which is a commonly
lems are performed under multiple frequency constraints. In the used constraint-handling approach [3], is adopted, and with it,
5
T.N. Huynh, D.T.T. Do and J. Lee Applied Soft Computing 107 (2021) 107464
Table 2
Material properties, cross-sectional area bounds and frequency constraints for the truss weight optimization problems.
Problem Young’s modulus Material density Cross-sectional Frequency constraints
E(N/m2 ) ρ (kg/m3 ) area bounds A(cm2 ) ω(Hz)
10-bar planar truss 6.98 × 1010 2770 0.645 ≤ A ≤ 50 ω1 ≥ 7, ω2 ≥ 15, ω3 ≥ 20
72-bar space truss 6.98 × 1010 2770 0.645 ≤ A ≤ 50 ω1 = 4, ω2 ≥6
120-bar dome truss 2.1 × 1011 7971.81 1 ≤ A ≤ 129.3 ω1 ≥ 9, ω2 ≥ 11
200-bar planar truss 2.1 × 1011 7860 0.1 ≤ A ≤ 25 ω1 ≥ 5, ω2 ≥ 10, ω3 ≥ 15
37-bar planar truss 2.1 × 1011 7800 1 ≤ A ≤ 10 ω1 ≥ 20, ω2 ≥ 40, ω3 ≥ 60
Table 3
Comparison of optimal results of the 10-bar planar truss structure obtained using different algorithms.
Variables FA JAYA GWO WOA ReDEa MS-DEa DE qlDE qlDEa
Ai (cm2 ) [52] [53]
1 34.7502 35.0610 34.8174 35.1003 35.1565 35.1147 35.1007 35.1146 35.3252
2 14.7435 14.6886 14.8731 20.6891 14.7605 14.5260 14.7106 14.7259 14.6879
3 35.7114 35.3155 35.4656 37.0407 35.1187 35.0208 35.1584 35.1233 35.0360
4 14.8260 14.6572 14.8511 9.6049 14.7275 14.9330 14.7098 14.7008 14.6747
5 0.6450 0.6532 0.6457 0.6484 0.6450 0.6515 0.6450 0.6450 0.6451
6 4.5491 4.5662 4.5477 5.1898 4.5558 4.5695 4.5595 4.5595 4.5610
7 23.5403 23.7396 23.5669 30.9065 23.7199 23.2904 23.7074 23.7286 23.7472
8 23.7560 23.7566 23.5743 17.0298 23.6304 24.0829 23.7012 23.6718 23.7275
9 12.3508 12.4370 12.5440 7.2446 12.3827 12.6026 12.4019 12.4159 12.4184
10 12.3708 12.2889 12.3532 17.9017 12.4580 12.3531 12.4363 12.4414 12.3233
Best 524.4950 524.5263 524.5153 536.0290 524.4542 524.58 524.4513 524.4516 524.4579
nFEAs 31 350 150 000 150 000 150 000 8300 4800 99 800 33 450 9980
Worst 532.3784 530.7592 530.7445 662.1084 530.6431 – 524.4531 524.5971 534.6095
Mean 525.9205 525.2136 525.2319 577.3127 524.7635 526.1 524.4519 524.4747 527.1701
SD 1.5143 1.8740 1.2398 29.3229 1.1129 2.42 0.0004 0.0335 3.1562
Avg. nFEAs 32 656 150 000 150 000 150 000 – – 91 768 24 813 9428
a
Investigated with NP = 20.
Table 4
Natural frequencies of the optimal design for the 10-bar truss structure obtained by various algorithms.
Freq. No. FA JAYA GWO WOA ReDE MS-DE DE qlDE qlDE*
1 7.0000 7.0000 7.0000 7.0001 7.0000 7.0002 7.0000 7.0000 7.0000
2 16.2013 16.2062 16.1960 15.3938 16.1924 16.1927 16.1909 16.1900 16.1954
3 20.0000 19.9999 20.0002 20.0000 20.0000 20.0001 20.0000 20.0000 20.0000
4 20.0000 20.0071 20.0055 20.4233 20.0002 20.0144 20.0000 20.0001 20.0001
5 28.5205 28.5419 28.5370 24.6726 28.5517 28.5646 28.5584 28.5620 28.5422
6 28.9387 28.9216 29.0280 32.2562 – 29.0352 28.9650 28.9715 28.9244
7 48.6303 48.5780 48.6421 48.3824 – 48.5624 48.5748 48.5720 48.5782
8 51.1354 51.0538 51.1663 53.1250 – 51.0797 51.0707 51.0704 51.0591
Table 5
Comparison of optimal results of the 72-bar space truss structure obtained using different algorithms.
Variables FA JAYA GWO WOA ReDEa DE qlDE qlDEa
Ai (cm2 ) [52]
1 3.6307 3.2966 3.5443 6.4358 3.5327 3.4723 3.4685 3.2318
2 8.0298 7.8475 7.8490 6.5300 7.8303 7.8611 7.8459 7.7388
3 0.6450 0.6491 0.6450 0.6450 0.6453 0.6450 0.6450 0.6451
4 0.6450 0.6576 0.6811 0.6450 0.6450 0.6451 0.6451 0.6450
5 8.8816 7.6283 7.6855 7.6128 8.0029 7.9675 7.9829 7.8710
6 7.9477 7.8998 8.0136 8.5628 7.9135 7.9220 7.9317 7.9837
7 0.6450 0.6456 0.6525 0.6450 0.6451 0.6450 0.6450 0.6451
8 0.6450 0.6460 0.6490 0.6450 0.6451 0.6450 0.6450 0.6450
9 13.0781 12.8277 12.9276 24.7355 12.7626 12.7030 12.7040 13.0343
10 7.9346 7.9792 7.9086 8.2413 7.9657 7.9691 7.9802 7.8947
11 0.6450 0.6564 0.6453 0.6450 0.6452 0.6450 0.6450 0.6452
12 0.6450 0.6500 0.6822 0.6450 0.6450 0.6451 0.6450 0.6458
13 15.8196 17.4866 17.0599 11.7928 16.9041 17.0555 17.0426 17.0906
14 7.8485 8.0245 7.9816 8.9409 8.0434 7.9993 7.9936 8.1391
15 0.6450 0.6479 0.6631 0.6450 0.6451 0.6450 0.6450 0.6460
16 0.6450 0.6532 0.6660 0.6450 0.6473 0.6450 0.6450 0.6450
6
T.N. Huynh, D.T.T. Do and J. Lee Applied Soft Computing 107 (2021) 107464
Table 6
First five natural frequencies of the optimal design for the 72-bar truss structure
obtained by various algorithms.
Freq. No. FA JAYA GWO WOA ReDE DE qlDE qlDE*
1 4.0000 4.0000 4.0000 4.0000 4.0000 4.0000 4.0000 4.0000
2 4.0000 4.0000 4.0000 4.0000 4.0000 4.0000 4.0000 4.0000
3 6.0000 6.0001 6.0000 6.0063 6.0001 6.0000 6.0000 6.0000
4 6.3061 6.2807 6.3628 6.3777 6.2762 6.2692 6.2690 6.2499
5 9.1397 9.0884 9.1192 9.1431 9.1073 9.1017 9.1001 9.0561
The following subsections contain a description of four ex- 4.2.2. 72-bar space truss
amples of truss element size optimization problems: 10-bar pla- The elements connectivity, length and boundary conditions
nar truss, 72-bar space truss, 120-bar dome truss, 200-bar pla- of a 72-bar space truss structure can be found in Fig. 5, which
nar truss and the results of implementing various optimization also indicates the four added non-structural masses of 2270 kg
methods to solve them. The member cross-sectional areas are at four marked upper nodes 1–4 of the structures. The cross-
considered as continuous design variables within corresponding sectional areas of the elements are divided into 16 groups, which
bounds. are considered as design variables Ai with i ∈ [1, 16]. The element
groups are listed in detail in Table 13.
4.2.1. 10-bar planar truss Table 5 compares the optimization results of the 72-bar space
A 10-bar planar truss structure, which can be illustrated as in
truss structure problem obtained using different algorithms. From
Fig. 3, is adopted as the first optimization example. Four vertically
the data and the convergence histories displayed in Fig. 6, a
loads of 454 kg are applied at four corresponding free nodes,
similar conclusion to that of the first example can be drawn. qlDE
i.e. nodes of 1–4 as indicated in the illustration. The ten cross-
algorithm achieves the most optimal design while utilizing signif-
sectional areas of the elements are considered as continuous
icantly fewer function evaluations with acceptable consistency.
design variables Ai with i ∈ [1, 10].
The results gained from the investigated approaches are However, in this test, comparing with ReDE with NP = 20, qlDE
shown in Table 3. From the results, we can see that, DE and gains a slightly similar optimal result but with worse consistency
qlDE yield the most optimal design with good consistency and and convergence rate. The natural frequencies of the optimal
convergence rate. Comparing with the DE, qlDE converges signif- designs shown in Table 6 indicate that the constraints are not
icantly faster. On the other hand, with NP = 20, comparing with violated.
other DE variants, ReDE and MS-DE, qlDE although yields a similar
optimal design but converges slower. The optimal convergence 4.2.3. 120-bar dome truss
histories of the problem obtained using different approaches are Fig. 7 displays the structure of a 120-bar dome truss and the
shown in Fig. 4. The first eight natural frequencies of the optimal three marked node groups: group of node 1, group of nodes 2–
structural design are evaluated in Table 4, it can be seen that all 13 and group of the remaining nodes. On every member of the
the constraints are satisfied. three groups a constant concentrated mass of m1 = 3000 kg,
Table 7
Comparison of optimal results of the 120-bar dome truss structure obtained using different algorithms.
Variables FA JAYA GWO WOA ReDEa DE qlDE qlDEa
Ai (cm2 ) [52]
1 19.5425 0.0020 19.5134 20.5526 19.5131 19.4929 19.5076 19.5171
2 40.5250 0.0040 40.2129 39.7014 40.3914 40.408 40.3619 40.3702
3 10.4466 0.0011 10.6775 9.9319 10.6066 10.6082 10.6075 10.6082
4 21.0746 0.0021 21.1208 21.5904 21.1415 21.1117 21.1111 21.1055
5 9.7259 0.0010 9.9041 8.5878 9.8057 9.8442 9.8413 9.8119
6 11.9494 0.0012 11.6838 12.8019 11.7781 11.7726 11.7673 11.7425
7 14.8849 0.0015 14.8474 15.0515 14.8163 14.8355 14.8453 14.8625
7
T.N. Huynh, D.T.T. Do and J. Lee Applied Soft Computing 107 (2021) 107464
m2 = 500 kg or m3 = 100 kg is applied, respectively. Also, from 4.2.4. 200-bar planar truss
the figure, the truss members are numbered into seven groups of With the its structure illustrated in Fig. 9, the 200-bar planar
the same cross-sectional area, which are considered continuous truss structure is supplemented constant non-structural masses
design variables Ai with i ∈ [1, 7]. of 100 kg at top nodes, i.e. nodes 1–5. The truss members of
The convergence histories are shown in Fig. 8, we can recog- the structure are divided into 29 groups, each with a uniform
nize that the qlDE approach acquired optimal design with faster member’s cross-sectional area, members of the groups are listed
in Table 13. Members of the cross-sectional area vector Ai with
convergence speed toward the optimal than other algorithms.
i ∈ [1, 7] are considered as continuous design variables.
Table 7 presents the optimal designs of the 120-bar truss problem
Fig. 10 displays the convergence histories of the example
obtained using the investigated algorithms. From the statistic, it
obtained using different approaches. The optimal designs of the
can be seen that ReDE, DE and qlDE result in consistent optimal
200-bar truss problem obtained by the investigated algorithms
designs with the fewest analyses needed. In which, qlDE obtains
are tabulated in Table 9, it shows that qlDE obtains more opti-
the less optimal result while requiring less computational cost mal result than the other consistently algorithms. However, with
than DE. In contrast to earlier examples, with NP = 20, comparing NP = 20, for this problem, comparing with MS-DE, qlDE achieves
to ReDE, qlDE obtained a similar optimal design and convergence a relatively optimal result over 30 trials, but it lacks consistency.
rate. The natural frequencies of the optimal designs is shown in The first five natural frequencies of the optimal designs are cal-
Table 8, which shows that none of the constraints are notably culated in Table 10, none of the frequency constraints is shown
violated. to be violated.
8
T.N. Huynh, D.T.T. Do and J. Lee Applied Soft Computing 107 (2021) 107464
The example problem for the element size and nodal coordi-
nate optimization is weight minimization of a simply-supported
37-bar planar truss structure, as described in Fig. 11. The struc-
ture is supported at node 1 and node 20, the marked nodes
[2, 4, . . . , 18] each are subjected to a constant lumped mass of
10 kg. The truss members at the lower chords, members [28–37],
have fixed cross-sectional areas of 4 × 10−3 m2 , those of the other
members varies from 10−4 m2 to 10−3 m2 . The y-axis coordinates
of the upper nodes, nodes [3, 5, . . . , 19], are considered for opti-
mization, they are bounded in the neighbourhood of −0.5(m) to
2.5(m) from the y-axis coordinate of the lower chords.
The objective of the optimization is to find the optimal design
of nodal coordinates and member cross-sectional areas. Both of
which are grouped symmetrically into a 19-element continu-
ous design variable vector P = {Y1 , Y2 , . . . , Y5 , A1 , A2 , . . . , A14 },
where Yi denotes ith y-axis coordinate and Aj denotes jth cross-
sectional area. The design variables in the same group share
the same value, the member list of the groups are presented in
Table 13.
The optimal shape design of the problem is presented in
Fig. 12 and the optimal solutions of the problems gained by
various approaches are shown in Table 11. From the results, we
can observe that qlDE generated a more optimal design than
the other algorithms while having a superior convergence rate,
as can be seen in convergence histories graph Fig. 13. Similar
to Section 4.2.4, the optimal standard deviation of qlDE is also
shown to be better than the remaining algorithms. As can be seen
from Table 12, the obtained optimal designs do not violate any
natural frequency constraints.
The optimal shape design of the problem is presented in
Fig. 12 and the optimal solutions of the problems gained by
various approaches are shown in Table 11. From the results, we
can observe that qlDE generates a more optimal design than most
other algorithms. Similar to Section 4.2.3, for investigation with
NP = 20, qlDE obtains optimal design with decent consistency,
just slightly falls behind ReDE and MS-DE in term of convergence
rate. While with others, as can be seen in convergence histories
graph Fig. 13, convergences much faster and/or yields more con-
Fig. 9. Illustration of a 200-bar planar structure. sistent result. As can be seen from Table 12, the obtained optimal
designs do not violate any natural frequency constraints.
9
T.N. Huynh, D.T.T. Do and J. Lee Applied Soft Computing 107 (2021) 107464
Table 8
First five natural frequencies of the optimal design for the 120-bar truss structure obtained by
various algorithms.
Freq. No. FA JAYA GWO WOA ReDE DE qlDE qlDE*
1 9.0000 9.0000 9.0002 9.0000 9.0000 9.0000 9.0000 9.0000
2 11.0000 10.9999 11.0000 11.0000 11.0000 11.0000 11.0000 11.0000
3 11.0000 10.9999 11.0001 11.0000 11.0000 11.0000 11.0000 11.0000
4 11.0000 11.0001 11.0001 11.0193 11.0000 11.0000 11.0000 11.0000
5 11.0667 11.0668 11.0671 11.0814 14.0667 11.0669 11.0669 11.0668
Table 9
Comparison of optimal results of the 200-bar planar truss structure obtained using different algorithms.
Variables FA JAYA GWO WOA MS-DEa DE qlDE qlDEa
Ai (cm2 ) [53]
1 0.6559 0.2999 0.3035 0.4196 0.3027 0.3353 0.3026 0.3021
2 0.5179 0.4494 0.4571 0.8553 0.4533 0.4353 0.4515 0.4527
3 0.1000 0.1003 0.1000 0.1000 0.1002 0.1019 0.1000 0.1001
4 0.1000 0.1001 0.1000 0.1000 0.1003 0.1023 0.1000 0.1002
5 0.6412 0.5086 0.5093 0.7125 0.5112 0.5042 0.5118 0.5122
6 0.7968 0.8208 0.8177 0.9071 0.8169 0.8127 0.8204 0.8194
7 0.1374 0.1006 0.1006 0.3368 0.1005 0.1019 0.1000 0.1006
8 1.3527 1.4224 1.4399 0.6203 1.4093 1.4466 1.4240 1.4294
9 0.1000 0.1004 0.1000 0.1000 0.1001 0.1010 0.1000 0.1000
10 1.6713 1.5906 1.5630 1.1939 1.5887 1.5482 1.5839 1.5777
11 1.1022 1.1441 1.1491 1.5047 1.1619 1.1490 1.1592 1.1580
12 0.1033 0.1476 0.1310 0.7326 0.1364 0.1540 0.1300 0.1105
13 3.7031 2.9861 2.9438 2.4000 2.9712 2.9990 2.9771 3.0105
14 0.1456 0.1015 0.1064 0.1000 0.1009 0.1073 0.1000 0.1007
15 17.2392 3.2497 3.2284 2.7215 3.2506 3.1567 3.2485 3.3230
16 1.5543 1.5832 1.5877 1.7982 1.5839 1.5866 1.5906 1.5857
17 0.1965 0.2513 0.3085 0.5052 0.2615 0.2421 0.2599 0.2313
18 5.5155 5.1007 5.0692 6.0774 5.0886 5.2240 5.0783 5.3886
19 0.2314 0.1003 0.1024 0.1000 0.1033 0.1341 0.1001 0.1002
20 6.0425 5.4270 5.4102 3.1989 5.4265 5.4683 5.4388 5.4713
21 2.1151 2.1084 2.1019 2.1094 2.1086 2.1276 2.0989 2.0838
22 0.8373 0.6925 0.7446 1.8207 0.7051 0.6004 0.6917 0.6508
23 7.3159 7.6729 7.6632 10.7626 7.6910 7.8168 7.6891 7.7175
24 0.8573 0.1046 0.1424 0.5586 0.1079 0.2377 0.1041 0.1438
25 10.3292 7.9138 7.9685 7.7495 7.9069 7.9513 7.9674 8.0090
26 3.1411 2.8069 2.8052 3.5550 2.8211 2.7568 2.7998 2.8120
27 8.5254 10.4194 10.4882 9.2361 10.4602 10.6517 10.5237 10.2987
28 20.1061 21.3809 21.2919 18.8379 21.1925 21.2701 21.2905 21.2032
29 20.6098 10.7155 10.7499 20.8991 10.7934 10.5659 10.7254 10.8699
Table 10
First five natural frequencies of the optimal design for the 200-bar truss structure obtained by
various algorithms.
Freq. No. FA JAYA GWO WOA MS-DE DE qlDE qlDE*
1 5.0000 5.0000 5.0000 5.0000 5.0000 5.0001 5.0000 5.0000
2 14.6500 12.2144 12.1803 13.5881 12.2130 12.4582 12.2083 12.2106
3 15.0000 15.0327 15.0171 15.0108 15.0290 15.0434 15.0215 15.0139
4 17.9327 16.7094 16.6951 16.8668 16.6870 16.7285 16.6897 16.6937
5 20.9668 21.4028 21.3513 22.7464 21.4170 21.3604 21.3978 21.3920
10
T.N. Huynh, D.T.T. Do and J. Lee Applied Soft Computing 107 (2021) 107464
Table 11
Comparison of optimal results of the 37-bar planar truss structure obtained using different algorithms.
Variables FA JAYA GWO WOA ReDEa MS-DEa DE qlDE qlDEa
Pi (m/cm2 ) [52] [53]
1 0.8981 0.9521 0.9805 0.8574 0.9533 0.9522 0.9648 0.9523 0.9333
2 1.2430 1.3411 1.3464 1.1747 1.3414 1.3435 1.3452 1.3387 1.3135
3 1.3832 1.5301 1.5325 1.3799 1.5319 1.5387 1.5258 1.5224 1.5110
4 1.4957 1.6658 1.6666 1.5139 1.6528 1.6750 1.6609 1.6627 1.6432
5 1.5672 1.7321 1.7414 1.6312 1.7280 1.7501 1.7324 1.7358 1.7203
6 2.7832 2.9256 2.8251 2.7184 2.9608 2.8279 2.8976 2.9140 2.9823
7 1.0000 1.0159 1.0055 1.7427 1.0052 1.0280 1.0262 1.0001 1.0012
8 1.0000 1.0152 1.0004 2.0415 1.0014 1.0104 1.0004 1.0000 1.0042
9 3.2574 2.5679 2.6016 2.0891 2.5994 2.3892 2.5793 2.5671 2.4223
10 1.0882 1.2216 1.1348 1.4186 1.1949 1.2008 1.1704 1.1892 1.1581
11 1.2357 1.2522 1.2126 1.6341 1.2165 1.2553 1.2091 1.2320 1.2404
12 3.6252 2.6427 2.5518 2.8862 2.4303 2.4195 2.6128 2.5314 2.5708
13 1.2990 1.3289 1.4419 2.2927 1.3644 1.3621 1.3522 1.3855 1.4851
14 1.4235 1.5192 1.4897 2.0474 1.5548 1.5691 1.5388 1.4932 1.4449
15 3.8489 2.3939 2.5094 3.9495 2.5247 2.4970 2.4741 2.5043 2.5903
16 1.1036 1.2368 1.2240 2.4154 1.1946 1.2206 1.2011 1.2408 1.3040
17 1.2326 1.3317 1.3182 1.5651 1.3163 1.4111 1.3280 1.3334 1.3039
18 2.5335 2.3538 2.4382 4.0124 2.4465 2.5220 2.4154 2.4160 2.5335
19 1.0000 1.0022 1.0000 2.2341 1.0003 1.0013 1.0052 1.0000 1.0026
Best 361.4595 359.8715 359.8237 374.5330 359.8066 359.9300 359.8157 359.7860 359.8877
nFEAs 33 950 150 000 150 000 150 000 13740.0000 9520 150 000 63 450 15 700
Worst 384.8995 360.1972 360.0602 423.3269 360.5492 – 359.9786 359.8750 363.4417
Mean 367.3682 359.9659 359.9064 396.2282 359.9944 360.2100 359.8878 359.8160 360.5791
SD 5.5230 0.0662 0.0571 14.6547 0.1493 0.2000 0.0381 0.0237 0.9790
Avg. nFEAs 41 380 150 000 150 000 150 000 – – 150 000 52 197 16 683
a
Investigated with NP = 20.
Table 12
First five natural frequencies of the optimal design for the 37-bar truss structure obtained by various algorithms.
Freq. No. FA JAYA GWO WOA ReDE MS-DE DE qlDE qlDE*
1 20.0000 20.0104 20.0003 20.0000 20.0005 20.0086 20.0003 20.0001 20.0002
2 40.0000 40.0003 40.0002 42.9742 40.0004 40.0156 40.0051 40.0003 40.0010
3 60.0000 60.0061 60.0050 67.8645 60.0022 60.0015 60.0048 60.0004 60.0000
4 76.0818 77.0035 76.1201 89.0106 76.4734 77.0491 76.3244 76.5730 76.8795
5 95.7964 96.6422 96.0315 108.6484 960.3820 97.0918 96.3619 96.4065 96.8683
Table 13
Table of member/variable groups for 72-bar, 200-bar and 37-bar truss structure problems.
Problem Design variable Corresponding Problem Design variable Corresponding
group members group design variables
A1 1-4 P1 Y3 , Y19
A2 4–12 P2 Y5 , Y17
A3 13–16 P3 Y7 , Y15
A4 17–18 P4 Y9 , Y13
A5 19–20 P5 Y11
A6 23–30 P6 A1 , A27
72-bar truss problem
A12 77,78,79,80
A13 81,84,87,90,93
A14 95,96,97,98,99,10
A15 102,105,108,111,114
A16 82,83,85,86,88,89,91,92,103,104,106,107,109,110,112,113
A17 115,116,117,118
A18 119,122,125,128,131
A19 133,134,135,136,137,138
A20 140,143,146,149,152
A21 120,121,123,124,126,127,129,130,141,142,144,145,147,148,150,151
A22 153,154,155,156
A23 157,160,163,166,169
A24 171,172,173,174,175,176
A25 178,181,184,187,190
A26 158,159,161,162,164,165,167,168,179,180,182,183,185,186,188,189
A27 191,192,193,194
A28 195,197,198,200
A29 196,199
12
T.N. Huynh, D.T.T. Do and J. Lee Applied Soft Computing 107 (2021) 107464
etc., especially for engineering problems. Moreover, the potential [6] L.F.F. Miguel, R.H. Lopez, L.F.F. Miguel, Multimodal size, shape, and topol-
of the additional Q-learning parameter control procedure for ogy optimisation of truss structures using the firefly algorithm, Adv. Eng.
Softw. 56 (2013) 23–37, Publisher: Elsevier.
other optimization algorithms such as FA, PSO, ABC [14,54] etc.,
[7] G.I. Rozvany, Structural Design Via Optimality Criteria: the Prager Approach
awaits further attention. to Structural Optimization, Vol. 8, Springer Science & Business Media, 2012.
[8] N. Khot, L. Berke, V. Venkayya, Comparison of optimality criteria algorithms
CRediT authorship contribution statement for minimum weight design of structures, AIAA J. 17 (2) (1979) 182–190.
[9] F.T. Ko, B.P. Wang, An improved method of optimality criteria for structural
optimization, Comput. Struct. 41 (4) (1991) 629–636, Publisher: Elsevier.
Thanh N. Huynh: Formal analysis, Visualization, Methodology,
[10] L. Lamberti, C. Pappalettere, Move limits definition in structural optimiza-
Writing - original draft. Dieu T.T. Do: Writing - review & editing, tion with sequential linear programming. Part I: Optimization algorithm,
Validation. Jaehong Lee: Conceptualization, Writing - review & Comput. Struct. 81 (4) (2003) 197–213, Publisher: Elsevier.
editing, Supervision, Funding acquisition. [11] L. Lamberti, C. Pappalettere, Improved sequential linear programming
formulation for structural weight minimization, Comput. Methods Appl.
Mech. Engrg. 193 (33–35) (2004) 3493–3521, Publisher: Elsevier.
Declaration of competing interest
[12] L. Qian, W. Zhong, K. Cheng, Y. Sui, An approach to structural
optimization—sequential quadratic programming, SQP, Eng. Optim. 8 (1)
The authors declare that they have no known competing finan- (1984) 83–100, Publisher: Taylor & Francis.
cial interests or personal relationships that could have appeared [13] L. Lamberti, C. Pappalettere, Comparison of the numerical efficiency of
to influence the work reported in this paper. different sequential linear programming based algorithms for structural
optimisation problems, Comput. Struct. 76 (6) (2000) 713–728, Publisher:
Elsevier.
Acknowledgments [14] H. Gao, Z. Fu, C.M. Pun, J. Zhang, S. Kwong, An efficient artificial bee
colony algorithm with an improved linkage identification method, IEEE
This research was supported by Grant (NRF-2021R1A2B5B030 Trans. Cybern. (2020) Publisher: IEEE.
02410) from NRF (National Research Foundation of Korea) funded [15] W. Tang, L. Tong, Y. Gu, Improved genetic algorithm for design optimiza-
tion of truss structures with sizing, shape and topology variables, Internat.
by MEST (Ministry of Education and Science Technology) of Ko-
J. Numer. Methods Engrg. 62 (13) (2005) 1737–1762, Publisher: Wiley
rean government. Online Library.
[16] A. Kaveh, M.I. Ghazaan, Enhanced whale optimization algorithm for sizing
References optimization of skeletal structures, Mech. Based Des. Struct. Mach. 45 (3)
(2017) 345–362, Publisher: Taylor & Francis.
[1] L. Bellagamba, T.Y. Yang, Minimum-mass truss structures with constraints [17] A. Kaveh, P. Zakian, Improved GWO algorithm for optimal design of truss
on fundamental natural frequency, AIAA J. 19 (11) (1981) 1452–1458. structures, Eng. Comput. 34 (4) (2018) 685–707, Publisher: Springer.
[2] D.T. Do, J. Lee, A modified symbiotic organisms search (mSOS) algorithm [18] J. Liu, J. Lampinen, A fuzzy adaptive differential evolution algorithm, Soft
for optimization of pin-jointed structures, Appl. Soft Comput. 61 (2017) Comput. 9 (6) (2005) 448–462, Publisher: Springer.
683–699, Publisher: Elsevier. [19] O. Hasançebi, S.K. Azad, Adaptive dimensional search: a new metaheuristic
[3] Q.X. Lieu, D.T. Do, J. Lee, An adaptive hybrid evolutionary firefly algo- algorithm for discrete truss sizing optimization, Comput. Struct. 154 (2015)
rithm for shape and size optimization of truss structures with frequency 1–16, Publisher: Elsevier.
constraints, Comput. Struct. 195 (2018) 99–112, Publisher: Elsevier. [20] A.K. Qin, P.N. Suganthan, Self-adaptive differential evolution algorithm for
[4] H.M. Gomes, Truss optimization with dynamic constraints using a particle numerical optimization, Vol. 2, IEEE, 2005, pp. 1785–1791.
swarm algorithm, Expert Syst. Appl. 38 (1) (2011) 957–968, Publisher: [21] J. Brest, S. Greiner, B. Boskovic, M. Mernik, V. Zumer, Self-adapting control
Elsevier. parameters in differential evolution: A comparative study on numerical
[5] S. Degertekin, Improved harmony search algorithms for sizing optimization benchmark problems, IEEE Trans. Evol. Comput. 10 (6) (2006) 646–657,
of truss structures, Comput. Struct. 92 (2012) 229–241, Publisher: Elsevier. Publisher: IEEE.
13
T.N. Huynh, D.T.T. Do and J. Lee Applied Soft Computing 107 (2021) 107464
[22] W.J. Yu, M. Shen, W.N. Chen, Z.H. Zhan, Y.J. Gong, Y. Lin, O. Liu, J. Zhang, [40] P. Rakshit, A. Konar, P. Bhowmik, I. Goswami, S. Das, L.C. Jain, A.K. Nagar,
Differential evolution with two-level parameter adaptation, IEEE Trans. Realization of an adaptive memetic algorithm using differential evolution
Cybern. 44 (7) (2013) 1080–1099, Publisher: IEEE. and Q-learning: A case study in multirobot path planning, IEEE Trans. Syst.
[23] Z. Meng, J.S. Pan, K.K. Tseng, PaDE: An enhanced differential evolution Man Cybern.: Syst. 43 (4) (2013) 814–831.
algorithm with novel control parameter adaptation schemes for numerical [41] Z. Li, L. Shi, C. Yue, Z. Shang, B. Qu, Differential evolution based
optimization, Knowl.-Based Syst. 168 (2019) 80–99, Publisher: Elsevier. on reinforcement learning with fitness ranking for solving multimodal
[24] M. Isiet, M. Gadala, Self-adapting control parameters in particle swarm multiobjective problems, Swarm Evol. Comput. 49 (2019) 234–244.
optimization, Appl. Soft Comput. 83 (2019) 105653, Publisher: Elsevier. [42] P. Kim, J. Lee, An integrated method of particle swarm optimization
[25] R. Storn, K. Price, Differential evolution–A simple and efficient heuristic for and differential evolution, J. Mech. Sci. Technol. 23 (2) (2009) 426–434,
global optimization over continuous spaces, J. Global Optim. 11 (4) (1997) Publisher: Springer.
341–359, Publisher: Springer. [43] H. Samma, J. Mohamad-Saleh, S.A. Suandi, B. Lahasan, Q-learning-
[26] B. Babu, M.M.L. Jehan, Differential Evolution for Multi-Objective Optimiza- based simulated annealing algorithm for constrained engineering design
tion, Vol. 4, IEEE, 2003, pp. 2696–2703. problems, Neural Comput. Appl. 32 (9) (2020) 5147–5161, Publisher:
[27] S. Das, A. Konar, U.K. Chakraborty, Two Improved Differential Evolution Springer.
Schemes for Faster Global Search, 2005, pp. 991–998. [44] E.H. Aarts, P.J. van Laarhoven, Simulated annealing: A pedestrian review
[28] E. Mezura-Montes, J. Velázquez-Reyes, C.C. Coello, Modified Differential of the theory and some applications, in: Pattern Recognition Theory and
Evolution for Constrained Optimization, IEEE, 2006, pp. 25–32. Applications, Springer, 1987, pp. 179–192.
[29] G. Onwubolu, D. Davendra, Scheduling flow shops using differential evolu- [45] D. Bertsimas, J. Tsitsiklis, Simulated annealing, Statist. Sci. 8 (1) (1993)
tion algorithm, European J. Oper. Res. 171 (2) (2006) 674–692, Publisher: 10–15, Publisher: Institute of Mathematical Statistics.
Elsevier. [46] D. Fouskakis, D. Draper, Stochastic optimization: A review, ISR 70 (3)
[30] J. Zhang, A. Sanderson, JADE: Adaptive differential evolution with optional (2002) 315–349, Publisher: Wiley Online Library.
external archive, IEEE Trans. Evol. Comput. 13 (5) (2009) 945–958. [47] M. Pant, H. Zaheer, L. Garcia-Hernandez, A. Abraham, Differential evolu-
[31] S.M. Elsayed, R.A. Sarker, T. Ray, Differential evolution with auto- tion: A review of more than two decades of research, Eng. Appl. Artif.
matic parameter configuration for solving the CEC2013 competition on Intell. 90 (2020) 103479, Publisher: Elsevier.
real-parameter optimization, in: 2013 IEEE Congress on Evolutionary [48] X.S. Yang, Firefly algorithm, stochastic test functions and design op-
Computation, IEEE, Cancun, Mexico, 2013, pp. 1932–1937. timisation, Int. J. Bio-Inspired Comput. 2 (2) (2010) 78–84, Publisher:
[32] R.A. Sarker, S.M. Elsayed, T. Ray, Differential evolution with dynamic Inderscience Publishers.
parameters selection for optimization problems, IEEE Trans. Evol. Comput. [49] R. Rao, Jaya: A simple and new optimization algorithm for solving con-
18 (5) (2014) 689–707. strained and unconstrained optimization problems, Int. J. Ind. Eng. Comput.
[33] S. Das, S.S. Mullick, P.N. Suganthan, Recent advances in differential 7 (1) (2016) 19–34.
evolution–An updated survey, Swarm Evol. Comput. 27 (2016) 1–30, [50] S. Mirjalili, S.M. Mirjalili, A. Lewis, Grey wolf optimizer, Adv. Eng. Softw.
Publisher: Elsevier. 69 (2014) 46–61, Publisher: Elsevier.
[34] A.E. Eiben, R. Hinterding, Z. Michalewicz, Parameter control in evolutionary [51] S. Mirjalili, A. Lewis, The whale optimization algorithm, Adv. Eng. Softw.
algorithms, IEEE Trans. Evol. Comput. 3 (2) (1999) 124–141, Publisher: 95 (2016) 51–67, Publisher: Elsevier.
IEEE. [52] V. Ho-Huu, T. Nguyen-Thoi, T. Truong-Khac, L. Le-Anh, T. Vo-Duy, An
[35] R. Gämperle, S.D. Müller, P. Koumoutsakos, A parameter study for differ- improved differential evolution based on roulette wheel selection for shape
ential evolution, Int. J. Fuzzy Log. Intell. Syst. 10 (10) (2002) 293–298, and size optimization of truss structures with frequency constraints, Neural
Publisher: WSEAS Press Stevens Point, WI. Comput. Appl. 29 (1) (2018) 167–185, Publisher: Springer.
[36] C.J. Watkins, P. Dayan, Q-learning, Mach. Learn. 8 (3–4) (1992) 279–292, [53] S. Jalili, Y. Hosseinzadeh, Combining migration and differential evolution
Publisher: Springer. strategies for optimum design of truss structures with dynamic constraints,
[37] R. Dearden, N. Friedman, S. Russell, Bayesian Q-Learning, 1998, pp. Iran. J. Sci. Technol. Trans. Civ. Eng. 43 (1) (2019) 289–312, Publisher:
761–768. Springer.
[38] H.V. Hasselt, Double Q-Learning, 2010, pp. 2613–2621. [54] M. Sonmez, Artificial bee colony algorithm for optimization of truss
[39] T. Hester, M. Vecerik, O. Pietquin, M. Lanctot, T. Schaul, B. Piot, D. structures, Appl. Soft Comput. 11 (2) (2011) 2406–2418, Publisher:
Horgan, J. Quan, A. Sendonaris, G. Dulac-Arnold, Deep q-learning from Elsevier.
demonstrations, 2017, arXiv preprint arXiv:1704.03732.
14