A Neural Network Approach For Solving Optimal Control Problems With Inequality Constraints and Some Applications
A Neural Network Approach For Solving Optimal Control Problems With Inequality Constraints and Some Applications
DOI 10.1007/s11063-016-9562-6
Abstract In this paper, a class of nonlinear optimal control problems with inequality con-
straints is considered. Based on Karush–Kuhn–Tucker optimality conditions of nonlinear
optimization problems and by constructing an error function, we define an unconstrained
minimization problem. In the minimization problem, we use trial solutions for the state,
Lagrange multipliers, and control functions where these trial solutions are constructed by
using two-layered perceptron. We then minimize the error function using a dynamic opti-
mization method where weights and biases associated with all neurons are unknown. The
stability and convergence analysis of the dynamic optimization scheme is also studied. Sub-
stituting the optimal values of the weights and biases in the trial solutions, we obtain the
optimal solution of the original problem. Several examples are given to show the efficiency
of the method. We also provide two applicable examples in robotic engineering.
1 Introduction
Control problems for systems governed by ordinary (or partial) differential equations arise in
many applications, e.g., in astronautics, aeronautics, robotics, and economics [1,2]. Exper-
imental studies of such problems go back recent years and computational approaches have
been applied since the advent of computer age. The solution of practical control systems
usually has special difficulties. Moreover, in classical theory of control, just input-output
signals are considered and the basic deficiency of this theory is that it is only applicable for
B Alireza Nazemi
nazemi20042003@gmail.com
Rezvan Karami
rezvankarami91@gmail.com
1 Department of Mathematics, School of Mathematical Sciences, Shahrood University of Technology,
P.O. Box 3619995161-316, Shahrood, Iran
123
996 A. Nazemi, R. Karami
123
A Neural Network Approach for Solving Optimal Control... 997
From the last paragraph, one can conclude that direct collocation methods based on a
feed forward neural network scheme are suitable for solving TPBVPs which arise from
the optimal control problems. On the other hand, a triple of trial functions for solving the
optimality system which can be used for the training input data at arbitrary collocation nodes
is constructed. The trial function is based on two facts: First, it satisfies the boundary/initial
conditions of the differential equation. Second, it constructed to be a sum of two terms,
involving perceptron parameters. Using the trial solutions based on the neural network and
collocation points, the numerical problem of solving the optimality system is converted to
an unconstrained minimization problem. Finally, using a proposed dynamic optimization
technique, the optimal pair control-state for the problem is given.
It should be mentioned that the structure of this research work is different from [58] in
two important aspects. The problem of under investigation in this paper is an optimal control
with inequality constraints, meanwhile in [58] a typical optimal control without inequality
constrains has been considered. Moreover in [58], authors solve the final unconstrained
optimization via some optimization algorithms in Mathlab Toolbox optimization software,
meanwhile in the presented study we propose a dynamic optimization scheme with stability
and convergence properties for solving unconstrained minimization problem with clarity of
presentation.
Consider the problem of finding the control u(τ ) ∈ Rm that minimizes the functional
τf
J (x, u) = (τ, x)|τ f + L(τ, x, u)dτ
τ0
subject to (1)
x˙i = f i (τ, x, u), i = 1, 2, . . . , n, (2)
cl (τ, x, u) ≤ 0, l = 1, 2, . . . , p, (3)
x(τ0 ) = x0 , (4)
where x(τ ) ∈ Rn , and τ0 , τ f are both fixed. Adjoining (2) and (3) to the functional (1) with
Lagrange multipliers λ ∈ Rn and μ ∈ R p gives
τf n
p
J =+ L+ λi ( f i − ẋi ) + (μl cl ) dτ .
τ0 i=1 l=1
∂L
n
∂ fi
p
∂cl
0= + λi + μl , s = 1, 2, . . . , m, (7)
∂u s ∂u s ∂u s
i=1 l=1
0 = μl cl , l = 1, 2, . . . , p, (8)
0 ≤ μl , l = 1, 2, . . . , p, (9)
0 ≥ cl , l = 1, 2, . . . , p, (10)
123
998 A. Nazemi, R. Karami
Using the perturbed FB function, we can thus convert the NCP (8)–(10) into equality con-
straints as
ε
φFB (μl , −cl ) = 0, ε → 0+ , l = 1, · · · , p. (13)
It can be observed that we can calculate the output value of a two-layer feed–forward network
in Fig. 2 by the following formulation:
I
output = i=1 vi σ (z i ),
(14)
z i = wi x + bi ,
where I is the number of sigmoid units, wi is a weight parameter from input to the i-th
hidden unit, vi is an i-th weight parameter from hidden unit i to the output layer, bi is an
i-th bias for the i-th unit in the hidden layer, z i is an input of the i-th hidden unit and σ is
an arbitrary sigmoid function. The transfer function σ is generally a one–dimensional non–
linear monotonic function (activation function) that can be easily evaluated. In the numerical
examples presented in this paper, a sigmoidal function given below was used for this purpose:
1
σ (x) = . (15)
1 + e−x
123
A Neural Network Approach for Solving Optimal Control... 999
123
1000 A. Nazemi, R. Karami
⎧
⎪
⎪ x T = x0 + (τ − τ0 )n x ,
⎪
⎪
⎪
⎨λ = ∂ | + (τ − τ )n ,
T τ f λ
∂ xi T f (17)
⎪
⎪
⎪
⎪ μT = n μ ,
⎪
⎩u = n .
T u
The trial solutions (17) are the universal approximation and must satisfy conditions (5)–(12).
Thus we have
x˙i T − f i = 0, i = 1, 2, . . . , n, (18)
∂ LT
n
∂ fi T
p
∂clT
λ˙i T + + λi T + μlT = 0, i = 1, 2, . . . , n, (19)
∂ xi T ∂ xi T ∂ xi T
i=1 l=1
∂ LT ∂ fi T
n ∂clT p
+ λi T + μlT = 0, s = 1, 2, . . . , m, (20)
∂u s T ∂u s T ∂u sT
i=1 l=1
ε
φFB (μl T , −cl T ) = 0, l = 1, 2, . . . , p. (21)
In order to reformulate (18)–(21) as an unconstrained minimization problem, we first collo-
cate the optimality system (18)–(21) on the r points τk , k = 1, ..., r of the interval [τ0 , τ f ]
and then define an optimization problem as
1
r
minimize y E(y) = {E 1 (τk , y) + E 2 (τk , y) + E 3 (τk , y) + E 4 (τk , y)}, (22)
2
k=1
Lemma 3.1 If y = (wx∗ , wλ∗ , wμ∗ , wu∗ , b∗x , bλ∗ , bμ ∗ , b∗ , v ∗ , v ∗ , v ∗ , v ∗ )T satisfies the following
u x λ μ u
equation ⎡ ⎤
E 1 (τ1 , y)
⎢ .. ⎥
⎢ . ⎥
⎢ ⎥
⎢ E (τ , y)⎥
⎢ 1 m ⎥
⎢ E (τ , y) ⎥
⎢ 2 1 ⎥
⎢ .. ⎥
⎢ ⎥
⎢ . ⎥
⎢ ⎥
⎢ E 2 (τm , y)⎥
η(y) = ⎢ ⎥ = 0, (24)
⎢ E 3 (τ1 , y) ⎥
⎢ .. ⎥
⎢ ⎥
⎢ . ⎥
⎢ ⎥
⎢ E 3 (τm , y)⎥
⎢ ⎥
⎢ E 4 (τ1 , y) ⎥
⎢ ⎥
⎢ .. ⎥
⎣ . ⎦
E 4 (τm , y)
123
A Neural Network Approach for Solving Optimal Control... 1001
Proof Let η(y ∗ ) = 0. Then E i (τk , y ∗ ) = 0, for i = 1, 2, 3, 4 and k = 1, ..., r. Since in (22),
E(y) ≥ 0, thus y ∗ is an optimal solution of (22).
Now by Lemma 3.1, we can easily verify that the minimization problem (22) is equivalent
to the following problem:
1
minimize y E(y) = η(y) 2 . (25)
2
In order to solve (25), which is an unconstrained optimization problem, we can use any opti-
mization algorithms such as steepest descent, Newton, Quasi–Newton, Conjugate Gradient,
etc. [74–76] as well as the heuristic algorithms such as genetic algorithm, particle swarm
optimization, ant colony search algorithms, tabu search, etc. [77]. However, in the next sec-
tion, we propose a new approach based on a dynamic optimization method for solving (25).
The convergence properties of this technique is also rigorously provided.
Let wx (.), wλ (.), wμ (.), wu (.), bx (.), bλ (.), bμ (.), bu (.), vx (.), vλ (.), vμ and vu (.), be some
time dependent variables. Our aim is to design a dynamic model that will settle down to an
equilibrium point, which is also a stationary point of the energy function E(y). We propose
the following dynamic model for solving (22) as
dy(t)
= −κ∇ E(y(t)), κ > 0, (26)
dt
y(0) = y0 , (27)
where κ is a scaling factor and indicates the convergence rate of (26) and (27).
We now recall some preliminary results that will be used later and background materials
of ordinary differential equations that will play an important role in the subsequent analysis.
In what follows, . denotes l 2 -norm of Rn , T denotes the transpose and x =
(x1 , x2 , · · · , xn )T . If a differentiable function F : Rn → R, then ∇ F ∈ Rn stands for
its gradient. For any differentiable mapping F = (F1 , . . . Fm )T : Rn → Rm , ∇ F =
[∇ F1 (x), . . . , ∇ Fm (x)] ∈ Rn×m , denotes the transposed Jacobian of F at x.
(a) An isolated equilibrium point x ∗ of a system x = F(x) is Lyapunov stable if there exists
a Lyapunov function over some neighborhood ∗ of x ∗ .
(b) An isolated equilibrium point x ∗ of a system x = F(x) is asymptotically stable if there
is a Lyapunov function over some neighborhood ∗ of x ∗ such that dζ (x(t))
dt < 0, ∀x ∈
∗ {x ∗ }.
123
1002 A. Nazemi, R. Karami
Definition 4.3 [75] Let x(t) be a solution trajectory of a system x = F(x), and let X ∗
denotes the set of equilibrium points of this equation. The solution trajectory of the system
is said to be globally convergent to the set X ∗ , if x(t) satisfies
lim dist(x(t), X ∗ ) = 0,
t→∞
where dist(x(t), X ∗ ) = inf y∈X ∗ x − y . In particular, if the set X ∗ has only one point x ∗ ,
then limt→∞ x(t) = x ∗ , and the system is said to be globally asymptotically stable at x ∗ in
the sense of Lyapunov.
Lemma 4.4 [75] If A is an n ×n non singular matrix, then the homogeneous system AX = 0
has only the trivial solution X = 0.
minimize f (x)
subject to
x ∈ Rn ,
In this section, the stability and convergence properties of the dynamic model (26) and (27)
are rigourously studied.
Theorem 5.1 Let y ∗ be the equilibrium point of the model (26) and (27) and the Jacobian
matrix of η(y) in (24) is nonsingular. Then y ∗ is an optimal solution of (22). On the other
hand, if y ∗ , be an optimal solution of (22), then y ∗ is an equilibrium point of the framework
(26) and (27).
Proof Suppose that y ∗ is an equilibrium point of dynamic model (26) and (27). It is clear
that ∇ E(y ∗ ) = 0. With a simple calculation, it is clearly shown that
where ∇η(y) is the Jacobian matrix of η(y). Using Lemma 4.4 and by non-singularity of
∇η(y) we get η(y ∗ ) = 0, i.e. y ∗ is an optimal solution of (22). From Theorem 4.5 , the
converse of the proof is straightforward.
Theorem 5.2 Let y ∗ be an isolated equilibrium point of (26) and (27) and the Jacobian
matrix of η(y) in (24) is nonsingular. Then y ∗ is asymptotically stable for (26) and (27).
Proof First, notice that E(y) 0 and E(y ∗ ) = 0. In addition, since y ∗ is an isolated
equilibrium point of (26) and (27), there exists a neighborhood ∗ ⊆ R4I (2n+m+ p) of y ∗
such that
123
A Neural Network Approach for Solving Optimal Control... 1003
We claim that for any y ∈ ∗ \{y ∗ }, E(y) > 0. Otherwise if there is a y ∈ ∗ \{y ∗ } satisfying
E(y) = 0. Then, we have ∇ E(y) = 0, i.e., y is also an equilibrium point of (26), which
clearly contradicts the assumption that y is an isolated equilibrium point in ∗ . Moreover
d E(y(t)) dy(t)
= ∇[E(y(t))]T = −k ∇ E(y(t)) 2
0, (28)
dt dt
and
d E(y(t))
< 0, ∀y(t) ∈ ∗ and y(t) = y ∗ .
dt
This, by Lemma 4.2(b) implies that y ∗ is asymptotically stable.
Proposition 5.3 (a) For any initial state y0 = y(t0 ), there exists exactly one maximal
solution y(t, y0 ) with t ∈ [t0 , β(y0 )) for the model (26) and (27).
(b) If the level set L(y0 ) = {y ∈ R4I (2n+ p+r ) : E(y) E(y0 )} is bounded, then β(y0 ) =
+∞.
lim y(t) = ∞.
t→β(y0 )
Let
and
We know that y(β0 ) lies on the boundary of L(y0 ) and L C (y0 ). Moreover, L(y0 ) is
compact since it is bounded by assumption and it is also closed because of the continuity
of E(y). Therefore, we have y(β0 ) ∈ L(y0 ) and β0 < β(y0 ), implying that
E(y(s)) > E(y0 ) > E(y(β0 )) for some s ∈ (β0 , β(y0 )). (29)
However, Theorem 5.2 says that E is nonincreasing on [t0 , β(y0 )), which contradicts
(29). This completes the proof of (b).
Theorem 5.4 Suppose that y = y(t, y0 ) is a trajectory of (26) and (27) in which the initial
point is y0 = y(0, y0 ) and the level set L(y0 ) = {y ∈ R3I (2n+ p+r ) : E(y) E(y0 )} is
bounded. Then
(a) γ + (y0 ) = {y(t, y0 )|t 0} is bounded.
(b) There exists ȳ such that limt→∞ y(t, y0 ) = ȳ.
(c) Let the Jacobian matrix of η(y) in (24) is nonsingular. Then ∀y0 ∈ R3I (2n+ p+r ) , the
corresponding trajectory of (26) and (27) converges to one optimal solution of (22).
123
1004 A. Nazemi, R. Karami
Proof (a) Calculating the derivative of E(y) along the trajectory y(t, y0 ), (t 0) one has
d E(y(t)) dy(t)
= ∇[E(y(t))]T = −k ∇ E(y(t)) 2
0. (30)
dt dt
Thus along trajectory y = y(t, y0 ), (t 0), E(y) is monotone nonincreasing. Therefore
γ + (y0 ) ⊆ L(y0 ), that is to say γ + (y0 ) = {y(t, y0 )|t 0} is bounded.
(b) By (a) γ + (y0 ) = {y(t, y0 )|t 0} is a bounded set of points. Take strictly monotone
increasing sequence {t¯n }, 0 < t¯1 < · · · < t¯n , t¯n → +∞, then {y(t¯n , y0 )} is a bounded
sequence composed of infinitely many points. Thus there exists limiting point ȳ, that
is, there exists a subsequence {tn } ⊆ {t¯n }, tn → +∞ such that limn→+∞ (tn , y0 ) = ȳ,
which indicates that ȳ is ω-limit point of γ + (y0). Using the LaSalle Invariant Set
Theorem (see [80]), one has that y(t, y0 ) → ȳ ∈ M as t → ∞, where M is the largest
d E(y(t, y0 ))
invariant set in K = {y(t, y0 )| = 0}.
dt
(c) By (a) and (b) ȳ is an equilibrium point of (26) and (27). Since ∇η is nonsingular and
dwx dw p dvu d E(y(t))
from (30), it follows that = = ··· = = 0 ⇐⇒ = 0, thus
∗ ∗
dt dt∗
dt dt
ȳ ∈ D by M ⊆ K ⊆ D , where D is denoted as the optimal solution set of (22).
Therefore, from any initial state y0 , the trajectory y(t, y0 )of (26) and (27) converge to
an optimal solution of (22). The proof is complete.
Theorems 5.2 and 5.4 show that the offered model (26) and (27) is asymptotically stable, and
it converges to an exact solution of (22) in the large. Especially if (22) has a unique minimum
point [i.e., there exists a unique equilibrium point in (26) and (27)], then is globally, uniformly,
and asymptotically stable.
6 Numerical Examples
In this section, we try to implement the proposed algorithm to solve several problems. We
use for each problems, the suitable number of parameters for each input, output and bias
weights. We also discretize each intervals to the suitable number of equivalent parts.
123
A Neural Network Approach for Solving Optimal Control... 1005
Fig. 3 state and control function for Example 6.1. a Exact and approximated state function. b Exact and
approximated control function
Because x(1) is free, we have p(1) = 0. Considering this condition and the initial condition
x(0) = 1, we can choose the trial solutions as:
⎧
⎪ x T = 1 + tn x ,
⎪
⎪
⎪
⎨ λT = (t − 1)n λ ,
u T = nu ,
⎪
⎪
⎪
⎪ μ = n μ1 ,
⎩ 1T
μ2T = n μ2 .
We can see the approximate solutions for x(t) and u(t) in Fig. 3, respectively.
123
1006 A. Nazemi, R. Karami
Because x(1) is free, we have p(1) = 0. Considering this condition and the initial condition
x(0) = 0, we can choose the trial solutions as:
⎧
⎪
⎪ x T = tn x ,
⎪
⎪
⎪
⎪ λ T = (t − 1)n λ ,
⎨
u T = nu ,
⎪ μ1T = n μ1 ,
⎪
⎪
⎪
⎪
⎪ μ = n μ2 ,
⎩ 2T
μ3T = n μ3 .
We can see the approximate solutions for x(t) and u(t) in Fig. 4, respectively.
Because x1 (3) and x2 (3) are free, we have p1 (3) = p2 (3) = 0. Considering this condition
and the initial condition x1 (0) = 2, x2 (0) = 0, we can choose the trial solutions as:
123
A Neural Network Approach for Solving Optimal Control... 1007
Fig. 4 state and control function for Example 6.2. a Exact and approximated state function. b Exact and
approximated control functio
⎧
⎪
⎪ x1T = 2 + tn x1 ,
⎪
⎪
⎪
⎪ x 2T = tn x2 ,
⎪
⎪
⎪
⎪ λ1T = (t − 3)n λ1 ,
⎨
λ2T = (t − 3)n λ2 ,
⎪
⎪ u T = nu ,
⎪
⎪
⎪
⎪ μ 1T = n μ1 ,
⎪
⎪
⎪
⎪ μ = n μ2 ,
⎩ 2T
μ3T = n μ3 .
We can see the approximate solutions for x1 (t), x2 (t) and u(t) in Fig. 5, respectively.
123
1008 A. Nazemi, R. Karami
Fig. 5 state and control function for Example 6.3. a Exact and approximated state function. b Exact and
approximated state function. C Exact and approximated state function
123
A Neural Network Approach for Solving Optimal Control... 1009
subject to
ẋ(t) = −ax(t) + u(t),
x(0) = 5, x(T ) = is free.
With assume that H = 5, a = −0.2 and T = 15; the exact state and control functions are
as follows: ⎧
⎪ x(t) = B exp −t − 5A exp t ,
⎪
⎪
⎪
⎨ 5 5
⎪
⎪
⎪ t
⎪
⎩ u(t) = −2 A exp ,
5
where
25 26A exp(6)
A= , B= .
26 exp(6) − 25 5
Because x(15) is free, we have p(15) = 0. Considering this condition and the initial condition
x(0) = 5, we can choose the trial solutions as:
⎧
⎨ x T = 5 + tn x ,
λT = (t − 15)n λ + 5x(15),
⎩
u T = nu .
We can see the approximate solutions for x(t) and u(t) in Fig. 6, respectively.
123
1010 A. Nazemi, R. Karami
Fig. 6 state and control function for Example 6.4. a Exact and approximated state function. b Exact and
approximated control function
123
A Neural Network Approach for Solving Optimal Control... 1011
Fig. 7 state and control function for Example 6.5. a Exact and approximated state function. b Exact and
approximated control function
Because x(1) is free, we have p(1) = 0. Considering this condition and the initial condition
x(0) = 1, we can choose the trial solutions as:
⎧
⎨ x T = 1 + tn x ,
λT = (t − 1)n λ ,
⎩
u T = nu .
We can see the approximate solutions for x(t) and u(t) in Fig. 8, respectively.
123
1012 A. Nazemi, R. Karami
Fig. 8 state and control function for Example 6.6. a Exact and approximated state function. b Exact and
approximated control function
subject to
ẋ(t) = 0.5x 2 (t) sin(x(t)) + u(t),
x(0) = 0, x(1) = 0.5.
This system is solved by Euler method (see [78]), and the results are displayed and compared
with the results obtained by neural networks in Fig. 9, respectively. We choose the trial
solutions as: ⎧
⎨ x T = t (1 − t)n x + 0.5t,
λT = n λ ,
⎩
u T = nu .
We can see the approximate solutions for x(t) and u(t) in Fig. 9, respectively.
123
A Neural Network Approach for Solving Optimal Control... 1013
Fig. 9 state and control function for Example 6.7. a Exact and approximated state function. b Exact and
approximated control function
Example 6.8 A general form of an optimal path planning for a single rigid and free-flying
object A , which is considered as a point, can be considered as the following optimization
problem:
T
minimize I (q(t), q̇(t)) = F0 (t, q(t), q̇(t)) dt, (31)
0
subject to
ϕi (t, q(t), q̇(t)) > 0, i = 1, 2, . . . , s, t ∈ [0, T ], (32)
q(0) = qinit , q(T ) = qgoal , (33)
123
1014 A. Nazemi, R. Karami
as length of path, and ith inequality in (32), is equivalent to the image complement of ith
obstacle, Bi (t), in C , that is the complement of:
C Bi (t) = {q : A (q) ∩ Bi (t) = ∅}, ∀t ∈ [0, T ],
(see [81]), where A (q) is the subset of the workspace occupied by the object A at configu-
ration q.
We emphasize that in our examples all the stationary or moving obstacles are considered
as ri -radius circles or spheres, i = 1, 2, . . . , s, in plane or space, respectively. Since e.g. in
plane if we assume that ith obstacle at the moment of τ is a non-circle geometrical shape,
such as ϒi = Bi (τ ), and with compact boundary ∂ϒi , then one can cover ∂ϒi by a finite
number of circles, e.g. Cl , l = 1, 2, . . . , L , such that Cl ⊆ C Bi (τ ), l = 1, 2, . . . , L , (see
Fig. 10). Thus we can substitute the number of L conditions in our optimization problem
instead of the obstacle Bi (τ ). Now define the artificial control function u(·) as:
q̇(t) = u(t).
The following mix-constrained optimal control problem represents the optimization problem
(31)–(33) as:
T
minimize I (q(t), u(t)) = F0 (t, q(t), u(t)) dt, (34)
0
subject to
q̇(t) = u(t), (35)
ϕi (t, q(t), u(t)) > 0, i = 1, 2, . . . , q, t ∈ [0, T ], (36)
q(0) = qinit , q(T ) = qgoal . (37)
As an example, we consider an optimal path planning with moving obstacles. Again the
converted optimal control problem corresponding to origin problem is as follows:
1
minimize (u 1 (t))2 + (u 2 (t))2 dt,
0
subject to
q̇1 (t) = u 1 (t),
q̇2 (t) = u 2 (t),
ϕi (t, q(t), u(t)) > 0, i = 1, 2, . . . , q
q1 (0) = q2 (0) = 0,
123
A Neural Network Approach for Solving Optimal Control... 1015
Fig. 11 Approximate optimal trajectory with one moving obstacle for Example 6.8
where
We consider the problem in two cases: First only one moving obstacle is considered as:
where ⎧
⎪
⎨α(t) = t,
β(t) = 0.5,
⎪
⎩
r = 18 .
Considering initial and boundary conditions, we can choose the trial solutions as:
⎧
⎪
⎪ q1T = t + t (t − 1)n q1 ,
⎪
⎪
⎪
⎪ q 2T = t + t (t − 1)n q2 ,
⎪
⎪
⎨ λ1T = n λ1 ,
λ2T = n λ2 ,
⎪
⎪
⎪ μ1T = n μ1 ,
⎪
⎪
⎪
⎪ u = nu1 ,
⎪
⎩ 1T
u 2T = n u 2 .
The position of approximate optimal trajectory and moving obstacle are shown in Fig. 11.
In the second case we consider two moving obstacles
where ⎧ ⎧
⎪
⎨α1 (t) = t, ⎪
⎨α1 (t) = t,
β1 (t) = 0.5, , β1 (t) = −1.2t 2 + 1.2t + 0.7,
⎪
⎩ ⎪
⎩
r1 = 18 . r1 = 18 .
123
1016 A. Nazemi, R. Karami
Fig. 12 Approximate optimal trajectory with two moving obstacles for Example 6.8
Similar to the previous case, we can choose the trial solutions as:
⎧
⎪
⎪ q1T = t + t (t − 1)n q1 ,
⎪
⎪
⎪
⎪ q 2T = t + t (t − 1)n q2 ,
⎪
⎪
⎪
⎨
λ
⎪ 1T = n λ1 ,
λ2T = n λ2 ,
⎪ μ1T = n μ1 ,
⎪
⎪
⎪
⎪
⎪ μ2T = n μ2 ,
⎪
⎪
⎪
⎪ u = nu1 ,
⎩ 1T
u 2T = n u 2 .
The position of moving obstacles and approximate optimal trajectories are shown in Fig. 12.
Example 6.9 Consider a planar floating robot, modeled as a rigid body floating freely on
a flat table shown in Fig. 13. This robot can be maneuvered using gas thrusters (see [82]).
The floating robot is described by three coordinates: (x, y) is the Cartesian position of its
center of mass relative to an inertially fixed coordinate frame and θ is the angular orientation
of a coordinate frame fixed on the robot relative to the X axis of the fixed frame. Two
quadridirectional gas thrusters that can fire jets of air along positive and negative coordinate
axes X B and Y B are mounted on the floating robot at coordinates (−1, −d) and (1, −d) as
shown in the figure. The six state variables for the robot are x 1 = x, x2 = ẋ, x3 = y, x4 =
ẏ, x5 = θ, x6 = θ̇ . Its first-order differential equations of motion are:
ẋ1 (t) = x2 (t),
u 1 (t) + u 3 (t) u 2 (t) + u 4 (t)
ẋ2 (t) = cos(x5 (t)) − sin(x5 (t)),
10 10
ẋ3 (t) = x4 (t),
u 1 (t) + u 3 (t) u 2 (t) + u 4 (t)
ẋ4 (t) = sin(x5 (t)) − cos(x5 (t)),
10 10
ẋ5 (t) = x6 (t),
5 5
ẋ6 (t) = (u 1 (t) + u 3 (t)) + (u 4 (t) − u 2 (t)). (39)
12 12
123
A Neural Network Approach for Solving Optimal Control... 1017
The floating robot is used to capture floating bodies on the table. For this task, it is required
to find the control inputs u 1 (t), ..., u 4 (t) that will steer the floating robot from x1 (0) =
x2 (0) = · · · = x6 (0) = 0 to x1 (t f ) = x1 f , x3 (t f ) = x3 f , x5 (t f ) = x5 f , with zero
approach velocity, i.e. x2 (t f ) = x4 (t f ) = x6 (t f ) = 0. The force input in this maneuver is
supplied by an on-board compressed air and the cost associated with a maneuver is written
t
as 0 f {(u 1 (t))2 + (u 2 (t))2 + (u 3 (t))2 + (u 4 (t))2 } dt.
As an example, we use d = 5, l = 5, t f = 5, and the terminal states are required to
satisfy the constraints
π
x1 (5) = 4, x2 (5) = 0, x3 (5) = 4, x4 (5) = 0, x5 (5) = , x6 (5) = 0.
4
Considering this initial and boundary conditions, we can choose the trial solutions as:
⎧
⎪
⎪ x1T = 0.8t + t (t − 5)n x1 ,
⎪
⎪
⎪
⎪ x 2T = t (t − 5)n x2 ,
⎪
⎪
⎪ 3T = 0.8t + t (t − 5)n x3 ,
⎪
⎪
x
⎪
⎪ x4T = t (t − 5)n x4 ,
⎪
⎪ π
⎪
⎪ x5T = 0.2 t + t (t − 5)n x5 ,
⎪
⎪
⎪
⎪ 4
⎪
⎪ x6T = t (t − 5)n x6 ,
⎪
⎪
⎪
⎪ λ1T = n λ1 ,
⎨
λ2T = n λ2 ,
⎪
⎪ λ3T = n λ3 ,
⎪
⎪
⎪
⎪ λ 4T = n λ4 ,
⎪
⎪
⎪
⎪
⎪
λ
⎪ 5T = n λ5 ,
⎪
⎪
⎪ λ6T = n λ6 ,
⎪
⎪
⎪ u 1T = n u 1 ,
⎪
⎪
⎪
⎪ u 2T = n u 2 ,
⎪
⎪
⎪
⎪ u = nu3 ,
⎪
⎩ 3T
u 4T = n u 4 .
The numerical solution obtained for this problem is shown in Figs. 14, 15, 16, 17, 18 and 19.
Here it can be seen that the desired terminal constraints have been achieved.
To end this section, we answer one natural question: what are the practical and com-
putational advantages of the proposed method, compared to existing generally available
algorithms. To answer this, we summarize what we have observed from numerical experi-
ments and theoretical results as below.
123
1018 A. Nazemi, R. Karami
123
A Neural Network Approach for Solving Optimal Control... 1019
123
1020 A. Nazemi, R. Karami
• Feed forward neural network frameworks are applied to solve the problems that have no
analytical solutions or their analytical solutions can not be determined directly.
• Unlike the other methods, there is no ill-conditioning of the concluded linear systems in
the expansion methods or the necessity of making a special relation among the step sizes
for different axis in the finite difference methods.
• The solution of the problem is continuous over all the domain of the problem. In contrast,
the numerical methods provide solutions only over discrete points; and the solution
between these points must be interpolated.
• The computational complexity does not increase considerably with the number of sam-
pling points and with the number of dimensions involved in the problem.
• The rounding-off error propagation of standard numerical methods does not affect the
neural network solution.
• Some other advantages of the proposed method include; the feasibility of solving time-
dependent systems of ODE’s and PDE’s [41], the extension of the method for solving
higher order and nonlinear problems [39,44], determining the approximate solution of
the problem in a closed analytical form [42], the evaluation of the fast solution with a few
number of parameters in real applications [43,46], usage of parallel processing and the
possibility of hardware implementation of the method in the form of neural chips [39].
7 Conclusion
In this paper, a hybrid method based on artificial neural networks, optimization techniques and
collocation methods is proposed to determine the approximate solution of the optimal control
problems with inequality constraints in a closed analytical form. The method requires a min-
imal amount of effort to implement and produces a good-quality solution within a reasonable
amount of time. The results of the numerical simulations presented indicate the feasibility
and the efficiency of using the offered technique for solving optimal control problems.
Conflicts of interest The authors declare that they have no conflict of interest.
References
1. Betts JT (2001) Practical methods for optimal control using nonlinear programming. SIAM, Philadelphia
2. Betts JT, Erb SO (2003) Optimal low thrust trajectories to the moon. SIAM J Appl Dyn Syst 2:144–170
3. Ascher UM, Mattheij RMM, Russel RD (1995) Numerical solution of boundary value problems for
ordinary differential equations. SIAM, Philadelphia
4. Saberi Nik H, Effati S, Yildirim A (2012) Solution of linear optimal control systems by differential
transform method. Neural Comput Appl. doi:10.1007/s00521-012-1073-4
5. Shamsyeh Zahedi M, Saberi Nik H (2013) On homotopy analysis method applied to linear optimal control
problems. Appl Math Model 37:9617–9629
6. Tohidi E, SaberiNik H (2015) A bessel collocation method for solving fractional optimal control problems.
Appl Math Modell 5:455–465
7. Jajarmi A, Pariz N, Vahidian A, Effati S (2011) A novel modal series representation approach to solve a
class of nonlinear optimal control problems. Int J Innov Comput Inf Control 7:1413–1425
8. Tang GY (2005) Suboptimal control for nonlinear systems: a successive approximation approach. Syst
Control Lett 54:429–434
123
A Neural Network Approach for Solving Optimal Control... 1021
9. Yousefi SA, Dehghan M, Lotfi A (2010) Finding the optimal control of linear systems via Hes variational
iteration method. Int J Comput Math 87:1042–1050
10. ElKady M, Elbarbary EME (2002) A Chebyshev expansion method for solving optimal control problems.
Appl Math Comput 129:171–182
11. Elnagar GN (1997) State-control spectral Chebyshev parameterization for linearly constrained quadratic
optimal control problems. J Comput Appl Math 79:19–40
12. Hsu NS, Chang B (1989) Analysis and optimal control of time-varying linear systems via block-pulse
functions. Int J Control 33:1107–1122
13. Kekkeris GT, Paraskevopoulos PN (1988) Hermite series approach to optimal control. Int J Control
47:557–567
14. Hosseinpour S, Nazemi AR (2016) Solving fractional optimal control problems with fixed or free final
states by Haar wavelet collocation method. IMA J Math Control Inf 33:543–561
15. Mansoori M, Nazemi AR (2016) Solving infinite-horizon optimal control problems of the time-delayed
systems by Haar wavelet collocation method. Comput Appl Math 35:97–117
16. Nazemi AR, Mahmoudy N (2014) Solving infinite-horizon optimal control problems using Haar wavelet
collocation method. ANZIAM J (formerly: J Aust Math Soc Ser B ) 56:179–191
17. Nazemi AR, Mansoori M (2016) Optimal control problems of the time-delayed systems by Haar wavelet.
J Vib Control 22:2657–2670
18. Nazemi AR, Shabani MM (2015) Numerical solution of the time-delayed optimal control problems with
hybrid functions. IMA J Math 32: 623–638
19. Pontryagin LS, Boltyanskii V, Gamkrelidze R, Mischenko E (1962) The mathematical theory of optimal
processes. Wiley Interscience, New York
20. Schalkoff RJ (1997) Artificial neural networks. McGraw-Hill, New York
21. Picton P (2000) Neural networks, 2nd edn. Palgrave, Basingstoke
22. Minsky M, Papert S (1969) Perceptrons. MIT Press, Cambridge
23. Haykin S (1999) Neural networks: a comprehensive foundation. Prentice Hall, Upper Saddle River
24. Khanna T (1990) Foundations of neural networks. Addison-Wesley, Reading
25. Stanley J (1990) Introduction to neural networks, 3rd edn. California Scientific Software, Sierra Mardre
26. Lippmann RP (1987) ) An introduction to computing with neural nets. IEEE ASSP Mag 4(2):4–22
27. Hornick K, Stinchcombe M, white H (1989) Multilayer feedforward networks are universal approximators.
Neural Netw 2(5):359–366
28. Lapedes A, Farber R (1988) How neural nets work. In: Anderson DZ (ed) Neural information processing
systems. AIP, New York
29. Mckeown JJ, Stella F, Hall G (1997) Some numerical aspects of the training problem for feed-forward
neural nets. Neural Netw 10(8):1455–1463
30. Haykin S (2007) Neural networks : a comprehensive foundation, 3rd edn. Prentice-Hall, Upper Saddle
River
31. Graupe D (2007) Principles of artificial neural networks, 2nd edn. World Scientific, Singapore
32. Tang H, Tan KC, Yi Z (2007) Neural networks: computational models and applications. Springer, Berlin
33. Muller B, Reinhardt J, Strickland MT (2002) Neural networks: an introduction, 2nd edn. Springer-Verlag,
Berlin
34. Picton P (2000) Neural networks, 2nd edn. Palgrave, Basingstoke
35. Fine TL (1999) Feed forward neural network methodology. Springer-Verlag, New York
36. Schalkoff RJ (1997) Artificial neural networks. McGraw-Hill, New York
37. Ellacott SW (1997) Mathematics of neural networks: models, algorithms and applications. Kluwer Aca-
demic Publishers, Boston
38. Lagaris IE, Likas A, Fotiadis DI (1998) Artificial neural networks for solving ordinary and partial differ-
ential equations. IEEE Trans Neural Netw 9:987–1000
39. Malek A, Shekari Beidokhti R (2006) Numerical solution for high order differential equations using a
hybrid neural networkOptimization method. Appl Math Comput 183:260–271
40. Shekari Beidokhti R, Malek A (2009) Solving initial-boundary value problems for systems of partial
differential equations using neural networks and optimization techniques. J Franklin Inst 346:898–913
41. Tsoulos IG, Gavrilis D, Glavas E (2009) Solving differential equations with constructed neural networks.
Neurocomputing 72:2385–2391
42. Kumar M, Yadav N (2011) Multilayer perceptrons and radial basis function neural network methods for
the solution of differential equations: A survey. Comput Math Appl 62:3796–3811
43. Dua V (2011) An artificial neural network approximation based decomposition approach for parameter
estimation of system of ordinary differential equations. Comput Chem Eng 35:545–553
44. Shirvany Y, Hayati M, Moradian R (2008) Numerical solution of the nonlinear Schrodinger equation by
feedforward neural networks. Commun Nonlinear Sci Numer Simul 13:2132–2145
123
1022 A. Nazemi, R. Karami
45. Shirvany Y, Hayati M, Moradian R (2009) Multilayer perceptron neural networks with novel unsupervised
train ing method for numerical solution of the partial differential equations. Appl Soft Comput 9:20–29
46. Balasubramaniam P, Kumaresan N (2008) Solution of generalized matrix Riccati differential equation
for indefinite stochastic linear quadratic singular system using neural networks. Appl Math Comput
204:671–679
47. Becerikli Y, Konarm AF, Samad T (2003) Intelligent optimal control with dynamic neural networks.
Neural Netw 16:251–259
48. Sarangapani J (2006) Neural network control of nonlinear discrete-time systems. Taylor & Francis, Boca
Raton
49. Kutz M, Lewis FL, Ge SS (2006) Neural networks in feedback control systems. In: Kutz M (ed) Mechanical
engineers’ handbook: instrumentation, systems, controls, and MEMS, chap. 19, vol 2, 3rd edn. Wiley,
New York
50. Gnecco G, Sanguineti M (2010) Suboptimal solutions to dynamic optimization problems via approxima-
tions of the policy functions. J Optim Theory Appl 146(2010):764–794
51. Gaggero M, Gnecco G, Sanguineti M (2014) Approximate dynamic programming for stochastic N-stage
optimization with application to optimal consumption under uncertainty. Comput Optim Appl 58:31–85
52. Liu L, Wang ZS, Zhang HG (2016) Adaptive fault-tolerant tracking control for MIMO discrete-time
systems via reinforcement learning algorithm with less learning parameters. IEEE Trans Autom Sci Eng.
doi:10.1109/TASE.2016.2517155
53. Wang Z, Liu L, Shan Q, Zhang H (2015) Stability criteria for recurrent neural networks with time-varying
delay based on secondary delay partitioning method. IEEE Trans Neural Netw Learn Syst 26:2589–2595
54. Wang ZS, Liu L, Zhang HG, Xiao GY (2016) Fault-tolerant controller design for a class of nonlinear
MIMO discrete-time systems via online reinforcement learning algorithm. IEEE Trans Syst Man Cybern
46:611–622
55. Vrabie D, Lewis F (2009) Neural network approach to continuous-time direct adaptive optimal control
for partially unknown nonlinear systems. Neural Netw 22:237–246
56. Vrabie D, Lewis F, Levine D (2008) Neural network-based adaptive optimal controller A continuous-time
formulation. Commun Comput Inf Sci 15:276–285
57. Cheng T, Lewis FL, Abu-Khalaf M (2007) Fixed-final-time-constrained optimal control of nonlinear
systems using neural network HJB approach. IEEE Trans Neural Netw 18:1725–1737
58. Effati S, Pakdaman M (2013) Optimal control problem via neural networks. Neural Comput Appl 23:2093–
2100
59. Nazemi AR (2011) A dynamical model for solving degenerate quadratic minimax problems with con-
straints. J Comput Appl Math 236:1282–1295
60. Nazemi AR (2012) A dynamic system model for solving convex nonlinear optimization problems. Com-
mun Nonlinear Sci Numer Simul 17:1696–1705
61. Nazemi AR (2013) Solving general convex nonlinear optimization problems by an efficient neurodynamic
model. Eng Appl Artif Intell 26:685–696
62. Nazemi AR, Omidi F (2012) A capable neural network model for solving the maximum flow problem. J
Comput Appl Math 236:3498–3513
63. Nazemi AR (2014) A neural network model for solving convex quadratic programming problems with
some applications. Eng Appl Artif Intell 32:54–62
64. Nazemi AR, Omidi F (2013) An efficient dynamic model for solving the shortest path problem. Transp
Res Part C 26:1–19
65. Nazemi AR, Sharifi E (2013) Solving a class of geometric programming problems by an efficient dynamic
model. Commun Nonlinear Sci Numer Simul 18:692–709
66. Nazemi AR, Effati S (2013) An application of a merit function for solving convex programming problems.
Comput Ind Eng 66:212–221
67. Nazemi AR, Nazemi M (2014) A gradient-based neural network method for solving strictly convex
quadratic programming problems. Cognit Comput 6:484–495
68. Pan S-H, Chen J-S (2010) A semismooth Newton method for the SOCCP based on a one-parametric class
of SOC complementarity functions. Comput Optim Appl 45:59–88
69. Sun D, Sun J (2005) Strong semismoothness of the Fischer–Burmeister SDC and SOC complementarity
functions. Math Program 103(3):575–581
70. Facchinei F, Jiang H, Qi L (1999) A smoothing method for mathematical programs with equilibrium
constraints. Math Program 35:107–134
71. Cybenko G (1989) Approximation by superpositions of a sigmoidal function. Math Control Signals Syst
2:303–314
72. Funahashi K (1989) On the approximate realization of continuous mappings by neural networks. Neural
Netw 2:183–192
123
A Neural Network Approach for Solving Optimal Control... 1023
73. Hornik K, Stinchombe M, White H (1989) Multilayer feedforward networks are universal approximators.
Neural Netw 2:359–366
74. Bazaraa MS, Sherali HD, Shetty CM (2006) Nonlinear programming—theory and algorithms, 3rd edn.
Wiley, Hoboken
75. Zhang X-S (2000) Neural networks in optimization. Kluwer Academic Publishers, Dordrecht
76. Nocedal J, Wright S (2006) Numerical optimization, 2nd edn. Springer-Verlag, Berlin
77. Lee KY, El-Sharkawi MA (2008) Modern heuristic optimization techniques: theory and applications to
power systems. Wiley-Interscience, Piscataway
78. Miller RK, Michel AN (1982) Ordinary differential equations. Academic Press, New York
79. Sun J, Chen J-S, Ko C-H (2012) Neural networks for solving second-order cone constrained variational
inequality problem. Comput Optim Appl 51:623–648
80. Hale JK (1969) Ordinary differential equations. Wiley-Interscience, New York
81. Latombe JC (1991) Robot motion planning. Kluwer Academic Publishers, Boston
82. Annapragada M, Agrawal SK (1999) Design and experiments on a free-floating planar robot for optimal
chase and capture operations in space. Robot Auton Syst 26:281–297
123