0% found this document useful (0 votes)
7 views6 pages

06722294

This document discusses a novel online method for optimal control system design of doubly fed induction generators (DFIGs) in wind turbines using adaptive dynamic programming and discrete linear quadratic regulator concepts. The proposed method approximates the solution to the Hamilton-Jacobi-Bellman equation using recursive least squares to establish optimal control strategies for DFIGs under uncertain wind conditions.

Uploaded by

rasablackmore
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views6 pages

06722294

This document discusses a novel online method for optimal control system design of doubly fed induction generators (DFIGs) in wind turbines using adaptive dynamic programming and discrete linear quadratic regulator concepts. The proposed method approximates the solution to the Hamilton-Jacobi-Bellman equation using recursive least squares to establish optimal control strategies for DFIGs under uncertain wind conditions.

Uploaded by

rasablackmore
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

2013 IEEE International Conference on Systems, Man, and Cybernetics

Online Optimal DLQR-DFIG Control System


Design via Recursive Least-Square Approach and
State Heuristic Dynamic Programming for
Approximate Solution of the HJB Equation
João Viana da Fonseca Neto authorrefmark1, Ernesto F. M. Ferreira ∗ and Patricia H. Moraes Rego †
∗ Federal University of Maranhão (UFMA)
Electrical Engineering Department,
65.080-040- São Luı́s, MA, Brasil
Email: jviana@dee.ufma.br
† State University of Maranhão (UEMA)
65.055.310- São Luı́s, MA, Brasil
Email: phmrego@yahoo.com.br

Abstract—Our aim in this paper is to present a novel method Thorough studies in ADP have been conducted by [1] [2],
for online optimal control system design via state heuristic where they propose news ideas and comment the trends of
dynamic programming (HDP) to approximate the solution of ADP for this decade. The state of the art on approximate
the Hamilton-Jacobi-Bellman (HJB) equation by means of the
recursive least-square (RLS) approach. Because the randomness solution of the Hamilton-Jacobi-Bellman (HJB) equation that
nature associated to primary energy sources, the control of is associated to the discrete algebraic Riccati equation [3] can
eolic and solar energy systems demands methods and technics be found in [4] [5]. In references [6] [7] the efficient rein-
that are suitable with the high degree of the environment forcement learning, temporal difference learning and function
uncertainties. The reinforcement learning (RL) and approximate approximation, respectively, are discussed in the context of
dynamic programming (ADP) approaches furnish the key ideas
and the mathematical formulations to develop optimal control least-square to solve HJB equation.
system methods and strategies for alternative energy systems. Reports on applications of DLQR control in DFIG show the
We are proposing a online design method to establish control importance of this device and technologic improvements that
strategies for the the main unit of a eolic system that is the doubly
fed induction generator (DFIG). The performance of proposed
are promoted by optimal control strategies. The reference [8]
method is evaluated via computational experiments for discrete in decentralized nonlinear control of wind turbine via DFIG
time HDP algorithms that map eigenstructure assignments in the (Doubly Fed Induction Generator) presents the linear quadratic
stable Z-plane. regulator (LQR) design method to improve the transient stabil-
Index Terms—Heuristic Dynamic Programming, Multivariable ity of the power systems and enhance the system damping. An
Control, Dynamic Programming, Optimal Control Tuning, Con- optimal control strategy based on LQR for DFIG is presented
vergence, Discrete Linear Quadratic Regulator, Digital Control,
DFIG wind turbines, FACTS Devices; Doubly Fed Induction by [9]. The strategies are designed to solve transient stability
Generator. problems, and gain adjustment of linear quadratic controllers
is performed via deviations of weighting matrices values. In
reference [10], the authors present an optimal control strategy
I. I NTRODUCTION
for reactive power, where the DFIG is a reactive power source
Lots of efforts are being made today for the development of the wind farm, where a genetic algorithm was developed
of alternative energy systems, such as solar and eolic plants to optimize the control strategy. An optimal tracking of the
as primary energy sources to transform those energies into secondary voltage control for DFIG . This control is based
electrical energy. These natural resources are subjected to on the regulation margin of the grid buses intelligent selection
uncertainties provoked by environmental changes of temper- based on the voltage violation condition.
ature, pressure and human been environment changes. Due to In this paper, we present a novel method and a on-line
these uncertainties control systems must be robust to lead with algorithm to design control strategies for DFIG in eolic plants.
random situations in his normal operation. Mostly to minimize Adaptive dynamic programming, discrete linear quadratic reg-
and sometimes to avoid unwanted effects of uncertainties, we ulator concepts that support the development of proposed
present the first insights of on-line optimal control design online optimal control system design method are presented
method based on reinforcement learning and approximated in Section II. The online DFIG-HDP-DLQR design method of
solution of HJB equation that is oriented to handle random Section III is based on adaptive critic approach and HDP al-
and non-linear processes. gorithms that are associated to the turbine-generator linearized

978-1-4799-0652-9/13 $31.00 © 2013 IEEE 3180


3174
DOI 10.1109/SMC.2013.541
model, DLQR and QR-Tuning procedure. The proposed al- while the models of the environment and the instantaneous cost
gorithm convergence properties are evaluated and discussed in are needed to determine the value function V corresponding
Section V in the sense of the RLS estimator, and its adaptation to the optimal policy.
for random input variations are observed during the tests. The The supervised learning can be introduced into Eq.(3) using
final remarks and trade-offs on the proposed control strategies an iterative scheme which has Vg approximated by param-
are presented in Section VI. eterized models. The parameterizations of the environment
f (A, B) of Eq.(1), the instantaneous cost r(xk , uk ) = r(Q, R),
II. ADP-DLQR F RAMEWORK
and the long-term cost V (P ) of Eq.(5) establish the parame-
The following topics assemble the framework of online terized function approximation for DLQR. This approximation
optimal control design which is based on discrete LQR policy is based on minimizing of the expected squared error between
via solution of Bellman equation. the estimated value V (x, P ) and the desired value d(·), which
The classical dynamic programming follows the mathemati- is given as
cal model of states which represents the system by the function
f (xk , uk ) (environment) in its fullness and the control policy Pj+1 = arg min Ex {|xT P x − d(x, r, f, Pj )|2 },
P
g(xk ) of the DLQR control that is adopted. Specifically, the (7)
dynamic behavior of the system is represented by linear models
given by where E(·) denotes the expected value, Pj is the current
estimate and Pj+1 is the updated estimate of the parameter
P of the approximation. The desired value d(·) consists of the
xk+1 = f (xk , uk ) = Axk + Buk (1)
current cost and the cost function from the next state f (x, u),
and which is given as

uk = g(xk ) = Kxk , (2) d(x, r, f, P ) = xT Qx + uT Ru +


where A ∈ Rn×n , n is the order of the system, xk ∈ Rn + γ(Ax + Bu)T P (Ax + Bu). (8)
is the state, B ∈ Rn×ne , ne is the amount of control inputs, The parameterization of the policy g by iterative determining of
uk ∈ Rne is the control input, and K ∈ Rne ×n is the feedback the parameters, according to the Bellman’s optimality equation,
gain matrix. and the cost estimate V allow determination of optimal policy,
In the DLQR control design one must determine a con- which is given as
trol policy  g(xk ) that minimizes the performance index

Vg (xk ) = i=k γ
i−k
r(xi , g(xi )), where the instantaneous u∗ = arg min {r(x, g(x, u)) + V (f (x, g(x, u), P )} .
u
cost r(xk , uk ) = xTk Qxk + uTk Ruk . This index, also called
(9)
value function, is represented in the Bellman equation which
is given The optimal parameters of Eq.(9) are obtained by solving the
Vg (xk ) = r (xk , g(xk )) + γVg (f (xk , g(xk ))) . (3) ∂u = 0. Clearly, the derivatives
equation of the gradient for u, ∂V
of the parameterized models f , r, V and g are required to

The goal is to establish a control or decision policy g that is determine the gradient ∂V ∂u . Thus, the minimum of Eq.(9)
optimal in the sense that it promotes the smallest possible set should satisfy:
of discounted instantaneous costs incurred over time, which
∂r ∂g ∂V ∂f ∂g
satisfies the following inequality Vg∗ (x) ≤ Vg (x), ∀{g, x}. +γ = 0. (10)
According to Bellman’s Optimality Principle, the optimal value ∂g ∂u ∂f ∂g ∂u
V ∗ must satisfy the discrete time Hamilton-Jacobi-Bellman = 2uT R, ∂V
∂f = 2f P and
∂r T
By substituting the derivatives ∂g
(HJB) equation or the Bellman’s optimality equation, i.e., ∂f ∂g
∂g = B in Eq.(10), and assuming ∂u = 0, one has that the
V ∗ (xk ) = min {r(xk , g(xk )) + γV ∗ (f (xk , g(xk )))} . (4) optimal policy u∗ is given by
g(·)
u∗ = K(P )x, (11)
It can be shown that the DLQR optimal value V ∗ admits the
following quadratic form: where K = −γ(R + B T P θ B)−1 B T P θ A is the optimal gain.

V (xk ) = xTk P xk , (5)
III. O NLINE O PTIMAL C ONTROL S YSTEM D ESIGN
where P T = P > 0 satisfies the discrete time algebraic Riccati The method of online optimal control system design pre-
equation (DARE), which is given as sented in this section is based on the adaptive critic approach
γ(AT P A) − P + Q − γ[AT P B(R/γ + and HDP algorithm. Specifically, the approximate algorithm is
developed for realizing online optimal control of DFIG in the
B T P B)−1 B T P A] = 0. (6)
context of the DLQR design taking into account policy iter-
The estimation of the value function V for an given policy ation schemes. The HDP policy provides the Riccati solution
only requires samplings from the instantaneous cost function r, P and the Riccati optimal gain K.

3175
3181
A. Adaptive Critic Approach on iterative process that are instructions established from steps
The adaptive critic approach is used to determine the 14-42. The control strategies are established according to the
parameters of the value function approximation. The Eq.(5) process.
that minimizes the expected squared error is parametrized Algorithm 1 - HDP Algorithm - Recursive least-square and Recurrence Step.
according to the Eq.(7). This can be written in vectorized form PD INAMIC -DLQR-HDP(N )
as 1 ————————————————————————-
2  - Setup - Initial Conditions
3  - weighting Matrices and Dynamic System
4 [Q, R, Ad , Bd , N ] ← [ ]
θj+1 = arg min Ex {|x̄T θ − d(x, r, f, Pj )|2 }, 5  Select - Discount factor - 0 < γ ≤ 1.
θ 6  RLS Parameters: θ0 , Γ0
(12) 7 Select - Forgetting factor - 0 < τ ≤ 1.
8  Select - Initial State - x0 .
where x̄ ∈ n(n+1)/2
R is defined according to 9  - Iterative Process Parameters
10 [N, nrevit ] ← [ ]
the Kronecker product that is given by x̄T = 11  - P and K initial Values
[x21 . . . x1 xn x22 . . . xn−1 xn x2n ]. The function 12 [P0 , K0 ] ← [ ]
13 ————————————————————————-
θ = vec(P ) ∈ Rn(n+1)/2 of the square matrix P is a 14  Iterative Process
vector containing the n diagonal entries of P and the 15 for i ← 0 : N
16 do
n(n + 1)/2 − n distinct sums pik + pki . Assuming an 17  Optimal Policy
ordering between the vector x̄ and the vectorization vec(P ) 18 ui ← Ki xi
19  States
to represent the quadratic form xT P x = x̄T vec(P ), the 20 xi+1 ← Ad xi + Bd ui
least-square parametric estimate of Eq.(12) is given by 21  Basis Set - Kronecker Product
22 xi ← [x21i ; x1i x2i ; . . . ; x26i ]
23  Target Vector Assembling
24 d(x, r, f, P ) ← xT i Qxi + ui Rui + xi+1 P xi+1
T T

θj+1 = (Ex {x̄T x̄})−1 Ex {x̄T d(x, r, f, Pj )}. (13) 25  Recursive least-square
Γi x i
26 Li+1 = T
τ +x Γi xi
The matrix vectorization and Kronecker product theories θi+1 = θi + 
i
Li+1 (d(·) − xT )
27 i θi
[11] contribute for an approximate solution of the DARE that 28 Γi+1 = τ −1 Γ x xT Γ
Γi − i iT i i
τ +x Γi xi
is obtained from an iterative scheme of systems of linear 29
i
 P matrix recovery from vector θ
equations for the coefficients of the matrix P which is given 30 P ← [θ1 , θ2 /2, θ3 /2, θ4 /2, θ5 /2, θ6 /2;
by 31 θ2 /2, θ7 , θ8 /2, θ9 /2, θ10 /2, θ11 /2;
32 θ3 /2, θ8 /2, θ12 , θ13 /2, θ14 /2, θ15 /2;
33 θ4 /2, θ9 /2, θ13 /2, θ16 , θ17 /2, θ18 /2;
34 θ5 /2, θ10 /2, θ14 /2, θ17 /2, θ19 , θ20 /2;
xTi vec(Pj+1 ) = d(xi , r, f, Pj ), (14) 35 θ6 /2, θ11 /2, θ15 /2, θ18 /2, θ20 /2, θ21 ]
36  Feedback Optimal Gain K
37 K ← −(R + BdT P Bd )−1 BdT P Ad
where ij−1 ≤ i ≤ ij , with ij − ij−1 = n(n + 1)/2 linearly 38 if i%nrevit = 0
independent samples. The regression vector xi and target value 39 then
40 xi+1 ← xrevit
d(·) are given by 41 —————————————————————–
42 End - Iterative Process

xi = [x21i x1i x2i . . . x2ni ]T (15) IV. O N - LINE DFIG-HDP-DLQR A LGORITHM


d(·) = xTi Qxi + uTi Rui + γxTi+1 Pj xi+1 . (16) The approximate algorithms are developed for realization of
digital optimal control. Approximate dynamic programming,
The least-square solution of the system (14) is given by, [12] so-called adaptive critic, is presented in the context of the
DLQR design taking into account the heuristic dynamic pro-
vec(Pj+1 ) = Θ−1
j ψj , (17) gramming (HDP), which is established by a set of method-
 T  T ologies for the development of methods and procedures that
where Θj = xi xi e ψ j = xi d(xi , r, f, Pj ). estimate the value function V . The HDP policy provides the
i i
Riccati solution P and the Riccati optimal gain K.
B. HDP algorithm
The main core of the HDP algorithm is developed according V. O NLINE DFIG-HDP-DLQR A LGORITHM E VALUATION
to Eqs.(15)-(17). The steps for determining the optimal control The computational experiments are designed to investigate
policy are presented in Algorithm 1. In this algorithm, a recur- the performance of the online algorithm for the doubly fed
sive least-square implementation is used to find vec(Pj+1 ). In induction generator, as shown in the schematic of Fig.1. The
this case, the algorithm has a persistence resetting (state revital- DFIG is modelled as 6th order system of linear differential
ization condition) due to problems of null state that lead to the equations. For more details on modelling of DFIG generators,
regression matrix with null rank. In such case, the revitalization see references, [13], [14], The convergence of HDP algorithm
is applied at the end of each interval nrevit . One observes that of Section III is investigated in terms of convergence of QR
the HDP algorithm is made up of two segments that can be tuning to reach regions of stable Z-plane. The proposed control
classified on initial conditions established in steps 1-12 and strategy for tuning the DLQR controllers is in the context of

3176
3182
approximation of the HJB-DARE solution and reinforcement approximate solution of HJB-DARE equation is performed by
learning theory. the recursive least-square (RLS). The evolution of the iterative
process for the Q and R cost diagonal matrices is shown in
Figures 2 and 3. The Q matrix has its diagonal elements given
by q1,1 = 0.1, q2,2 = 100, q3,3 = 100, q4,4 = 1000, q5,5 =
1000 and q6,6 = 1. The R matrix is the identity matrix. The
iterative process behaviour is presented for the horizon of 420
seconds.
120 250

100 200
θestim
80
150 θ0

p11

p22
60
100
40
θestim
20 50
θ0
0 0
0 10 20 30 40 0 10 20 30 40
a) tamost(s) b) tamost(s)

4
x 10
2 4000

3000
Figure 1. Control system of DFIG. 1
θestim
2000
0 θ0

p33

p44
1000
A. Setup of the Iterative Process −1
0
θestim
The setup of the iterative process is classified in three −2
θ0
−1000

categories that are related with iterative process (algorithm), −3


0 10 20 30 40
−2000
0 10 20 30 40
c) tamost(s) d) tamost(s)
dynamic system and approximation of the HJB-DARE solu-
tion.
1) The Iterative Process Parameters: The adjustable pa- Figure 2. Approximate HJB-DARE Solution - P11 -P44 .
rameters to control the convergence process are: n system order
(n = 6), Tamost sampling time (Tamost = 0.01seconds), N 500 12000

iteration number (N = 4000), τ forgetting factor (τ = 0.89), θestim 10000 θestim

nrevit revitalization interval (nrevit = 21).


400 θ0
θ0
8000

2) State Space Description: The matrices of the dynamic 300


6000

system (DFIG) for the continuous state space description of 200


4000
p55

p66
2000

the linearized model are given by 100


0

⎡ ⎤ 0
−2000

−101.48 193.74 0.00 0.00 0.00 0.00


⎢ −193.74 −101.48 0.00 0.00 0.00 0.00 ⎥
−4000

⎢ ⎥ −100

⎢ 0.00 0.00 −0.15 0.00 0.00 0.00 ⎥ −6000

Ac = ⎢ ⎥
⎢ 0.00 0.00 0.00 −250.00 377.00 0.00 ⎥ −200 −8000
⎣ 0.00 0.00 0.00 −377.00 −250.00 0.00 ⎦ 0 10 20
e) tamost(s)
30 40 0 10 20
f) tamost(s)
30 40

0.00 0.00 0.00 −767.72 0.00 22.72


(18)
Figure 3. Approximate HJB-DARE Solution - P55 and P66 .
and
⎡ ⎤
33.97 0.00 0.00 0.00 0.00 0.00
⎢ 0.00 33.97 0.00 0.00 0.00 0.00 ⎥ The expectation, standard deviation, median, maximum and
⎢ ⎥
⎢ 0.00 0.00 100.00 0.00 0.00 0.00 ⎥ minimum of diagonal parameters of HJB-DARE solution are
Bc = ⎢ ⎥(19)
.
⎢ 0.00 0.00 0.00 −83.33 0.00 0.00 ⎥
⎣ 0.00 0.00 0.00 0.00 −83.33 0.00 ⎦ presented in Table I. It is observed that RLS estimator provides
0.00 0.00 0.00 0.00 0.00 454.54 a good appoximation, because the expectations of parameters
θi converge to their respective true values. The standard devia-
The matrices Ad and Bd of the discrete model for a sample
tion of θ12 , θ16 and θ21 parameters are higher than the others,
interval Tamost = 0.01 are obtained by the zero-order-hold
this happens due to the singularities of regressor vectors. The
method.
median shows that the true value reached the steady true value
3) Recursive Least-Square Parameters: The RLS ad-
of the parameters. The maximum and minimum values show
justable parameter is the forgetting factor 0 < τ ≤ 1. The
a high degree of variation for p33 , p44 , p55 and p66 . The p11
initial conditions are given by initial HJB-DARE solution and
and p22 maximum and minimum values peaks are much lesser
the RLS covariance matrix.
that others.
In Table II are presented statistics of off-diagonal parame-
B. Approximate Solution of the HJB-DARE Equation ters. As can be seen in this table, the estimated values converge
The approximate solution of the HJB-DARE equation is for the true value. The standard deviation, maximum and
minimum values present high values, but the iterative process
presented in the context of policy iteration. This equation is has the ability to recovery from these oscillations. The median
solved in terms of heuristic dynamic programming (HDP). The values show that most of the parameters are the true values.

3177
3183
Table I convergence, in the sense of algorithm iterations, for situations
S TATISTICS OF D IAGONAL PARAMETERS - S OLUTION OF HJB-DARE
E QUATION .
1, 2 and 3 are 600, 4000 and 1000 iterations, respectively.
Table III
Parameter θ1 θ7 θ12 θ16 U NIFORM D EVIATIONS - Q M ATRIX T RACES AND DYNAMIC S YSTEM
p11 p22 p33 p44 E IGENVALUES
True 103.04 103.04 100.99 100.24
Estimated 103.04 103.04 100.99 100.24
No Trace Eigenvalues (10−2 )
Expectation 102.70 103.10 -3.70 111.64
1 6 44.536 +j 0.000 -12.705 +j 33.142 -12.705 +j -33.142
Standard Deviation 3.43 10.41 1603.51 218.21 2.596 +j 0.000 -6.260 +j 4.664 -6.260 +j -4.664
Median 103.04 103.04 100.99 100.24 2 600 -2.914 +j 7.590 -2.914 +j -7.590 -1.467 +j 1.208
Minimum 0.00 0.00 -23304.65 -1863.10 -1.467 +j -1.208 0.981 +j 0.000 0.037 +j 0.000
Maximum 107.97 223.99 14353.35 3276.06 3 6000 -0.375 +j 0.975 -0.375 +j -0.975 -0.182 +j 0.149
-0.182 +j -0.149 0.100 +j 0.000 0.004 +j 0.000

Table II
S TATISTICS OF O FF - DIAGONAL PARAMETERS - S OLUTION OF HJB-DARE
For situations 1 and 2 presented in Table III, the forgetting
E QUATION .
factor values associated with the uniform deviations of Q
matrix is 0.89. For situation 3 the forgetting factor value is
Parameter θ2 θ3 θ4 θ5 θ6
0.85. The traces for non-uniform variations of Q matrix and
p12 p13 p14 p15 p16
the eigenvalues of dynamic system are presented in Table IV.
The forgetting factor values for situations 1, 2 and 3 of Table
True 0.00 0.00 0.00 0.00 0.00 IV are 0.89, 085 and 0.92, respectively. The RLS convergence
Estimated -0.00 -0.00 -0.00 0.00 -0.00 is reached around 600, 900 and 1000 iterations for situations
Expectation -0.89 -5.97 1.47 2.90 3.43 1, 2 and 3, respectively.
S. Deviation 17.76 95.03 14.94 40.82 38.05 Table IV
Median 0.00 -0.00 0.00 0.00 0.00 N ON - UNIFORM D EVIATIONS - Q M ATRIX T RACES AND DYNAMIC S YSTEM
Minimum -255.83 -1382.45 -125.48 -351.08 -306.07 E IGENVALUES .
Maximum 158.51 836.30 214.10 590.98 550.43

No Trace Eigenvalues (10−2 )


θ 1 102200.1 -2.914 +j 7.589 -2.914 +j -7.589 -5.223 +j 0.0
The comparison of the matrix P∞
steady state values with -0.370 +j 0.00 0.004 +j 0.000 0.001 +j 0.0
elements of the HJB solution by Schur method, shows that the 2 2201.10 -7.424 +j 14.807 -7.424 +j -14.807 4.386 +j 0.0
values of the elements of the matrices have an accuracy of at 0.976 +j 0.000 -0.162 +j 0.119 -0.162 +j -0.119
3 1000410.0 -2.896 +j 0.0 -3.006 +j 1.909 -3.006 +j -1.909
least up to the first decimal place. Comparison tests on steady 0.982 +j 0.000 -0.003 +j 0.000 0.038 +j 0.00
state with the Riccati recurrence showed that the parameters θ
of least-square converge to the true value.
In Table V are presented the elements of the Q matrix that
C. QR Convergence are associated with the eigenvalues of the dynamic system of
Table IV.
The on-line approximate QR tuning method is the proposed Table V
NON - UNIFORM VARIATIONS OF THE Q M ATRIX .
procedure to tune optimal controllers based on relation of cost
matrices. The on-line QR tuning is a heuristic procedure to
establish systematic variations on Q and R weighting matrices No q(11) q(22) q(33) q(44) q(55) q(66)
1 100.00 100.00 100000.00 1000.0 0.1 1000.0
[15]. In this manner, the approximate solutions are guided by 2 0.10 100.00 100.00 1000.0 1000.0 1.0
deviations of cost matrices Q and R. The proposed methodol- 3 1000000.00 100.00 100.00 100.0 10.0 100.0
ogy is oriented for on-line solutions of optimal controllers, i.e.,
to impose stability margins of DLQR design of the proposed
The eigenvalues of the exact solution via Schur method are
method. given in Table VI. These values are used for comparison with
The proposed tuning method maps the traces of the weight- the results presented in Table IV.
ing matrices Q and R to regions in the stable Z-plane, Table VI
Schur S OLUTION FOR Q M ATRIX VARIATIONS
according to the specifications of the designer, into linear
combinations of the state vector, gain K being responsible for
the transformation. The heuristics are assembled according to No Trace Eigenvalues (10−2 )
1 102200.10 -2.914 +j 7.590 -2.914 +j -7.590 -5.766 +j 0.0
the directives of [16]. The uniform variations of the elements of 0.004 +j 0.000 -0.374 +j 0.000 0.001 +j 0.0
diagonal matrix Q are evaluated for the amount of iterations of 2 2201.10 -7.424 +j 14.807 -7.424 +j -14.807 4.358 +j 0.0
the HDP convergence process and the eigenvalue assignment. -0.167 +j 0.123 -0.167 +j -0.123 0.980 +j 0.0
3 1000410.00 -0.003 +j 0.000 -2.895 +j 0.000 0.037 +j 0.0
The numbers in column Q(qi ) represent the numerical values -3.004 +j 1.888 -3.004 +j -1.888 0.980 +j 0.0
of matrices Q(qi ) = 10qi I33 . For this situation, one has qi
= {0, 2, 3}. Analogous reasoning follows for the matrices
R(qi ) = 10ri I22 . In this case, one has ri = 0 for ∀ i. The It is observed that variations in the matrix Q led to the
number of samples is given by N . mapping of real and complex conjugate poles in the Z-plane.
In Table III, it is shown Q matrix traces and dynamic system The poles are limited to the real axis Z and the right half-
eigenvalues for uniform deviations on Q matrix. The RLS plane. An investigation for mapping in other regions of the

3178
3184
Z-plane involves the development of algebraic relationships neering (PPGEE), State University of Maranhão (UEMA) for
that can support the application of polarized heuristics for the the development infrastructure and financial support, as well
selection of matrices Q and R. as, FAPEMA, CNPq and CAPES.
D. Tests with plant parameter variations R EFERENCES
In order to show the robustness of the system, tests with [1] G. Lendaris, “A retrospective on adaptive dynamic programming for
plant parameter variations have been carried out on the speed control,” in Neural Networks, 2009. IJCNN 2009. International Joint
Conference on, vol. 0, 2009, pp. 1750 –1757.
of rotor, slip frequency, DC-link capacitance, DC-link voltage [2] P. J. Werbos, “Foreword - adp: The key direction for future research in
and others, these variations are related with operation of the intelligent control and understanding brain intelligence,” Systems, Man,
DFIG. Due to the unpredictable behavior of the wind, changes and Cybernetics, Part B: Cybernetics, IEEE Transactions on, vol. 38,
no. 4, pp. 898–900, Aug. 2008.
in its operating parameters are not impossible to happen. To [3] A. J. Laub, “A schur method for solving algebraic riccati equations,”
evaluate the impact of variations on system, it is defined a IEEE Transactions on Automatic Control, vol. 24, no. 6, pp. 913–921,
disturbance variable V ar that represents the plant changes, as 1979.
[4] T. Landelius, “Reinforcement Learning and Distributed Local Model
can be seen in Figure 4. Synthesis,” Ph.D. dissertation, Linköping University, Sweden, SE-581
The plant response to pulse variation is seen in the Figure 83 Linköping, Sweden, 1997, dissertation No 469, ISBN 91-7871-892-
4, these parameter changes are held in the interval from 1500 9.
[5] J. Murray, C. Cox, G. Lendaris, and R. Saeks, “Adaptive dynamic
to 2500 interactions (15-25 seconds). The disturbance leads programming,” Systems, Man, and Cybernetics, Part C: Applications and
the plant to another operational point, when the disturbance Reviews, IEEE Transactions on, vol. 32, no. 2, pp. 140 – 153, May 2002.
is ceased the system returns to scheduled operation. This test [6] X. Xu, H. He, and D. Hu, “Efficient reinforcement learning using
recursive least-squares methods,” in Department of Automatic Control.
shows the ability of the system to recovery from parametric National University of Defense Technology, ChangSha, Hunan, 410073,
changes. P.R.China, 2002.
[7] J. Boyan, “Least-squares temporal difference learning,” in Technical
update: least-squares temporal difference learning. Machine Learning,
Special Issue on Reinforcement Learning, to appear, 2002.
[8] F. Wu, X.-P. Zhang, P. Ju, and M. Sterling, “Decentralized nonlinear
control of wind turbine with doubly fed induction generator,” Power
Systems, IEEE Transactions on, vol. 23, no. 2, pp. 613 –621, may 2008.
[9] L. Barros, W. Mota, J. da Silva, and C. Barros, “An optimal control strat-
egy for dfig,” in Industrial Technology (ICIT), 2010 IEEE International
Conference on, march 2010, pp. 1727 –1732.
[10] J. Fang, G. Li, X. Liang, and M. Zhou, “An optimal control strategy
for reactive power in wind farms consisting of vscf dfig wind turbine
generator systems,” in Electric Utility Deregulation and Restructuring
and Power Technologies (DRPT), 2011 4th International Conference on,
july 2011, pp. 1709 –1715.
[11] J. Brewer, “Kronecker products and matrix calculus in system theory,”
Circuits and Systems, IEEE Transactions on, vol. 25, no. 9, pp. 772 –
781, Sep. 1978.
[12] K. J. Astrom and B. Wittenmark, Adaptive Control, 2nd ed. Boston,
MA, USA: Addison-Wesley Longman Publishing Co., Inc., 1994.
[13] V. P. Pinto, J. C. T. Campos, L. L. N. Dos Reis, C. B.
Jacobina, and N. Rocha, “Robustness and performance analysis for
Figure 4. System response to parametric variations the linear quadratic gaussian/loop transfer recovery with integral
action controller applied to doubly fed induction generators in
wind energy conversion systems,” Electric Power Components and
Systems, vol. 40, no. 2, pp. 131–146, 2011. [Online]. Available:
VI. C ONCLUSION http://www.tandfonline.com/doi/abs/10.1080/15325008.2011.629331
[14] A. Rosas, P. A C. e Estanqueiro, “Guia de projeto elétrico de centrais
In this article, we highlighted some insights on the on- eólicas, volume 1: Projeto elétrico e impacto de centrais eólicas na rede
line control system design. The problem was characterized elétrica.”
by Hamilton-Jacobi-Bellman Equation and parameterized for [15] J. V. da Fonseca Neto and L. R. Lopes, “On the convergence of
dlqr control system design and recurrences of riccati and lyapunov in
DLQR in the so called discrete algebraic Riccati equation. The dynamic programming strategies,” in Proceedings of the 13th UKSim-
optimal control law gains were given by approximations of the AMSS International Conference on Computer Modelling and Simulation,
DARE solution via HDP. The theory and development of a Cambridge University, Emmanuel College, Cambridge, UK, 30 March -
1 April 2011, D. Al-Dabass, A. Orsoni, R. Cant, and A. Abraham, Eds.
dedicated policy iteration algorithm were presented to evaluate IEEE, 2011, pp. 26–31.
the feasibility of proposed method in energy eolic plants. [16] J. Fonseca Neto and L. R. Lopes, “On the convergence of DLQR control
The proposed method has shown to be an alternative to and recurrences of riccati and lyapunov in dynamic programming,”
in UKSim 13th International Conference on Computer Modelling and
assign the eigenvalues of the dynamic system inside of unitary Simulation (UKSim2011), Cambridge, United Kingdom.
circle by estimating the DARE solution via the recursive least-
square.
ACKNOWLEDGMENT
The authors are indebted the Federal University of
Maranhão (UFMA), the Graduate Program in Electrical Engi-

3179
3185

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy