0% found this document useful (0 votes)

25 views6 pages

Deep Reinforcement Learning Energy Management

The document presents a deep reinforcement learning-based energy management system for electric vehicles utilizing multiple battery units. It addresses the challenges of efficiently managing energy resources to extend battery lifespan and reduce maintenance costs, while also optimizing performance in varying operating conditions. The proposed approach formulates the problem as a Markov decision process and employs reinforcement learning techniques to achieve optimal energy usage across the battery units.

Uploaded by

Murat Görmemiş

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

25 views6 pages

Deep Reinforcement Learning Energy Management

Uploaded by

Murat Görmemiş

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

Deep Reinforcement Learning Energy Management

System for Multiple Battery Based Electric Vehicles

Hicham Chaoui1 , Hamid Gualous2 , Loic Boulon3 and Sousso Kelouwani3
1
Intelligent Robotic and Energy Systems (IRES), Department of Electronics, Carleton University, Ottawa, ON, Canada
2
LUSAC Laboratory, University of Caen-Normandie, Cherbourg-Octeville, France
3
Hydrogen Research Institute, Université du Québec à Trois-Rivières, PQ, Canada
email: hicham.chaoui@carleton.ca

Abstract—In recent years, energy management systems have performance. Hence, energy management of such systems is
become an emerging research topic. This concept allows the a serious issue as it significantly influences the performance
distribution of energy-intensive loads among various energy of electric vehicles. But, their maintenance and their energy
sources. An appropriate resource allocation scheme is necessary
for the controller to efficiently allocate its energy resources management are getting more burdensome and costly. Ad-
in different operating conditions. Recent advances in artificial ditionally, the amount of energy of these devices is limited
intelligence are instrumental to solve complex energy manage- and hence, it has to be used efficiently. Not long ago, energy
ment problems by learning large repertoires of behavioral skills. management systems started to be given a thorough attention
This consists of hand-engineered policy and human-like expertise by the research community. Henceforth, optimal energy usage
representations. In this paper, a deep reinforcement learning
based resource allocation scheme is proposed for electric vehicles of energy storage devices is among the numerous challenges
avoiding to work at the level of complex vehicle dynamics. to be addressed, which raises the urgency to find alternative
Using multiple energy storage devices, like batteries, in parallel efficient energy deployment techniques to keep up with the
increases their maintenance due to their different behavior in growing energy demand.
various operating conditions. Thus, the proposed strategy aims to Various energy management methods have been proposed
learn optimal policies to equilibrate the state of charge (SOC) of
all batteries extending their lifespan and reducing their frequent throughout the years such as, dynamic programming (DP) [1],
maintenance. adaptive DP [2], optimal control [3] and soft-computing meth-
ods [4]–[7]. In [1], a hybrid trip model is presented to obtain
I. I NTRODUCTION the vehicle-speed trajectory for the trip path without GPS data.
With the aim to reduce carbon emissions, global demand Next, a DP-based EMS with prediction horizon is proposed.
for clean technologies and more sustainable modes of trans- Since DP is known for its heavy computational requirements,
portation is rising. Eliminating carbon emissions remains one search range optimization algorithm is used. To alleviate the
of the world’s major challenges and social pressure for a computational burden of DP, a stochastic DP is proposed
more sustainable future has never been higher. In an effort in [8]. In [9], better performance is achieved with a multiagent
to reduce the human-induced climate change, transitioning fuzzy logic strategy as no load profile a priori knowledge
to sustainable technologies is considered as a cure for our is required. Also, multiagents and particle swarm optimiza-
addiction to fossil fuel. The transportation sector has been tion [10], [11] have been used for optimal energy management
dominated for almost a century by combustion engines. In and hybrid distributed energy management systems. In [12],
the last decade, electric transportation has been experiencing [13], a hybrid approach combines between supercapacitors fast
rapid growth across the globe and its broad-scale adoption is dynamics and high batteries energy density for multiple energy
bringing significant societal changes. For many years, effective sources integration. Analytical optimization methods, such as
transportation solutions have been provided for a wide range Pontryagin’s minimum principle, find an analytical solution
of applications from golf carts and forklifts, to utility vehicles, using a mathematical problem formulation [14]. This makes
and now interest is spreading. the obtained solution faster than the purely numerical methods.
In vehicular technology, electric/electronic systems are tak- But, optimal solutions are generated offline and require that
ing over pneumatic, hydraulic, and mechanical systems. That the future driving conditions to be known in prior. In [15], an
is because clean energies have received a increasing interest energy management technique is proposed using fuzzy logic
as they have been considered for the last decade as a way for an embedded fuel-cell system. On the other hand, neural
of reversing climate change. Consequently, the development networks are suggested in [16] for hybrid electric vehicles
of electric vehicles is booming and researchers have an op- as an efficient energy management system. Using multiple
portunity to contribute to solutions to improve their energy energy sources, the energy requirement of hybrid vehicles
efficiency. An energy management system (EMS) is consid- can be easily managed. Recent advances in soft-computing
ered as the backbone in electric transportation systems that methodologies has led to the widespread of intelligent sys-
use multiple energy sources such as batteries, supercapacitors, tems [17]–[24]. But, neural networks remain incapable of
and fuel cells, which yield great flexibility in achieving higher incorporating any human-like expertise and fuzzy logic is

978-1-5386-6203-8/18/$31.00 ©2018 IEEE

Storage devices DC is bidirectional, the same principle applies for the backward
Battery = operation. Without loss of generality, the voltage drop across
unit 1 = the diode is neglected along with equivalent series resistances
of the IGBTs, inductor, and capacitor.
Battery =
unit 2 = =
Inverter
˜
DC bus idc iac
..
.
..
. DC-DC converter
AC motor
Battery = =
Electric Q1
unit n = vehicle
BUS is L
C Vdc Vac
Fig. 1. Multiple battery-based electric vehicle
Vs Q2

incapable of incorporating any learning already known about

˜
the system’s dynamics. Moreover, these tools are credited
for their robust approximation of mathematically ill-defined Fig. 2. DC-AC electric vehicle power system
systems that comes at the expense of heavy calculations.
This paper presents a model-free energy management strat-
egy for multiple battery units. The electric vehicle power idc
system consits of multiple DC-DC converters for bidirectional
energy transfer via a DC bus as illustrated in Fig. 1. Among
the functions of an energy management is to prevent batteries Q1
from over charging and discharging conditions. Thus, the is L
aim is to equilibrate energy usage in order to extend the C Vdc
lifespan of energy storage devices and reduce their frequent
maintenance and achieve a higher efficient operation of the Vs Q2
overall system. The main contributions of this paper are
summarized as follows: i) we formulate optimization problems
as a markov decision process (MDP) with well-designed states
and cost functions; and ii) the MDP mathematical modeling
framework is integrated in a deep reinforcement learning (RL) Fig. 3. DC-DC bidirectional converter
scheme to solve the optimization problem. The rest of the
Let us define the variable ρ such that ρ = 1 when Q1 = OFF
paper is arranged as follows: Section II describes the system
and Q2 = ON, and ρ = 0 when Q1 = ON and Q2 = OFF.
dynamic model. The proposed energy management technique
The average dynamic mathematical model of the DC-DC
is outlined in section III. Section IV presents the MDP and RL
converter can be described by:
framework used to solve the energy management optimization
problem. In Section V, simulation results are reported and d 1
is (t) = (Vs (t) − u Vdc (t))
discussed. We conclude with a few remarks and suggestions dt L (1)
for further studies pertaining to this important, yet complex, d 1
Vdc (t) = (u is (t) − io (t))
optimization problem. dt C
where C the capacitance, L is the inductance, is (t) the
II. DYNAMICS
inductor current, io (t) the inverter current, Vs (t) the supply
A DC-AC electric vehicle power system is shown in (Fig. 2). voltage, and Vdc (t) the DC bus voltage. The control action is
Since the scope of this work is on the DC-DC converter side, defined as u = 1 − ρ.
the inverter is consider as a load supplied by a DC bus voltage
source Vdc . As such, multiple bidirectional DC-DC converters III. E NERGY M ANAGEMENT S TRATEGY
are used as illustrated in Fig. 1, where pulse width modulator Due to the inherent uncertainties within a manufacturing
(PWM) is used to control IGBTs. A single DC-DC converter’s process, batteries (even from the same batch) shall exhibit
equivalent circuit is depicted in Fig. 3. As it is shown, when capacity variations. Consequently, these batteries do not age at
IGBT 2 is ON and IGBT 1 is OFF, the circuit is divided into the same rate while being operated in the same conditions. In
two independent parts allowing the current of the inductor real-life, these batteries cannot be all installed at the distance
and its energy to increase. When IGBT 1 is ON and IGBT from the load due to their footprint adding more uncertainty.
2 is OFF, the inductor’s energy decreases and its voltage is Thus, these batteries cannot have the same behavior since
combined with the input voltage to charge the output capacitor. an optimal operating condition for a given battery is not
This repetitive ON/OFF action boosts the output voltage with necessarily optimal for other batteries. As such, each battery
respect to the input voltage source. Since the DC-DC converter unit should be assigned its own time-varying operation rate
parameter since the inconsistency normally behaves differently IV. M ARKOV D ECISION P ROCESS AND D EEP
along different operating regions. It is worthwhile noting R EINFORCEMENT L EARNING
that using different time-varying operation rate parameters
Markov decision process (MDP) is a mathematical modeling
for these energy sources modifies in a fundamental way the
framework for decision making and can be used to solve
classical energy management system approach.
optimization problems [25]. It consists five components: 1)
(possibly infinite) set of states S; 2) set of actions A; 3) r(s, a)
ki Learning SOCi SOC isi : S ×A → R: reward function received after state transitioning
algorithm algorithm due to action a; 4) p(st+1 = s′ |st = s, at = a): transition
e1 i∗s1 es1 ρ1 probability that action a in state s at time t leads to state s′ at
PI - PI time t+1; and 5) γ ∈ [0, 1]: discount factor to balance between
+ is1 present and future rewards [26]. Online decision making
VDC
∗
e ei i∗s2 ρi involves a fundamental choice, i.e., i) exploitation: take the
-
Eq.(2) PI PI DC-DC
- converters best known decision given current information; ii) exploration:
is2
+
VDC
+
.. .. investigate a new direction to gather more information. This is
known as exploration/exploitation dilemma. Combining both
. .
i∗sn
PI PI methodologies by a discount factor γ yields better decision.
en - ρn In this paper, γ is set as a Boltzmann distribution. Initially,
+ isn
exploration is given a significant weight that is decreased
gradually over time so that the best solution will be selected
Fig. 4. Block diagram of the proposed control scheme more often. At state st in time t, the agent chooses and
executes action at according to its policy π(at |st ), transitions
The aim is to design an optimal energy management strategy to a new state st according to the dynamics p(st |st , at ) and
for efficient energy usage. For that, all battery units should receives a reward r(st , at ). The ∑ goal is to maximize some
∞
be kept at the same state of charge (SOC) in all operating cumulative reward function rt = t=0 γ t r(st , at ) [27].
conditions. In the absence of a closed-loop SOC control, Among reinforcement learning methods, Q-learning is used
some battery units tend to overcompensate for the remaining for its simplicity and popularity [28]. To solve the reinforce-
batteries leading to SOC drift and fast aging. Therefore, ment learning problem, the optimal state action value function
keeping an optimal operation of these battery units all through Q∗ (s, a) can be formulated as the maximum expectation of the
their lifetime reduces their heavy maintenance and extends cumulative reward function rt ,
their lifespan. Hence, every battery unit should handle the
Q∗ (s, a) = max {E [rt |st = s, at = a]} (3)
load according to its capability and benefit from the available π∈P
energy to charge at its own pace. Thus, battery units must be
where P is the policy space. According to Bellman Equa-
charged/discharged at their respective rate. Fig. 4 shows the
∗ tion, (3) can be rewritten as,
control scheme block diagram. Define e = Vdc −Vdc as the DC [ ]
∗
bus voltage tracking error with Vdc being the DC bus voltage ∗ ∗
Q (s, a) = E r + γ max {Q (s, a)|s, a} (4)
reference signal. This error signal should be distributed among ′ a ∈A
the n available units such that each energy source unit has its
own error signal ei , which is conveyed in a weighted style as, The estimation of Q∗ (s, a) is carried out using the following
iterative Q-learning updating rule,
[ ]
ei = ki e (2) ′ ′
Qt+1 (s, a) = Qt (s, a)+η r + γ max ′
{Qt (s , a ) − Qt (s, a)}
a ∈A
where ki ≥ 0 is the operation rate parameter ∑of the ith energy (5)
n
source (i = 1 . . . n) with the constraint of i=1 ki = 1. In where Qt (s, a) is the state-action value function at time step
essence, each energy source contributes to the minimization t and η [0, 1] is the learning rate.
of the tracking error e in proportion to its capacity set by its Since deep neural networks offer greater performance and
operation rate ki . Since parameter ki is applied in discharge more robust learning, a multi-layer deep convolutional neural
mode only, a different operation rate parameter k̄i should be network (CNN) is used in this work. It consists of an alternat-
used for charging mode, ing sequence of linear filtering and nonlinear transformation
operations. This topology offers hierarchical layers of tiled
1 − ki convolutional filters to exploit the local spatial correlations,
k̄i = ∀n>1
n−1 and make it possible to extract high-level features from raw
input data [29]. In this work, the input and output layers have
where k̄i ≥ 0 is limited only to positive values. Next, the the same size n. During the training process, samples batches
energy management deep reinforcement learning problem is are randomly presented to the CNN rather than directly using
formulated. the consecutive samples [30].
401.5
Reinforcement learning aims to control an agent attempting
to maximize a reward function (minimize a cost function) 401

DC bus voltage (V)

which, in the context of this problem, minimizes the deviation 400.5
between the SOC of all units. For that, the optimal reward is 400
obtained from historical data by iterative learning. Thus, the
399.5
learning process combines a sequence of actions with their
399
corresponding rewards. For each iteration, the expected effect
of taking different actions is evaluated by a value function. 398.5
0 2 4 6 8
Then, an action is selected and its associated reward is used Time (min)

to refine its value function. Typically, the reward is a metric of (a)

the benefit associated with an action. Next, we will elaborate 30
how to determine the operation rate parameter ki at for each Unit1
20 Unit2
time interval t to maximize the total reward. Considering a Unit3

Battery currents (A)

state st for n units that corresponds to their respective state of 10

charge, i.e., [SOC1 , SOC2 , . . ., SOCn ], the agent chooses an 0

action at = [k1 , k2 , . . . , kn ] according to the following policy -10

π(at |st ): (i) when the difference between SOCs within a state -20
st is significant, then the action ki is proportional to the state
-30
si . In other words, a high value is assigned for high SOC 0 2 4 6 8
Time (min)
units and a low value for low SOC units; and (ii) when the
difference between SOCs becomes negligible, the action is set (b)
to zero. In ∑ this policy, the action should satisfy the following
n 0.8 Converter1
condition, i=1 ki = 1. This way, the policy forces overtime Converter2
Converter3
energy source units to operate at the same rate. 0.7

Duty cycle 0.6

V. S IMULATION RESULTS AND D ISCUSSION
0.5
A. Setup
Without loss of generality, consider the case of three iden- 0.4

tical units (i.e., n = 3) to show the effectiveness of the 0.3

0 2 4 6 8
suggested energy management system. Thus, given the state Time (min)
st = [SOC1 , SOC2 , SOC3 ], the agent provides an action (c)
at = [k1 , k2 , k3 ] that corresponds to optimal
∑n operation rate
parameters subject to the restriction, i=1 i = 1. In this
k 80
Unit1
Unit2
case, the optimal action should reach overtime the following
State of charge (SOC)

Unit3

equilibrium, ki = k̄i = 13 . A set of numerical simulations 60

is performed on a system that consists of three identical

40
bidirectional DC-DC converters like the one shown in Fig. 3.
∗
The reference of the DC bus voltageis set to VDC = 400V and 20
both switching and sampling frequencies are set to 5 kHz.
The converters’ inductance is set to L = 250µH whereas 0
0 2 4 6 8
the capacitance is set to C = 1mF. The Coulomb counting Time (min)

method, also called Amphour (Ah) balancing method, is used (d)

for SOC estimation of all battery units. Since charging and
0.6 Converter1
discharging batteries is a slowly time-varying process, the Converter2
0.5 Converter3
system is simulated with batteries of a low capacity to be able
Parameters

to reduce the simulation time to 10 minutes instead of hours. 0.4

The system’s performance metrics are the DC bus voltage 0.3

VDC , the source currents is of the three converters, their 0.2
duty cycles ρ, state of charge SOC, and the operation rate 0.1
parameters.
0
0 2 4 6 8
B. Results Time (min)

The system’s dynamics is simulated using the inverter (e)

current as a critically damped second order system’s step Fig. 5. System response under 1 min time interval power demand: (a) DC bus
response with a natural frequency of 100 rad/s. The initial SOC voltage VDC ; (b) battery currents isi ; (c) duty cycle ρi ; (d) state of charge
of the three units is set to 20%, 40%, and 70%, respectively. SOCi ; and (e) parameters ki and k̄i .
Then, the system is exposed to a ±7.5kW periodic square 401.5

power demand with a period of 1 minute to allow energy 401

DC bus voltage (V)

units to operate in both charge and discharge modes. For 400.5
the first minute, no power demand is requested from the 400
system to see its behavior in equilibrium. As expected with
399.5
results shown in Fig. 5, no current flows between battery
399
units since energy transfer among batteries themselves results
in inefficient operation due to loss of energy. Afterward, the 398.5
0 2 4 6 8
square power demand that corresponds to ±10A current from Time (min)

each battery unit is applied. As it can be seen in Fig. 5(a), (a)

all battery units are able to gradually decay the DC bus 30
voltage tracking error to zero after each power demand step. Unit1
20 Unit2
It is important to note that this is achieved by using different Unit3

Battery currents (A)

operation rate parameters according to battery units SOC. As 10

it shown in Fig. 5(b), different currents are observed from all 0

battery units. The difference between all battery units SOC -10
gradually decreases over time to finally settle at zero as it -20
can be observed in Fig. 5(d). For that, the deep reinforcement
-30
learning based resource allocation scheme assigns a operation 0 2 4 6 8

rate parameter of a 13 making all battery units operating at the Time (min)

rate, which is confirmed by identical square wave currents of (b)

±10A.
0.8 Converter1
Next, The power demand period is changed from 1 to 2 Converter2
Converter3
minutes to show the effectiveness of the approach in a different 0.7

operating condition. As expected, the reinforcement learning Duty cycle 0.6

approach enables the system to learn the optimal actions,
0.5
i.e., operating rate parameters, for all units that yield to SOC
equilibrium. In this paper, the well-known Coulomb counting 0.4

technique is used for simplicity to estimate SOC. But, more 0.3

0 2 4 6 8
accurate estimation methods are available in the literature [31], Time (min)
[32]. (c)

VI. C ONCLUSION Unit1

80
Unit2
State of charge (SOC)

In this paper, a reinforcement learning based resource allo- Unit3

60
cation scheme is introduced for multiple battery unit electric
vehicles. The suggested strategy achieves energy usage equi- 40
librium for various battery units by human-like expertise rep-
resentations. A Markov decision process offers the framework 20

for the implementation a hand-engineered policy along with

0
a reinforcement learning based resource allocation algorithm. 0 2 4 6 8
Time (min)
Appropriate resource allocation is instrumental to efficiently
allocate different energy resources at various operating con- (d)
ditions. This extends the life expectancy of energy storage
0.6 Converter1
devices and decreases their recurrent maintenance. In multiple Converter2
0.5 Converter3
battery unit systems, when few battery units reach their end
Parameters

0.4
of life (usually due to accelerated aging), all batteries within
that system are replace to preserve its overall integrity. The 0.3

proposed energy management approach prevents these battery 0.2

units from premature aging and allows the use of units with 0.1
different capacities and from different manufacturers. 0
0 2 4 6 8
Time (min)
R EFERENCES
(e)
[1] J. Liu, Y. Chen, W. Li, F. Shang, and J. Zhan, “Hybrid-Trip-Model-
Based Energy Management of a PHEV With Computation-Optimized Fig. 6. System response under 2 min time interval power demand: (a) DC bus
Dynamic Programming,” IEEE Transactions on Vehicular Technology, voltage VDC ; (b) battery currents isi ; (c) duty cycle ρi ; (d) state of charge
vol. 67, no. 1, pp. 338–353, Jan. 2018. SOCi ; and (e) parameters ki and k̄i .
[2] Q. Wei, F. L. Lewis, G. Shi, and R. Song, “Error-Tolerant Iterative Learning,” IEEE Transactions on Intelligent Transportation Systems,
Adaptive Dynamic Programming for Optimal Renewable Home Energy vol. 19, no. 4, pp. 1198–1207, Apr. 2018.
Scheduling and Battery Management,” IEEE Transactions on Industrial [23] H. Chaoui and P. Sicard, “Fuzzy Logic Based Supervisory Energy
Electronics, vol. 64, no. 12, pp. 9527–9537, Dec. 2017. Management for Multisource Electric Vehicles,” in IEEE Vehicle Power
[3] S. Delprat, T. Hofman, and S. Paganelli, “Hybrid Vehicle Energy Man- and Propulsion Conference, 2011.
agement: Singular Optimal Control,” IEEE Transactions on Vehicular [24] H. Chaoui, S. Miah, and P. Sicard, “Adaptive Fuzzy Logic Control of
Technology, vol. 66, no. 11, pp. 9654–9666, Nov. 2017. a DC-DC Boost Converter with Large Parametric and Load Uncertain-
[4] Z. Chen, C. C. Mi, J. Xu, X. Gong, and C. You, “Energy Management ties,” in IEEE/ASME Advanced Intelligent Mechatronics International
for a Power-Split Plug-in Hybrid Electric Vehicle Based on Dynamic Conference, 2010.
Programming and Neural Networks,” IEEE Transactions on Vehicular [25] N. D. Nguyen, T. Nguyen, and S. Nahavandi, “System Design Perspec-
Technology, vol. 63, no. 4, pp. 1567–1580, Apr. 2014. tive for Human-Level Agents Using Deep Reinforcement Learning: A
[5] A. Arabali, M. Ghofrani, M. Etezadi-Amoli, M. S. Fadali, and Y. Bagh- Survey,” IEEE Access, vol. 5, pp. 27 091–27 102, 2017.
zouz, “Genetic-Algorithm-Based Optimization Approach for Energy [26] L. Li, Y. Lv, and F.-Y. Wang, “Traffic signal timing via deep reinforce-
Management,” IEEE Transactions on Power Delivery, vol. 28, no. 1, ment learning,” IEEE/CAA Journal of Automatica Sinica, vol. 3, no. 3,
pp. 162–170, Jan. 2013. pp. 247–254, July 2016.
[6] E. Kamal and L. Adouane, “Intelligent Energy Management Strategy [27] S. S. Mousavi, M. Schukat, and E. Howley, “Traffic light control using
Based on Artificial Neural Fuzzy for Hybrid Vehicle,” IEEE Transac- deep policy-gradient and value-function-based reinforcement learning,”
tions on Intelligent Vehicles, vol. 3, no. 1, pp. 112–125, Mar. 2018. IET Intelligent Transport Systems, vol. 11, no. 7, pp. 417–423, 2017.
[7] R.-J. Wai, S.-J. Jhung, J.-J. Liaw, and Y.-R. Chang, “Intelligent Optimal [28] T. de Bruin, J. Kober, K. Tuyls, and R. Babuka, “Integrating State
Energy Management System for Hybrid Power Sources Including Fuel Representation Learning Into Deep Reinforcement Learning,” IEEE
Cell and Battery,” IEEE Transactions on Power Electronics, vol. 28, Robotics and Automation Letters, vol. 3, no. 3, pp. 1394–1401, July
no. 7, pp. 3231–3244, July 2013. 2018.
[8] S. J. Moura, H. K. Fathy, D. S. Callaway, and J. L. Stein, “A stochastic [29] D. Zhao, Y. Chen, and L. Lv, “Deep Reinforcement Learning With
optimal control approach for power management in plug-in hybrid Visual Attention for Vehicle Classification,” IEEE Transactions on
electric vehicles,” IEEE Transactions on Control Systems Technology, Cognitive and Developmental Systems, vol. 9, no. 4, pp. 356–367, Dec.
vol. 19, no. 3, pp. 545–555, May 2011. 2017.
[9] K. Manickavasagam, “Intelligent Energy Control Center for Distributed [30] H. Chaoui, C. Ibe-Ekeocha, and H. Gualous, “Aging Prediction and State
Generators Using Multi-Agent System,” IEEE Power Systems, vol. 30, of Charge Estimation of a LiFePO4 Battery using Input Time-Delayed
no. 5, pp. 2442–2449, Sep. 2015. Neural Networks,” Electric Power Systems Research, Elsevier, vol. 146,
[10] M. Mao, P. Jin, N. D. Hatziargyriou, and L. Chang, “Multiagent-Based pp. 189–197, May 2017.
Hybrid Energy Management System for Microgrids,” IEEE Transactions [31] H. Chaoui, N. Golbon, I. Hmouz, R. Souissi, and S. Tahar, “Lyapunov-
on Sustainable Energy, vol. 5, no. 3, pp. 938–946, July 2014. Based Adaptive State of Charge and State of Health Estimation for
[11] V.-H. Bui, A. Hussain, and H.-M. Kim, “A Multiagent-Based Hierar- Lithium-Ion Batteries,” IEEE Trans. Ind. Electron., vol. 62, no. 3, pp.
chical Energy Management Strategy for Multi-Microgrids Considering 1610–1618, Mar. 2015.
Adjustable Power and Demand Response,” IEEE Transactions on Smart [32] H. Chaoui and S. Mandalapu, “Comparative Study of Online Open
Grid, vol. 9, no. 2, pp. 1323–1333, Mar. 2018. Circuit Voltage Estimation Techniques for State of Charge Estimation
[12] N. Mendis, K. M. Muttaqi, and S. Perera, “Management of Battery- of Lithium-Ion Batteries,” Batteries, vol. 3, pp. 1–13, Apr. 2017.
Supercapacitor Hybrid Energy Storage and Synchronous Condenser
for Isolated Operation of PMSG Based Variable-Speed Wind Turbine
Generating Systems,” IEEE Transactions on Smart Grid, vol. 5, no. 2,
pp. 944–953, Mar. 2014.
[13] U. Akram, M. Khalid, and S. Shafiq, “An Innovative Hybrid Wind-
Solar and Battery-Supercapacitor Microgrid SystemDevelopment and
Optimization,” IEEE Access, vol. 5, pp. 25 897–25 912, 2017.
[14] S. Teleke, M. Baran, S.Bhattacharya, and A. Huang, “Optimal Control of
Battery Energy Storage for Wind Farm Dispatching,” IEEE Transactions
on Energy Conversion, vol. 25, no. 3, pp. 787–794, Sep. 2010.
[15] M. Tekin, D. Hissel, M.-C. Pera, and J. Kauffmann, “Energy-
Management Strategy for Embedded Fuel-Cell Systems Using Fuzzy
Logic,” IEEE Transactions on Industrial Electronics, vol. 54, no. 1, pp.
595–603, Feb. 2007.
[16] J. Moreno, M. Ortuzar, and J. Dixon, “Energy-management system for a
hybrid electric vehicle, using ultracapacitors and neural networks,” IEEE
Transactions on Industrial Electronics, vol. 53, no. 2, pp. 614–623, Mar.
2006.
[17] H. Chaoui, M. Khayamy, and O. Okoye, “Adaptive RBF Network Based
Speed Control for Interior PMSM Drives without Current Sensing,”
IEEE Trans. Veh. Technol., vol. PP, no. 99, pp. 1–1, In Press 2018.
[18] H. Chaoui and C. Ibe-Ekeocha, “State of Charge and State of Health
Estimation for Lithium Batteries using Recurrent Neural Networks,”
IEEE Trans. Veh. Technol., vol. 66, no. 10, pp. 8773–8783, Oct. 2017.
[19] H. Chaoui, M. Khayamy, and A. A. Aljarboua, “Adaptive Interval Type-
2 Fuzzy Logic Control for PMSM Drives with a Modified Reference
Frame,” IEEE Trans. Ind. Electron., vol. 64, no. 5, pp. 3786–3797, May
2017.
[20] H. Chaoui, B. Hamane, and M. L. Doumbia, “Adaptive Control of
Venturini Modulation Based Matrix Converters Using Interval Type-2
Fuzzy Sets,” Journal of Control, Automation and Electrical Systems,
Springer, vol. 7, no. 2, pp. 132–143, Apr. 2016.
[21] H. Chaoui and P. Sicard, “Adaptive Fuzzy Logic Control of Permanent
Magnet Synchronous Machines with Nonlinear Friction,” IEEE Trans.
Ind. Electron., vol. 59, no. 2, pp. 1123–1133, Feb. 2012.
[22] F. Belletti, D. Haziza, G. Gomes, and A. M. Bayen, “Expert Level
Control of Ramp Metering Based on Multi-Task Deep Reinforcement

SYSCAL Pro Users Manual SYSCAL Pro Stand
No ratings yet
SYSCAL Pro Users Manual SYSCAL Pro Stand
114 pages
Power Management Strategy For Battery Electric Vehicles: Hedra Saleeb, Khairy Sayed, Ahmed Kassem, Ramadan Mostafa
No ratings yet
Power Management Strategy For Battery Electric Vehicles: Hedra Saleeb, Khairy Sayed, Ahmed Kassem, Ramadan Mostafa
10 pages
Design and Construction of An Energy Management
No ratings yet
Design and Construction of An Energy Management
6 pages
Novel Machine Learning Control For Power Management Using
No ratings yet
Novel Machine Learning Control For Power Management Using
28 pages
1 s2.0 S2773153722000287 Main
No ratings yet
1 s2.0 S2773153722000287 Main
15 pages
1 s2.0 S2352152X20315589 Main
No ratings yet
1 s2.0 S2352152X20315589 Main
9 pages
Fernando Tola &amp Carmen Dragonetti - Trisvabhāvakārikā of Vasubandhu
100% (2)
Fernando Tola &amp Carmen Dragonetti - Trisvabhāvakārikā of Vasubandhu
43 pages
Dynamic Programming-Based Control System Development For Advanced Electric Power Drive
No ratings yet
Dynamic Programming-Based Control System Development For Advanced Electric Power Drive
12 pages
1 s2.0 S0360544223015128 Main
No ratings yet
1 s2.0 S0360544223015128 Main
36 pages
Unit 4 2
No ratings yet
Unit 4 2
24 pages
Equilibrium:: The Extent of Chemical Reactions
No ratings yet
Equilibrium:: The Extent of Chemical Reactions
48 pages
UNIT-5 Updated EHV-1
No ratings yet
UNIT-5 Updated EHV-1
24 pages
1 s2.0 S0360544222017091 Main
No ratings yet
1 s2.0 S0360544222017091 Main
13 pages
EV Unit 5
No ratings yet
EV Unit 5
11 pages
Maestro XS Reference Manual Version 2.0 PDF
33% (3)
Maestro XS Reference Manual Version 2.0 PDF
130 pages
Electric Vehicles: Assignment - Ii
No ratings yet
Electric Vehicles: Assignment - Ii
13 pages
Photovoltaic - Battery Operated Electric Vehicle W
No ratings yet
Photovoltaic - Battery Operated Electric Vehicle W
10 pages
4 Hybrid-Energy-Management-On-Electric-Vehicles-For-Power - 2022 - Environmental-C
No ratings yet
4 Hybrid-Energy-Management-On-Electric-Vehicles-For-Power - 2022 - Environmental-C
6 pages
Hybrid Propulsion
No ratings yet
Hybrid Propulsion
8 pages
Literature Review For Research Problem Statement
100% (1)
Literature Review For Research Problem Statement
18 pages
Xue 2020
No ratings yet
Xue 2020
4 pages
Energies 16 05023 v2
No ratings yet
Energies 16 05023 v2
23 pages
Passivity-Based Nonlinear Control Approach For Efficient Energy Management in Fuel Cell Hybrid Electric Vehicles
No ratings yet
Passivity-Based Nonlinear Control Approach For Efficient Energy Management in Fuel Cell Hybrid Electric Vehicles
22 pages
Optimal Control Strategy For Parallel Plug-In Hybrid Electric Vehicles Based On Dynamic Programming
No ratings yet
Optimal Control Strategy For Parallel Plug-In Hybrid Electric Vehicles Based On Dynamic Programming
17 pages
EV Demand KSR Phase 2
No ratings yet
EV Demand KSR Phase 2
68 pages
Energy Management System in HEV Using PI Controller
No ratings yet
Energy Management System in HEV Using PI Controller
5 pages
Challoob Et Al 2023 Energy and Battery Management Systems For Electrical Vehicles A Comprehensive Review Recommendations
No ratings yet
Challoob Et Al 2023 Energy and Battery Management Systems For Electrical Vehicles A Comprehensive Review Recommendations
32 pages
Energy Saving of Battery Electric Vehicl
No ratings yet
Energy Saving of Battery Electric Vehicl
26 pages
Egy D 22 12752
No ratings yet
Egy D 22 12752
43 pages
Unit-4 EMS and RB
No ratings yet
Unit-4 EMS and RB
28 pages
1 s2.0 S1018363921000428 Main
No ratings yet
1 s2.0 S1018363921000428 Main
7 pages
Optimal Energy Management Strategy Based On Neural Network Algorithm For Fuel Cell Hybrid Vehicle Considering Fuel Cell Lifetime and Fuel Consumption
No ratings yet
Optimal Energy Management Strategy Based On Neural Network Algorithm For Fuel Cell Hybrid Vehicle Considering Fuel Cell Lifetime and Fuel Consumption
24 pages
LiangGUO IECON2021 V2.1
No ratings yet
LiangGUO IECON2021 V2.1
7 pages
2022 - (Conf) Real-Time Power Management Strategy of BAT-SC HESS For EV
No ratings yet
2022 - (Conf) Real-Time Power Management Strategy of BAT-SC HESS For EV
11 pages
Renewable Energy Integrated DC Microgrid For EV Charging Station
No ratings yet
Renewable Energy Integrated DC Microgrid For EV Charging Station
6 pages
Electronics 09 01277 v2
No ratings yet
Electronics 09 01277 v2
19 pages
Mini Report
No ratings yet
Mini Report
10 pages
Asce 7-22 CH 01 - For PC
100% (2)
Asce 7-22 CH 01 - For PC
17 pages
A Seminar Presentation
No ratings yet
A Seminar Presentation
32 pages
Playing Changes
100% (2)
Playing Changes
13 pages
1 s2.0 S0360544222000858 Main
No ratings yet
1 s2.0 S0360544222000858 Main
15 pages
Storage System For A Hybrid Renewable Power System
No ratings yet
Storage System For A Hybrid Renewable Power System
29 pages
A Machine Learning-Based Energy Optimization Syste
No ratings yet
A Machine Learning-Based Energy Optimization Syste
8 pages
Implementationofenergymanagement 161125085142 PDF
No ratings yet
Implementationofenergymanagement 161125085142 PDF
10 pages
An Optimal Control-Based Strategy For Energy Management of Electric Vehicles Using Battery Supercapacitor
No ratings yet
An Optimal Control-Based Strategy For Energy Management of Electric Vehicles Using Battery Supercapacitor
6 pages
Energy Management System Using Battery and Ultra Capacitor by D.C.-D.C Converter
No ratings yet
Energy Management System Using Battery and Ultra Capacitor by D.C.-D.C Converter
9 pages
Xiang 2017
No ratings yet
Xiang 2017
14 pages
Optimal Energy Management and Sizing of A Dual Motor-Driven Electric Powertrain
No ratings yet
Optimal Energy Management and Sizing of A Dual Motor-Driven Electric Powertrain
13 pages
Using Third Overtone Crystals
No ratings yet
Using Third Overtone Crystals
11 pages
SAP2000 Tutorial Example: Analysis and Design of Continuous RC Beam
No ratings yet
SAP2000 Tutorial Example: Analysis and Design of Continuous RC Beam
21 pages
Bojowald - Canonical Gravity and Applications: Cosmology, Black Holes, and Quantum Gravity
100% (1)
Bojowald - Canonical Gravity and Applications: Cosmology, Black Holes, and Quantum Gravity
313 pages
EGY D 23 00909 - Reviewer
No ratings yet
EGY D 23 00909 - Reviewer
25 pages
MA26 Meter & MP-T1 Pulser: Document Ref 903158-001 Rev - 1 10/2001
100% (1)
MA26 Meter & MP-T1 Pulser: Document Ref 903158-001 Rev - 1 10/2001
28 pages
12 Ds 13 Fs 1 D
No ratings yet
12 Ds 13 Fs 1 D
2 pages
Install Active-Directory
No ratings yet
Install Active-Directory
4 pages
1 s2.0 S004579062300407X Main
No ratings yet
1 s2.0 S004579062300407X Main
20 pages
Constrained EV Charging Scheduling Based On Safe Deep Reinforcement Learning
No ratings yet
Constrained EV Charging Scheduling Based On Safe Deep Reinforcement Learning
3 pages
An ANFIS-based ECMS For Energy Optimization of Parallel Hybrid Electric Bus
No ratings yet
An ANFIS-based ECMS For Energy Optimization of Parallel Hybrid Electric Bus
11 pages
Energy-Efficiency Optimization and Control For Ele
No ratings yet
Energy-Efficiency Optimization and Control For Ele
15 pages
1 s2.0 S1755008424000565 Main
No ratings yet
1 s2.0 S1755008424000565 Main
41 pages
PowerWave Observer
No ratings yet
PowerWave Observer
21 pages
JOTRON TRON UAIS TR-2500 - Operation - Installation Manual
No ratings yet
JOTRON TRON UAIS TR-2500 - Operation - Installation Manual
77 pages
Journal of Electrical Engineering
No ratings yet
Journal of Electrical Engineering
20 pages
V4i3 1616 PDF
No ratings yet
V4i3 1616 PDF
6 pages
Water Resource Systems Planning and Management Daniel P. Loucks & Eelco Van Beek
No ratings yet
Water Resource Systems Planning and Management Daniel P. Loucks & Eelco Van Beek
69 pages
OpenText File System Archiving 10.2.0 Release Notes
No ratings yet
OpenText File System Archiving 10.2.0 Release Notes
13 pages
An Energy Management Strategy For A Concept Battery Ultracapacitor Electric Vehicle With Improved Battery Life
No ratings yet
An Energy Management Strategy For A Concept Battery Ultracapacitor Electric Vehicle With Improved Battery Life
10 pages
Best Practice Catalog: Machine Condition Monitoring
No ratings yet
Best Practice Catalog: Machine Condition Monitoring
18 pages
Simulation of DC DC Converter in Interfa
No ratings yet
Simulation of DC DC Converter in Interfa
8 pages
Noblelft FD20-35 Operation & Maintenance Manual
No ratings yet
Noblelft FD20-35 Operation & Maintenance Manual
108 pages
Visual Basic 6.0
No ratings yet
Visual Basic 6.0
9 pages
Optimal Scheduling of Smart Microgrids Considering Electric Vehicle Battery Swapping Stations
No ratings yet
Optimal Scheduling of Smart Microgrids Considering Electric Vehicle Battery Swapping Stations
15 pages
Energies 17 01696
No ratings yet
Energies 17 01696
39 pages
Module 2 Lab: Creating Data Types and Tables
No ratings yet
Module 2 Lab: Creating Data Types and Tables
5 pages
Formulation In-Vitro Evaluation of Sulfanilamide 15% Vaginal Cream
No ratings yet
Formulation In-Vitro Evaluation of Sulfanilamide 15% Vaginal Cream
3 pages
Musical Elements Table
No ratings yet
Musical Elements Table
3 pages
SL No. Item Decription Unit Qty Unit Rate Total Supply Cost in Rs. Unit Rate Total Erection Cost in Rs. Supply Portion Erection Portion
No ratings yet
SL No. Item Decription Unit Qty Unit Rate Total Supply Cost in Rs. Unit Rate Total Erection Cost in Rs. Supply Portion Erection Portion
1 page
The Theory of Space Time Warping
No ratings yet
The Theory of Space Time Warping
9 pages
Bms Battery Charging
No ratings yet
Bms Battery Charging
12 pages
191008-Elmsbrook BREEAM Daylighting-Rev03
No ratings yet
191008-Elmsbrook BREEAM Daylighting-Rev03
10 pages
Parameter Estimation of A Plucked String Synthesis Model Using A Genetic Algorithm With Perceptual Fitness Calculation
No ratings yet
Parameter Estimation of A Plucked String Synthesis Model Using A Genetic Algorithm With Perceptual Fitness Calculation
15 pages
Caso Blue Mountain Coffee ADBUDG
No ratings yet
Caso Blue Mountain Coffee ADBUDG
16 pages
SSRN 4204249
No ratings yet
SSRN 4204249
9 pages
Tut 2
No ratings yet
Tut 2
2 pages
Soal Pts GASAL 2023
No ratings yet
Soal Pts GASAL 2023
1 page
Including:: 4 Authors
No ratings yet
Including:: 4 Authors
34 pages
Cls 8 - Math D - Term 1 - LP 4
No ratings yet
Cls 8 - Math D - Term 1 - LP 4
2 pages
Expert Crafting of Ships Electric Energy Systems
From Everand
Expert Crafting of Ships Electric Energy Systems
Mukesh Rajan
No ratings yet
Microsoft Excel-Based Tool Kit for Planning Hybrid Energy Systems: A User Guide
From Everand
Microsoft Excel-Based Tool Kit for Planning Hybrid Energy Systems: A User Guide
Asian Development Bank
No ratings yet
Analog Dialogue, Volume 46, Number 3: Analog Dialogue, #7
From Everand
Analog Dialogue, Volume 46, Number 3: Analog Dialogue, #7
Analog Dialogue
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Deep Reinforcement Learning Energy Management

Uploaded by

Deep Reinforcement Learning Energy Management

Uploaded by

Deep Reinforcement Learning Energy Management

System for Multiple Battery Based Electric Vehicles

978-1-5386-6203-8/18/$31.00 ©2018 IEEE

incapable of incorporating any learning already known about

DC bus voltage (V)

to refine its value function. Typically, the reward is a metric of (a)

Battery currents (A)

charge, i.e., [SOC1 , SOC2 , . . ., SOCn ], the agent chooses an 0

action at = [k1 , k2 , . . . , kn ] according to the following policy -10

Duty cycle 0.6

tical units (i.e., n = 3) to show the effectiveness of the 0.3

equilibrium, ki = k̄i = 13 . A set of numerical simulations 60

is performed on a system that consists of three identical

method, also called Amphour (Ah) balancing method, is used (d)

to reduce the simulation time to 10 minutes instead of hours. 0.4

The system’s performance metrics are the DC bus voltage 0.3

The system’s dynamics is simulated using the inverter (e)

power demand with a period of 1 minute to allow energy 401

DC bus voltage (V)

each battery unit is applied. As it can be seen in Fig. 5(a), (a)

Battery currents (A)

it shown in Fig. 5(b), different currents are observed from all 0

rate, which is confirmed by identical square wave currents of (b)

operating condition. As expected, the reinforcement learning Duty cycle 0.6

technique is used for simplicity to estimate SOC. But, more 0.3

VI. C ONCLUSION Unit1

In this paper, a reinforcement learning based resource allo- Unit3

for the implementation a hand-engineered policy along with

proposed energy management approach prevents these battery 0.2

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.