0% found this document useful (0 votes)
13 views12 pages

An Application of Stochastic Maximum Principle For

This research article explores a stochastic control problem involving a constrained system with memory, modeled using Stochastic Differential Delay Equations and a jump-diffusion process. Two theorems are developed to address deterministic and stochastic Lagrange multipliers, applicable across various fields including finance and engineering. The findings are applied to optimize a company's consumption process based on its wealth and historical performance, utilizing the stochastic maximum principle.

Uploaded by

sangramkhavare9
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views12 pages

An Application of Stochastic Maximum Principle For

This research article explores a stochastic control problem involving a constrained system with memory, modeled using Stochastic Differential Delay Equations and a jump-diffusion process. Two theorems are developed to address deterministic and stochastic Lagrange multipliers, applicable across various fields including finance and engineering. The findings are applied to optimize a company's consumption process based on its wealth and historical performance, utilizing the stochastic maximum principle.

Uploaded by

sangramkhavare9
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

http://communications.science.ankara.edu.

tr

Commun.Fac.Sci.Univ.Ank.Ser. A1 Math. Stat.


Volume 74, Number 1, Pages 150–161 (2025)
https://doi.org/10.31801/cfsuasmas.1512961
ISSN 1303-5991 E-ISSN 2618-6470

Research Article; Received: July 9, 2024; Accepted: January 2, 2025

An application of stochastic maximum principle for a constrained system with memory

Emel SAVKU
Department of Computer Engineering, Atilim University, Ankara, TÜRKİYE

Abstract. In this research article, we study a stochastic control problem in a theoretical frame to solve
a constrained task under memory impact. The nature of memory is modeled by Stochastic Differential
Delay Equations and our state process evolves according to a jump-diffusion process with time-delay.
We work on two specific types of constraints, which are described in the stochastic control problem
as running gain components. We develop two theorems for corresponding deterministic and stochastic
Lagrange multipliers. Furthermore, these theorems are applicable to a wide range of continuous-time
stochastic optimal control problems in a diversified scientific area such as Operations Research, Biology,
Computer Science, Engineering and Finance. Here, in this work, we apply our results to a financial
application to investigate the optimal consumption process of a company via its wealth process with
historical performance. We utilize the stochastic maximum principle, which is one of the main methods
of continuous-time Stochastic Optimal Control theory. Moreover, we compute a real-valued Lagrange
multiplier and clarify the relation between this value and the specified constraint.

2020 Mathematics Subject Classification. 93E20, 93E03, 49N90, 60G07, 60J76, 91B16, 91G80.
Keywords. Stochastic optimal control, stochastic maximum principle, stochastic differential delay equa-
tions, Lagrange multiplier, anticipated backward stochastic differential equations

1. Introduction and Unconstrained Control Problem


Stochastic Optimal Control theory is one of the main fields of sequential decision-making under uncer-
tainty. Its fundamental goal is to determine the optimal control processes and the optimal value function
for a specified control task, see [21,22,35]. The state process of a control problem is generally represented
by a diffusion process, a jump-diffusion process or by a larger model such as a regime-switching process,
see [2, 11, 15, 25, 28, 30, 33]. These processes meet specific mathematical requirements of each problem
in a wide range of scientific disciplines such as finance, insurance, biology computer science, engineering
etc. Whenever the uncertainty in an application can be expressed as a continuous-time process, diffusion
processes can be used effectively. On the other side, in real-life applications, we usually require discontin-
uous formulations and in those cases, jump-diffusion processes and regime-switching models well-describe
sudden changes in the process as well as in the environment.
Especially, in financial applications, the state processes may represent the price process of a risky
asset, the wealth process of a company, the surplus process of an insurance policy, etc. Furthermore,
since stochastic control theory provides quite strong tools to handle uncertainty and to develop optimal
feedback controls, it is widely utilized in quantitative finance, see [6, 9, 12, 16, 26, 27, 34]. In this work, we
use a jump-diffusion model to present the wealth process of a company and it is well known that such
models efficiently describe the abrupt changes in the dynamics of a risky asset (for a broad literature,
see [3]). The probabilistic literature for jump processes has been extensively developed and applied in
financial mathematics so far, see also [1].
Moreover, in our article, we study a stochastic control problem with memory and constraints. The
memory component is represented by a time-delay term, δ > 0, in the dynamics of a Stochastic Dif-
ferential Delay Equation (SDDE) (for a comprehensive theory of such equations, see [17]). Moreover,
SDDEs express real-life financial phenomena more realistically with a meaning of historical performance
of risky assets, economic inertia, time lag in financial operations. Hence, such systems have got significant

esavku@gmail.com; 0000-0001-8731-2928. 2025 Ankara University


Communications Faculty of Sciences University of Ankara Series A1 Mathematics and Statistics
This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.

150
A CONSTRAINED SYSTEM WITH MEMORY 151

attention from the researchers in the Stochastic Optimal Control field so far, see [7, 8, 14, 20, 23, 25, 28]
and references therein.
Let us introduce the technical details and mathematical structure of our work:
As we stated, we use a jump-diffusion process with delay as the state process of our control task (for
a detailed theory of continuous-time stochastic processes, see [1, 13, 19] and references therein).
Let R0 := R \ {0} be. B0 represents a Borel σ-field generated by the open subset O of R0 , whose
closure does not include the point 0.
Let (N (dt, dz) : t ∈ [0, T ], z ∈ R0 ) be a Poisson random measure on ([0, T ] × R0 , B([0, T ]) ⊗ B0 ). The
Lévy measure of N (·, ·) is defined by ν and Ñ (dt, dz) := N (dt, dz) − ν(dz)dt is a compensated Poisson
random measure.
Let (W (t) : t ∈ [0, T ]) be a Brownian motion. (Ω, F, Ft , P) represents a complete filtered probability
space generated by the Brownian motion W (·) and the Poisson random measure N (·, ·). We define
F = (Ft : t ∈ [0, T ]) as a right-continuous, P-completed filtration and assume that the Brownian motion
and the Poisson random measure are independent of each other and adapted to F.
We follow a controlled jump-diffusion model with a constant delay term δ > 0, which is one of the
most general representations of such systems and is introduced in [20] as follows:
dX(t) =b(t, X(t), Y (t), A(t), u(t))dt + σ(t, X(t), Y (t), A(t), u(t))dW (t)
Z
+ η(t, X(t), Y (t), A(t), u(t), z)Ñ (dt, dz) (1)
R0
X(t) =θ(t), t ∈ [−δ, 0],
where for t ∈ [0, T ],
Z t
Y (t) = X(t − δ), A(t) = e−ρ(t−r) X(r)dr.
t−δ

The coefficient functions of the model are defined as:


b : [0, T ] × R × R × R × U → R,
σ : [0, T ] × R × R × R × U → R,
η : [0, T ] × R × R × R × U × R0 → R,
and generally, in financial applications, b, σ, and η represent appreciation rate, volatility and jump size
of a risky asset correspondingly.
Moreover, for example, while Brownian motion W (·) catches little shocks in the price process of
an asset, the Poisson random measure N (·, ·) captures the jumps of that process, which occur as a
consequence of abrupt changes, sudden news or big sell/buy orders in the financial markets.
In this model, we observe the memory component in the dynamics of the system as Y (·) and A(·)
terms. Note that for the systems described by SDDEs, rather than an initial value, we need an initial
path. θ(·) represents the initial path and is a continuous, deterministic function. Here, ρ ≥ 0 is a constant
averaging parameter.
We assume that U is a non-empty subset of R and represents a set of admissible control values u(t),
t ∈ [0, T ]. We define an admissible control process u(·) as a U-valued, Ft -measurable and càdlàg process
such that the Equation (1) has a unique solution X(·) ∈ L2 (ξ × P), where ξ represents the Lebesgue
measure on [0, T ]. Let A denote a family of admissible control processes (for more detail, see [20]).
Moreover, we assume that
Z T 
2
E |u(t)| dt < ∞.
0

For all u ∈ A, let us define the objective criterion in the classical sense (for a broad survey of the
Stochastic Optimal Control theory, see [21, 22, 32] and references therein) as follows:
J(u) = J(x, y, a, u)
Z T 
=E f (t, X(t), Y (t), A(t), u(t))dt + g(X(T )) , (2)
0
152 E. SAVKU

where f : [0, T ] × R × R × R × U → R represents the running gain and g : R → R corresponds to the


terminal gain of the control task. Here, we assume that f and g are C 1 -functions with respect to x, y, a, u
such that for all xi = x, y, a, u,
Z T  2
∂f
E |f (t, X(t), Y (t), A(t), u(t))| + (t, X(t), Y (t), A(t), u(t)) dt
0 ∂xi

2
+ |g(X(T ))| + |gx (X(T ))| < ∞.

Hence, in a classical unconstrained stochastic control problem, our goal is to find the optimal control
u∗ ∈ A such that
J(u∗ ) = sup J(u). (3)
u∈A
On the other hand, in this work, we formulate the constraints inspired by Theorem 11.3.1 of [19] but
with completely different constraints. In this theorem, the author presents an approach for the stochastic
control tasks with a condition at the terminal time T > 0 for a diffusion process. Later, [4] gave an
application of this theorem and [28] extended this theorem to the stochastic differential games with
regimes.
Furthermore, [5] stated a version of Theorem 11.3.1 of [19] with constraint types (5) and (6) for a
jump-diffusion process. These constraints describe deterministic and stochastic Lagrange multipliers,
correspondingly and are different than the terminal conditions given in Theorem 11.3.1 of [19]. But
the authors do not investigate the Lagrange multipliers however they claimed that their existence is a
crucial condition to apply the proved theorems, see Theorem 5.2 and 5.4 of [5]. In our work, we study
a stochastic control problem for a jump-diffusion process with the memory and the constraints defined
with (5) and (6). Hence, our work extends the theorems of [5] to a delayed model. Moreover, we develop
an application for which the corresponding Lagrange multiplier exists. In that sense, we should underline
that our work is the first work that completes the desired task with the constraints (5)-(6) and also, by
inserting a delay term, we study a larger model.
We do not prefer to define many technical conditions over b, σ, η in this section. In Section 2, we will
develop two fundamental theorems to approach stochastic control problems with the constraints (5) and
(6). These can be solved by both Stochastic Maximum Principle (SMP) and Dynamic Programming
Principle (DPP). Thus, the technical assumptions have to be determined specifically depending on the
preferred method. We will highlight them in Section 3, while we are studying an optimal consumption
problem.
This article is organized as follows: In Section 2, we introduce the mathematical formulation of our
constrained stochastic control problem and demonstrate the corresponding theorems in a Lagrangian
environment. Section 3 is devoted to developing a financial application, which formulates the optimal
consumption process of a company with memory. The final section gives a conclusion.

2. Reformulation of the Control Task within the Context of Constraints


In this section, we develop two theorems which describe the optimal control process and investigate
the corresponding Lagrange multipliers for a time-delayed stochastic control system.
Firstly, let us state the value function of the constrained control problem:
ϕ(x, y, a) = J(u∗ ) = sup J(x, y, a, u). (4)
u∈Θ

Here, J(·) is defined by Equation (2) and the supremum is taken over Θ of all admissible controls
u : R → U ⊂ R such that Z T 
E M (t, X(t), Y (t), A(t), u(t))dt = 0, (5)
0
or Z T
M (t, X(t), Y (t), A(t), u(t))dt = 0 a.s.. (6)
0
M : [0, T ] × R × R × R × U → R is a C 1 function with respect to x, y, and a such that for xi = x, y, a, u:
Z T  2 
∂M
E |M (t, X(t), Y (t), A(t), u(t))| + (t, X(t), Y (t), A(t), u(t)) dt < ∞.
0 ∂xi
A CONSTRAINED SYSTEM WITH MEMORY 153

Here, we study two types of constraints: The constraint type (5) represents a real valued Lagrange
multiplier and the type (6) discovers a stochastic one.
Thus, we should specify the set of stochastic Lagrange multipliers as in [27]:
 
∆= λ : Ω → R|λ is FT − measurable and E[|λ|] < ∞ .

Now, by observing the Equation (4) and the constraints (5) and (6), let us present the unconstrained
stochastic control problem in the following way:
ϕλ (x, y, a) = sup J(x, y, a, u)
u∈Θ
Z T
x,y,a
= sup E f (t, X(t), Y (t), A(t), u(t))dt + g(X u (T ))
u∈Θ 0
Z T 
+λ M (t, X(t), Y (t), A(t), u(t))dt , (7)
0

subject to the system (1).


First, we will prove the following theorem corresponding to the type (6):

Theorem 1. Assume that for all λ ∈ ∆1 ⊂ ∆, we can develop ϕλ (x, y, a) and the optimal control process
u∗,λ , which solves the unconstrained stochastic control problem (7) subject to the system (1). Moreover,
assume that there exists λ0 ∈ ∆1 , such that
Z T
∗,λ0 ∗,λ0 ∗,λ0
M (t, Xtu , Ytu , Aut , u∗,λ
t
0
)dt = 0, a.s. (8)
0
λ0
Then, ϕ(x, y, a) = ϕ (x, y, a) is obtained and u∗ = u∗,λ0 solves the constrained stochastic control problem
(3) subject to (1) and (6).
Proof. The first inequality appears by definition of the optimal value function as follows:
ϕλ (x, y, a) = J(x, y, a, u∗,λ )
Z T
∗,λ ∗,λ ∗,λ
=E x,y,a
f (t, Xtu , Ytu , Atu , u∗,λ )dt
0
Z T 
∗,λ ∗,λ ∗,λ ∗,λ
+λ M (t, Xtu , Ytu , Atu , u∗,λ u
t )dt + g(XT )
0
≥ J(x, y, a, uλ )
Z T
λ λ λ
x,y,a
=E f (t, Xtu , Ytu , Aut , uλ )dt
0
Z T 
λ λ λ λ
+λ M (Xtu , Ytu , Aut , uλt )dt + g(XTu ) . (9)
0

In particular, if λ = λ0 exists and since u1 ∈ Θ is feasible in the constrained control problem (3), then
by (8):
Z T Z T
∗,λ0 ∗,λ0 ∗,λ0 λ λ λ
M (t, Xtu , Ytu , Aut , u∗,λ
t
0
)dt = M (Xtu , Ytu , Aut , uλt )dt = 0 (10)
0 0

Therefore, by (9) and (10):

ϕλ0 (x, y, a) = J(u∗,λ0 ) = J(x, y, a, u∗,λ0 ) ≥ J(x, y, a, u) = J(u),

for all u ∈ Θ. Note that u∗,λ0 ∈ Θ and this completes the proof. □

The following theorem can be proved similarly for the constraint type (5).
154 E. SAVKU

Theorem 2. Assume that for all λ ∈ K ⊂ R, we can determine ϕλ (x, y, a) and the optimal control
process u∗,λ solving the unconstrained stochastic control problem (7) subject to (1). Furthermore, assume
that there exists λ0 ∈ K such that
Z T 
∗,λ0 ∗,λ0 ∗,λ0
E M (t, Xtu , Ytu , Atu , u∗,λ
t
0
)dt = 0.
0

Then, ϕ(x, y, a) = ϕλ0 (x, y, a) and u∗ = u∗,λ0 solves the constrained stochastic control problem (3) subject
to the model (1) and the constraint (5).
Remark 1. Theorem 1 and 2 can be applied to a wide range of stochastic control problems by both SMP
and DPP as long as it is possible to determine the corresponding Lagrange multipliers. If we prefer to
apply DPP, we should be careful about Markov property. SDDEs provide a more realistic environment
to interact but we loose Markov property. Moreover, since we have an initial path instead of an initial
value for the system (1), our problem creates the corresponding partial differential equations so-called
Hamilton-Jacobi-Bellman equations in an infinite dimensional space. Hence, a direct application of DPP
is not mathematically possible (more details to handle such problems by DPP in [7, 8, 14] and reference
therein).
Remark 2. To utilize SMP, we do not need any Markovian assumption different than DPP. Hence,
in this work, we will combine the method described in Theorem 2 of our paper with Theorem 3.1 and
Theorem 4.1 of [20] to find the optimal consumption process by SMP.
Remark 3. Our work is inspired from Theorem 11.3.1 of [19], but we should highlight that the constraint
of Theorem 11.3.1 of [19] is defined at terminal time T as:
E[M (XTu )] = 0, (11)
which is completely different than our constraints (5)-(6). We put a condition over running gain compo-
nent rather than the terminal gain. Moreover, we can see similar constraints in [5] but both [19] and [5]
do not include memory impact.
Remark 4. In [30], we studied memory impact within the framework of Lagrange multipliers similar
to Equation 11, which is a different type of constraint as we stated in Remark 3. Furthermore, in [30],
we focused on a dividend policy application in a regime-switching environment with a different control
formulation. Our present work and [30] share a similar philosophy with completely different constraints
and financial formulations.
Now, let us present an application of Theorem 2 in finance.

3. Application to Finance
In this section, we will develop the formulation of an optimal consumption process that corresponds
to the wealth process of a company with memory. This process evolves according to a time-delayed
jump-diffusion model. The dynamics of the model carry past values of the wealth process in the form
of Y (t) = X(t − δ), t ∈ [0, T ], where δ > 0 is a constant. Our purpose is to develop a more realistic
consumption policy, which depends on the information about the historical performance of the company
as well.
µ(·) is a deterministic function and represents the appreciation rate of the company. Furthermore,
we suppose that σ(t) and η(t, z), t ∈ [0, T ], are given bounded, square integrable and adapted processes.
U is a non-empty, closed and convex subset of R. In this section, our problem formulation justifies the
technical assumptions provided in [20] thus, we are allowed to apply Theorem 3.1 and Theorem 4.1 of
that article.
The consumption process is a càdlàg, Ft -adapted control process, which satisfies:
Z T 
2
E |c(t)| dt < ∞.
0
Let us state the wealth process X(t) = X c (t), which is a special form of Equation (1) as follows:
A CONSTRAINED SYSTEM WITH MEMORY 155

  
dX(t) = X(t − δ)µ(t) − c(t) dt + X(t − δ) σ(t)dW (t)
Z 
+ η(t, z)Ñ (dt, dz) , t ∈ [0, T ], (12)
R0
X(t) = θ(t), t ∈ [−δ, 0],
where θ(·) is a given nonnegative, deterministic and continuous function.
We assume that the company wants to maximize its wealth despite a quadratic running loss by bal-
ancing it corresponding to a constraint of linear running gain, which is described in terms of the control
process. Moreover, the company aims to reach a level of a constant K times the terminal time T > 0. So
we assume that the company takes into account time restrictions as well. We will develop and highlight
the conditions over K at the end of our computations. Hence, our goal is to find the optimal consumption
process c∗ (·) by solving:
J(c∗ ) = sup J(c)
c∈Θ
Z T 
2
= sup E α(t)c (t)dt + βX(T )
c∈Θ 0
subject to the system (12) and to the constraint:
Z T 
E γ(t)c(t)dt = T K, K ∈ R, (13)
0
where α(·) < 0 and γ(·) are deterministic functions and β ∈ R.
Now we can develop the Lagrangian form of this stochastic control problem as follows:
J(c∗ ) = sup J(c)
c∈Θ
Z T Z T 
2
= sup E α(t)c (t)dt + λ (γ(t)c(t) − K)dt + βX(T ) , (14)
c∈Θ 0 0

for which we aim to find c∗ = cλ,∗ and the real-valued Lagrange multiplier λ = λ0 described in Theorem
2.
Since we apply SMP to solve the problem (14), first, we define the Hamiltonian corresponding to the
wealth process (12):
H(t, x, y, a, c, p, q, r(·)) = α(t)c2 + λ(γ(t)c − K) + (µ(t)y − c)p + yσ(t)q
Z
+y η(t, z)r(t, z)ν(dz). (15)
R0
Note that it is clearly seen that Hamiltonian H is a concave function of x, y, a and c, hence the con-
cavity condition over H is satisfied, see Theorem 3.1 of [20] is justified.
Furthermore, we should present the corresponding Anticipated Backward Stochastic Differential Equa-
tion (Anticipated BSDE) and solve it for unknown p(t), q(t), and r(t, z).
For t ∈ [0, T ], let us introduce:

dp(t) = −E µ(t + δ)p(t + δ) + σ(t + δ)q(t + δ)
Z  
+ η(t + δ, z)r(t + δ, z)ν(dz) 1[0,T −δ] (t)|Ft dt
R0
Z
+ q(t)dW (t) + r(t, z)Ñ (dt, dz) (16)
R0
p(T ) = β. (17)
We call Anticipated to this type of BSDEs since as seen in µ, σ, η, p(·), q(·), and r(·, ·), the terms involve
time-advanced values in the form of t + δ for t ∈ [0, T ]. This type of BSDEs was first introduced and
developed by Peng and Yang, see [23]. For technical definitions of the Hamiltonian (15) and the System
156 E. SAVKU

(16)-(17), please see Apendix 4 or Section 2 in [20]. Furthermore, see [25, 28] for the formulation of
Anticipated BSDEs and their relation with SDDEs via different models.
We follow the technique described in [20] to find the solution for p(·), q(·), and r(·, ·), which will be
computed inductively in the following way:
Step 1: For t ∈ [T − δ, T ], the corresponding adjoint equation becomes:
Z
dp(t) = q(t)dW (t) + r(t, z)Ñ (dt, dz),
R0
p(T ) = β,
for which we have the solution:
p(t) = E[p(T )|Ft ] = β, t ∈ [T − δ, T ].
By martingale representation theorem, since the Lagrange multiplier is a real value, we choose q =
r = w = 0. Hence, the Anticipated BSDE gets the form:
dp(t) = −µ(t + δ)p(t + δ)1[0,T −δ] (t)dt, t ≤ T,
p(t) = β, t ∈ [T − δ, T ].
Step 2: We define:
h(t) = p(T − t), t ∈ [0, T ]. (18)
That way, we get a deterministic delay equation:
dh(t) = −dp(T − t) = µ(T − t + δ)p(T − t + δ)dt
= µ(T − t + δ)h(t − δ)dt, t ∈ [δ, T ],
h(t) = p(T − t) = β, t ∈ [0, δ].
For such equations, again, we have an approach of solving inductively. Since we can compute h(t) on
[(j − 1)δ, jδ], we obtain:
Z t
h(t) = h(jδ) + h′ (s)ds

Z t
= h(jδ) + µ(T − s + δ)h(s − δ)ds (19)

for t ∈ [jδ, (j + 1)δ], j = 1, 2, ....


Now, we should maximize the Hamiltonian (15) with respect to c to get:
1
c∗ (t) =
α(t)(p(t) − λγ(t)), t ∈ [0, T ]. (20)
2
As a consequence of the nature of constrained stochastic control problems, we should compute the
value of Lagrange multiplier λ0 to use Theorem 2 properly.
Solving stochastic delay equations require special approaches different than usual stochastic differential
equations. By the Equation (20), the wealth process becomes:
1
dX(t) = (X(t − δ)µ(t) − α(t)(p(t) − λγ(t)))dt + X(t − δ)(σ(t)dW (t)
Z 2
+ η(t, z)Ñ (dt, dz)), t ∈ [0, T ], (21)
R0
X(t) = θ(t), t ∈ [−δ, 0].
We know that the SDDE (21) can be solved by successive Itô integrations over steps of length δ (see
Section 1, page 7 in [18]). Specifically, we assume that terminal time T = 2δ. This assumption is just for
the sake of simplicity and does not pretend to show the complete methodology of applying the technique.
Thus, the total duration that we study is the interval of [−δ, 2δ].
First, for t ∈ [0, T ], let us define:
Z
dL(t) = σ(t)dW (t) + η(t, z)Ñ (dt, dz).
R0
A CONSTRAINED SYSTEM WITH MEMORY 157

By also observing (18) and (19), we provide the following open form of the solution process:
X(t) = θ(t), if − δ ≤ t ≤ 0,
Z t 
1
X(t) = θ(0) + θ(s − δ)µ(s) − α(s)(h(T − s) − λγ(s)) ds
0 2
Z t
+ θ(s − δ)dL(s) if 0 ≤ t ≤ δ,
0
Z t  Z v−δ  
1
X(t) = X(δ) + θ(s − δ)µ(s) − α(s)(h(T − s) − λγ(s)) ds
θ(0) +
δ 0 2
Z v−δ  
1
+ θ(s − δ)dL(s) µ(v) − α(v)(h(T − v) − λγ(v)) dv
0 2
Z t Z v−δ  
1
+ θ(0) + θ(s − δ)µ(s) − α(s)(h(T − s) − λγ(s)) ds
δ 0 2
Z v−δ 
+ θ(s − δ)dL(s) dL(v) if δ ≤ t ≤ 2δ = T.
0

Now, the values of h(T − t), t ∈ [0, T ] at the above integrals can be determined by following the
boundary values of the integrals and their relation with t. Remember that T = 2δ.
Then, by (19)
if, 0 ≤ s ≤ t ≤ δ, then, δ ≤ T − s ≤ 2δ,
Z 2δ−s
h(2δ − s) = h(δ) + µ(3δ − u)h(u − δ)du,
δ
 Z 2δ−s 
h(2δ − s) = β 1 + µ(3δ − u)du .
δ
Moreover,
if, 0≤s≤v−δ and δ ≤ v ≤ t ≤ 2δ, then, 0 ≤ v − δ ≤ t − δ ≤ δ,
so, 0 ≤ s ≤ δ, then δ ≤ 2δ − s ≤ 2δ, then, by (19),
 Z 2δ−s 
h(2δ − s) = β 1 + µ(3δ − u)du .
δ
Finally,
if, δ ≤ v ≤ t ≤ 2δ, then, 0 ≤ 2δ − v ≤ δ, then, by 18 h(2δ − v) = β.
Firstly, we change the value of h(·) according to the relevant intervals in the above solution processes
and integrate the Equation (21) from 0 to 2δ by following the above δ-length description of X(·).
Then, we apply expectation to both sides of the Equation (21).
Now, let us introduce the following terms:
Z 2δ Z 2δ Z v−δ  
A= α(s)γ(s)ds + E α(s)γ(s)ds µ(v)dv + dL(v)
0 δ 0
and
  Z δ
B = θ(0) − E X(2δ) + θ(s − δ)µ(s)ds
0
Z δ  Z 2δ−s  Z δ 
1
− β α(s) 1 + µ(3δ − u)du ds + E θ(s − δ)dL(s)
2 0 δ 0
Z 2δ Z 2δ Z v−δ 
+ θ(0)µ(v)dv + θ(s − δ)µ(s)ds µ(v)dv
δ δ 0
Z 2δ Z v−δ  Z 2δ−s  
1
−β α(s) 1 + µ(3δ − u)du ds µ(v)dv
δ 0 δ 2
158 E. SAVKU

Z 2δ Z v−δ  
+E θ(s − δ)dL(s) µ(v)dv
δ 0
Z 2δ Z 2δ  Z v−δ
1
− β α(v)dv + E θ(0) + θ(s − δ)µ(s)ds
2 δ δ 0
Z v−δ  Z 2δ−s  Z v−δ  
1
− β α(s) 1 + µ(3δ − u)du ds + θ(s − δ)dL(s) dL(v) .
2 0 δ 0
Then, we get:
2B
on condition that A ̸= 0.
λ= (22)
A
Now, by (18) and (19), let us make some observations about the constraint (13):
Z T
1 δ
 Z   Z 2δ−t  
E γ(t)c(t)dt = γ(t)α(t) β 1 + µ(3δ − u)du − λγ(t) dt
0 2 0 δ
Z 2δ
1
+ γ(t)α(t)(β − λγ(t))dt
2 δ
=2δK.
Then, let us utilize the above equality to clarify λ and define the following terms:
Z 2δ Z δ Z 2δ−t  
β
D= γ(t)α(t)dt + γ(t)α(t) µ(3δ − u)du dt − 2δK,
2 0 0 δ
and Z 2δ
C= γ 2 (t)α(t)dt.
0
Then, we obtain:
2D
on condition that C ̸= 0.
λ= (23)
C
Finally, by observations (22)-(23), we conclude that in order to use Theorem 2, we have to specify the K
value in Equation (5) carefully such that
D B
= .
C A
By this final result, we determined explicitly the control process c∗ (·), the Lagrange multiplier λ0 and
consequently, the solution for p(·) corresponding to the Anticipated BSDE (16)-(17), and all the technical
assumptions required.

4. Conclusion and Future Work


In this work, we studied a constrained stochastic control problem and investigated the impact of delay
term on Lagrange multipliers. We proved two theorems for two different types of constraints and gave an
application in finance for the case of a real-valued Lagrange multiplier. We focused on the wealth process
of a company, which evolves according to a jump-diffusion model with historical values in its dynamics.
We observed that however the Theorems 1 and 2 are applicable for a wide range of control tasks by both
SMP and DPP, determining the Lagrange multipliers remains as a challenge. It is not always easy to
compute these parameters. Furthermore, the step of formulating these multipliers can not be ignored
because the provided theorems are enforceable on the condition that there exists a Lagrange multiplier
for which the constraint is justified. Despite this challenge, to the best of our knowledge, our article
presents the first results for a delayed system with constraints in running gain of the control task and
computes the corresponding Lagrange multiplier exactly. In our financial application, we clearly present
the technical differences for solving a delayed SDE and a usual one by applying Itô’s formula recursively.
Furthermore, since stochastic control theory is a discipline of sequential decision-making, we may
encounter some challenges from the side of model selection. The decision maker may believe that her
model is perfect. But in reality, generally, this is not the case. Especially, in finance, model misidentifi-
cation can cause high financial losses. At this point, robust control designs different control or decision
rules performing fare well across alternative models [11, 34]. Especially, in stochastic games, we handle
model uncertainty in a relative entropy context as a penalty term [2, 6, 9]. It is known that Hansen and
A CONSTRAINED SYSTEM WITH MEMORY 159

Sargent [10] used a Lagrange multiplier theorem to convert the entropy constraint onto a penalty on
perturbations from the model. Therefore, we would like to underline the potential of our work towards
robust stochastic control and stochastic games.
Risk minimization and worst-case scenarios have significant value in quantitative finance and insurance
since each action with uncertainty carries a potential for loss that cannot be underestimated. Therefore,
as a further study, we aim to focus on the relation between Lagrange multipliers and robust control.
These structures can be approached from the side of relative entropy as well as from the sides of Var
and CVar concepts, see [9, 15, 16]. Furthermore, within the wide scope of risk management, Lagrange
multipliers can be handled via computational methods such as deep learning and deep reinforcement
learning, see [24, 31].
On the other hand, we strongly believe that however delay systems are demanding and challenging,
they will be highlighted within the context of other hot fields such as Deep Learning. However, the aim
of our research article is to provide theoretical and technical approaches, in [29], we present a collection
of novel aspects within the intersection of computer science and stochastic optimal control under the
memory component.

Declaration of Competing Interests The author declares no conflict of interest.

References
[1] Bichtele,r K., Stochastic Integration with Jumps, Cambridge University Press, 2002. https://doi.org/10.1017/
CBO9780511549878
[2] Cartea, A., Donnelly, R., Jaimungal, S., Algorithmic trading with model uncertainty, SIAM Journal on Financial
Mathematics, 8(1) (2017), 635-671. https://doi.org/10.1137/16M106282X
[3] Cont, R., Tankov, P., Financial Modelling with Jump Processes, Chapman and Hall/CRC, 2003. https://doi.org/
10.1201/9780203485217
[4] Dahl Rognlien, K., Stokkereit, E., Stochastic maximum principle with Lagrange multipliers and optimal consumption
with Lévy wage, Afrika Matematika, 27(3) (2016), 555-572. https://doi.org/10.1007/s13370-015-0360-5
[5] Dordevic, J., Rognlien Dahl, K., Stochastic optimal control of pre-exposure prophylaxis for HIV infection, Mathematical
Medicine and Biology: A Journal of the IMA, 39(3) (2022), 197-225. https://doi.org/10.1007/s00285-024-02151-3
[6] Elliott, R. J., Siu, T. K., Robust optimal portfolio choice under Markovian regime-switching model, Methodology and
Computing in Applied Probability, 11 (2009), 145-157. https://doi.org/10.1007/s11009-008-9085-3
[7] Federico, S., A stochastic control problem with delay arising in a pension fund model, Finance and Stochastics, 15(3)
(2011), 421-459. https://doi.org/10.1007/s00780-010-0146-4
[8] Gozzi, F., Masiero, F., Stochastic optimal control with delay in the control I: Solving the HJB equation through
partial smoothing, SIAM Journal on Control and Optimization, 55(5) (2017), 2981-3012. https://doi.org/10.1137/
16M1070128
[9] Gueant, O., Lehalle, C. A., Fernandez-Tapia, J., Dealing with the inventory risk: a solution to the market making
problem, Mathematics and Financial Economics, 7(4) (2013), 477-507. https://doi.org/10.1007/s11579-012-0087-0
[10] Hansen L. P., Sargent T. J., Robustness, Princeton University Press, 2008. https://doi.org/10.1515/9781400829385
[11] Korn, R., Olaf, M., Mogens, S., Worst-case-optimal dynamic reinsurance for large claims, European Actuarial Journal,
2 (2012), 21-48. https://doi.org/10.1007/s13385-012-0050-8
[12] Korn, R., Melnyk, Y., Seifried, F. T., Stochastic impulse control with regime-switching dynamics, European Journal
of Operational Research, 260(3) (2017), 1024-1042. https://doi.org/10.1016/j.ejor.2016.12.029
[13] Lamberton, D., Lapeyre, B., Introduction to Stochastic Calculus Applied to Finance, Chapman and Hall/CRC, 2011.
https://doi.org/10.1201/9781420009941
[14] Larssen, B., Risebro, N. H., When are HJB-equations in stochastic control of delay systems finite dimensional?,
Stochastic Analysis and Applications, 21(3) (2003), 643-71. https://doi.org/10.1081/SAP-120020430
[15] Mataramvura, S., Øksendal, B., Risk minimizing portfolios and hjbi equations for stochastic differential games, Stochas-
tics An International Journal of Probability and Stochastic Processes, 80(4) (2008), 317-337. https://doi.org/10.
1080/17442500701655408
[16] Miller, C. W., Yang, I., Optimal control of conditional value-at-risk in continuous time, SIAM Journal on Control and
Optimization, 55(2) (2017), 856-884. https://doi.org/10.1137/16M1058492
[17] Mohammed, S.E.A., Stochastic Functional Differential Equations, Pitman, London, 1984.
[18] Mohammed, S. E. A., Stochastic differential systems with memory: theory, examples and applications. In Stochastic
Analysis and Related Topics VI: Proceedings of the Sixth Oslo—Silivri Workshop Geilo 1996 (pp. 1-77), Boston,
MA: Birkhauser Boston, 1998. https://opensiuc.lib.siu.edu/cgi/viewcontent.cgi?article=1064&context=math_
articles
[19] Øksendal, B., Stochastic Differential Equations: an Introduction with Applications, Springer Science & Business Media,
2013. https://doi.org/10.1007/978-3-642-14394-6
160 E. SAVKU

[20] Øksendal, B., Sulem, A., Zhang, T., Optimal control of stochastic delay equations and time-advanced backward
stochastic differential equations, Advances in Applied Probability, 43(2) (2011), 572-596. https://doi:10.1239/aap/
1308662493
[21] Øksendal, B., Sulem, A., Stochastic Control of Jump Diffusions. In: Applied Stochastic Control of Jump Diffusions,
Springer, 2019. https://doi.org/10.1007/978-3-030-02781-0_5
[22] Pham, H., Continuous-Time Stochastic Control and Optimization with Financial Applications, Vol. 61, Springer Science
& Business Media, 2009. https://doi.org/10.1007/978-3-540-89500-8
[23] Peng, S, Yang, Z., Anticipated backward stochastic differential equations, The Annals of Probability, 37(3) (2009), 877
– 902. https://doi.org/10.1214/08-AOP423
[24] Peters, J., Mulling, K., Altun, Y., Relative entropy policy search, In Proceedings of the AAAI Conference on Artificial
Intelligence, 24 (2010), 1607-1612. https://doi.org/10.1609/aaai.v24i1.7727
[25] Savku, E., Weber, G. W., A stochastic maximum principle for a Markov regime-switching jump-diffusion model with
delay and an application to finance, Journal of Optimization Theory and Applications, 179(2) (2018), 696-721. https:
//doi.org/10.1007/s10957-017-1159-3
[26] Savku, E., Fundamentals of Market Making Via Stochastic Optimal Control, Operations Research: New Paradigms
and Emerging Applications, (pp.136-154), Purutçuoğlu, V., Weber, G.W., & Farnoudkia, H. (Eds.), (1st ed.), CRC
Press, 2022. https://doi.org/10.1201/9781003324508-10
[27] Savku, E., A stochastic control approach for constrained stochastic differential games with jumps and regimes, Math-
ematics, 11(14) (2023), 3043. https://doi.org/10.1007/s10957-017-1159-3
[28] Savku, E., Deep-Control of Memory via Stochastic Optimal Control and Deep Learning, In International Conference
on Mathematics and its Applications in Science and Engineering (pp. 219-240), Cham: Springer Nature Switzerland,
2023. https://doi.org/10.1007/978-3-031-49218-1_16
[29] Savku, E., Memory and anticipation: two main theorems for Markov regime-switching stochastic processes, Stochastics,
(2024), 1-18. https://doi.org/10.1080/17442508.2024.2427733
[30] Savku, E., An approach for regime-switching stochastic control problems with memory and terminal conditions, Opti-
mization, (2024), 1–18. https://doi.org/10.1080/17442508.2024.2427733
[31] Tamar, A., Glassner, Y., Mannor, S. Optimizing the cvar via sampling, In Proceedings of the AAAI Conference on
Artificial Intelligence, 29 (2015). https://doi.org/10.1609/aaai.v29i1.9561
[32] Touzi, N., Optimal Stochastic Control, Stochastic Target Problems, and Backward SDE, Vol. 29, Springer Science &
Business Media, 2012.https://doi.org/10.1007/978-1-4614-4286-8
[33] Uğurlu, K., Tomasz, B., Distorted probability operator for dynamic portfolio optimization in times of socio-
economic crisis, Central European Journal of Operations Research, 31(4) (2023), 1043-1060. https://doi.org/10.
1007/s10100-022-00834-0
[34] Uğurlu, K., Robust utility maximization of terminal wealth with drift and volatility uncertainty, Optimization, 70(10)
(2021), 2081-2102. https://doi.org/10.1080/02331934.2020.1774586
[35] Wendell, H., F., Soner, H. M., Controlled Markov Processes and Viscosity Solutions, Vol. 25, Springer Science &
Business Media, 2006. https://doi.org/10.1007/0-387-31071-1

Appendix
In order to apply SMP, we have to define corresponding Hamiltonian for a delayed system as follows:
H : [0, T ] × R × R × R × U × R × R × R → R,
H(t, x, y, a, u, p, q, r) =f (t, x, y, a, u) + b(t, x, y, a, u)p + σ(t, x, y, a, u)q
Z
+ η(t, x, y, a, u, z)r(t, z)ν(dz) (24)
R0
where R denotes the set of all functions
r : [0, T ] × R0 → R, for which the integral in (24) converges.
Associated to H, the adjoint, unknown and adapted processes (p(t) ∈ R : t ∈ [0, T ]), (q(t) ∈ R : t ∈ [0, T ]),
and (r(t, z) ∈ R : t ∈ [0, T ], z ∈ R0 ) are described by the following Anticipated BSDE with jumps:
Z
dp(t) = E[µ(t)|Ft ]dt + q(t)dW (t) + r(t, z)Ñ (dt, dz)
R0
p(T ) = gx (X(T )),
where
∂H
µ(t) := − (t, X(t), Y (t), A(t), u(t), p(t), q(t), r(t, ·))
∂x
∂H
− (t + δ, X(t + δ), Y (t + δ), A(t + δ), u(t + δ), p(t + δ), q(t + δ), r(t + δ, ·))
∂y
Z t+δ
ρt ∂H
× 1[0,T −δ] (t) − e (s, X(s), Y (s), A(s), u(s), p(s), q(s), r(s, ·))
t ∂a
A CONSTRAINED SYSTEM WITH MEMORY 161


−ρs
× e 1[0,T ] (s)ds . (25)

As seen in µ(t), we have the future values of X(s), u(s), p(s), q(s), and r(s, ·) for s ≤ t + δ in Equation
(25), hence we call Anticipated to this type of BSDEs.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy