2. Mathematical Framework
We consider a stationary
d-dimensional Gaussian time series
(for
), with:
such that
and
for all
and
. Furthermore, we require the cross-correlation function to fulfil
for
and
, where the component-wise cross-correlation functions
are given by
for each
and
. For each random vector
, we denote the covariance matrix by
, since it is independent of
j due to stationarity. Therefore, we have
.
We specify the dependence structure of
and turn to long-range dependence: we assume that for the cross-correlation function
for each
, it holds that:
with
for finite constants
with
, where the matrix
has full rank, is symmetric and positive definite. Furthermore, the parameters
are called long-range dependence parameters. Therefore,
is multivariate long-range dependent in the sense of [
13], Definition 2.1.
The processes we want to consider have a particular structure, namely for
, we obtain for fixed
:
The following relation holds between the
extendend process and the primarily regarded process
. For all
,
we have:
where
. Note that the process
is still a centered Gaussian process since all finite-dimensional marginals of
follow a normal distribution. Stationarity is also preserved since for all
,
and
, the cross-correlation function
of the process
is given by
and the last line does not depend on
j. The covariance matrix
of
has the following structure:
Hence, we arrive at:
where
,
. Note that
and
,
since we are studying cross-correlation functions.
Therefore, we finally have to show that based on the assumptions on , the extended process is still long-range dependent.
Hence, we have to consider the cross-correlations again:
since
and
, with
,
and
.
Let us remark that .
Therefore, we are still dealing with a multivariate long-range dependent Gaussian process. We see in the proofs of the following limit theorems that the crucial parameters that determine the asymptotic distribution are the long-range dependence parameters , of the original process and therefore, we omit the detailed description of the parameters herein.
It is important to remark that the extended process
is also long-range dependent in the sense of [
14], p. 2259, since:
with:
and
can be chosen as any constant
that is not equal to zero, so for simplicity, we assume without a loss of generality
, and therefore,
, since the condition in [
14] only requires convergence to a finite constant
. Hence, we may apply the results in [
14] in the subsequent results.
We define the following set, which is needed in the proofs of the theorems of this section.
and denote the corresponding long-range dependence parameter to each
by
We briefly recall the concept of Hermite polynomials as they play a crucial role in determining the limit distribution of functionals of multivariate Gaussian processes.
Definition 1. (Hermite polynomial, [15], Definition 3.1) The j-th Hermite polynomial , , is defined as Their multivariate extension is given by the subsequent definition.
Definition 2. (Multivariate Hermite polynomial, [15], p. 122) Let . We define as d-dimensional Hermite polynomial:with . Let us remark that the case is excluded here due to the assumption .
Analogously to the univariate case, the family of multivariate Hermite polynomials
forms an orthogonal basis of
, which is defined as
The parameter denotes the density of the d-dimensional standard normal distribution, which is already divided into the product of the univariate densities in the formula above.
We denote the Hermite coefficients by
The Hermite rank
of
f with respect to the distribution
is defined as the largest integer
m, such that:
Having these preparatory results in mind, we derive the multivariate Hermite expansion given by
We focus on the limit theorems for functionals with Hermite rank 2. First, we introduce the matrix-valued Rosenblatt process. This plays a crucial role in the asymptotics of functionals with Hermite rank 2 applied to multivariate long-range dependent Gaussian processes. We begin with the definition of a multivariate Hermitian–Gaussian random measure
with independent entries given by
where
is a univariate Hermitian–Gaussian random measure as defined in [
16], Definition B.1.3. The multivariate Hermitian–Gaussian random measure
satisfies:
and:
where
denotes the Hermitian transpose of
. Thus, following [
14], Theorem 6, we can state the spectral representation of the matrix-valued Rosenblatt process
,
as
where each entry of the matrix is given by
The double prime in
excludes the diagonals
,
in the integration. For details on multiple Wiener-Itô integrals, as can be seen in [
17].
The following results were taken from [
18], Section 3.2. The corresponding proofs were outsourced to the
Appendix A.
Theorem 1. Let be a stationary Gaussian process as defined in (1) that fulfils (2) for , . For we fix:with and as described in (6). Let be a function with Hermite rank 2 such that the set of discontinuity points is a Null set with respect to the -dimensional Lebesgue measure. Furthermore, we assume f fulfills . Then:where: The matrix is a normalizing constant, as can be seen in [18], Corollary 3.6. Moreover, is a multivariate Hermitian–Gaussian random measure with and L as defined in (2). Furthermore, is a normalizing constant and:where for each and and:where C denotes the matrix of second order Hermite coefficients, given by It is possible to soften the assumptions in Theorem 1 to allow for mixed cases of short- and long-range dependence.
Corollary 1. Instead of demanding in the assumptions of Theorem 1 that (2) holds for with the addition that for all we have , we may use the following condition. We assume that:with as given in (2), but we do no longer assume for all but soften the assumption to and for , we allow for . Then, the statement of Theorem 1 remains valid. However, with a mild technical assumption on the covariances of the one-dimensional marginal Gaussian processes that is often fulfilled in applications, there is another way of normalizing the partial sum on the right-hand side in Theorem 1, this time explicitly for the case and , such that the limit can be expressed in terms of two standard Rosenblatt random variables. This yields the possibility of further studying the dependence structure between these two random variables. In the following theorem, we assume for the reader’s convenience.
Theorem 2. Under the same assumptions as in Theorem 1 with and and the additional condition that , for , and , it holds that:with being the same normalizing factor as in Theorem 1, and . Note that and are both standard Rosenblatt random variables whose covariance is given by 3. Ordinal Pattern Dependence
Ordinal pattern dependence is a multivariate dependence measure that compares the co-movement of two time series based on the ordinal information. First introduced in [
10] to analyze financial time series, a mathematical framework including structural breaks and limit theorems for functionals of absolutely regular processes has been built in [
11]. In [
19], the authors have used the so-called symbolic correlation integral in order to detect the dependence between the components of a multivariate time series. Their considerations focusing on testing independence between two time series are also based on ordinal patterns. They provide limit theorems in the i.i.d.-case and otherwise use bootstrap methods. In contrast, in the mathematical model in the present article, we focus on asymptotic distributions of an estimator of ordinal pattern dependence having a bivariate Gaussian time series in the background but allowing for several dependence structures to arise. As it will turn out in the following, this yields central but also non-central limit theorems.
We start with the definition of an ordinal pattern and the basic mathematical framework that we need to build up the ordinal model.
Let
denote the set of permutations in
,
that we express as
-dimensional tuples, assuring that each tuple contains each of the numbers above exactly once. In mathematical terms, this yields:
as can be seen in [
11], Section 2.1.
The number of permutations in is given by . In order to get a better intuitive understanding of the concept of ordinal patterns, we have a closer look at the following example, before turning to the formal definition.
Example 1. Figure 1 provides an illustrative understanding of the extraction of an ordinal pattern from a data set. The data points of interest are colored in red and we consider a pattern of length , which means that we have to take data points into consideration. We fix the points in time , , and and extract the data points from the time series. Then, we search for the point in time which exhibits the largest value in the resulting data and write down the corresponding time index. In this example, it was given by . We order the data points by writing the time position of the largest value as the first entry, the time position of the second largest as the second entry, etc. Hence, the absolute values are ordered from largest to smallest and the ordinal pattern is obtained for the considered data points. Formally, the aforementioned procedure can be defined as follows, as can be seen in [
11], Section 2.1.
Definition 3. As the ordinal pattern of a vector , we define the unique permutation :such that:with if , . The last condition assures the uniqueness of if there are ties in the data sets. In particular, this condition is necessary if real-world data are to be considered.
In
Figure 2, all ordinal patterns of length
are shown. As already mentioned in the introduction, from the practical point of view, a highly desirable property of ordinal patterns is that they are not affected by monotone transformations, as can be seen in [
5], p. 1783.
Mathematically, this means that if
is strictly monotone, then:
In particular, this includes linear transformations , with and .
Following [
11], Section 1, the minimal requirement of the data sets we use for ordinal analysis in the time series context, i.e., for ordinal pattern probabilities as well as for ordinal pattern dependence later on, is
ordinal pattern stationarity (of order h). This property implies that the probability of observing a certain ordinal pattern of length
h remains the same when shifting the moving window of length
h through the entire time series and is not depending on the specific points in time. In the course of this work, the time series, in which the ordinal patterns occur, always have either stationary increments or are even stationary themselves. Note that both properties imply ordinal pattern stationarity. The reason why requiring stationary increments is a sufficient condition is given in the following explanation.
One fundamental property of ordinal patterns is that they are uniquely determined by the increments of the considered time series. As one can imagine in Example 1, the knowledge of the increments between the data points is sufficient to obtain the corresponding ordinal pattern. In mathematical terms, we can define another mapping
, which assigns the corresponding ordinal pattern to each vector of increments, as can be seen in [
5], p. 1783.
Definition 4. We define for the mapping :such that for , , we obtain: We define the two mappings, following [
5], p. 1784:
An illustrative understanding of these mappings is given as follows. The mapping
, which is the spatial reversion of the pattern
, is the reflection of
on a horizontal line, while
, the time reversal of
, is its reflection on a vertical line, as one can observe in
Figure 3.
Based on the spatial reversion, we define a possibility to divide into two disjoint sets.
Definition 5. We define as a subset of with the property that for each , either π or are contained in the set, but not both of them.
Note that this definition does not yield the uniqueness of .
Example 2. We consider the case again and we want to divide into a possible choice of and the corresponding spatial reversal. We choose , and therefore, . Remark that is also a possible choice. The only condition that has to be satisfied is that if one permutation is chosen for , then its spatial reverse must not be an element of this set.
We stick to the formal definition of ordinal pattern dependence, as it is proposed in [
11], Section 2.1. The considered moving window consists of
data points, and hence,
h increments. We define:
and:
Then, we define ordinal pattern dependence
as
The parameter q represents the hypothetical case of independence between the two time series. In this case, p and q would obtain equal values and therefore, would equal zero. Regarding the other extreme, the case in which both processes coincide or one is a strictly monotone increasing transform of the other one, we obtain the value 1. However, in the following, we assume and .
Note that the definition of ordinal pattern dependence in (
17) only measures positive dependence. This is no restriction in practice, because negative dependence can be investigated in an analogous way, by considering
. If one is interested in both types of dependence simultaneously, in [
11], the authors propose to use
. To keep the notation simple, we focus on
as it is defined in (
17).
We compare whether the ordinal patterns in
coincide with the ones in
. Recall that it is an essential property of ordinal patterns that they are uniquely determined by the increment process. Therefore, we have to consider the increment processes
as defined in (
1) for
, where
,
. Hence, we can also express
p and
q (and consequently
) as a probability that only depends on the increments of the considered vectors of the time series. Recall the definition of
for
, given by
such that
with
as given in (
6).
In the course of this article, we focus on the estimation of
p. For a detailed investigation of the limit theorems for estimators of
, we refer to [
18]. We define the estimator of
p, the probability of coincident patterns in both time series in a moving window of fixed length, by
where:
Figure 4 illustrates the way ordinal pattern dependence is estimated by
. The patterns of interest that are compared in each moving window are colored in red.
Having emphasized the crucial importance of the increments, we define the following conditions on the increment process : let be a bivariate, stationary Gaussian process with , :
- (L)
We assume that
fulfills (
2) with
in
. We allow for
to be in the range
.
- (S)
We assume
such that the cross-correlation function of
fulfills for
:
with
and
holds.
Furthermore, in both cases, it holds that for and to exclude ties.
We begin with the investigation of the asymptotics of . First, we calculate the Hermite rank of , since the Hermite rank determines for which ranges of the estimator is still long-range dependent. Depending on this range, different limit theorems may hold.
Lemma 1. The Hermite rank of with respect to is equal to 2.
Proof. Following [
20], Lemma 5.4 it is sufficient to show the following two properties:
- (i)
,
- (ii)
.
Note that the conclusion is not trivial, because
in general, as can be seen in [
15], Lemma 3.7. Lemma 5.4 in [
20] can be applied due to the following reasoning. Ordinal patterns are not affected by scaling, therefore, the technical condition that
is positive semidefinite is fulfilled in our case. We can scale the standard deviation of the random vector
by any positive real number
since for all
we have:
To show property
, we need to consider a multivariate random vector:
with covariance matrix
. We fix
. We divide the set
into disjoint sets, namely into
, as defined in Definition 5 and the complimentary set
. Note that:
holds. This implies:
for
. Hence, we arrive at:
for
.
Consequently, .
In order to prove
, we consider:
to be a random vector with independent
distributed entries. For
and
such that
, we obtain:
since
for all
. This was shown in the proof of Lemma 3.4 in [
20].
All in all, we derive and hence, have proven the lemma. □
The case
exhibits the property that the standard range of the long-range dependence parameter
has to be divided into two different sets. If
, the transformed process
is still long-range dependent, as can be seen in [
16], Table 5.1. If
, the transformed process is short-range dependent, which means by definition that the autocorrelations of the transformed process are summable, as can be seen in [
13], Remark 2.3. Therefore, we have two different asymptotic distributions that have to be considered for the estimator
of coincident patterns.
3.1. Limit Theorem for the Estimator of p in Case of Long-Range Dependence
First, we restrict ourselves to the case that at least one of the two parameters and is in . This assures . We explicitly include mixing cases where the process corresponding to is allowed to be long-range as well as short-range dependent.
Note that this setting includes the pure long-range dependence case, which means that for , we have , or even . However, in general, the assumptions are lower, such that we only require for either or and the other parameter is also allowed to be in or .
We can, therefore, apply the results of Corollary 1 and obtain the following asymptotic distribution for :
Theorem 3. Under the assumption in (L), we obtain:with as given in Theorem 1 for and being a normalizing constant. We have:for each and and , where the variable:denotes the matrix of second order Hermite coefficients. Proof. The proof of this theorem is an immediate application of the Corollary 1 and Lemma 1. Note that for
it holds that it is square integrable with respect to
and that the set of discontinuity points is a Null set with respect to the
-dimensional Lebesgue measure. This is shown in [
18], Equation (4.5). □
Following Theorem 2, we are also able to express the limit distribution above in terms of two standard Rosenblatt random variables by modifying the weighting factors in the limit distribution. Note that this requires slightly stronger assumptions as in Theorem 1.
Theorem 4. Let (L) hold with . Additionally, we assume that , for , and . Then, we obtain:with and as given in Theorem 3. Note that and are both standard Rosenblatt random variables, whose covariance is given by Remark 1. Following [18], Corollary 3.14, if additionally and is fulfilled for all , then the two limit random variables following a standard Rosenblatt distribution in Theorem 4 are independent. Note that due to the considerations in [21], Equation (10), we know that the distribution of the sum of two independent standard Rosenblatt random variables is not standard Rosenblatt. However, this yields a computational benefit, as it is possible to efficiently simulate the standard Rosenblatt distribution, for details, as can be seen in [21]. We turn to an example that deals with the asymptotic variance of the estimator of p in Theorem 3 in the case .
Example 3. We focus on the case and consider the underlying process . It is possible to determine the asymptotic variance depending on the correlation between these two increment variables.
We start with the calculation of the second order Hermite coefficients in the case . This corresponds to the event , which yields:and: Due to , we have and therefore, . We identify the second order Hermite coefficients as the ones already calculated in [20], Example 3.13, although we are considering two consecutive increments of a univariate Gaussian process there. However, since the corresponding values are only determined by the correlation between the Gaussian variables, we can simply replace the autocorrelation at lag 1 by the cross-correlation at lag 0. Hence, we obtain: Recall that the inverse of the correlation matrix of is given by By using the formula for obtained in [18], Equation (4.23), we derive: Plugging the second order Hermite coefficients and the entries of the inverse of the covariance matrix depending on into the formulas, we arrive at:and: Therefore, in the case , we obtain the following factors in the limit variance in Theorem 3: Remark 2. It is not possible to analytically determine the limit variance for , as this includes orthant probabilities of a four-dimensional Gaussian distribution. Following [22], no closed formulas are available for these probabilities. However, there are fast algorithms at hand that calculate the limit variance efficiently. It is possible to take advantage of the symmetry properties of the multivariate Gaussian distribution to keep the computational cost of these algorithms low. For detail, as can be seen in [18], Section 4.3.1. 3.2. Limit Theorem for the Estimator of p in Case of Short-Range Dependence
In this section, we focus on the case of
. If
, we are still dealing with a long-range dependent multivariate Gaussian process
. However, the transformed process
is no longer long-range dependent, since we are considering a function with Hermite rank 2, see also [
16], Table 5.1. Otherwise, if
, the process
itself is already short-range dependent, since the cross-correlations are summable. Therefore, we obtain the following central limit theorem by applying Theorem 4 in [
14].
Theorem 5. Under the assumptions in (S), we obtain:with: We close this section with a brief retrospect of the results obtained. We established limit theorems for the estimator of p as probability of coincident pattern in both time series and hence, on the most important parameter in the context of ordinal pattern dependence. The long-range dependent case as well as the mixed case of short- and long-range dependence was considered. Finally, we provided a central limit theorem for a multivariate Gaussian time series that is short-range dependent if transformed by . In the subsequent section, we provide a simulation study that illustrates our theoretical findings. In doing so, we shed light on the Rosenblatt distribution and the distribution of the sum of Rosenblatt distributed random variables.
4. Simulation Study
We begin with the generation of a bivariate long-range dependent fractional Gaussian noise series .
First, we simulate two independent fractional Gaussian noise processes
and
derived by the R-package “longmemo”, for a fixed parameter
in both time series. For the reader’s convenience, we denote the long-range dependence parameter
d by
as it is common, when dealing with fractional Gaussian noise and fractional Brownian motion. We refer to
H as
Hurst parameter, tracing back to the work of [
23]. For
and
we generate
samples, for
, we choose
. We denote the correlation function of univariate fractional Gaussian noise by
,
. Then, we obtain
for
:
for
.
Note that this yields the following properties for the cross-correlations of the two processes for
:
We use and to obtain unit variance in the second process.
Note that we chose the same Hurst parameter in both processes to get a better simulation result. The simulations of the processes
and
are visualized in
Figure 5. On the left-hand side, the different fractional Gaussian noises depending on the Hurst parameter
H are displayed. They represent the stationary long-range dependent Gaussian
increment processes we need in the view of the limit theorems we derived in
Section 3. The processes in which we are comparing the coincident ordinal patterns, namely
and
, are shown on the right-hand side in
Figure 5. The long-range dependent behavior of the increment processes is very illustrative in these processes: roughly speaking, they become smoother the larger the Hurst parameter gets.
We turn to the simulation results for the asymptotic distribution of the estimator
. The first limit theorem is given in Theorem 3 for
and
. In the case of
, a different limit theorem holds, see Theorem 5. Therefore, we turn to the simulation results of the asymptotic distribution of the estimator
of
p, as shown in
Figure 6 for pattern length
. The asymptotic normality in case
can be clearly observed. We turn to the interpretation of the simulation results of the distribution of
for
and
as the weighted sum of the sample (cross-)correlations: we observe in the Q–Q plot for
that the samples in the upper and lower tail deviate from the reference line. For
, a similar behavior in the Q–Q plot is observed.
We want to verify the result in Theorem 4 that it is possible, by a different weighting, to express the limit distribution of
as the distribution of the sum of two independent standard Rosenblatt random variables. The simulated convergence result is provided in
Figure 7. We observed the standard Rosenblatt distribution.