System Identification Based on Tensor Decompositions: A Trilinear Approach

Dogariu, Laura-Maria; Ciochină, Silviu; Benesty, Jacob; Paleologu, Constantin

doi:10.3390/sym11040556

Open AccessFeature PaperArticle

System Identification Based on Tensor Decompositions: A Trilinear Approach

by

Laura-Maria Dogariu

^1,*,

Silviu Ciochină

¹,

Jacob Benesty

²

and

Constantin Paleologu

¹

Department of Telecommunications, University Politehnica of Bucharest, 1–3, Iuliu Maniu Blvd., 061071 Bucharest, Romania

²

INRS-EMT, University of Quebec, Montreal, QC H5A 1K6, Canada

^*

Author to whom correspondence should be addressed.

Symmetry 2019, 11(4), 556; https://doi.org/10.3390/sym11040556

Submission received: 28 March 2019 / Revised: 12 April 2019 / Accepted: 14 April 2019 / Published: 17 April 2019

(This article belongs to the Special Issue Nonlinear Circuits and Systems in Symmetry)

Download

Browse Figures

Versions Notes

Abstract

:

The theory of nonlinear systems can currently be encountered in many important fields, while the nonlinear behavior of electronic systems and devices has been studied for a long time. However, a global approach for dealing with nonlinear systems does not exist and the methods to address this problem differ depending on the application and on the types of nonlinearities. An interesting category of nonlinear systems is one that can be regarded as an ensemble of (approximately) linear systems. Some popular examples in this context are nonlinear electronic devices (such as acoustic echo cancellers, which are used in applications for two-party or multi-party voice communications, e.g., videoconferencing), which can be modeled as a cascade of linear and nonlinear systems, similar to the Hammerstein model. Multiple-input/single-output (MISO) systems can also be regarded as separable multilinear systems and be treated using the appropriate methods. The high dimension of the parameter space in such problems can be addressed with methods based on tensor decompositions and modelling. In recent work, we focused on a particular type of multilinear structure—namely the bilinear form (i.e., two-dimensional decompositions)—in the framework of identifying spatiotemporal models. In this paper, we extend the work to the decomposition of more complex systems and we propose an iterative Wiener filter tailored for the identification of trilinear forms (where third-order tensors are involved), which can then be further extended to higher order multilinear structures. In addition, we derive the least-mean-square (LMS) and normalized LMS (NLMS) algorithms tailored for such trilinear forms. Simulations performed in the context of system identification (based on the MISO system approach) indicate the good performance of the proposed solution, as compared to conventional approaches.

Keywords:

nonlinear systems; multiple-input/single-output (MISO) systems; Wiener filter, least-mean-square (LMS) algorithm; normalized LMS (NLMS) algorithm; trilinear forms; tensors

1. Introduction

The approximation of nonlinear systems can be performed using a finite sum of the Volterra series expansion that relates the system’s inputs and outputs. This method has been studied since the 1960s [1,2,3] and used in different applications, for example, Reference [4,5,6], among others. In this context, bilinear forms have been used to approximate a large class of nonlinear systems, where the bilinear component is interpreted in terms of an input-output relation (meaning that it is defined with respect to the data). Consequently, the bilinear system may be regarded as one of the simplest recursive nonlinear systems.

More recently, a new approach was introduced in Reference [7], where the bilinear term is considered within the framework of a multiple-input/single-output (MISO) system, and it is defined with respect to the spatiotemporal model’s impulse responses. A particular case of this type of system is the Hammerstein model [8]; in this scenario, a single-input signal passes through a nonlinear block and a linear system, which are cascaded. Recently, adaptive solutions were proposed for the study of such bilinear forms in the context of system identification [9,10,11]. The bilinear approach is suitable for a particular form of the decomposition, which involves only two terms (i.e., two systems). In some cases, it would be useful to exploit a higher-order decomposition, which could improve the overall performance in terms of both complexity and efficiency.

Motivated by the good performance of these approaches in the study of bilinear forms, we aim to further extend this approach to higher-order multilinear in parameters’ systems. Applications, such as multichannel equalization [12], nonlinear acoustic devices for echo cancellation [13], multiple-input/multiple-output (MIMO) communication systems [14,15] and others, can be addressed within the framework of multilinear systems. Because many of these applications can be formulated in terms of system identification problems [16], it is of interest to estimate a model based on the available and observed data, which are usually the input and the output of the system. The challenges that usually occur in such system identification tasks may be represented by a high length of the finite impulse response filter [17,18], which may employ hundreds or even thousands of coefficients. Moreover, another possible issue is the large parameter space [19,20]. Nowadays, such scenarios are related to very important topics, for example, big data [21], machine learning [22], and source separation [23]. Usually, they are addressed based on tensor decompositions and modelling [24,25,26,27]. These techniques mainly exploit the Kronecker product decomposition [28], which further allows the reformulation of a high-dimension problem as low-dimension models, which can be tensorized together.

In this paper, we propose an iterative Wiener filter tailored for third-order systems (i.e., trilinear forms), together with adaptive solutions for such problems—namely the least-mean-square (LMS) and normalized LMS (NLMS) algorithms. Following this development, the solution can then be extended to higher-order multilinear systems. The proposed approach presents an advantage from the perspective of exploiting the decomposition of the global impulse response. One potential limitation of such a method is related to the particular form of the global impulse response to be identified, which is a result of separable systems. In perspective, it would be useful to extend this approach to identify more general forms of impulse responses. Recently, we developed such ideas in References [29,30], based on the Wiener filter and the recursive least-squares (RLS) algorithm, by exploiting the nearest Kronecker product decomposition and the related low-rank approximation.

The rest of this paper is organized as follows: In Section 2, a brief theory of tensors is provided, which is necessary for the following developments; next, in Section 3, the iterative Wiener filter tailored for the identification of trilinear forms is proposed; Section 4 contains the derivation of the trilinear forms for the LMS and NLMS adaptive algorithms; simulation results are provided in Section 5, while Section 6 concludes the paper.

2. Background on Tensors

A tensor is a multidimensional array of data the entries of which are referred by using multiple indices [31,32]. A tensor, a matrix, a vector, and a scalar can be denoted by

A

,

A

,

a

, and a, respectively. In this paper, we are only interested in the third-order tensor

A \in R^{L_{1} \times L_{2} \times L_{3}}

, meaning that its elements are real-valued and its dimension is

L_{1} \times L_{2} \times L_{3}

. For a third-order tensor, the first and second indices

l_{1}

and

l_{2}

correspond to the row and column, respectively—as in a matrix—while the third index

l_{3}

corresponds to the tube and describes its depth. These three indices describe the three different modes. The entries of the different order tensors are denoted by

{(A)}_{l_{1} l_{2} l_{3}} = a_{l_{1} l_{2} l_{3}}

,

{(A)}_{l_{1} l_{2}} = a_{l_{1} l_{2}}

, and

{(a)}_{l_{1}} = a_{l_{1}}

, for

l_{1} = 1, 2, \dots, L_{1}

,

l_{2} = 1, 2, \dots, L_{2}

, and

l_{3} = 1, 2, \dots, L_{3}

.

The notion of vectorization, consisting of transforming a matrix into a vector, is very well known. Matricization does somewhat the same thing but from a third-order tensor into a large matrix. Depending on which index’s elements are considered first, we have matricization along three different modes [24,25]:

\begin{matrix} A_{[1]} & = A_{:, 1 : L_{2}, 1 : L_{3}} \\ = [\begin{matrix} A_{: : 1} & \dots & A_{: : L_{3}} \end{matrix}], A_{[1]} \in R^{L_{1} \times L_{2} L_{3}}, \\ A_{[2]} & = A_{1 : L_{1}, :, 1 : L_{3}}, A_{[2]} \in R^{L_{2} \times L_{1} L_{3}}, \\ A_{[3]} & = A_{1 : L_{1}, 1 : L_{2}, :}, A_{[3]} \in R^{L_{3} \times L_{1} L_{2}}, \end{matrix}

where

A_{: : l_{3}} \in R^{L_{1} \times L_{2}}, l_{3} = 1, 2, \dots, L_{3}

are the frontal slices. Hence, the vectorization of a tensor is

\begin{matrix} vec (A) & = vec (A_{[1]}) = [\begin{matrix} vec (A_{: : 1}) \\ ⋮ \\ vec (A_{: : L_{3}}) \end{matrix}] \in R^{L_{1} L_{2} L_{3}} . \end{matrix}

Let

b_{1}

,

b_{2}

, and

b_{3}

be vectors of length

L_{1}

,

L_{2}

, and

L_{3}

, respectively, whose elements are

b_{1 l_{1}}

,

b_{2 l_{2}}

, and

b_{3 l_{3}}

. A rank-1 tensor (of dimension

L_{1} \times L_{2} \times L_{3}

) is defined as

\begin{matrix} B & = b_{1} \circ b_{2} \circ b_{3}, \end{matrix}

(1)

where ∘ is the vector outer product and the elements of

B

are given by

{(B)}_{l_{1} l_{2} l_{3}} = b_{1 l_{1}} b_{2 l_{2}} b_{3 l_{3}}

. The frontal slices of

B

in Equation (1) are

B_{: : l_{3}} = b_{1} b_{2}^{T} b_{3 l_{3}} \in R^{L_{1} \times L_{2}}, l_{3} = 1, 2, \dots, L_{3}

. In particular, we have

B = b_{1} \circ b_{2} = b_{1} b_{2}^{T}

, where

^{T}

is the transpose operator. Therefore, the rank of a tensor

A

, denoted

rank (A)

, is defined as the minimum number of rank-1 tensors that generate

A

as their sum.

The inner product between two tensors

A

and

B

of the same dimension is

\begin{matrix} 〈 A, B 〉 = \sum_{l_{1} = 1}^{L_{1}} \sum_{l_{2} = 1}^{L_{2}} \sum_{l_{3} = 1}^{L_{3}} a_{l_{1} l_{2} l_{3}} b_{l_{1} l_{2} l_{3}} = {vec}^{T} (B) vec (A) . \end{matrix}

It is important to be able to multiply a tensor with a matrix [26,27]. Let the tensor be

A \in R^{L_{1} \times L_{2} \times L_{3}}

and the matrix be

M_{1} \in R^{M_{1} \times L_{1}}

. The mode-1 product between the tensor

A

and the matrix

M_{1}

gives the tensor:

\begin{matrix} U & = A \times_{1} M_{1}, U \in R^{M_{1} \times L_{2} \times L_{3}}, \end{matrix}

(2)

whose entries are

u_{m_{1} l_{2} l_{3}} = \sum_{l_{1} = 1}^{L_{1}} a_{l_{1} l_{2} l_{3}} m_{m_{1} l_{1}}

, for

m_{1} = 1, 2, \dots, M_{1}

, and

U_{[1]} = M_{1} A_{[1]}

. In the same way, with the matrix

M_{2} \in R^{M_{2} \times L_{2}}

, the mode-2 product between the tensor

A

and the matrix

M_{2}

gives the tensor:

\begin{matrix} U & = A \times_{2} M_{2}, U \in R^{L_{1} \times M_{2} \times L_{3}}, \end{matrix}

(3)

whose entries are

u_{l_{1} m_{2} l_{3}} = \sum_{l_{2} = 1}^{L_{2}} a_{l_{1} l_{2} l_{3}} m_{m_{2} l_{2}}

, for

m_{2} = 1, 2, \dots, M_{2}

, and

U_{[2]} = M_{2} A_{[2]}

. Finally, with the matrix

M_{3} \in R^{M_{3} \times L_{3}}

, the mode-3 product between the tensor

A

and the matrix

M_{3}

gives the tensor:

\begin{matrix} U & = A \times_{3} M_{3}, U \in R^{L_{1} \times L_{2} \times M_{3}}, \end{matrix}

(4)

whose entries are

u_{l_{1} l_{2} m_{3}} = \sum_{l_{3} = 1}^{L_{3}} a_{l_{1} l_{2} l_{3}} m_{m_{3} l_{3}}

, for

m_{3} = 1, 2, \dots, M_{3}

, and

U_{[3]} = M_{3} A_{[3]}

.

The multiplication of

A

with the row vectors

b_{1}^{T}

,

b_{2}^{T}

, and

b_{3}^{T}

(see the components of

B

in Equation (1)) gives the scalar:

\begin{matrix} c & = A \times_{1} b_{1}^{T} \times_{2} b_{2}^{T} \times_{3} b_{3}^{T} = \sum_{l_{1} = 1}^{L_{1}} \sum_{l_{2} = 1}^{L_{2}} \sum_{l_{3} = 1}^{L_{3}} a_{l_{1} l_{2} l_{3}} b_{1 l_{1}} b_{2 l_{2}} b_{3 l_{3}} . \end{matrix}

(5)

In particular, we have

\begin{matrix} c & = A \times_{1} b_{1}^{T} \times_{2} b_{2}^{T} = \sum_{l_{1} = 1}^{L_{1}} \sum_{l_{2} = 1}^{L_{2}} a_{l_{1} l_{2}} b_{1 l_{1}} b_{2 l_{2}} = b_{1}^{T} A b_{2} \end{matrix}

(6)

and

\begin{matrix} c & = a \times_{1} b_{1}^{T} = \sum_{l_{1} = 1}^{L_{1}} a_{l_{1}} b_{1 l_{1}} = b_{1}^{T} a . \end{matrix}

(7)

It is easy to check that Equations (5)–(7) are trilinear (with respect to

b_{1}

,

b_{2}

, and

b_{3}

), bilinear (with respect to

b_{1}

and

b_{2}

), and linear (with respect to

b_{1}

) forms, respectively.

We can express Equation (6) as

\begin{matrix} c & = tr (b_{2} b_{1}^{T} A) = tr [{(b_{1} b_{2}^{T})}^{T} A] = {vec}^{T} (b_{1} b_{2}^{T}) vec (A) \\ = {(b_{2} \otimes b_{1})}^{T} vec (A), \end{matrix}

(8)

where

tr (\cdot)

represents the trace of a square matrix and ⊗ denotes the Kronecker product. Expression (5) can also be written in a more convenient way. Indeed, we have

\begin{matrix} c & = 〈 A, B 〉 = {vec}^{T} (B) vec (A) \\ = {vec}^{T} (b_{1} \circ b_{2} \circ b_{3}) vec (A) \\ = {(b_{3} \otimes b_{2} \otimes b_{1})}^{T} vec (A), \end{matrix}

(9)

where

B

is defined in Equation (1).

3. Trilinear Wiener Filter

Let us consider the following signal model, which is used in system identification tasks:

\begin{matrix} d (t) & = y (t) + v (t) \\ = X (t) \times_{1} h_{1}^{T} \times_{2} h_{2}^{T} \times_{3} h_{3}^{T} + v (t) \\ = \sum_{l_{1} = 1}^{L_{1}} \sum_{l_{2} = 1}^{L_{2}} \sum_{l_{3} = 1}^{L_{3}} x_{l_{1} l_{2} l_{3}} (t) h_{1 l_{1}} h_{2 l_{2}} h_{3 l_{3}} + v (t), \end{matrix}

(10)

where

d (t)

is the desired (also known as reference) signal at time index t,

y (t)

is the output signal of a MISO system and

v (t)

is a zero-mean additive noise, uncorrelated with the input signals. The zero-mean input signals can be described in a tensorial form

X (t) \in R^{L_{1} \times L_{2} \times L_{3}}

:

\begin{matrix} {(X)}_{l_{1} l_{2} l_{3}} (t) & = x_{l_{1} l_{2} l_{3}} (t), l_{k} = 1, 2, \dots, L_{k}, k = 1, 2, 3, \end{matrix}

and the three impulse responses

h_{k}, k = 1, 2, 3

, of lengths

L_{1}

,

L_{2}

, and

L_{3}

, respectively, can be written as

\begin{matrix} h_{k} = {[\begin{matrix} h_{k 1} & h_{k 2} & \dots & h_{k L_{k}} \end{matrix}]}^{T}, k = 1, 2, 3 . \end{matrix}

It can be seen that

y (t)

represents a trilinear form because it is a linear function of each of the vectors

h_{k}, k = 1, 2, 3

, if the other two are kept fixed. The trilinear form can be regarded as an extension of the bilinear form [7].

Starting from the three impulse responses of the MISO system, a rank-1 tensor of dimension

L_{1} \times L_{2} \times L_{3}

can be constructed in the following way:

\begin{matrix} H & = h_{1} \circ h_{2} \circ h_{3}, \end{matrix}

(11)

where

\begin{matrix} {(H)}_{l_{1} l_{2} l_{3}} & = h_{1 l_{1}} h_{2 l_{2}} h_{3 l_{3}}, l_{k} = 1, 2, \dots, L_{k}, k = 1, 2, 3 . \end{matrix}

Consequently, the output signal becomes

\begin{matrix} y (t) = {vec}^{T} (H) vec [X (t)], \end{matrix}

(12)

where

\begin{matrix} vec (H) & = [\begin{matrix} vec (H_{: : 1}) \\ ⋮ \\ vec (H_{: : L_{3}}) \end{matrix}] = h_{3} \otimes h_{2} \otimes h_{1} ≜ h, \end{matrix}

(13)

\begin{matrix} vec [X (t)] & = [\begin{matrix} vec [X_{: : 1} (t)] \\ ⋮ \\ vec [X_{: : L_{3}} (t)] \end{matrix}] ≜ x (t), \end{matrix}

(14)

H_{: : l_{3}}

and

X_{: : l_{3}} (t)

(with

l_{3} = 1, 2, \dots, L_{3}

) are the frontal slices of

H

and

X (t)

, respectively, while

h

and

x (t)

denote two long vectors, each of them having

L_{1} L_{2} L_{3}

elements. Hence, the output signal can be expressed as

\begin{matrix} y (t) = h^{T} x (t) . \end{matrix}

(15)

In this framework, the aim is to estimate the global impulse response

h

. For that, first we define the error signal:

\begin{matrix} e (t) = d (t) - \hat{y} (t) = d (t) - {\hat{h}}^{T} x (t), \end{matrix}

(16)

where

\hat{y} (t) = {\hat{h}}^{T} x (t)

represents the estimated signal, obtained using the impulse response

\hat{h}

of length

L_{1} L_{2} L_{3}

.

Based on Equation (16), let us consider the mean-squared error (MSE) optimization criterion, that is, the minimization of the cost function:

\begin{matrix} J (\hat{h}) = E [e^{2} (t)], \end{matrix}

(17)

where

E [\cdot]

denotes mathematical expectation. Using Equation (16) in Equation (17), together with the notation

σ_{d}^{2} = E [d^{2} (t)]

(the variance of the reference signal),

p = E [x (t) d (t)]

(the cross-correlation vector between the input and the reference signals) and

R = E [x (t) x^{T} (t)]

(the covariance matrix of the input signal), the cost function can be developed as

\begin{matrix} J (\hat{h}) = σ_{d}^{2} - 2 {\hat{h}}^{T} p + {\hat{h}}^{T} R \hat{h} . \end{matrix}

(18)

By minimizing Equation (18), we obtain the well-known Wiener filter:

\begin{matrix} {\hat{h}}_{W} = R^{- 1} p . \end{matrix}

(19)

As we can see, the dimension of the covariance matrix is

L_{1} L_{2} L_{3} \times L_{1} L_{2} L_{3}

, thus requiring a large amount of data (much more than

L_{1} L_{2} L_{3}

samples) to obtain a good estimate of it. Furthermore,

R

could be very ill-conditioned because of its huge size. As a result, the solution

{\hat{h}}_{W}

will be very inaccurate, to say the least, in practice.

On the other hand, as we notice from Equation (13), the global impulse response

h

(with

L_{1} L_{2} L_{3}

coefficients) results based on a combination of the shorter impulse responses

h_{k}, k = 1, 2, 3

, with

L_{1}

,

L_{2}

, and

L_{3}

coefficients, respectively. In fact, we only need

L_{1} + L_{2} + L_{3}

different elements to form

h

, even though this global impulse response is of length

L_{1} L_{2} L_{3}

. This represents the motivation behind an alternative approach to the conventional Wiener solution. Similar to Equation (13), the estimate of the global system can be decomposed as

\begin{matrix} \hat{h} = {\hat{h}}_{3} \otimes {\hat{h}}_{2} \otimes {\hat{h}}_{1}, \end{matrix}

(20)

where

{\hat{h}}_{k}, k = 1, 2, 3

are three impulse responses of lengths

L_{1}

,

L_{2}

, and

L_{3}

, respectively, which represent the estimates of the individual impulse responses

h_{k}, k = 1, 2, 3

. Nevertheless, we should note that there is no unique solution related to the decomposition in Equation (20), since for any constants

α_{1}, α_{2}

, and

α_{3}

, with

α_{1} α_{2} α_{3} = 1

, we have

\hat{h} = {\hat{h}}_{3} \otimes {\hat{h}}_{2} \otimes {\hat{h}}_{1} = α_{3} {\hat{h}}_{3} \otimes α_{2} {\hat{h}}_{2} \otimes α_{1} {\hat{h}}_{1}

. Consequently,

α_{k} {\hat{h}}_{k}, k = 1, 2, 3

also represents a set of solutions for our problem. Nevertheless, the global system impulse response—

h

—can be identified with no scaling ambiguity.

Next, we propose an iterative alternative to the conventional Wiener filter, following the decomposition from Equation (20). First, we can easily verify that

\begin{matrix} (21) & \hat{h} & = ({\hat{h}}_{3} \otimes {\hat{h}}_{2} \otimes I_{L_{1}}) {\hat{h}}_{1} \\ (22) & = ({\hat{h}}_{3} \otimes I_{L_{2}} \otimes {\hat{h}}_{1}) {\hat{h}}_{2} \\ (23) & = (I_{L_{3}} \otimes {\hat{h}}_{2} \otimes {\hat{h}}_{1}) {\hat{h}}_{3}, \end{matrix}

where

I_{L_{k}}

is the identity matrix of size

L_{k} \times L_{k}

, for

k = 1, 2, 3

. Based on the previous relations, the cost function from Equation (18) can be expressed in three different ways. For example, using Equation (21), we obtain

\begin{matrix} J ({\hat{h}}_{1}, {\hat{h}}_{2}, {\hat{h}}_{3}) & = σ_{d}^{2} - 2 {\hat{h}}_{1}^{T} {({\hat{h}}_{3} \otimes {\hat{h}}_{2} \otimes I_{L_{1}})}^{T} p \\ + {\hat{h}}_{1}^{T} {({\hat{h}}_{3} \otimes {\hat{h}}_{2} \otimes I_{L_{1}})}^{T} R ({\hat{h}}_{3} \otimes {\hat{h}}_{2} \otimes I_{L_{1}}) {\hat{h}}_{1} . \end{matrix}

(24)

When

{\hat{h}}_{2}

and

{\hat{h}}_{3}

are fixed, we can rewrite Equation (24) as

\begin{matrix} J_{{\hat{h}}_{2}, {\hat{h}}_{3}} ({\hat{h}}_{1}) & = σ_{d}^{2} - 2 {\hat{h}}_{1}^{T} p_{1} + {\hat{h}}_{1}^{T} R_{1} {\hat{h}}_{1}, \end{matrix}

(25)

where

\begin{matrix} p_{1} & = {({\hat{h}}_{3} \otimes {\hat{h}}_{2} \otimes I_{L_{1}})}^{T} p, \end{matrix}

(26)

\begin{matrix} R_{1} & = {({\hat{h}}_{3} \otimes {\hat{h}}_{2} \otimes I_{L_{1}})}^{T} R ({\hat{h}}_{3} \otimes {\hat{h}}_{2} \otimes I_{L_{1}}) . \end{matrix}

(27)

In this case, the partial cost function from Equation (25) is a convex one and can be minimized with respect to

{\hat{h}}_{1}

.

Similarly, using Equations (22) and (23), the cost function from Equation (18) becomes

\begin{matrix} (28) & J ({\hat{h}}_{1}, {\hat{h}}_{2}, {\hat{h}}_{3}) & = σ_{d}^{2} - 2 {\hat{h}}_{2}^{T} {({\hat{h}}_{3} \otimes I_{L_{2}} \otimes {\hat{h}}_{1})}^{T} p + {\hat{h}}_{2}^{T} {({\hat{h}}_{3} \otimes I_{L_{2}} \otimes {\hat{h}}_{1})}^{T} R ({\hat{h}}_{3} \otimes I_{L_{2}} \otimes {\hat{h}}_{1}) {\hat{h}}_{2} \\ (29) & = σ_{d}^{2} - 2 {\hat{h}}_{3}^{T} {(I_{L_{3}} \otimes {\hat{h}}_{2} \otimes {\hat{h}}_{1})}^{T} p + {\hat{h}}_{3}^{T} {(I_{L_{3}} \otimes {\hat{h}}_{2} \otimes {\hat{h}}_{1})}^{T} R (I_{L_{3}} \otimes {\hat{h}}_{2} \otimes {\hat{h}}_{1}) {\hat{h}}_{3} . \end{matrix}

Also, when

{\hat{h}}_{1}

and

{\hat{h}}_{3}

are fixed, Equation (28) becomes

\begin{matrix} J_{{\hat{h}}_{1}, {\hat{h}}_{3}} ({\hat{h}}_{2}) & = σ_{d}^{2} - 2 {\hat{h}}_{2}^{T} p_{2} + {\hat{h}}_{2}^{T} R_{2} {\hat{h}}_{2}, \end{matrix}

(30)

where

\begin{matrix} p_{2} & = {({\hat{h}}_{3} \otimes I_{L_{2}} \otimes {\hat{h}}_{1})}^{T} p, \end{matrix}

(31)

\begin{matrix} R_{2} & = {({\hat{h}}_{3} \otimes I_{L_{2}} \otimes {\hat{h}}_{1})}^{T} R ({\hat{h}}_{3} \otimes I_{L_{2}} \otimes {\hat{h}}_{1}), \end{matrix}

(32)

while when

{\hat{h}}_{1}

and

{\hat{h}}_{2}

are fixed, the cost function from Equation (29) results in

\begin{matrix} J_{{\hat{h}}_{1}, {\hat{h}}_{2}} ({\hat{h}}_{3}) & = σ_{d}^{2} - 2 {\hat{h}}_{3}^{T} p_{3} + {\hat{h}}_{3}^{T} R_{3} {\hat{h}}_{3}, \end{matrix}

(33)

where

\begin{matrix} p_{3} & = {(I_{L_{3}} \otimes {\hat{h}}_{2} \otimes {\hat{h}}_{1})}^{T} p, \end{matrix}

(34)

\begin{matrix} R_{3} & = {(I_{L_{3}} \otimes {\hat{h}}_{2} \otimes {\hat{h}}_{1})}^{T} R (I_{L_{3}} \otimes {\hat{h}}_{2} \otimes {\hat{h}}_{1}) . \end{matrix}

(35)

In both cases, the partial cost functions from Equations (30) and (33) can be minimized with respect to

{\hat{h}}_{2}

and

{\hat{h}}_{3}

, respectively.

The previous procedure suggests an iterative approach. To start the algorithm, a set of initial values should be provided for two of the estimated impulse responses. For example, we can choose

{\hat{h}}_{2}^{(0)} = (1 / L_{2}) {[\begin{matrix} 1 & 1 & \dots & 1 \end{matrix}]}^{T}

and

{\hat{h}}_{3}^{(0)} = (1 / L_{3}) {[\begin{matrix} 1 & 1 & \dots & 1 \end{matrix}]}^{T}

. Hence, based on Equations (26) and (27), one may compute

\begin{matrix} p_{1}^{(0)} & = {({\hat{h}}_{3}^{(0)} \otimes {\hat{h}}_{2}^{(0)} \otimes I_{L_{1}})}^{T} p, \end{matrix}

(36)

\begin{matrix} R_{1}^{(0)} & = {({\hat{h}}_{3}^{(0)} \otimes {\hat{h}}_{2}^{(0)} \otimes I_{L_{1}})}^{T} R ({\hat{h}}_{3}^{(0)} \otimes {\hat{h}}_{2}^{(0)} \otimes I_{L_{1}}) . \end{matrix}

(37)

In the first iteration, the first cost function to be minimized results from Equation (25) (using Equations (36) and (37)), that is,

\begin{matrix} J_{{\hat{h}}_{2}, {\hat{h}}_{3}} ({\hat{h}}_{1}^{(1)}) & = σ_{d}^{2} - 2 {({\hat{h}}_{1}^{(1)})}^{T} p_{1}^{(0)} + {({\hat{h}}_{1}^{(1)})}^{T} R_{1}^{(0)} {\hat{h}}_{1}^{(1)}, \end{matrix}

which leads to the solution

{\hat{h}}_{1}^{(1)} = {(R_{1}^{(0)})}^{- 1} p_{1}^{(0)}

. Also, since

{\hat{h}}_{1}^{(1)}

and

{\hat{h}}_{3}^{(0)}

are now available, we can evaluate (based on Equations (31) and (32))

\begin{matrix} p_{2}^{(1)} & = {({\hat{h}}_{3}^{(0)} \otimes I_{L_{2}} \otimes {\hat{h}}_{1}^{(1)})}^{T} p, \end{matrix}

(38)

\begin{matrix} R_{2}^{(1)} & = {({\hat{h}}_{3}^{(0)} \otimes I_{L_{2}} \otimes {\hat{h}}_{1}^{(1)})}^{T} R ({\hat{h}}_{3}^{(0)} \otimes I_{L_{2}} \otimes {\hat{h}}_{1}^{(1)}), \end{matrix}

(39)

so that the cost function from Equation (30) becomes

\begin{matrix} J_{{\hat{h}}_{1}, {\hat{h}}_{3}} ({\hat{h}}_{2}^{(1)}) & = σ_{d}^{2} - 2 {({\hat{h}}_{2}^{(1)})}^{T} p_{2}^{(1)} + {({\hat{h}}_{2}^{(1)})}^{T} R_{2}^{(1)} {\hat{h}}_{2}^{(1)}, \end{matrix}

while its minimization leads to

{\hat{h}}_{2}^{(1)} = {(R_{2}^{(1)})}^{- 1} p_{2}^{(1)}

. Finally, using the solutions

{\hat{h}}_{1}^{(1)}

and

{\hat{h}}_{2}^{(1)}

, we can find

{\hat{h}}_{3}^{(1)}

in a similar manner. First, we evaluate (based on Equations (34) and (35))

\begin{matrix} p_{3}^{(1)} & = {(I_{L_{3}} \otimes {\hat{h}}_{2}^{(1)} \otimes {\hat{h}}_{1}^{(1)})}^{T} p, \end{matrix}

(40)

\begin{matrix} R_{3}^{(1)} & = {(I_{L_{3}} \otimes {\hat{h}}_{2}^{(1)} \otimes {\hat{h}}_{1}^{(1)})}^{T} R (I_{L_{3}} \otimes {\hat{h}}_{2}^{(1)} \otimes {\hat{h}}_{1}^{(1)}) . \end{matrix}

(41)

Then, we minimize the cost function (which results from Equation (33)):

\begin{matrix} J_{{\hat{h}}_{1}, {\hat{h}}_{2}} ({\hat{h}}_{3}^{(1)}) & = σ_{d}^{2} - 2 {({\hat{h}}_{3}^{(1)})}^{T} p_{3}^{(1)} + {({\hat{h}}_{3}^{(1)})}^{T} R_{3}^{(1)} {\hat{h}}_{3}^{(1)}, \end{matrix}

which leads to the solution

{\hat{h}}_{3}^{(1)} = {(R_{3}^{(1)})}^{- 1} p_{3}^{(1)}

. Continuing the iterative procedure, at iteration number n, we obtain the estimates of the impulse responses based on the following steps:

\begin{matrix} {\hat{h}}_{1}^{(n)} & = {(R_{1}^{(n - 1)})}^{- 1} p_{1}^{(n - 1)}, \\ p_{2}^{(n)} & = {({\hat{h}}_{3}^{(n - 1)} \otimes I_{L_{2}} \otimes {\hat{h}}_{1}^{(n)})}^{T} p, \\ R_{2}^{(n)} & = {({\hat{h}}_{3}^{(n - 1)} \otimes I_{L_{2}} \otimes {\hat{h}}_{1}^{(n)})}^{T} R ({\hat{h}}_{3}^{(n - 1)} \otimes I_{L_{2}} \otimes {\hat{h}}_{1}^{(n)}), \\ {\hat{h}}_{2}^{(n)} & = {(R_{2}^{(n)})}^{- 1} p_{2}^{(n)}, \\ p_{3}^{(n)} & = {(I_{L_{3}} \otimes {\hat{h}}_{2}^{(n)} \otimes {\hat{h}}_{1}^{(n)})}^{T} p, \\ R_{3}^{(n)} & = {(I_{L_{3}} \otimes {\hat{h}}_{2}^{(n)} \otimes {\hat{h}}_{1}^{(n)})}^{T} R (I_{L_{3}} \otimes {\hat{h}}_{2}^{(n)} \otimes {\hat{h}}_{1}^{(n)}), \\ {\hat{h}}_{3}^{(n)} & = {(R_{3}^{(n)})}^{- 1} p_{3}^{(n)} . \end{matrix}

Thus, the global impulse response at iteration n results in

\begin{matrix} {\hat{h}}^{(n)} = {\hat{h}}_{3}^{(n)} \otimes {\hat{h}}_{2}^{(n)} \otimes {\hat{h}}_{1}^{(n)} . \end{matrix}

(42)

The proposed iterative Wiener filter for trilinear forms represents an extension of the solution presented in Reference [7] (in the context of bilinear forms). However, when the MISO system identification problem results are based on Equation (10), it is more advantageous to use the algorithm tailored for trilinear forms instead of reformulating the problem in terms of multiple bilinear forms. The trilinear approach has some similarities (to some extent) with that introduced in Reference [33]. However, the batch Trilinear Wiener-Hopf (TriWH) algorithm from Reference [33] is more related to an adaptive approach, since the statistics are estimated within the algorithm. On the other hand, in the case of our iterative Wiener filter, the estimates of the statistics are considered to be a priori available (see also the related discussion in the next section), which is basically in the spirit of the Wiener filter.

4. LMS and NLMS Algorithms for Trilinear Forms

It is well-known that the Wiener filter presents several limitations that may make it unsuitable to be used in practice (e.g., the matrix inversion operation, the correlation matrix estimation, etc.). For this reason, a more convenient manner of treating the system identification problem is through adaptive filtering. One of the simplest types of adaptive algorithms is the LMS, which will be presented in the following, tailored for the new trilinear form approach.

First, let us consider the three estimated impulse responses

{\hat{h}}_{k}, k = 1, 2, 3

, together with the corresponding a priori error signals:

\begin{matrix} e_{{\hat{h}}_{2} {\hat{h}}_{3}} (t) & = d (t) - {\hat{h}}_{1}^{T} (t - 1) x_{{\hat{h}}_{2} {\hat{h}}_{3}} (t), \end{matrix}

(43)

\begin{matrix} e_{{\hat{h}}_{1} {\hat{h}}_{3}} (t) & = d (t) - {\hat{h}}_{2}^{T} (t - 1) x_{{\hat{h}}_{1} {\hat{h}}_{3}} (t), \end{matrix}

(44)

\begin{matrix} e_{{\hat{h}}_{1} {\hat{h}}_{2}} (t) & = d (t) - {\hat{h}}_{3}^{T} (t - 1) x_{{\hat{h}}_{1} {\hat{h}}_{2}} (t), \end{matrix}

(45)

where

\begin{matrix} x_{{\hat{h}}_{2} {\hat{h}}_{3}} (t) & = [{\hat{h}}_{3} (t - 1) \otimes {\hat{h}}_{2} (t - 1) \otimes I_{L_{1}}] x (t), \end{matrix}

(46)

\begin{matrix} x_{{\hat{h}}_{1} {\hat{h}}_{3}} (t) & = [{\hat{h}}_{3} (t - 1) \otimes I_{L_{2}} \otimes {\hat{h}}_{1} (t - 1)] x (t), \end{matrix}

(47)

\begin{matrix} x_{{\hat{h}}_{1} {\hat{h}}_{2}} (t) & = [I_{L_{3}} \otimes {\hat{h}}_{2} (t - 1) \otimes {\hat{h}}_{1} (t - 1)] x (t) . \end{matrix}

(48)

It can be checked that

e_{{\hat{h}}_{2} {\hat{h}}_{3}} (t) = e_{{\hat{h}}_{1} {\hat{h}}_{3}} (t) = e_{{\hat{h}}_{1} {\hat{h}}_{2}} (t)

. In this context, the LMS updates for the three filters are the following:

\begin{matrix} {\hat{h}}_{1} (t) & = {\hat{h}}_{1} (t - 1) - \frac{μ_{{\hat{h}}_{1}}}{2} \times \frac{\partial e_{{\hat{h}}_{2} {\hat{h}}_{3}}^{2} (t)}{\partial {\hat{h}}_{1} (t - 1)} \\ = {\hat{h}}_{1} (t - 1) + μ_{{\hat{h}}_{1}} x_{{\hat{h}}_{2} {\hat{h}}_{3}} (t) e_{{\hat{h}}_{2} {\hat{h}}_{3}} (t), \end{matrix}

(49)

\begin{matrix} {\hat{h}}_{2} (t) & = {\hat{h}}_{2} (t - 1) - \frac{μ_{{\hat{h}}_{2}}}{2} \times \frac{\partial e_{{\hat{h}}_{1} {\hat{h}}_{3}}^{2} (t)}{\partial {\hat{h}}_{2} (t - 1)} \\ = {\hat{h}}_{2} (t - 1) + μ_{{\hat{h}}_{2}} x_{{\hat{h}}_{1} {\hat{h}}_{3}} (t) e_{{\hat{h}}_{1} {\hat{h}}_{3}} (t), \end{matrix}

(50)

\begin{matrix} {\hat{h}}_{3} (t) & = {\hat{h}}_{3} (t - 1) - \frac{μ_{{\hat{h}}_{3}}}{2} \times \frac{\partial e_{{\hat{h}}_{1} {\hat{h}}_{2}}^{2} (t)}{\partial {\hat{h}}_{3} (t - 1)} \\ = {\hat{h}}_{3} (t - 1) + μ_{{\hat{h}}_{3}} x_{{\hat{h}}_{1} {\hat{h}}_{2}} (t) e_{{\hat{h}}_{1} {\hat{h}}_{2}} (t), \end{matrix}

(51)

where

μ_{{\hat{h}}_{1}} > 0, μ_{{\hat{h}}_{2}} > 0, μ_{{\hat{h}}_{3}} > 0

represent the step-size parameters. Relations (49)–(51) define the LMS algorithm for trilinear forms, namely LMS-TF.

For the initialization of the estimated impulse responses, we use

\begin{matrix} {\hat{h}}_{1} (0) & = {[1 0 \dots 0]}^{T}, \end{matrix}

(52)

\begin{matrix} {\hat{h}}_{2} (0) & = \frac{1}{L_{2}} {[1 1 \dots 1]}^{T}, \end{matrix}

(53)

\begin{matrix} {\hat{h}}_{3} (0) & = \frac{1}{L_{3}} {[1 1 \dots 1]}^{T} . \end{matrix}

(54)

In the end, we can obtain the global filter using Relation (20). Alternatively, this global impulse response may be identified directly with the regular LMS algorithm by using the following update:

\begin{matrix} \hat{h} (t) & = \hat{h} (t - 1) + μ_{\hat{h}} x (t) e (t), \end{matrix}

(55)

where

e (t) = d (t) - \hat{h} (t - 1) x (t)

(56)

and

μ_{\hat{h}}

is the global step-size parameter.

However, an observation needs to be made regarding the update in Relation (55): this involves the presence of an adaptive filter of length

L_{1} L_{2} L_{3}

, whereas the LMS-TF algorithm, which is defined by the update Relations (49)–(51), uses three shorter filters of lengths

L_{1}

,

L_{2}

, and

L_{3}

, respectively. Basically, a system identification problem of size

L_{1} L_{2} L_{3}

(as in the regular approach) was reformulated in terms of three shorter filters of lengths

L_{1}

,

L_{2}

, and

L_{3}

. Taking into account that we usually have

L_{1} + L_{2} + L_{3} ≪ L_{1} L_{2} L_{3}

, the advantage of the trilinear approach (in terms of reducing the complexity) could be important. Therefore, the complexity of this new approach is lower and the convergence rate is expected to be faster.

The step-size parameters in Relations (49)–(51) take constant values, chosen such that they ensure the convergence of the algorithm and a good compromise between convergence speed and steady-state misadjustment. Nevertheless, when dealing with non-stationary signals, it may be more appropriate to use time-dependent step-sizes, which lead to the following update relations:

\begin{matrix} {\hat{h}}_{1} (t) & = {\hat{h}}_{1} (t - 1) + μ_{{\hat{h}}_{1}} (t) x_{{\hat{h}}_{2} {\hat{h}}_{3}} (t) e_{{\hat{h}}_{2} {\hat{h}}_{3}} (t), \end{matrix}

(57)

\begin{matrix} {\hat{h}}_{2} (t) & = {\hat{h}}_{2} (t - 1) + μ_{{\hat{h}}_{2}} (t) x_{{\hat{h}}_{1} {\hat{h}}_{3}} (t) e_{{\hat{h}}_{1} {\hat{h}}_{3}} (t), \end{matrix}

(58)

\begin{matrix} {\hat{h}}_{3} (t) & = {\hat{h}}_{3} (t - 1) + μ_{{\hat{h}}_{3}} (t) x_{{\hat{h}}_{1} {\hat{h}}_{2}} (t) e_{{\hat{h}}_{1} {\hat{h}}_{2}} (t) . \end{matrix}

(59)

For deriving the expressions of the step-size parameters, we take into consideration the stability conditions and we target to cancel the following expressions, which represent the a posteriori error signals [34]:

\begin{matrix} ε_{{\hat{h}}_{2} {\hat{h}}_{3}} (t) & = d (t) - {\hat{h}}_{1}^{T} (t) x_{{\hat{h}}_{2} {\hat{h}}_{3}} (t), \end{matrix}

(60)

\begin{matrix} ε_{{\hat{h}}_{1} {\hat{h}}_{3}} (t) & = d (t) - {\hat{h}}_{2}^{T} (t) x_{{\hat{h}}_{1} {\hat{h}}_{3}} (t), \end{matrix}

(61)

\begin{matrix} ε_{{\hat{h}}_{1} {\hat{h}}_{2}} (t) & = d (t) - {\hat{h}}_{3}^{T} (t) x_{{\hat{h}}_{1} {\hat{h}}_{2}} (t) . \end{matrix}

(62)

By replacing Relation (49) in (60), Relation (50) in (61), and Relation (51) in (62), respectively, and by imposing the conditions

ε_{{\hat{h}}_{2} {\hat{h}}_{3}} (t) = 0

,

ε_{{\hat{h}}_{1} {\hat{h}}_{3}} (t) = 0

, and

ε_{{\hat{h}}_{1} {\hat{h}}_{2}} (t) = 0

, we obtain that

\begin{matrix} ε_{{\hat{h}}_{2} {\hat{h}}_{3}} (t) & = e_{{\hat{h}}_{2} {\hat{h}}_{3}} (t) [1 - μ_{{\hat{h}}_{1}} (t) x_{{\hat{h}}_{2} {\hat{h}}_{3}}^{T} (t) x_{{\hat{h}}_{2} {\hat{h}}_{3}} (t)] = 0, \end{matrix}

(63)

\begin{matrix} ε_{{\hat{h}}_{1} {\hat{h}}_{3}} (t) & = e_{{\hat{h}}_{1} {\hat{h}}_{3}} (t) [1 - μ_{{\hat{h}}_{2}} (t) x_{{\hat{h}}_{1} {\hat{h}}_{3}}^{T} (t) x_{{\hat{h}}_{1} {\hat{h}}_{3}} (t)] = 0, \end{matrix}

(64)

\begin{matrix} ε_{{\hat{h}}_{1} {\hat{h}}_{2}} (t) & = e_{{\hat{h}}_{1} {\hat{h}}_{2}} (t) [1 - μ_{{\hat{h}}_{3}} (t) x_{{\hat{h}}_{1} {\hat{h}}_{2}}^{T} (t) x_{{\hat{h}}_{1} {\hat{h}}_{2}} (t)] = 0 . \end{matrix}

(65)

Consequently, assuming that

e_{{\hat{h}}_{2} {\hat{h}}_{3}} (t) \neq 0

,

e_{{\hat{h}}_{1} {\hat{h}}_{3}} (t) \neq 0

, and

e_{{\hat{h}}_{1} {\hat{h}}_{2}} (t) \neq 0

, the following expressions for the step-size parameters result:

\begin{matrix} μ_{{\hat{h}}_{1}} (t) & = \frac{1}{x_{{\hat{h}}_{2} {\hat{h}}_{3}}^{T} (t) x_{{\hat{h}}_{2} {\hat{h}}_{3}} (t)}, \end{matrix}

(66)

\begin{matrix} μ_{{\hat{h}}_{2}} (t) & = \frac{1}{x_{{\hat{h}}_{1} {\hat{h}}_{3}}^{T} (t) x_{{\hat{h}}_{1} {\hat{h}}_{3}} (t)}, \end{matrix}

(67)

\begin{matrix} μ_{{\hat{h}}_{3}} (t) & = \frac{1}{x_{{\hat{h}}_{1} {\hat{h}}_{2}}^{T} (t) x_{{\hat{h}}_{1} {\hat{h}}_{2}} (t)} . \end{matrix}

(68)

In order to obtain a good balance between convergence rate and misadjustment, three positive constants,

0 < α_{{\hat{h}}_{1}} < 1

,

0 < α_{{\hat{h}}_{2}} < 1

, and

0 < α_{{\hat{h}}_{3}} < 1

, are employed [35]. In addition, three regularization constants

δ_{{\hat{h}}_{1}} > 0

,

δ_{{\hat{h}}_{2}} > 0

, and

δ_{{\hat{h}}_{3}} > 0

, usually chosen to be proportional to the variance of the input signal [36], are added to the denominators of the step-size parameters. Finally, the updates of the NLMS algorithm for trilinear forms (NLMS-TF) become

\begin{matrix} {\hat{h}}_{1} (t) & = {\hat{h}}_{1} (t - 1) + \frac{α_{{\hat{h}}_{1}} (t) x_{{\hat{h}}_{2} {\hat{h}}_{3}} (t) e_{{\hat{h}}_{2} {\hat{h}}_{3}} (t)}{x_{{\hat{h}}_{2} {\hat{h}}_{3}}^{T} (t) x_{{\hat{h}}_{2} {\hat{h}}_{3}} (t) + δ_{{\hat{h}}_{1}}}, \end{matrix}

(69)

\begin{matrix} {\hat{h}}_{2} (t) & = {\hat{h}}_{2} (t - 1) + \frac{α_{{\hat{h}}_{2}} (t) x_{{\hat{h}}_{1} {\hat{h}}_{3}} (t) e_{{\hat{h}}_{1} {\hat{h}}_{3}} (t)}{x_{{\hat{h}}_{1} {\hat{h}}_{3}}^{T} (t) x_{{\hat{h}}_{1} {\hat{h}}_{3}} (t) + δ_{{\hat{h}}_{2}}}, \end{matrix}

(70)

\begin{matrix} {\hat{h}}_{3} (t) & = {\hat{h}}_{3} (t - 1) + \frac{α_{{\hat{h}}_{3}} (t) x_{{\hat{h}}_{1} {\hat{h}}_{2}} (t) e_{{\hat{h}}_{1} {\hat{h}}_{2}} (t)}{x_{{\hat{h}}_{1} {\hat{h}}_{2}}^{T} (t) x_{{\hat{h}}_{1} {\hat{h}}_{2}} (t) + δ_{{\hat{h}}_{3}}} . \end{matrix}

(71)

The initializations of the estimated filters may be the same as Equations (52)–(54). In a similar way as for the LMS algorithm, the global impulse response can be identified using the regular NLMS:

\begin{matrix} \hat{h} (t) & = \hat{h} (t - 1) + \frac{α_{\hat{h}} x (t) e (t)}{x^{T} (t) x (t) + δ_{\hat{h}}}, \end{matrix}

(72)

where

e (t)

is given in Equation (56). The parameters

α_{\hat{h}}

and

δ_{\hat{h}}

represent the normalized step-size parameter and the regularization constant for the global filter, respectively. As previously shown in Reference [9] for bilinear forms, the global misalignment can be controlled by using a constraint on the sum of the normalized step-sizes, and this sum should be smaller than 1. In this way, for different values of

α_{{\hat{h}}_{1}}

,

α_{{\hat{h}}_{2}}

,

α_{{\hat{h}}_{3}}

fulfilling this condition, the misalignment of the global filter is the same. On the other hand, in the case when

α_{{\hat{h}}_{1}} = α_{{\hat{h}}_{2}} = α_{{\hat{h}}_{3}}

, the three filters achieve the same level of the misalignment.

Again, we notice that the global impulse response identification involves the use of a filter of length

L_{1} L_{2} L_{3}

. Because the trilinear approach uses three much shorter impulse responses of lengths

L_{1}

,

L_{2}

, and

L_{3}

, respectively, it is expected that this new solution will yield a faster convergence. This will be shown through simulations.

The NLMS-TF algorithm proposed here is similar to that presented in Reference [19]. However, our choice of the system impulse responses used in simulations is different from that in Reference [19]. On the contrary, we aim to show the performance of the algorithm in a scenario that includes a real echo path. In addition, we also study the tracking capability of the algorithm.

5. Simulation Results

In order to show the performance of our approach, we perform simulations in which we compare the trilinear forms of the proposed algorithms with their regular counterparts.

5.1. Iterative Wiener Filter

In this section, the performance of the proposed iterative Wiener filter for trilinear forms is evaluated in the context of system identification. The input signals that form

X

(t)

are AR(1) processes, which are obtained by generating white Gaussian noises and then filtering them through a first-order system

1 / (1 - 0.9 z^{- 1})

. The additive noise

v (t)

, corrupting the output signal

y (t)

, is white and Gaussian, with the variance set to

σ_{v}^{2} = 0.01

. The impulse responses used in simulations are depicted in Figure 1. The impulse response

h_{1}

is the first impulse response from the G168 Recommendation [37], of length

L_{1} = 64

(see Figure 1a). Next,

h_{2}

is a random impulse response (with Gaussian distribution) of length

L_{2} = 8

(as shown in Figure 1b). Finally, the coefficients of the impulse response

h_{3}

(depicted in Figure 1c) are evaluated as

h_{3 l_{3}} = {0.5}^{l_{3} - 1}, l_{3} = 1, 2, \dots, L_{3}

, using

L_{3} = 4

. Therefore, the global impulse response from Figure 1d results in

h = h_{3} \otimes h_{2} \otimes h_{1}

and its length is

L = L_{1} L_{2} L_{3} = 2048

. As we can see, this global impulse response is similar (to some extent) to a channel with echoes, similar to an acoustic echo path.

In order to evaluate the identification of the individual filters

h_{k}, k = 1, 2, 3

, we should use the normalized projection misalignment (NPM) [38]:

\begin{matrix} NPM (h_{k}, {\hat{h}}_{k}) & = & 1 - {(\frac{h_{k}^{T} {\hat{h}}_{k}}{∥ h_{k} ∥ ∥ {\hat{h}}_{k} ∥})}^{2}, \end{matrix}

(73)

where

∥ \cdot ∥

denotes the Euclidean norm. For the identification of the global impulse response,

h (t)

, we use the normalized misalignment (NM):

NM (h, \hat{h}) = \frac{∥ h - \hat{h} ∥^{2}}{{∥ h ∥}^{2}} .

(74)

We consider that the covariance matrix

R

and the cross-correlation vector

p

are estimated based on N data samples:

\begin{matrix} \hat{R} & = \frac{1}{N} \sum_{t = 1}^{N} x (t) x^{T} (t), \end{matrix}

(75)

\begin{matrix} \hat{p} & = \frac{1}{N} \sum_{t = 1}^{N} x (t) d (t) . \end{matrix}

(76)

These two terms are a priori computed and they are used afterwards (instead of

R

and

p

) for both the conventional and iterative Wiener filters.

The matrix involved in the linear system, to be solved in the case of the conventional Wiener filter from Equation (19), is of size

L \times L

; hence, a number of data samples larger than L are needed to estimate the statistics in Equations (75) and (76), in order to obtain a good solution. This is shown in Figure 2, where different values of N (that is, the available amount of data in Equations (75) and (76)) are used and the solution provided by the conventional Wiener filter is evaluated for each of these values. Similar to Equation (74), the performance measure in Figure 2 is the NM (in dB), which is defined as

10 \log_{10} ({∥h - {\hat{h}}_{W}∥}^{2} / {∥h∥}^{2})

; in this case, the conventional Wiener solution

{\hat{h}}_{W}

results from Equation (19) (using Equations (75) and (76)). As we can see, the conventional Wiener filter achieves a reasonable decrease in misalignment only when a large amount of data (i.e.,

N > L

) are used to estimate the statistics in Equations (75) and (76).

The main advantage provided by the iterative Wiener filter is that it operates with much shorter filters (due to the decomposition in Equation (20)) and, consequently, the dimensions of the linear systems of equations to be solved are significantly reduced. Therefore, even with a small amount of data (i.e.,

N < L

), the iterative Wiener filter is able to obtain a reliable estimation. This advantage is outlined in Figure 3, where the solution provided by the conventional Wiener filter (based on Equation (19) and using Equations (75) and (76)) is compared to the iterative Wiener filter from Equation (42). The performance measure is the NM (in dB), which is evaluated based on Equation (74), for the identification of the global system

h

. Three amounts of data are considered in this experiment, that is,

N = 500, 2500,

and 5000. Clearly, in the first case (

N = 500

), the conventional Wiener filter leads to an inaccurate solution due to the small amount of data (as compared to

L = 2048

). When the amount of data slightly exceeds the value of L (e.g.,

N = 2500

), the conventional Wiener filter provides a more reliable solution, that is, the misalignment attenuation is approximately

- 10

dB. Finally, for a large amount of data (

N = 5000

), this conventional solution is improved in terms of accuracy (e.g., the misalignment is close to

- 20

dB). On the other hand, in all the previous cases, the proposed iterative Wiener filter achieves a much more accurate solution (with only a few iterations), which outperforms by far the conventional one (even in the case when a small amount of data are available, e.g.,

N = 500

). For example, the iterative Wiener filter which uses

N = 500

yields a lower misalignment level with respect to the conventional Wiener filter with

N = 5000

.

In Figure 4, the performance of the iterative Wiener filter is also illustrated using the NPMs (in dB), based on Equation (73), for the identification of the individual impulse responses from Figure 1a–c. Basically, the same conclusion applies, that is, only a few iterations are required by the iterative Wiener filter to achieve a reliable solution (even for a small amount of data).

5.2. LMS-TF and NLMS-TF

For the second set of simulations, the setup is the same as in the previous experiments. First, we aim to show the influence of the constant step-size values on the performance of the LMS-TF algorithm. The performance in terms of the NM (in dB) is shown in Figure 5.

It can be seen that if the step-sizes take large values, the LMS-TF algorithm reaches convergence after less than

10^{4}

iterations. Then, as these values decrease, the convergence becomes slower but the steady-state value of the NM also decreases, highlighting the compromise between convergence rate and NM value.

Next, we illustrate the improvement brought by the proposed solution, by comparing the LMS-TF algorithm to its regular counterpart, applied for the identification of the global filter. Figure 6 shows the values of the NM for the regular LMS filter and the LMS-TF. The first observation is that, in order to reach the same steady-state value of the NM, the regular LMS algorithm needs many more iterations. On the other hand, for a similar convergence speed, the final NM provided by the LMS-TF is much lower than that offered by its regular counterpart. This proves that the proposed solution offers a significant improvement with respect to the classical approach.

The next step is to study the behavior of the NLMS-TF filter. First, the performance of the NLMS-TF algorithm is depicted in Figure 7, for different values of the normalized step-sizes. The regularization constants are

δ_{{\hat{h}}_{1}} = δ_{{\hat{h}}_{2}} = δ_{{\hat{h}}_{3}} = 0.001

. The same conclusion as for the LMS-TF case is valid, namely that the decrease in the normalized step-sizes leads to a smaller value of the final NM but at the cost of a slower convergence rate. When the step-size values decrease 10 times, the number of iterations needed to reach convergence increases almost 10 times, while the steady-state NM value decreases by a bit more than 10 dB.

We then compare the NLMS-TF algorithm with its regular counterpart (applied on the global filter) in Figure 8. Again, we observe that the NLMS-TF behaves better than the regular NLMS algorithm, from the perspective of both convergence rate and final NM value.

Finally, the tracking capability of the NLMS-TF algorithm is of interest, that is, the capability of the algorithm to react to abrupt changes to the impulse responses. In order to study this characteristic, we simulated a sudden change of the random impulse response

h_{2}

in the middle of the experiment. The results are presented in Figure 9. The improvement brought by the NLMS-TF algorithm is clear. The algorithm tracks faster after the change of the system, while the value of the NM is smaller, as compared to the regular NLMS filter. This proves that even if the environment changes, the proposed approach exhibits good behavior.

6. Conclusions

In this paper, we addressed the problem of multilinear system identification, focusing in particular on trilinear forms in the framework of MISO systems. Trilinear forms are defined with respect to the impulse responses of the system and are treated using third-order tensors. In this context, we derived the corresponding Wiener filter, as well as the LMS and NLMS adaptive algorithms, tailored for such trilinear forms (LMS-TF and NLMS-TF). We have shown through simulations that the proposed algorithms lead to better solutions as compared to their regular counterparts, due to the reformulation of the system identification problem of high dimension in lower dimension problems. Experimental results support the theoretical analysis and highlight the good performance of the proposed solutions for the problem of system identification. Future work can focus on extending the approach to higher-order multilinear systems.

Author Contributions

Conceptualization, L.-M.D.; Formal analysis, S.C.; Methodology, J.B.; Software, C.P.

Funding

This research received no external funding.

Conflicts of Interest

The authors declare no conflict of interest.

References

Roy, R.; Sherman, J. A learning technique for Volterra series representation. IEEE Trans. Autom. Control 1967, 12, 761–764. [Google Scholar] [CrossRef]
Barker, H.A.; Obidegwu, S.N.; Pradisthayon, T. Performance of antisymmetric pseudorandom signals in the measurement of 2nd-order Volterra kernels by crosscorrelation. Proc. IEEE 1972, 119, 353–362. [Google Scholar]
Annabestani, M.; Naghavi, N.; Nejad, M.M. Nonautoregressive nonlinear identification of IPMC in large deformation situations using generalized Volterra-based approach. IEEE Trans. Instrum. Meas. 2016, 65, 2866–2872. [Google Scholar] [CrossRef]
Zhang, Z.; Ma, Y. Modeling of rate-dependent hysteresis using a GPO-based adaptive filter. Sensors 2016, 16, 205. [Google Scholar] [CrossRef]
Rugh, W.J. Nonlinear System Theory: The Volterra/Wiener Approach; Johns Hopkins University Press: Baltimore, MD, USA, 1981. [Google Scholar]
Carassale, L.; Kareem, A. Modeling nonlinear systems by Volterra series. J. Eng. Mech. 2010, 136, 801–818. [Google Scholar] [CrossRef]
Benesty, J.; Paleologu, C.; Ciochină, S. On the identification of bilinear forms with the Wiener filter. IEEE Signal Process. Lett. 2017, 24, 653–657. [Google Scholar] [CrossRef]
Bai, E.-W.; Li, D. Convergence of the iterative Hammerstein system identification algorithm. IEEE Trans. Autom. Control 2004, 49, 1929–1940. [Google Scholar] [CrossRef]
Paleologu, C.; Benesty, J.; Ciochină, S. Adaptive filtering for the identification of bilinear forms. Digit. Signal Process. 2018, 75, 153–167. [Google Scholar] [CrossRef]
Dogariu, L.; Paleologu, C.; Ciochină, S.; Benesty, J.; Piantanida, P. Identification of bilinear forms with the Kalman filter. In Proceedings of the 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Calgary, AB, Canada, 15–20 April 2018; pp. 4134–4138. [Google Scholar]
Dogariu, L.-M.; Ciochină, S.; Paleologu, C.; Benesty, J. A connection between the Kalman filter and an optimized LMS algorithm for bilinear forms. Algorithms 2018, 11, 211. [Google Scholar] [CrossRef]
Gesbert, D.; Duhamel, P. Robust blind joint data/channel estimation based on bilinear optimization. In Proceedings of the 8th Workshop on Statistical Signal and Array Processing, Corfu, Greece, 24–26 June 1996; pp. 168–171. [Google Scholar]
Stenger, A.; Kellermann, W. Adaptation of a memoryless preprocessor for nonlinear acoustic echo cancelling. Signal Process. 2000, 80, 1747–1760. [Google Scholar] [CrossRef] [Green Version]
Ribeiro, L.N.; Schwarz, S.; Rupp, M.; de Almeida, A.L.F.; Mota, J.C.M. A low-complexity equalizer for massive MIMO systems based on array separability. In Proceedings of the 2017 25th European Signal Processing Conference (EUSIPCO), Kos, Greece, 28 August–2 September 2017; pp. 2522–2526. [Google Scholar]
Da Costa, M.N.; Favier, G.; Romano, J.M.T. Tensor modelling of MIMO communication systems with performance analysis and Kronecker receivers. Signal Process. 2018, 145, 304–316. [Google Scholar] [CrossRef]
Ljung, L. System Identification: Theory for the User, 2nd ed.; Prentice-Hall: Upper Saddle River, NJ, USA, 1999. [Google Scholar]
Gay, S.L.; Benesty, J. (Eds.) Acoustic Signal Processing for Telecommunication; Kluwer Academic Publisher: Boston, MA, USA, 2000. [Google Scholar]
Benesty, J.; Gaensler, T.; Morgan, D.R.; Sondhi, M.M.; Gay, S.L. Advances in Network and Acoustic Echo Cancellation; Springer: Berlin, Germany, 2001. [Google Scholar]
Rupp, M.; Schwarz, S. A tensor LMS algorithm. In Proceedings of the 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brisbane, Australia, 19–24 April 2015; pp. 3347–3351. [Google Scholar]
Rupp, M.; Schwarz, S. Gradient-based approaches to learn tensor products. In Proceedings of the 2015 23rd European Signal Processing Conference (EUSIPCO), Nice, France, 31 August–4 September 2015; pp. 2486–2490. [Google Scholar]
Vervliet, N.; Debals, O.; Sorber, L.; Lathauwer, L.D. Breaking the curse of dimensionality using decompositions of incomplete tensors: Tensor-based scientific computing in big data analysis. IEEE Signal Process. Mag. 2014, 31, 71–79. [Google Scholar] [CrossRef]
Sidiropoulos, N.; Lathauwer, L.D.; Fu, X.; Huang, K.; Papalexakis, E.; Faloutsos, C. Tensor decomposition for signal processing and machine learning. IEEE Trans. Signal Process. 2017, 65, 3551–3582. [Google Scholar] [CrossRef]
Boussé, M.; Debals, O.; Lathauwer, L.D. A tensor-based method for large-scale blind source separation using segmentation. IEEE Trans. Signal Process. 2017, 65, 346–358. [Google Scholar] [CrossRef]
Lathauwer, L.D. Signal Processing Based on Multilinear Algebra. Ph.D. Thesis, Katholieke Universiteit Leuven, Leuven, Belgium, 1997. [Google Scholar]
Kolda, T.G.; Bader, B.W. Tensor decompositions and applications. SIAM Rev. 2009, 51, 455–500. [Google Scholar] [CrossRef]
Comon, P. Tensors: A brief introduction. IEEE Signal Process. Mag. 2014, 31, 44–53. [Google Scholar] [CrossRef]
Cichocki, A.; Mandic, D.P.; Phan, A.H.; Caiafa, C.F.; Zhou, G.; Zhao, Q.; Lathauwer, L.D. Tensor decompositions for signal processing applications. IEEE Signal Process. Mag. 2015, 32, 145–163. [Google Scholar] [CrossRef]
Loan, C.F.V. The ubiquitous Kronecker product. J. Comput. Appl. Math. 2000, 123, 85–100. [Google Scholar] [CrossRef] [Green Version]
Paleologu, C.; Benesty, J.; Ciochină, S. Linear system identification based on a Kronecker product decomposition. IEEE/ACM Trans. Audio Speech Lang. Process. 2018, 26, 1793–1808. [Google Scholar] [CrossRef]
Elisei-Iliescu, C.; Paleologu, C.; Benesty, J.; Stanciu, C.; Anghel, C.; Ciochină, S. Recursive least-squares algorithms for the identification of low-rank systems. IEEE/ACM Trans. Audio Speech Lang. Process. 2019, 27, 903–918. [Google Scholar] [CrossRef]
Kiers, H.A.L. Towards a standardized notation and terminology in multiway analysis. J. Chemom. 2000, 14, 105–122. [Google Scholar] [CrossRef]
Kroonenberg, P. Applied Multiway Data Analysis; Wiley: Hoboken, NJ, USA, 2008. [Google Scholar]
Ribeiro, L.N.; de Almeida, A.L.F.; Mota, J.C.M. Identification of separable systems using trilinear filtering. In Proceedings of the 2015 IEEE 6th International Workshop on Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP), Cancun, Mexico, 13–16 December 2015; pp. 189–192. [Google Scholar]
Morgan, D.R.; Kratzer, S.G. On a class of computationally efficient, rapidly converging, generalized NLMS algorithms. IEEE Signal Process. Lett. 1996, 3, 245–247. [Google Scholar] [CrossRef]
Haykin, S. Adaptive Filter Theory, 4th ed.; Prentice-Hall: Upper Saddle River, NJ, USA, 2002. [Google Scholar]
Benesty, J.; Paleologu, C.; Ciochină, S. On regularization in adaptive filtering. IEEE Trans. Audio Speech Lang. Process. 2011, 19, 1734–1742. [Google Scholar] [CrossRef]
Digital Network Echo Cancellers; ITU-T Recommendations G.168; 2002. Available online: https://www.itu.int/rec/T-REC-G.168/en (accessed on 16 April 2019).
Morgan, D.R.; Benesty, J.; Sondhi, M.M. On the evaluation of estimated impulse responses. IEEE Signal Process. Lett. 1998, 5, 174–176. [Google Scholar] [CrossRef]

Figure 1. Impulse responses used in simulations: (a)

h_{1}

of length

L_{1} = 64

(the first impulse response from G168 Recommendation [37]), (b)

h_{2}

of length

L_{2} = 8

(random impulse response with Gaussian distribution), (c)

h_{3}

of length

L_{3} = 4

(its elements are evaluated as

h_{3 l_{3}} = {0.5}^{l_{3} - 1}, l_{3} = 1, \dots, L_{3}

), and (d) the global impulse response

h = h_{3} \otimes h_{2} \otimes h_{1}

of length

L = L_{1} L_{2} L_{3} = 2048

.

Figure 1. Impulse responses used in simulations: (a)

h_{1}

of length

L_{1} = 64

(the first impulse response from G168 Recommendation [37]), (b)

h_{2}

of length

L_{2} = 8

(random impulse response with Gaussian distribution), (c)

h_{3}

of length

L_{3} = 4

(its elements are evaluated as

h_{3 l_{3}} = {0.5}^{l_{3} - 1}, l_{3} = 1, \dots, L_{3}

), and (d) the global impulse response

h = h_{3} \otimes h_{2} \otimes h_{1}

of length

L = L_{1} L_{2} L_{3} = 2048

.

Figure 2. NM of the conventional Wiener filter as a function of the number of available data samples used to estimate the statistics (N), for the identification of the global impulse response from Figure 1d. The input signals are AR(1) processes,

L = 2048

, and

σ_{v}^{2} = 0.01

.

Figure 2. NM of the conventional Wiener filter as a function of the number of available data samples used to estimate the statistics (N), for the identification of the global impulse response from Figure 1d. The input signals are AR(1) processes,

L = 2048

, and

σ_{v}^{2} = 0.01

.

Figure 3. NM of the conventional and iterative Wiener filters, for different values of the number of available data samples used to estimate the statistics (N), for the identification of the global impulse response from Figure 1d. The input signals are AR(1) processes,

L = 2048

, and

σ_{v}^{2} = 0.01

.

Figure 3. NM of the conventional and iterative Wiener filters, for different values of the number of available data samples used to estimate the statistics (N), for the identification of the global impulse response from Figure 1d. The input signals are AR(1) processes,

L = 2048

, and

σ_{v}^{2} = 0.01

.

Figure 4. NPM of the iterative Wiener filter, for different values of the number of available data samples used to estimate the statistics (N), for the identification of the individual impulse responses from Figure 1a–c: (a)

NPM (h_{1}, {\hat{h}}_{1}^{(n)})

, (b)

NPM (h_{2}, {\hat{h}}_{2}^{(n)})

, and (c)

NPM (h_{3}, {\hat{h}}_{3}^{(n)})

. The input signals are AR(1) processes,

L_{1} = 64

,

L_{2} = 8

,

L_{3} = 4

, and

σ_{v}^{2} = 0.01

.

Figure 4. NPM of the iterative Wiener filter, for different values of the number of available data samples used to estimate the statistics (N), for the identification of the individual impulse responses from Figure 1a–c: (a)

NPM (h_{1}, {\hat{h}}_{1}^{(n)})

, (b)

NPM (h_{2}, {\hat{h}}_{2}^{(n)})

, and (c)

NPM (h_{3}, {\hat{h}}_{3}^{(n)})

. The input signals are AR(1) processes,

L_{1} = 64

,

L_{2} = 8

,

L_{3} = 4

, and

σ_{v}^{2} = 0.01

.

Figure 5. NM of the LMS-TF algorithm using different values of the step-size parameters.

Figure 6. NM of the LMS-TF and regular LMS algorithms.

Figure 7. NM of the NLMS-TF algorithm using different values of the step-size parameters.

Figure 8. NM of the NLMS-TF and regular NLMS algorithms.

Figure 9. NM of the NLMS-TF and regular NLMS algorithms. The impulse response

h_{2}

changes in the middle of the experiment.

Figure 9. NM of the NLMS-TF and regular NLMS algorithms. The impulse response

h_{2}

changes in the middle of the experiment.

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Dogariu, L.-M.; Ciochină, S.; Benesty, J.; Paleologu, C. System Identification Based on Tensor Decompositions: A Trilinear Approach. Symmetry 2019, 11, 556. https://doi.org/10.3390/sym11040556

AMA Style

Dogariu L-M, Ciochină S, Benesty J, Paleologu C. System Identification Based on Tensor Decompositions: A Trilinear Approach. Symmetry. 2019; 11(4):556. https://doi.org/10.3390/sym11040556

Chicago/Turabian Style

Dogariu, Laura-Maria, Silviu Ciochină, Jacob Benesty, and Constantin Paleologu. 2019. "System Identification Based on Tensor Decompositions: A Trilinear Approach" Symmetry 11, no. 4: 556. https://doi.org/10.3390/sym11040556

APA Style

Dogariu, L.-M., Ciochină, S., Benesty, J., & Paleologu, C. (2019). System Identification Based on Tensor Decompositions: A Trilinear Approach. Symmetry, 11(4), 556. https://doi.org/10.3390/sym11040556

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

System Identification Based on Tensor Decompositions: A Trilinear Approach

Abstract

1. Introduction

2. Background on Tensors

3. Trilinear Wiener Filter

4. LMS and NLMS Algorithms for Trilinear Forms

5. Simulation Results

5.1. Iterative Wiener Filter

5.2. LMS-TF and NLMS-TF

6. Conclusions

Author Contributions

Funding

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.