0% found this document useful (0 votes)

5 views19 pages

Autoencoder-Based_Iterative_Modeling_and_Multivari

This paper presents an Autoencoder-based Iterative Modeling and Multivariate Time-Series Subsequence Clustering Algorithm (ABIMCA) designed for detecting change-points and identifying subsequences in transient multivariate time-series data, which is crucial for condition-based maintenance (CbM) in mechatronic systems. The algorithm utilizes a recurrent neural network (RNN) based Autoencoder, iteratively trained on incoming data, and introduces a new similarity measure for clustering subsequences. Evaluation against seven state-of-the-art algorithms demonstrates ABIMCA's superior performance in clustering both online and offline.

Uploaded by

aquabluegirl707

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views19 pages

Autoencoder-Based_Iterative_Modeling_and_Multivari

Uploaded by

aquabluegirl707

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 19

Received 16 January 2023, accepted 13 February 2023, date of publication 22 February 2023, date of current version 28 February 2023.

Digital Object Identifier 10.1109/ACCESS.2023.3247564

Autoencoder-Based Iterative Modeling and

Multivariate Time-Series Subsequence
Clustering Algorithm
JONAS KÖHNE 1,2 , LARS HENNING 2, AND CLEMENS GÜHMANN 1
1 Chairof Electronic Measurement and Diagnostic Technology, Technische Universität Berlin, 10623 Berlin, Germany
2 IAV GmbH, 10587 Berlin, Germany

Corresponding author: Jonas Köhne (j.koehne@tu-berlin.de)

ABSTRACT This paper introduces an algorithm for the detection of change-points and the identification
of the corresponding subsequences in transient multivariate time-series data (MTSD). The analysis of such
data has become increasingly important due to growing availability in many industrial fields. Labeling,
sorting or filtering highly transient measurement data for training Condition-based Maintenance (CbM)
models is cumbersome and error-prone. For some applications it can be sufficient to filter measurements
by simple thresholds or finding change-points based on changes in mean value and variation. But a robust
diagnosis of a component within a component group for example, which has a complex non-linear correlation
between multiple sensor values, a simple approach would not be feasible. No meaningful and coherent
measurement data, which could be used for training a CbM model, would emerge. Therefore, we introduce an
algorithm that uses a recurrent neural network (RNN) based Autoencoder (AE) which is iteratively trained on
incoming data. The scoring function uses the reconstruction error and latent space information. A model of
the identified subsequence is saved and used for recognition of repeating subsequences as well as fast offline
clustering. For evaluation, we propose a new similarity measure based on the curvature for a more intuitive
time-series subsequence clustering metric. A comparison with seven other state-of-the-art algorithms and
eight datasets shows the capability and the increased performance of our algorithm to cluster MTSD online
and offline in conjunction with mechatronic systems.

INDEX TERMS Condition-based maintenance, multivariate time-series data, change point detection,
unsupervised clustering, autoencoder, segmentation, subsequence, clustering.

I. INTRODUCTION individuals and in different hardware and software develop-

In the applications of machine diagnosis of mechatronic sys- ment stages, providing consistently labeled and categorized
tems and the subfield CbM, all supervised machine learning data is a challenge.
methods rely on high-quality labeled data [1], [2]. An option Automatically labeling and categorizing multivariate time-
for a mechatronic system with different operating points is series (MTS) data is therefore not only an alleviation but
to measure the main operating points separately and create a might be crucial for a successful CbM approach. As described
diagnosis method for each of those individually. This requires above, labeled and categorized data is essential for training
a well-structured design and execution of experiments with a a model to represent a mechatronic system in a data driven
measurement labeling process. In a real world development approach. In the automotive sector where a lot of measure-
environment for mechatronic system, where measurements ments occur at different operating points this is especially
are taken either automatically and/or manually and by many important. Some of these measurements are being recorded
on a test bench in standard environment conditions with pre-
The associate editor coordinating the review of this manuscript and defined operating points and for a given time (e.g., Worldwide
approving it for publication was Wentao Fan . harmonized Light Duty Test Cycle (WLTC)). Others can be

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
18868 VOLUME 11, 2023
J. Köhne et al.: AE-Based Iterative Modeling and MTS Subsequence Clustering Algorithm

in idle mode during waiting or preparation time for a longer The new algorithm which we introduce in this work
trip or other measurements. Also, very transient episodes are is capable of generating subsequence models from online
existent (e.g., Real Driving Emissions (RDE)). All of these streaming data which is processed sequentially. Any coherent
measurements do not necessarily have the same calibration of subsequence that is identified can be recognized (clustered)
the underlying mechatronic system. To train a robust model of if occurring again. Depending on multiple sensitivity cali-
the mechatronic system, component group or a single compo- bration parameters, time-varying data points are associated
nent, a big effort has to be put in the design of the experiments and identified as a subsequence. The parameters determine
alone, not to mention the experiments themselves. Therefore, the volatility or the strength of the affiliation required to be
a method of automatically labeling existing measurements is recognized as one time varying subsequence. These subse-
of advantage. Afterwards an automatic sorting of the labeled quence models can also be applied efficiently offline onto
time sequences by statistical methods is possible, to enable a large existing datasets. During this prediction phase, the algo-
data driven mechatronic diagnosis approach. rithm provides a vector of subsequence labels which were
Using advanced unsupervised approaches for CbM allows recognized as one from the training data. Depending on the
the data to be unlabeled (otherwise supervised methods could calibration, it can also provide a label for unknown data which
be used). In this case the labeling refers to the label of the represents a phase where no pattern could be recognized.
condition (mechanical degradation) of the monitored system. Otherwise, it finds the best fitting subsequence and labels it
When trying to diagnose mechatronic systems that have many as that. The approach published in this work is currently only
operating points and are free to transfer in between those or based on MTS input but could be adapted for a univariate
are capable of totally transient operation modes, then a robust input. It is a multivariate time-series sub-sequence discovery
diagnosis of the actual condition of the mechatronic system and identification method.
is extremely challenging. An early and reliable (robust) diag- Our contribution is a new algorithm for online sub-
nosis of a mechatronic system prevents accidents, enables sequence clustering of MTSD called ‘‘Autoencoder-based
optimal maintenance and increases uptime of machinery. Iterative Modeling and Subsequence Clustering Algorithm
Without the knowledge of the current condition of the system, (ABIMCA)’’ and a new metric to evaluate cluster algo-
fault prevention can only be done by predetermined mainte- rithms focused on this task ‘‘Multivariate Time-Series Sub-
nance intervals. Motivation is therefore to monitor the health Sequence Clustering Metric (MT3SCM)’’. We compare our
condition of the mechatronic system as close as possible, algorithm with
resulting in the task of separating discrete sensory data into • seven other state-of-the-art algorithms
uniquely identifiable and recognizable segments or subse- • eight datasets, from which six are publicly available and
quences. This is beneficial to the performance of anomaly two are provided with our codebase
detection, because if all normally occurring subsequences are • three widely used unsupervised clustering metrics
identified, the detection of abnormal or faulty subsequences • our own metric (MT3SCM) and its four components
is straightforward.
When monitoring the health condition of a mechatronic while varying the use of default algorithm parameters with
system it is state of the art, to manually calibrate specific optimized parameters on each algorithm and dataset via ran-
release conditions, during which the condition monitoring is dom grid search.
enabled. This is done to exclude operating points which are
very rare, too transient or are just not feasible for drawing II. RELATED WORK
conclusions about the condition of the mechatronic system. In this section we define the terminology used and its seman-
But even restricting the conditions on where to diagnose the tics to categorize our work within the large bibliography
machine (which already is reducing the probability of diag- existing in this field and provide a selected list of related
nosing the machine at all due to an operating state which is by works and their ascendancy to this paper.
chance outside the release conditions) cannot always help to
improve the fault detection, identification and quantification A. TERMINOLOGY AND SEMANTICS
of its magnitude. For example, in a mechatronic system with a Numerous possibilities have been described for achieving
complex nonlinear dependency of its subcomponents and its our main goal of segmenting discrete time-series sensory
time dependency, ‘‘going in’’ or ‘‘going out’’ of the release data. Most approaches can be sorted into the following par-
conditions can result in very different system behavior. Com- tially overlapping categories: time-series analysis [3], pattern
paring these two states does not lead to reliable conclusions recognition [4], temporal knowledge discovery [5], motif dis-
for the mechatronic systems health condition. covery [6], change-point detection [7], data clustering [8]
Therefore, the kind of data sequences used for train- or anomaly detection [9]. All those terms refer to methods
ing/calibration and validation is crucial for any monitoring or algorithms which could be used directly or indirectly to
strategy. Manually screening, labeling and sorting data into achieve our goal. Explicit description of each term or category
comparable sequences is time-consuming, error-prone and can be found in the stated references.
cumbersome. Additionally, this is a decision process which To limit the scope of this work, we focus on data clus-
requires expert and domain knowledge. tering which can be separated into six subcategories by the

VOLUME 11, 2023 18869

J. Köhne et al.: AE-Based Iterative Modeling and MTS Subsequence Clustering Algorithm

TABLE 1. Algorithms used for time-series clustering comparison.

FIGURE 1. Combination possibilities of time-series clustering categories.

used online. Nevertheless, online algorithms can be used
offline.
following groups of two: univariate and multivariate data,
online and offline algorithms, variant and invariant data. 3) DEPENDENT – INDEPENDENT TSD (TIME VARIANT –
The term ‘‘clustering’’ implies an unsupervised method. The TIME INVARIANT)
equivalent supervised method would be called ‘‘classifica- Depending on the field of study the specific terminology of
tion’’. In the relevant literature more and other distinctions time dependency can differ. We want to emphasize on the
are made, depending on the specific field and context. First, common accepted assumption “An intrinsic feature of a time-
we describe time-series data (TSD) since this is the data series is that, typically, adjacent observations are dependent”
format we focus on in this paper. Afterwards we explain the ([3, S. 1]). Time dependency characterizes TSD, where a
differences between the subcategories. consecutive observation has some connection with its prede-
‘‘A time-series is a sequence of observations taken sequen- cessor. In some fields a connection is not necessarily given
tially in time.’’ [3] We denote a time-series data point as for two data points in a database that are the closest to each
an observation with exactly one connected timestamp. The other regarding their timestamp. The dependency of adjacent
timestamp is not a variable or feature. observations is self-evident, when collecting sensor values of
a mechatronic system from an experimental rig, for example.
1) UNIVARIATE – MULTIVARIATE We therefore use the term ‘‘time dependency’’ in the context
If a single value or a scalar is the only variable of the data, of a dynamical system and extend it to a time-variant system
then the data is univariate. It is the most basic format data in the terminology of control systems engineering. Indepen-
can have. Considering TSD, a single temperature sensor with dent TSD would be where the variance and the average are
a timestamp would be univariate. Univariate TSD could also invariant along the time (stationary) and ergodic.
be interpreted as multivariate data of two dimensions, when We position our work in the subcategory of online clus-
taking the timestamp as another variable or feature. tering of dependent multivariate time-series data. As of
now we refer to a time-series as a sequence of dependent
2) ONLINE – OFFLINE
observations with a constant sample-rate. A ‘‘discrete series
of consecutive data points’’ as a subset of this time-series is
The differentiation between online and offline algorithms or
synonymously referred to as pattern, motif, sequence, operat-
analysis of data is crucial. Offline refers to data analysis
ing state, state change, between change points, subsequence,
that is applied to all the data at once. Measurement data,
episode or segment, among others. In this paper we will use
for example, is available in one or multiple files or can be
the term subsequence (using terminology of [10]).
accessed via a previously filled database. Offline algorithms
can therefore iterate and optimize their result based on a
B. ALGORITHMS AND DATASETS
criterion applied to known data. Online analysis on the other
hand, is applied sequentially. The algorithm needs to be In the relevant literature a diverse number of clustering
able to function with a criterion that generalizes well with algorithms can be found [11]. Due to this fact, most of
unknown data. One selection or piece of data can be applied the existing literature for reviewing or surveying existing
on the online algorithm without knowing the rest of the data. approaches, attend to a higher-level scope [12], [13], [14].
This approach cannot be as robust and accurate as an offline Fewer are concentrating on time-series clustering [23],
analysis, which is why most methods found in the literature [24], [25], online [26], temporal knowledge discovery [5],
are offline algorithms. Ideally the online algorithm learns as sequential pattern recognition [10], high dimensional
new data is provided. For the sake of completeness, however, data [27] or change point detection [14], [28]. Current
it should be noted that some online algorithms have to be approaches use deep learning architectures like multilayer
pre-trained offline and some algorithms referred to as offline 1 https://scikit-learn.org/stable/modules/classes.html
can be used sequentially. This depends on the underlying 2 https://facebookresearch.github.io/Kats/
methods used. An offline algorithms’ purpose is not to be 3 https://riverml.xyz

18870 VOLUME 11, 2023

J. Köhne et al.: AE-Based Iterative Modeling and MTS Subsequence Clustering Algorithm

perceptron (MLP), convolutional neural network (CNN), test to evaluate time-series stationarity and perform a segmen-
deep belief network (DBN), generative adversarial net- tation based on this [37]. In [38] dynamic latent variables
work (GAN) and variational autoencoder (VAE) among from a vector autoregression (VAR) model in combination
others [29]. with a principal component analysis (PCA) is used for seg-
The algorithms we use for comparison are listed in Table 1. menting industrial TSD. Reference [39] shows the advantage
All of these are online clustering algorithms that can be of an embedding approach as well, by introducing a PCA
used for time-series clustering. Implementations are publicly and a Vanilla-AE CP detection method with the restriction
available in the Python programming language (see library of focusing on multivariate power grid data.
column in Table 1) and are well established and tested. Focused on transfer learning, [40] introduces an adversar-
The Balanced Iterative Reducing and Clustering using ial approach for domain adaption using a stacked AE. Offline
Hierarchies (BIRCH) [16] algorithm is based on a clustering convolutional sparse AE used for supervised sequence clas-
features (CF) tree with the CF as a triple of the number of sification was done by [41] and adapted by [42] for
data points, linear sum and the squared sum. This CF tree unsupervised motif mining. Other AE based papers are, for
is built dynamically. It was also one of the earliest algo- example, by [43] who use a mixture of AEs for image and text
rithms capable of online clustering. The Bayesian Online clustering. Stacked AE and k-means for offline clustering is
Change-point Detection (BOCPD) [17] algorithm, as the done by [44] without considering time dependency. Showing
name suggests, uses Bayesian methods to detect change- the combination of GRU-based AE and MTS for anomaly
points (CPs) online. Since this algorithm only detects CPs, detection is done by [45]. Reference [46] applies the sliding
we manipulated the result to be able to interpret every CP windows approach on CNN-based AE for Anomaly Detec-
as the beginning of a new cluster. This algorithm starts tion of Industrial Robots. A similar approach using AE for
in our comparison with the limitation of not being able MTS segmentation is published in [47]. The focus there is on
to recognize a previously seen cluster. The Stream Clus- change point detection using latent space variables only and
tering Framework (CluStream) [19] algorithm is based on no clustering or identification of the subsequences is done.
extended CF from BIRCH, following a k-means algorithm. Clustering is strongly depending on the data and task pro-
The Density-based Stream Clustering (DBSTREAM) [20] vided: “. . . ; each new clustering algorithm performs slightly
algorithm is based on the Self Organizing density-based clus- better than the existing ones on a specific distribution of pat-
tering over data Stream (SOStream) [30] and uses a shared terns.” ([8, S. 268]). Therefore, we try to apply the algorithms
density graph to capture the density between micro-clusters. on multiple different MTS datasets and compute different
The Density-Based Clustering over an Evolving Data Stream metrics for comparison. Large efforts are made for making
with Noise (DenStream) [21] algorithm is an extension of datasets available to the scientific community and the public
Density-Based Spatial Clustering of Applications with Noise to improve comparability and reproducibility by universities
(DBSCAN) [31] which uses a damped window model of CF or governmental institutions [48], [49]. For this paper we
to create core-micro-clusters and outlier-micro-clusters. The focus on data with multivariate quantitative features with
Mini-Batch K-Means (MiniBatchKMeans) [22] algorithm continuous values. For a list of the datasets see Table 3
proposes “the use of mini-batch optimization for k-means and a brief description is given in section IV. Evaluating
clustering” ([22]) to improve the k-means optimization prob- the performance of a clustering algorithm can be done with
lem. The STREAMKMeans [18] algorithm uses an adaption two different approaches. If external knowledge about the
of the original STREAM algorithm from [32]. Replacing the ground truth of each data point and its cluster is known,
k-median subroutine LSEARCH by an incremental k-means then so-called external measures can be applied. If no ground
algorithm. More information and comparison of most of the truth is available, internal measures need to suffice. Many
used algorithms can be found in [26] and [33] external measures exist, like the well-known F1-score (based
As described in section I, the focus of this publication on the effectiveness measure by [50]). With the large number
is on the online multivariate time dependent subsequence of data available and working in the context of transient
clustering using RNN based AE. The number of algorithms machine behavior with the focus on finding internal states
within this scope is limited compared to the number of clus- of the system, acquiring or providing the ground truth is
tering algorithms in general. The following approaches use time-consuming, error-prone and cumbersome (as described
at least some of those prerequisites. Reference [34] empha- in section I). “The definition of clusters depends on the user,
sizes the term segmentation for an offline sliding window the domain, and it is subjective.” ([25, S. 30]). We therefore
and bottom-up algorithm. Others are converting the time- use internal measures for comparing our approach. Those
series into a Markov chain (MC) and then using a Bayesian internal measures commonly rely on a similarity measure of
method to cluster the MCs [35], referring to them as episodes. the actual data which is being clustered. Thorough work on
Here the data needs to be discretized into bins of equal metric comparison and similarity measures has been done
length. Reference [36] uses manually selected characteristics [25], [51], [52]. Most of those measures are based on simple
(e.g., kurtosis, skewness and frequency) for clustering uni- distances and densities computed for each data point but
variate TSD. Others are using the Augmented Dickey-Fuller do not take time dependency into consideration. Because of

VOLUME 11, 2023 18871

J. Köhne et al.: AE-Based Iterative Modeling and MTS Subsequence Clustering Algorithm

this, we found that for the use case described in this paper, steps using x from Equation (1) as
the commonly used clustering evaluation measures are not
X = (x0 , x1 , . . . , xn ) with n ∈ N (2)
well suited for ‘‘time-series clustering evaluation measures’’.
In section V we introduce an approach for similarity measures so, X ∈ Rd×n . With time dependency consideration, it is
which considers time dependency in combination with well- reasonable to denote a sliding window of the streaming data,
established clustering metrics (see Table 2). considering Equation (1) and n the number of samples already
collected as:
TABLE 2. Metrics used for time-series clustering comparison.
Wt = xt+0 , xt+1 , . . . , xt+ζ ⇒ (ζ ∈ N) ∧ (ζ < n) (3)

Implementations used from [15].

Let’s also assume, that within the measurement data X

there exist subsequences Sj which satisfy our requirements of
non-overlapping and variable length. For the indexes of our
subsequences, we denote
J = {j ∈ N : j ≤ u} (4)
where u is the number of identified subsequences in X . The
subsequence then is a continuous sampling from X for a small
period of time steps with consecutive data points and the
length of m. The subsequence length is usually much smaller
than the length of the full measurement data m ≪ n.
Sj = xqj , . . . , xqj +mj

(0 ≤ q ≤ n − m) ∧ j ∈ J (5)
TABLE 3. Datasets used for time-series clustering comparison.
For each subsequence with the index j we have a first time
step index qj = qj,start and a last qj + m = qj,end for which
we do not allow overlapping
∀j ∈ J ∃(qj,start , qj,end )
⇒ (qj,start < qj,end ) ∧ ((qj−1,end < qj,start ) ∧ j > 0) (6)
This results in our uniquely identified non-overlapping set of
subsequences
S = S0 , . . . , Sj

j∈J (7)
Clustering these uniquely identified subsequences results in
recognizing reoccurring subsequences and combining them
III. DEFINITIONS AND RESTRICTIONS into a subset of all subsequences
In the following section we define in more detail our data,
together with the restrictions of our environment. Considering Ci ⊆ S (8)
online clustering, we can refer to our TSD as continuously which results in the following cluster set C
incoming data or streaming data. This data is considered mul-
tivariate when the dimension (number of sensors or features) C = {C0 , . . . , Ci } i∈I (9)
of the data stream d > 1. When we denote one value of with the cluster index or unique cluster label
one feature as x, we have at time step t the following feature
vector: I = {i ∈ N : i < n} (10)

xt = (x0 , x1 , . . . , xd )T with x ∈ R and d ∈ N (1) For the output of a clustering algorithm at time t we denote
the scalar value yt as our label or designated subsequence
whereas the natural numbers include zero {0, 1, 2, . . .} = N. identification. For evaluation purposes a clustering for a time-
A complete measurement sequence with n number of time series produces a label array y for all time steps:
4 https://sites.cc.gatech.edu/~borg/ijcv_psslds/ y = (y0 , y1 , . . . , yn ) with y ∈ J and n ∈ N (11)
5 https://data.nasa.gov/dataset/C-MAPSS-Aircraft-Engine-Simulator-
Data/xaut-bemq Furthermore, it is a requirement, that the streaming data pro-
6 http://www.timeseriesclassification.com/description.php?Dataset=
vided can be applied to a numerical differentiation algorithm.
EigenWorms Therefore, a constant sample rate is necessary and in case
7 https://archive.ics.uci.edu/ml/datasets/Condition%20monitoring%20of
%20hydraulic%20systems of strong noise, filtering or smoothing of the data should
8 http://mocap.cs.cmu.edu/ be applied by a preprocessing step. Also, there mustn’t be
9 https://github.com/LuisM78/Occupancy-detection-data missing values and extreme outliers need to be removed.

18872 VOLUME 11, 2023

J. Köhne et al.: AE-Based Iterative Modeling and MTS Subsequence Clustering Algorithm

In our use case we assume that some knowledge about the

incoming data exists, so that an estimate of the variance and
the mean of the variable can be performed for standardization.

IV. DATASETS
All datasets used for comparison in this work are described
briefly in this section and listed in Table 3. They all contain
quantitative features with continuous values. For further use
of the datasets, no missing values exist, the data is continuous
and was standardized for the algorithms but not for the metric
computations. No other preprocessing like smoothening or
filtering was performed.
The bee-waggle dataset [56] contains movement of bees
in a hive captured with a vision-based tracker. The first two
features are the x and y coordinates of the bee added with the
sine and the cosine function applied to the heading angle. FIGURE 2. Lorenz-attractor dataset. Computed with Ẋ = s(Y − X );
Ẏ = rX −Y − XZ ; Ż = XY −bZ and parameters used s = 10, r = 28 and
The cmapss dataset is a ‘‘dataset of run-to-failure trajecto- b = 2.667. Color and marker size indicate amount of curvature on a
ries for a small fleet of aircraft engines under realistic flight logarithmic scale for better visibility.
conditions’’ [57] with 18 features.
The eigen-worms dataset [58] contains measurements of
worm motion. Preprocessing extracted six features, which subsequences in MTSD in general (in regard to the restric-
represent the amplitudes along six previously identified base tions in section III).
shapes of the worms Our MT3SCM score consists of three main components.
The hydraulic dataset [59] is obtained from a hydraulic
test rig with measuring 17 process values such as pressures,
mt3scm = (ccw + sL + sP )/3 (12)
volume flows and temperatures.
Lorenz-attractor refers to a synthetic dataset which is
calculated using a system of the three coupled ordinary dif- The weighted curvature consistency (ccw ), the silhouette
ferential equations which represent a hydrodynamic system: location based (sL ) and the silhouette curve-parameter based
Ẋ = s(Y − X ); Ẏ = rX −Y − XZ ; Ż = XY −bZ with (sP ). When making the attempt of clustering TSD, it is sub-
parameters used s = 10, r = 28 and b = 2.667 (see Figure 2). jective and domain specific. Nevertheless, we try to take
‘‘In these equations X is proportional to the intensity of the the intuitive approach of treating MTSD as space curves
convective motion, while Y is proportional to the temperature and use the parameterization as a similarity measure. This
difference between the ascending and descending currents, is done in two different ways. First, we create new fea-
similar signs of X and Y denoting that warm fluid is rising, tures by computing the curve parameters sample by sample
and cold fluid is descending.’’ [60] (e.g., curvature, torsion, acceleration) and determine their
The mocap or The Motion Capture Database (MOCAP) standard deviation for each cluster. Our hypothesis is, that
dataset [61] contains 93 features from human motion cap- with a low standard deviation of the curve parameters inside
tured with markers. a cluster, the actions of a mechatronic system in this clus-
The occupancy dataset [62] is a measurement of sensory ter are similar. We call this the curvature consistency (cc)
data in an office with the following sensors: temperature, (see Equation (24) used in 14 in algorithm 1). The sec-
humidity, the derived humidity ratio, light and CO2. ond procedure is to apply these newly computed features,
The thomas-attractor dataset is as the lorenz-attractor which are computed to scalar values per subsequence, onto
dataset, a synthetic dataset, computed with the three coupled a well-established internal clustering metric, the silhouette
differential equations: Ẋ = sin(Y ) − bX ; Ẏ = sin(Z ) − score [53] (see Table 2).
bY ; Ż = sin(X ) − bZ originally proposed by [63] with the The computation of the cc comprises the calculation of the
parameter used b = 0.1615. curvature κ and the torsion τ at every time step t with xt .

V. MULTIVARIATE TIME-SERIES SUB-SEQUENCE ⟨ė1 (t), e2 (t)⟩

κ(t) = (13)
CLUSTERING METRIC (MT3SCM) ∥ẋt ∥
As emphasized in section I and section II, to our knowledge, ⟨ė2 (t), e3 (t)⟩
τ (t) = (14)
none of the existing clustering metrics take into consider- ∥ẋt ∥
ation the time space variations like curvature, acceleration
or torsion in a multidimensional space. We believe using whereas e1 is the unit tangent vector (or first Frenet vector),
these curve parameters, is an intuitive method to measure e2 is the unit normal vector (or second Frenet vector) and e3 is
similarities between mechatronic system state changes or the unit binormal vector (or third Frenet vector) which are

VOLUME 11, 2023 18873

J. Köhne et al.: AE-Based Iterative Modeling and MTS Subsequence Clustering Algorithm

defined as: cci = 1 − σi with cci ∈ R : cci ≤ 1 (22)

(
ẋt cci , if cci > −1
e1 (t) = (15) cci = (23)
∥ẋt ∥ −1, if cci ≤ −1
e2 (t) = ẍt − ⟨ẍt , e1 (t)⟩ × e1 (t) (16)
e2 (t) The ccw , is directly derived from the cc per cluster, by weight-
e2 (t) = (17) ing it with the number of data points per cluster i ∈ I
∥e2 (t)∥
... ... ... Equation (24).
e3 (t) = x t − ⟨ x t , e1 (t)⟩ × e1 (t) − ⟨ x t , e2 (t)⟩ × e2 (t)
n
(18) P
cci × Ni
e3 (t) i=1
e3 (t) = (19) ccw = n (24)
∥e3 (t)∥ P
Ni
From which we can also derive the speed v = ∥ẋt ∥ and i=1
the acceleration a = ∥ẍt ∥. Figure 3 shows exemplarily the The calculation of the scores sP and sL is different to the
curvature κ, torsion τ , speed v and acceleration a for the first standard estimation of the silhouette score, which is shown
part of the thomas-attractor dataset. in Equation (25) and originally based on every data point of
the time-series X and the assigned cluster label array y:
s = f (X , y) (25)
Our sP is the silhouette score derived from our previously
computed curve parameters per subsequence per cluster as
well as the standard deviation of those and the number of data
points per subsequence.
XsP , yj )
sP = f (e (26)
with
κ 11 τ 11 σ11
   
a11 N11 y11
κ 12 τ 12 a12 σ12 N12  y12 
 .. .. .. .. ..   .. 
   
 . . . . . 
  . 
κ 21 τ 21 σ21 N21 , yj = 
  
XsP = 
e a21 
y21 
 (27)
κ 22 τ 22 a22 σ22 N22  y22 
  
 .. .. .. .. .. . 

 . . . . .   .. 
 

κ ij τ ij aij σij Nij yij

The sL uses the silhouette score based on the median value
x̂dij of a subsequences original feature space per feature d.
FIGURE 3. Qualitative visualization of the (a) curvature κ, (b) torsion τ , XsL , yj )
sL = f (e (28)
(c) speed v and (d) acceleration a computed on part of the
thomas-attractor dataset. Color and marker size indicate amount of curve with
parameter on a logarithmic scale for better visibility (dark and thin means
x̂211 . . . x̂d11 σ11
low value, bright and thick means high value). Axis labels and colorbar
 
x̂111 N11
labels are along the lines of Figure 2.
x̂112 x̂212 . . . x̂d12 σ12 N12 
 .. .. .. 
 
Afterwards the cc is calculated per cluster i ∈ I, by taking  . . . 
x̂221 . . . x̂d21 σ21
 
the empirical standard deviation for each curve parameter x̂121
XsL =  N21  (29)
e
(exemplarily for κ in Equation (20) with the set of subse- x̂222 . . . x̂d22 σ22

x̂1 N22 
 22
 .. .. .. 

quence indexes Ji within our cluster i). The arithmetic mean  . . . 
(Equation (21)) of the standard deviations for the curvature
x̂1ij x̂2ij . . . x̂dij σij Nij
κ, torsion τ and the acceleration a results in the final cc per
cluster (see Equation (22)). The main idea of this approach is to combine three main parts
v inside one metric. First incentive is to reward a low standard
qj +mj
u
u 1 X X
u deviation of the curve parameters in between a cluster
σκi = t (κn − κ i )2 (20) (accomplished by cc). Second, to benchmark the clusters
Ni − 1 n=q
j∈Ji j spatial separation based on the new feature space (curve
σκ + στi + σai parameters, accomplished by sP ). And third, to benchmark
σi = i (21)
3 the clusters spatial separation based on the median of the

18874 VOLUME 11, 2023

J. Köhne et al.: AE-Based Iterative Modeling and MTS Subsequence Clustering Algorithm

subsequence in the original feature space (accomplished the standard metrics, results in best scores for all metrics. This
by sL ). The proposed algorithm for this new metrics compu- shows that the new feature space allows a good separation in
tation is described in algorithm 1. contrast to the original space, as proven by the metrics scores
for silhouette, calinski-harabasz and davies-bouldin on the
Algorithm 1 MT3SCM original and the two new feature spaces. To show the benefit
1: procedure MT3SCM(X , y) ▷ Data X ∈ Rd×n and labels of the new feature space, we applied the agglomerative clus-
y ∈ Nn tering10 not on the original lorenz-attractor dataset but on the
2: L ← empty() ▷ Array initialization for all newly computed feature space based on curvature, torsion and
subsequence median coordinates or Location acceleration (see Figure 5) The metric values for Figure 5 (b)
3: P ← empty() ▷ Array initialization for all show a high ccw and a decent sP value for the low number of
subsequence curve Parameters mean values 10 clusters specified.
4: K ← GetCurveParametersForAllData(X ) To further evaluate our metric, we used the lorenz-attractor
5: yunique ← FindUniqueClusterIDs(y) and the thomas-attractor dataset (see Table 3) and applied
6: for i in yunique do an agglomerative clustering, a time-series k-means clustering
7: Xi ← GetClusterData(X , i) as well as a random subsequence clustering. Varying the
8: s ← FindSubsequences(y, i) number of clusters and some algorithm specific parameters.
9: for j in s do Afterwards the metrics calinski-harabasz, davies-bouldin and
10: Xi,j ← GetSubsequenceData(Xi , j) silhouette scores were computed and compared to our new
11: L[i, j] ← GetMedianLocations(Xi,j ) metric MT3SCM. From these results we derived a correlation
12: P[i, j] ← GetCurveParameterValues(K , i, j) matrix (see Figure 6). The cc and the ccw are clearly related
13: end for due to their direct combination. The positive correlation
14: cci ← ClusterCurvatureConsistency(P) ▷ between the internal components to the overall MT3SCM
Compute the cluster curvature consistency (cci ) with the score is obvious. We see a clear positive correlation to the
empirical standard deviation of each curve parameter silhouette score which is evident due to the internal use of
over time. If the cluster consists only of one time step, this metric. Interestingly, the correlation between the ccw and
set the cci to zero. the sP is negative. This is due to the types of datasets and
15: C[i] ← cci ▷ Collect cci data for all clusters algorithms we used. Because with higher number of clusters
16: end for we theoretically expect a better cc because of the lower stan-
17: ccw ← WeightedAverage(C, npc ) ▷ dard deviation by chance. On the other hand, the more clusters
Compute weighted average curvature consistency (ccw ) exist, the more likely a similar curve parameter between the
from cci with number of points per cluster clusters exists and therefore creates a new feature space with
18: sL ← SilhouetteComputation(L, yunique ) ▷ Compute overlapping clusters, which results in a low sP score. This
the silhouette coefficient using the center positions of can be retraced within the subfigures of Figure 7. The low
each identified subsequence correlation between the calinski-harabasz and the davies-
19: sP ← SilhouetteComputation(P, yunique ) ▷ Compute bouldin scores supports our point that the available clustering
the silhouette coefficient with the curve parameters metrics are not well suited to be used for time-series cluster-
20: score ← (ccw + sL + sP )/3 ing evaluation measures. Figure 7 shows examples where the
21: return score ▷ The final score agglomerative clustering was applied on the lorenz-attractor
22: end procedure dataset (part of the data used for the correlation matrix
Figure 6). It can be seen that the agglomerative clustering on
the original dataset is not an optimal cluster algorithm, when
comparing the metrics to Figure 5 (b). Comparing Figure 5 (b)
A. EVALUATION and Figure 7 (d) we can see a similar MT3SCM score but very
For computational tests, we manually created a ‘‘perfect’’ different standard metrics scores. The similar MT3SCM score
synthetic dataset with respect to our metric (see Figure 4). is based on the much higher number of clusters and equally
Figure 4 (a) shows the original synthetic dataset, where the distributed subsequence length in Figure 7 (d), which results
subsequences in cluster 1 are a helix along the increasing in a high ccw value as well as a good spatial separation (sL ),
x axis. For cluster 2 the subsequences are a straight move- which is compensating the low sP value due to the similar
ment, with quadratic decreasing distances along the y axis. curve parameters of the clusters. Figure 5 (b) however, also
Cluster 3 is representing a helix along the decreasing x axis has a very high ccw value with a good sP value reaching
but with a different resolution than cluster 1. Cluster 4 is, a similar MT3SCM score but with a fifth of the number of
along with cluster 2, a straight movement with quadratic clusters. How our metric handles random clustering with
increasing distances along the y axis. This cycle is repeated
six times. Figure 4 (b) shows the new feature space for the sL
component. The feature space for the sP component is shown 10 https://scikit-learn.org/stable/modules/generated/sklearn.cluster.
in Figure 4 (c). Applying the new features per subsequence on AgglomerativeClustering.html

VOLUME 11, 2023 18875

J. Köhne et al.: AE-Based Iterative Modeling and MTS Subsequence Clustering Algorithm

FIGURE 4. Synthetic dataset with four clusters with a perfect own metric score of mt 3scm = 1 due to each cluster’s unique and constant curve
parameters. (a) Synthetic dataset with best own result of mt 3scm = 1. Standard metrics scores computed with original data; davies-bouldin: 1.4,
calinski-harabasz: 6.9e + 02, silhouette: 0.087. (b) New feature space from the centers (median value) of each subsequence. Standard metrics scores
computed with new feature space; davies-bouldin: 6.3e − 07, calinski-harabasz: 3.8e + 14, silhouette: 1. (c) New feature space from the curve
parameters extracted from each subsequence. Standard metrics scores computed with new feature space; davies-bouldin: 6.8e − 07,
calinski-harabasz: 1.2e + 13, silhouette: 1.

TABLE 4. Metric values for Figure 7 8 and 5.

critical scenarios, is shown in Figure 8. The Python code and clustering criteria. Common similarity functions used are
a more detailed evaluation are publicly available at [64] distance measures like euclidean distance or some kind of
correlation coefficients like Pearson’s correlation coefficient.
B. CONCLUSION Those are also used for static data clustering algorithms. More
We have described a more suitable similarity measure for suitable for time-series clustering are similarity functions
dependent TSD. After showing how to compute our metric like Dynamic Time Warping (DTW) distance, short time-
and evaluated on different datasets its use case and effec- series (STS) distance [65] or considering space curves like
tiveness. Further we will use this metric in addition to the we introduced in section V.
standard metrics to evaluate our proposed online time-series In this work we analyzed an approach which is data driven,
clustering algorithm which is described in section VI based on unsupervised machine learning algorithms and has
online capabilities (see Figure 11). Our approach uses a RNN
VI. CLUSTERING ALGORITHM (ABIMCA) based AE to generate scores which are used as similarity
In this section we describe the concept of our time-series measures. Specifically, the experiments in this work were
clustering approach in detail. Afterwards, we apply our algo- performed using a pytorch [66] implementation of a bidi-
rithm onto the datasets described in section IV and present rectional one-layer gated recurrent unit (GRU) RNN with a
the results. hidden size of the input dimensions minus one h = d − 1.
Other prerequisites regarding the dataset and preprocessing
A. METHOD are described in section IV and section III.
As described in [23] a key component in a time-series clus- The main procedure of the approach is as follows: The
tering algorithm is the similarity function to quantify the incoming data is taken as a sliding window Wt at the current

18876 VOLUME 11, 2023

J. Köhne et al.: AE-Based Iterative Modeling and MTS Subsequence Clustering Algorithm

FIGURE 7. Agglomerative clustering from [15] applied on the

FIGURE 5. Lorenz-attractor dataset with 10 clusters from agglomerative lorenz-attractor dataset exemplifies the unique components of our metric
clustering on the new curve parameters feature space. See Table 4 for compared to the silhouette calinski-harabasz and davies-bouldin scores.
metric comparison of the following subplots. (a) New curve parameters See Table 4 for metric comparison of the following subplots. Subfigures
feature space computed from the Lorenz-attractor dataset with labels (a) (b) and (c) all have a similarly low MT3SCM score compared to
from agglomerative clustering (b) Lorenz-attractor dataset with labels Figure 5 (b) but considerably good standard metric scores. Subfigure
from agglomerative clustering on the new curve parameters feature space (d) can achieve a relatively high MT3SCM score due to the high number of
(c) New feature space from the curve parameters extracted from each clusters and the resulting good ccw and sL value which compensates the
subsequence. Standard metrics scores computed with new feature space low sP value.
(d) New feature space from the centers (median value) of each
subsequence. Standard metrics scores computed with new feature space.

FIGURE 6. Own metric (MT3SCM) correlation analysis. Own metric and its
four subcomponents (curvature consistency (cc), weighted curvature
consistency (ccw ), silhouette location based (sL ), silhouette FIGURE 8. Own metric evaluation using random clusterer on
curve-parameter based (sP )) correlation to calinski-harabasz, thomas-attractor dataset and lorenz-attractor dataset. See Table 4 for
davies-bouldin and silhouette score for random, agglomerative and metric comparison of the following subplots (a) Own metric and all of its
k-means clustering on lorenz and thomas-attractor dataset. subcomponents are around zero, as desired. Calinski-harabasz value is
low and davies-bouldin is high, which also indicate a ‘‘bad’’ clustering
(b) Longer random subsequences also generate a MT3SCM result around
zero. Calinski-harabasz and davies-bouldin scores are stronger influenced
time t with length ζ of past time steps and number of fea- by the subsequence length (c) Own metric and all of its subcomponents
tures d. This matrix Wt ∈ Rd×ζ is used for the input of, what are around zero, as desired. Calinski-harabasz value is low and
davies-bouldin is high, which also indicate a ‘‘bad’’ clustering (d) As seen
we call, the Base Autoencoder (BAE). The key element of our for the lorenz-attractor data in (b), longer subsequences have a high
algorithm is, that this BAE’s parameters are not constant but impact on calinski-harabasz and davies-bouldin scores.
being adapted iteratively with a stochastic gradient descent
(SGD) optimization method for each new incoming sliding
regularization term or sparsity penalty
window. For this training of the BAE, we use a slight adaption
of the sparse AE loss function L from [67] with a basic loss = l = L(Wt , W
et , h) = MSE(Wt , W
et ) + (h) (30)

VOLUME 11, 2023 18877

J. Köhne et al.: AE-Based Iterative Modeling and MTS Subsequence Clustering Algorithm

FIGURE 10. Example of batch-wise offline clustering with a simple

three-dimensional synthetic dataset.

TABLE 6. Summation of the number of outperformances of each

algorithm for all datasets and all metrics compared to the
mini-batch-kmeans with default parameters.
FIGURE 9. Example of online clustering with a simple three-dimensional
synthetic dataset. First row shows the original input data X (or the last
values of each sliding window Wt ) with the online cluster IDs as the
background color (blue is unknown or SID = 0). Second row shows the
output of the AE or the reconstruction Wf . In the third row the blue line
represents the value of the latent space hb (left axis) and the identified
subsequence ID SID (right axis). The last row indicates the Base
Autoencoder’s (BAE) (sb ) as well as the SAEs’ (s1 - s4 ) score values. The
black horizontal line is the subsequence detection score threshold (η)
and the gray line is the subsequence recognition score threshold (ρ).

TABLE 5. Summation of the number of outperformances of each

algorithm for all datasets and all metrics compared to the
mini-batch-kmeans with parameters from hyperparameter search.

which results in the final loss computation

1 Pd Pζ
wij )2 + λ · d−1
P
loss = l = d·ζ i=1 j=1 (wij − e i |hi − clc |
(33)
where the first part is the MSE between the input matrix Wt
and the reconstruction W et and the second part is the penalty
of the latent space deviation.
To determine if a subsequence is recognized at the current
time step, we denote the scoring function SF as follows
l
s = score = SF(l, h) = cfw · (|clc − h|) + cfw (34)
where h = f (x) is the encoders output or latent space. The whereas the weighting factor in current implementation is
sparsity penalty we denote as: cfw = 1 and latent center constant is clc = 0.5. It utilizes the
(h) = λ · d−1
P
(31) reconstruction error as well as the deviation of the latent space
i=1 |hi − clc |
with a doubled emphasis on the latent space deviation due
with the penalty factor λ = 1e−10 and the latent center to its dependence in the loss function as well as the scoring
constant clc = 0.5. The mean squared error (MSE) is function which includes the loss again (see Equation (34).
Pd Pζ In combination with a threshold, the score is used to deter-
MSE(Wt , W et ) = 1
d·ζ i=1 wij )2
j=1 (wij − e (32) mine when a recognizable subsequence is present.

18878 VOLUME 11, 2023

J. Köhne et al.: AE-Based Iterative Modeling and MTS Subsequence Clustering Algorithm

FIGURE 11. Concept of the ABIMCA approach. Sliding window of the MTS Wt is iteratively trained in the base AE. If score of base AE (gray dotted
line) is below threshold (dashed red line), a new subsequence AE is created from the base AE. Incoming data is also compared to existing
subsequence AEs if subsequence can be recognized.

TABLE 7. Best metric ‘metrics.mt3scm’ value for each dataset and algorithm from hyperparameter search results.

TABLE 8. Best metric ‘metrics.calinski-harabasz’ value for each dataset and algorithm from hyperparameter search results.

The bottom row of Figure 9 shows, that a subsequence Autoencoder (SAE), are used to recognize previously seen
is present, when the BAE’s score (blue line) is below the subsequences using the same scoring function. A concept
horizontal black line (sb <= η). If a subsequence is present, drawing of the approach is shown in Figure 11. The algorithm
a copy of the BAE is made and its parameters are frozen is described in pseudocode in algorithm 2.
and associated with this specific pattern of a subsequence. The functionality can be retraced considering
These copies of the BAE, which we call Subsequence Figure 9 and 10. This example shows the algorithm applied

VOLUME 11, 2023 18879

J. Köhne et al.: AE-Based Iterative Modeling and MTS Subsequence Clustering Algorithm

TABLE 9. Best metric ‘metrics.davies-bouldin’ value for each dataset and algorithm from hyperparameter search results.

TABLE 10. Best metric ‘metrics.silhouette’ value for each dataset and algorithm from hyperparameter search results.

TABLE 11. Metric ‘metrics.mt3scm’ value for each dataset and algorithm from default calibration results.

TABLE 12. Metric ‘metrics.calinski-harabasz’ value for each dataset and algorithm from default calibration results.

on a three-dimensional synthetic data set. The input data Recognizing a previously identified subsequence, however,
consists of four different operation points with small white is almost instantaneous. The calibration of the thresholds
noise. The sequence of the four subsequences is repeated η (horizontal black line) and ζ (horizontal gray line) are
once. The other rows are described in the caption of Figure 9. apparently crucial. The necessary time steps to adapt to a
It is evident that the algorithm needs a few time steps to adapt current subsequence can be altered by the calibration of the
to the current subsequence until it is recognized as such. learning rate α and the number of BAE’s training cycles

18880 VOLUME 11, 2023

J. Köhne et al.: AE-Based Iterative Modeling and MTS Subsequence Clustering Algorithm

TABLE 13. Metric ‘metrics.davies-bouldin’ value for each dataset and algorithm from default calibration results.

TABLE 14. Metric ‘metrics.silhouette’ value for each dataset and algorithm from default calibration results.

For a streaming application multiple runs of this procedure

could also be applied in parallel and combined into a
cluster tree.

B. EVALUATION
For the evaluation study of our algorithm, we chose eight
different MTS datasets (see Table 3) from which six are pub-
licly available and two are provided with our codebase [68],
seven other state-of-the-art algorithms (see Table 1) and three
widely used unsupervised clustering metrics (see Table 2).
Each algorithm has been applied to each dataset with default
parameters. Additionally, we performed a hyperparameter
search for each algorithm based on a random grid search of
300 samples. The parameter boundaries for this hyperparam-
eter search are listed in Table 15. Overall, 19 264 experiments
were run.
For a better overview of the results, we chose to compare
every algorithm to the ‘‘MiniBatchKMeans’’ algorithm and
FIGURE 12. Empricial complexity estimation with variation of total counted the number of times they performed better. Table 5
number of datapoints and subsequences identified by our algorithm over
the duration. 11.
shows the results for the hyperparameter search and the
number of outperformances of each algorithm compared to
the ‘‘MiniBatchKMeans’’ algorithm. Table 6 shows the same
results with default parameter settings for each algorithm.
per time step ω. A faster recognition of a subsequence has
We can see, that in sum and in two of the metrics our algo-
the drawback of the algorithm being very sensitive and
rithm beats state-of-the-art algorithms. The full list of results
therefore identifying even small changes of the input as a new
is attached in Table 7 and 14. Additionally, Table 16 shows
subsequence. A strategy could be to calibrate the algorithm
the best results from the hyperparameter search when sorted
first to be rather insensitive and cluster the time-series in
by MT3SCM with the total time spent.11
major subsequences. These can then be further clustered
with a more sensitive calibration. This procedure can be 11 Experiments were performed on a Linux machine with a AMD Ryzen
repeated until the required degree of granularity is achieved. Threadripper 2950X 16-Core Processor using a GeForce RTX 2080 GPU.

VOLUME 11, 2023 18881

J. Köhne et al.: AE-Based Iterative Modeling and MTS Subsequence Clustering Algorithm

Algorithm 2 ABIMCA. Using the Following Parameters: TABLE 15. Hyperparameter random grid search upper and lower bound
α: Learning Rate, ω: Number of Base Model Training Cycle for each algorithm and their specific parameter options.

per Step, η: Subsequence Detection Score Threshold, ρ: Sub-

sequence Recognition Score Threshold, Window Length: ζ .
Also, We Denote θS as a List of Subsequence Model Param-
eters, s as an Array of Scores for All Existing Subsequence
Models, W et the Reconstruction of the Sliding Window Input,
1: procedure Abimca(Wt ) ▷ Sliding window data
Wt ∈ Rd×ζ
2: Verify calibration t ≥ ζ ∨ α > 0 ∨ ω ≥ 1 ∨ η >
0∨ρ >η
3: cS ← 0 ▷ Initialize subsequence counter
4: θb ← Sparse(sparsity = 0.1) ▷ Initialize base model
parameters
5: while Wt do ▷ New input available
6: for j ← 1, ω do ▷ Iterative base model train
loop
7: Wet , hb ←Predict(Wt , θb )
8: lb ← L(Wt , W et , h)
9: 1θb ←Backpropagate(lb )
10: θb ← θb + 1θb ▷ Update base model
parameters
11: end for
12: et , hb ←Predict(Wt , θb )
W
13: sb ←SF(lb , hb )
14: s ←GetSubsequenceScores(Wt , θS )
15: if min(s) < ρ then ▷ Subsequence recognized
16: SID ← arg min(s) ▷ Set ID to recognized
subsequence index
17: else ▷ No subsequence recognized
18: if sb <= η then ▷ Score below new
subsequence threshold
19: θS .append(θb ) ▷ Append current
base model parameters to list of subsequence models
parameters
20: cs ← cs + 1 ▷ increase subsequence
counter
21: SID ← cs ▷ Set ID to new subsequence
index
22: else ▷ In transition
23: SID ← 0 ▷ Set ID to unknown
24: end if
25: end if
26: end while
27: end procedure

To estimate the complexity of the algorithm additional

experiments were run to substantiate our theoretical consid-
eration for the Big-O notation (see Figure 12). Since it is
an online algorithm that uses a sliding window, the duration
of the algorithm is linearly correlated to the number of data
points to be analyzed. Additionally, with every subsequence

18882 VOLUME 11, 2023

J. Köhne et al.: AE-Based Iterative Modeling and MTS Subsequence Clustering Algorithm

TABLE 16. Metrics from hyperparameter search when sorted for best value of mt3scm metric.

VOLUME 11, 2023 18883

J. Köhne et al.: AE-Based Iterative Modeling and MTS Subsequence Clustering Algorithm

identified, the algorithm checks if the new incoming data data after training is the parameters of the subsequence spe-
is already known by comparing it to the previously identi- cific AEs. It is a completely unsupervised method which can
fied subsequences. A linear correlation of the duration for cluster online data. In the context of CbM the once identified
every datapoint with the number of subsequences identified subsequence AEs can be used for deviation quantification
is therefore present. The complexity of our algorithm is of the underlying system. This can be used for deterioration
therefore O(NS), where N is the number of datapoints and analysis and maintenance strategies. Further investigations
S is the number of subsequences identified. Considering the for improving the ABIMCA method would be to explore
worst case scenario of identifying every new datapoint as different kind of AEs like feedforward neural network (FNN),
a new subsequence the duration for the algorithm increases CNN or a combination of such. Also, a VAE could be rea-
quadratically with the number of datapoints, which results in sonable depending on the underlying process. Future work
a complexity of O(n2 ). should analyze the effect of reducing the latent space dimen-
As cited before, every algorithm performs differently on sion by multiple factors of the input dimension (when input
the specific distribution of patterns and the hyperparameter dimension is very high). This could reduce computation costs
search was a simple random grid search of ‘‘only’’ 300 sam- and improve representation learning without performance
ples, so these results unlikely represent the optimal solution loss. A detailed analysis of the optimal default parameters or a
for each algorithm on each dataset. Nevertheless, we demon- generic automatic calibration depending on some statistics of
strate, that the algorithm we present in this work, is highly the expected input could increase performance and decrease
effective of detecting subsequences online in a MTS. calibration efforts.

VII. LIMITATIONS AND DISCUSSION VIII. CONCLUSION

For the evaluation of the segmentation of MTSD, we intro- In this paper we have introduced the Autoencoder-based
duced a new metric which is based on space-curve parame- Iterative Modeling and Subsequence Clustering Algorithm
ters in the feature space. Due to the wide variety of fields, (ABIMCA) which is a deep learning method to separate
use cases and applications, this falls into place for some multivariate time-series data (MTSD) into subsequences. It is
applications and uses cases but not for all. The calculation beneficial in a variety of fields, to cluster MTSD into smaller
of these space-curve parameters is sensitive to outliers and segments or subsequences in an unsupervised manner. The
smoothness and questionable for steady-state conditions or ability to filter measurement data based on specific subse-
non-moving point clouds in the feature space. We have imple- quences can improve downstream development products such
mented specific numerical boundary limits for computing as anomaly detection or machine diagnosis in Condition-
the derivatives of the data in these states, but it needs to based Maintenance (CbM) strategies. Our algorithm is specif-
be considered and evaluated if this is compatible with the ically useful for MTSD generated by a mechatronic system
application. Because of the outlier sensitivity we use the mean in a transient environment. It can be used offline as well as
value of these parameters as well as their standard deviation. online for streaming data. It utilizes recurrent neural network
A low-pass filter for very noisy data should be considered (RNN) based Autoencoders (AE) by iteratively training a
before applying it to the metric. Attention is called for, when Base Autoencoder (BAE), generating a segmentation score
the data is scaled or standardized. This effects the actual space and saving the intermediate parameters of the BAE to rec-
curve parameters, since a constant curvature is likely not con- ognize previously identified subsequences. By comparing
stant anymore after scaling. The metric also tends to reward our algorithm with seven other algorithms on eight different
short subsequences who only occur once. Due to the mean publicly available datasets using four different unsupervised
value of the curve parameters the subsequence separation metrics (from which we introduced one ourselves), we have
appears to be good, but the variance of one large subsequence shown that our algorithm outperforms state-of-the-art algo-
is high. This needs to be compensated or prevented more rithms. Our unsupervised metric introduced (Multivariate
and will be part of future analysis and improvement of the Time-Series Sub-Sequence Clustering Metric (MT3SCM)),
metric. It might also be reasonable to introduce weighting is an attempt to use a more intuitive similarity measure based
factors for the three parts of our metric in Equation (12) to on the curvature and other space-curve parameters of the
consider domain specific emphasis on a rather spatial or curve spanned feature space. Additionally, all our code is open
parameter separation requirement. source and publicly available for benchmarking.
Regarding our clustering method, calibration of the main REFERENCES
thresholds (η, ρ) needs special attention. In combination with [1] E. Quatrini, F. Costantino, G. Di Gravio, and R. Patriarca, ‘‘Condition-
the learning rate, they mainly influence the ‘‘sensitivity’’ based maintenance—An extensive literature review,’’ Machines, vol. 8,
of the segmentation process. Using only the reconstruction no. 2, p. 31, Jun. 2020.
[2] R. Isermann, Fault-Diagnosis Systems: An Introduction From Fault Detec-
error or only the representation deviation can be beneficial tion to Fault Tolerance. Berlin, Germany: Springer, 2006.
for different use cases and complexities (e.g., changing the [3] G. E. P. Box, G. M. Jenkins, G. C. Reinsel, and G. M. Ljung, Time
weighting factor in Equation (34)). Advantages are that no Series Analysis: Forecasting and Control (Wiley series in Probability
and Statistics), 5th ed. Hoboken, NJ, USA: Wiley, 2016. [Online].
additional information about the data for offline clustering Available: https://search.ebscohost.com/login.aspx?direct=true&scope=
needs to be saved. All necessary information for clustering site&db=nlebk&db=nlabk&AN=1061322

18884 VOLUME 11, 2023

J. Köhne et al.: AE-Based Iterative Modeling and MTS Subsequence Clustering Algorithm

[4] C. M. Bishop, Pattern Recognition and Machine Learning (Information [30] C. Isaksson, M. H. Dunham, and M. Hahsler, ‘‘SOStream: Self organizing
science and statistics), 1st ed. New York, NY, USA: Springer, 2016. density-based clustering over data stream,’’ in Machine Learning and
[5] J. F. Roddick and M. Spiliopoulou, ‘‘A survey of temporal knowledge dis- Data Mining in Pattern Recognition (Lecture Notes in Computer Science),
covery paradigms and methods,’’ IEEE Trans. Knowl. Data Eng., vol. 14, vol. 7376, D. Hutchison, T. Kanade, J. Kittler, J. M. Kleinberg, F. Mattern,
no. 4, pp. 750–767, Jul. 2002. J. C. Mitchell, M. Naor, O. Nierstrasz, C. P. Rangan, B. Steffen, M. Sudan,
[6] J. Lin, E. Keogh, S. Lonardi, and P. Patel, ‘‘Finding motifs in time series,’’ D. Terzopoulos, D. Tygar, M. Y. Vardi, G. Weikum, and P. Perner, Eds.
Proc. 2nd Workshop Temporal Data Mining, 2002, pp. 1–11. Berlin, Germany: Springer, 2012, pp. 264–278.
[7] G. J. J. van den Burg and C. K. I. Williams, ‘‘An evaluation of change point [31] M. Ester, H.-P. Kriegel, J. Sander, and X. Xu, ‘‘A density-based algo-
detection algorithms,’’ 2020, arXiv:2003.06222. rithm for discovering clusters a density-based algorithm for discovering
[8] A. K. Jain, M. N. Murty, and P. J. Flynn, ‘‘Data clustering: A review,’’ ACM clusters in large spatial databases with noise,’’ in Proc. 2nd Int. Conf.
Comput. Surv., vol. 31, no. 3, pp. 264–323, Sep. 1999. Knowl. Discovery Data Mining, 1996, pp. 226–231. [Online]. Available:
[9] V. Chandola, A. Banerjee, and V. Kumar, ‘‘Anomaly detection: A survey,’’ http://dl.acm.org/citation.cfm?id=3001460.3001507
ACM Comput. Surv., vol. 41, no. 3, pp. 1–58, Jul. 2009. [32] L. O’Callaghan, N. Mishra, A. Meyerson, S. Guha, and R. Motwani,
[10] R. Agrawal and R. Srikant, ‘‘Mining sequential patterns,’’ in Proc. 11th ‘‘Streaming-data algorithms for high-quality clustering,’’ in Proc. 18th Int.
Int. Conf. Data Eng., Mar. 1995, pp. 3–14. Conf. Data Eng., 2002, pp. 685–694.
[11] D. Xu and Y. Tian, ‘‘A comprehensive survey of clustering algorithms,’’ [33] M. Ghesmoune, M. Lebbah, and H. Azzag, ‘‘State-of-the-art on clustering
Ann. Data Sci., vol. 2, no. 2, pp. 165–193, 2015. data streams,’’ Big Data Analytics, vol. 1, no. 1, Dec. 2016.
[12] M. Lovrić, M. Milanović, and M. Stamenković, ‘‘Algoritmic meth- [34] E. Keogh, S. Chu, D. Hart, and M. Pazzani, ‘‘Segmenting time series:
ods for segmentation of time series: An overview,’’ J. Contemp. A survey and novel approach,’’ in Data Mining in Time Series Databases,
Econ. Bus. Issues, vol. 1, no. 1, pp. 31–53, 2014. [Online]. Available: vol. 57, M. Last, A. Kandel, and H. Bunke, Eds. Singapore: World Scien-
http://hdl.handle.net/10419/147468 tific, 2003, pp. 1–21.
[13] S. Torkamani and V. Lohweg, ‘‘Survey on time series motif discovery,’’ [35] M. Ramoni, P. Sebastiani, and P. Cohen, ‘‘Bayesian clustering by dynam-
Wiley Interdiscipl. Rev., Data Mining Knowl. Discovery, vol. 7, no. 2, ics,’’ Mach. Learn., vol. 47, no. 1, pp. 91–121, 2002.
p. e1199, Mar. 2017. [36] X. Wang, K. Smith, and R. Hyndman, ‘‘Characteristic-based clustering
[14] S. Aminikhanghahi and D. J. Cook, ‘‘A survey of methods for time series for time series data,’’ Data Mining Knowl. Discovery, vol. 13, no. 3,
change point detection,’’ Knowl. Inf. Syst., vol. 51, no. 2, pp. 339–367, pp. 335–364, 2006.
2017. [37] R. P. Silva, B. B. Zarpelão, A. Cano, and S. B. Junior, ‘‘Time series segmen-
[15] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, tation based on stationarity analysis to improve new samples prediction,’’
M. Blondel, A. Müller, J. Nothman, G. Louppe, P. Prettenhofer, R. Weiss, Sensors, vol. 21, no. 21, p. 7333, Nov. 2021.
V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, [38] S. Lu and S. Huang, ‘‘Segmentation of multivariate industrial time series
M. Perrot, and E. Duchesnay, ‘‘Scikit-learn: Machine learning in Python,’’ data based on dynamic latent variable predictability,’’ IEEE Access, vol. 8,
J. Mach. Learn. Res., vol. 12, pp. 2825–2830, Oct. 2011. pp. 112092–112103, 2020.
[16] T. Zhang, R. Ramakrishnan, and M. Livny, ‘‘BIRCH: An efficient data [39] M. Ceci, R. Corizzo, N. Japkowicz, P. Mignone, and G. Pio, ‘‘ECHAD:
clustering method for very large databases,’’ in Proc. ACM SIGMOD Int. Embedding-based change detection from multivariate time series in smart
Conf. Manage. Data, vol. 25, no. 2, 1996, pp. 103–114. grids,’’ IEEE Access, vol. 8, pp. 156053–156066, 2020.
[17] R. Prescott Adams and D. J. C. MacKay, ‘‘Bayesian online changepoint [40] R. Li, S. Li, K. Xu, X. Li, J. Lu, and M. Zeng, ‘‘A novel symmetric stacked
detection,’’ 2007, arXiv:0710.3742. autoencoder for adversarial domain adaptation under variable speed,’’
[18] J. Montiel, M. Halford, S. Martiello Mastelini, G. Bolmier, R. Sourty, IEEE Access, vol. 10, pp. 24678–24689, 2022.
R. Vaysse, A. Zouitine, H. Murilo Gomes, J. Read, T. Abdessalem, and [41] M. Baccouche, F. Mamalet, C. Wolf, C. Garcia, and A. Baskurt, ‘‘Spatio-
A. Bifet, ‘‘River: Machine learning for streaming data in Python,’’ 2020, temporal convolutional sparse auto-encoder for sequence classification,’’
arXiv:2012.04740. in Proc. Brit. Mach. Vis. Conf., R. Bowden, J. Collomosse, and K. Miko-
[19] C. C. Aggarwal, P. S. Yu, J. Han, and J. Wang, ‘‘A framework for clustering lajczyk, Eds. 2012, p. 124.
evolving data streams,’’ in Proc. VLDB Conf. Elsevier, 2003, pp. 81–92. [42] K. Bascol, R. Emonet, E. Fromont, and J.-M. Odobez, ‘‘Unsupervised
[20] M. Hahsler and M. Bolaños, ‘‘Clustering data streams based on shared interpretable pattern discovery in time series using autoencoders,’’ in
density between micro-clusters,’’ IEEE Trans. Knowl. Data Eng., vol. 28, Structural, Syntactic, and Statistical Pattern Recognition (Lecture Notes
no. 6, pp. 1449–1461, Jun. 2016. in Computer Science), vol. 10029, A. Robles-Kelly, M. Loog, B. Biggio,
[21] F. Cao, M. Estert, W. Qian, and A. Zhou, ‘‘Density-based clustering over F. Escolano, and R. Wilson, Eds. Cham, Switzerland: Springer, 2016,
an evolving data stream with noise,’’ in Proc. SIAM Int. Conf. Data Mining, pp. 427–438.
J. Ghosh, D. Lambert, D. Skillicorn, and J. Srivastava, Eds. Philadel- [43] S. E. Chazan, S. Gannot, and J. Goldberger, ‘‘Deep clustering based on a
phia, PA, USA: Society for Industrial and Applied Mathematics, 2006, mixture of autoencoders,’’ in Proc. IEEE 29th Int. Workshop Mach. Learn.
pp. 328–339. Signal Process. (MLSP), Oct. 2019, pp. 1–6.
[22] D. Sculley, ‘‘Web-scale K-means clustering,’’ in Proc. 19th Int. Conf. [44] B. Yang, X. Fu, D. N. Sidiropoulos, and M. Hong, ‘‘Towards
World Wide Web (WWW), New York, NY, USA, M. Rappa, P. Jones, K-means-friendly spaces: Simultaneous deep learning and
J. Freire, and S. Chakrabarti, Eds. 2010, p. 1177. clustering,’’ in Proc. 34th Int. Conf. Mach. Learn., D. Precup and
[23] T. W. Liao, ‘‘Clustering of time series data—A survey,’’ Pattern Recognit., Y. W. Teh, Eds. vol. 70, 2017, pp. 3861–3870. [Online]. Available:
vol. 38, no. 11, pp. 1857–1874, 2005. https://proceedings.mlr.press/v70/yang17b.html
[24] S. Zolhavarieh, S. Aghabozorgi, and Y. W. Teh, ‘‘A review of subsequence [45] Y. Guo, W. Liao, Q. Wang, L. Yu, T. Ji, and P. Li, ‘‘Multidimensional
time series clustering,’’ Sci. World J., vol. 2014, Jul. 2014, Art. no. 312521. time series anomaly detection: A gru-based Gaussian mixture variational
[25] S. Aghabozorgi, A. Seyed Shirkhorshidi, and T. Ying Wah, ‘‘Time-series autoencoder approach,’’ in Proc. 10th Asian Conf. Mach. Learn., J. Zhu
clustering—A decade review,’’ Inf. Syst., vol. 53, pp. 16–38, Oct. 2015. and I. Takeuchi, Eds. vol. 95, 2018, pp. 97–112. [Online]. Available:
[26] M. Carnein and H. Trautmann, ‘‘Optimizing data stream representation: http://proceedings.mlr.press/v95/guo18a.html
An extensive survey on stream clustering algorithms,’’ Bus. Inf. Syst. Eng., [46] T. Chen, X. Liu, B. Xia, W. Wang, and Y. Lai, ‘‘Unsupervised
vol. 61, no. 3, pp. 277–297, Jun. 2019. anomaly detection of industrial robots using sliding-window convolu-
[27] H.-P. Kriegel, P. Kröger, and A. Zimek, ‘‘Clustering high-dimensional tional variational autoencoder,’’ IEEE Access, vol. 8, pp. 47072–47081,
data,’’ ACM Trans. Knowl. Discovery Data, vol. 3, no. 1, pp. 1–58, 2020.
Mar. 2009. [47] W.-H. Lee, J. Ortiz, B. Ko, and R. Lee, ‘‘Time series segmentation through
[28] C. Truong, L. Oudre, and N. Vayatis, ‘‘Selective review of offline automatic feature learning,’’ 2018, arXiv:1801.05394.
change point detection methods,’’ Signal Process., vol. 167, Feb. 2020, [48] D. Dua and C. Graff. (2017). UCI Machine Learning Repository. [Online].
Art. no. 107299. Available: http://archive.ics.uci.edu/ml
[29] E. Aljalbout, V. Golkov, Y. Siddiqui, M. Strobel, and D. Cremers, [49] (Sep. 14, 2020). Data.gov. [Online]. Available: https://data.gov/
‘‘Clustering with deep learning: Taxonomy and new methods,’’ 2018, [50] C. J. van Rijsbergen, Information Retrieval. London, U.K.: Butterworths,
arXiv:1801.07648. 1979.

VOLUME 11, 2023 18885

J. Köhne et al.: AE-Based Iterative Modeling and MTS Subsequence Clustering Algorithm

[51] H. Kremer, P. Kranen, T. Jansen, T. Seidl, A. Bifet, G. Holmes, and JONAS KÖHNE was born in Berlin, Germany,
B. Pfahringer, ‘‘An effective evaluation measure for clustering on evolving in 1983. He received the Dipl.-Ing. degree in
data streams,’’ in Proc. 17th ACM SIGKDD Int. Conf. Knowl. Discov- energy and process engineering from the Technis-
ery Data Mining (KDD), New York, NY, USA, C. Apte, J. Ghosh, and che Universität Berlin, Germany, in 2014, where
P. Smyth, Eds. 2011, p. 868. he is currently pursuing the Ph.D. degree with the
[52] J. Serrá and J. L. Arcos, ‘‘An empirical evaluation of similarity measures Department of Energy and Automation Technol-
for time series classification,’’ Knowl.-Based Syst., vol. 67, pp. 305–314, ogy. From 2014 to 2015, he worked as a Func-
Sep. 2014.
tion Development Engineer and an Embedded
[53] P. J. Rousseeuw, ‘‘Silhouettes: A graphical aid to the interpretation and
Software Test Engineer in the automotive field
validation of cluster analysis,’’ J. Comput. Appl. Math., vol. 20, no. 1,
of electrical power steering with Bertrandt AG.
pp. 53–65, 1987.
[54] T. Caliński and J. Harabasz, ‘‘A dendrite method for cluster analysis,’’ From 2015 to 2019, he worked as a Function Developer of powertrain and
Commun. Stat., Theory Methods, vol. 3, no. 1, pp. 1–27, 1974. power engineering with the Department of Commercial Vehicle Electronics,
[55] D. L. Davies and D. W. Bouldin, ‘‘A cluster separation measure,’’ IEEE IAV GmbH. Since 2019, he has been a Research Assistant with the Depart-
Trans. Pattern Anal. Mach. Intell., vol. PAMI-1, no. 2, pp. 224–227, ment of Energy and Automation Technology, Technische Universität Berlin.
Apr. 1979. He is also working as a Data Scientist with the Department Commercial
[56] S. M. Oh, J. M. Rehg, T. Balch, and F. Dellaert, ‘‘Learning and inferring Vehicle Electronics, IAV GmbH. His research interests include anomaly
motion patterns using parametric segmental switching linear dynamic and novelty detection, condition/predictive maintenance strategies of mecha-
systems,’’ Int. J. Comput. Vis., vol. 77, nos. 1–3, pp. 103–124, May 2008. tronic systems, analysis of multivariate time-series data using autoencoder,
[57] M. Arias Chao, C. Kulkarni, K. Goebel, and O. Fink, ‘‘Aircraft engine and fault discovery and identification.
Run-to-Failure dataset under real flight conditions for prognostics and
diagnostics,’’ Data, vol. 6, no. 1, p. 5, Jan. 2021.
[58] E. Yemini, T. Jucikas, L. J. Grundy, A. E. X. Brown, and W. R. Schafer,
‘‘A database of caenorhabditis elegans behavioral phenotypes,’’ Nature
Methods, vol. 10, no. 9, pp. 877–879, Sep. 2013. LARS HENNING was born in Nauen, Germany,
[59] T. Schneider, N. Helwig, and A. Schütze, ‘‘Automatic feature extraction in 1975. He received the Dipl.-Ing. degree in
and selection for classification of cyclical time series data,’’ Technisches energy and process engineering from the Technis-
Messen, vol. 84, no. 3, pp. 198–206, Mar. 2017. che Universität Berlin, Germany, in 2002, and the
[60] E. N. Lorenz, ‘‘Deterministic nonperiodic flow,’’ J. Atmos. Sci., vol. 20, Ph.D. degree in control engineering, in 2008. Since
no. 2, pp. 130–141, 1963. 2008, he has been working with the Department
[61] (Apr. 21, 2022). Carnegie Mellon University—CMU Graphics Lab— of Commercial Vehicle Electronics, IAV GmbH,
Motion Capture Library. [Online]. Available: http://mocap.cs.cmu.edu/ in the area of powertrain and power engineering.
[62] L. M. I. Candanedo and V. Feldheim, ‘‘Accurate occupancy detection of Currently, he is a Team Manager of the Software
an office room from light, temperature, humidity and Co2 measurements Development Team, with the focus on condi-
using statistical learning models,’’ Energy Buildings, vol. 112, pp. 28–39, tion/predictive maintenance strategies of mechatronic systems and machine-
Jan. 2016. learned algorithms for powertrain control.
[63] R. Thomas, ‘‘Deterministic chaos seen in terms of feedback circuits:
Analysis, synthesis, ‘labyrinth chaos,’’’ Int. J. Bifurcation Chaos, vol. 9,
no. 10, pp. 1889–1905, 1999.
[64] J. Köhne. (2022). MT3SCM: Multivariate Time Series Sub-
Sequence Clustering Metric. Python. [Online]. Available: CLEMENS GÜHMANN was born in Berlin,
https://github.com/Jokonu/mt3scm Germany, in 1962. He received the Dipl.-Ing.
[65] C. S. Möller-Levet, F. Klawonn, K.-H. Cho, and O. Wolkenhauer, ‘‘Fuzzy degree in electrical engineering from the Technis-
clustering of short time-series and unevenly distributed sampling points,’’ che Universität Berlin, Germany, in 1989, and the
in Advances in Intelligent Data Analysis V (Lecture Notes in Computer Sci- Ph.D. degree in pattern recognition and technical
ence), vol. 2810, G. Goos, J. Hartmanis, J. van Leeuwen, M. R. Berthold, diagnosis, in 1995. From 1989 to 1994, he was
H.-J. Lenz, E. Bradley, R. Kruse, and C. Borgelt, Eds. Berlin, Germany: a Research Assistant with the Institute for Gen-
Springer, 2003, pp. 330–340. eral Electrical Engineering. From 1994 to 1995,
[66] A. Paszke et al., ‘‘PyTorch: An imperative style, high-performance deep he worked as a Function Development Engi-
learning library,’’ in Proc. Adv. Neural Inf. Process. Syst., vol. 32,
neer with Whirlpool Corporation (Bauknecht).
H. Wallach, H. Larochelle, A. Beygelzimer, F. d’Alché Buc, E. Fox,
From 1996 to 2003, he was employed as a Function Development Engi-
and R. Garnett, Eds. Red Hook, NY, USA: Curran Associates, 2019,
pp. 8024–8035. [Online]. Available: http://papers.neurips.cc/paper/9015- neer with IAV GmbH, where he was the last Head of the Department of
pytorch-an-imperative-style-high-performance-deep-learning-library.pdf Transmission Systems. In 2001, he received a Teaching Assignment with
[67] M. Ranzato, C. S. Poultney, S. Chopra, and Y. LeCun, ‘‘Efficient learning the Technische Universität Berlin, where he was appointed to a professor-
of sparse representations with an energy-based model,’’ in Proc. NIPS, ship at the Chair of Electronic Measurement and Diagnostic Technology,
2006, pp. 1–8. in 2003. His research interests include the measurement technology and data
[68] J. Köhne. (2022). ABIMCA: Autoencoder Based Iterative Modeling processing and the diagnosis, predictive maintenance, modeling, simulation,
and Subsequence Clustering Algorithm. Python. [Online]. Available: and automatic control of mechatronic systems.
https://github.com/Jokonu/abimca

18886 VOLUME 11, 2023

LSTM-based-framework-with-metaheuristic-optimizer-fo_2023_Alexandria-Enginee
No ratings yet
LSTM-based-framework-with-metaheuristic-optimizer-fo_2023_Alexandria-Enginee
10 pages
Comparing Clustering Algorithms Using Financial Time-Series Data
No ratings yet
Comparing Clustering Algorithms Using Financial Time-Series Data
21 pages
s10618-020-00727-3
No ratings yet
s10618-020-00727-3
49 pages
Multivariate Time Series Classification of Sensor Data From An in
No ratings yet
Multivariate Time Series Classification of Sensor Data From An in
101 pages
Engineering Standard For Process Design of Valves and Control Valves
No ratings yet
Engineering Standard For Process Design of Valves and Control Valves
41 pages
3481 PDF
No ratings yet
3481 PDF
8 pages
WIREs Data Min Knowl - 2022 - Meisenbacher - Review of Automated Time Series Forecasting Pipelines
No ratings yet
WIREs Data Min Knowl - 2022 - Meisenbacher - Review of Automated Time Series Forecasting Pipelines
42 pages
A012 Elliott Wave Theory
No ratings yet
A012 Elliott Wave Theory
81 pages
852883
No ratings yet
852883
42 pages
electronics-10-01166-v2
No ratings yet
electronics-10-01166-v2
21 pages
Husserl and Derrida
100% (1)
Husserl and Derrida
216 pages
Multivariate Time Series Classification With WEASE
No ratings yet
Multivariate Time Series Classification With WEASE
12 pages
Anomaly Detection of Spacecraft Telemetry Data Using Temporal Convolution Network
No ratings yet
Anomaly Detection of Spacecraft Telemetry Data Using Temporal Convolution Network
5 pages
Advanced Statistical and Machine Learning Methods For Multi-Step
No ratings yet
Advanced Statistical and Machine Learning Methods For Multi-Step
10 pages
sensors-21-07333
No ratings yet
sensors-21-07333
22 pages
SAP - CBM concept
No ratings yet
SAP - CBM concept
14 pages
Deep Learning for Time Series Classification a Rev
No ratings yet
Deep Learning for Time Series Classification a Rev
48 pages
Physics-informed HPs selection
No ratings yet
Physics-informed HPs selection
19 pages
Man 800t User
No ratings yet
Man 800t User
337 pages
Unsupervised Classification of Multivariate Time Series Using VPCA and Fuzzy Clustering With Spatial Weighted Matrix Distance
No ratings yet
Unsupervised Classification of Multivariate Time Series Using VPCA and Fuzzy Clustering With Spatial Weighted Matrix Distance
10 pages
LSTM 2
No ratings yet
LSTM 2
14 pages
24-25 G7 Math Rigour: Fraction, Decimal and Percentage (1)
No ratings yet
24-25 G7 Math Rigour: Fraction, Decimal and Percentage (1)
10 pages
Exploiting multi-channels deep convolutional neural networks for multivariate time series classification
No ratings yet
Exploiting multi-channels deep convolutional neural networks for multivariate time series classification
17 pages
1-s2.0-S1877050915018244-main
No ratings yet
1-s2.0-S1877050915018244-main
10 pages
Time Series 10.1007@s10618 019 00619 1
No ratings yet
Time Series 10.1007@s10618 019 00619 1
47 pages
Approaches and Applications of Early Classification
No ratings yet
Approaches and Applications of Early Classification
15 pages
Convolutional Neural Networks: Shusen Wang
No ratings yet
Convolutional Neural Networks: Shusen Wang
75 pages
Timemachine: A Time Series Is Worth 4 Mambas For Long-Term Forecasting
No ratings yet
Timemachine: A Time Series Is Worth 4 Mambas For Long-Term Forecasting
10 pages
An experimental evaluation of state-of-
No ratings yet
An experimental evaluation of state-of-
27 pages
Time Machine
No ratings yet
Time Machine
10 pages
Processes 10 02529 v2
No ratings yet
Processes 10 02529 v2
26 pages
MSR (Initialization Better Than Xavier)
No ratings yet
MSR (Initialization Better Than Xavier)
9 pages
Multivariate Lstm-Fcns For Time Series Classification: A B A, A
No ratings yet
Multivariate Lstm-Fcns For Time Series Classification: A B A, A
18 pages
Addison-Wesley - Franklin G.F., Powell J.D., Workman M.L. - Digital Control of Dynamic Systems, 3E
100% (2)
Addison-Wesley - Franklin G.F., Powell J.D., Workman M.L. - Digital Control of Dynamic Systems, 3E
382 pages
Correction: A Semi Supervised Interactive Algorithm For Change Point Detection
No ratings yet
Correction: A Semi Supervised Interactive Algorithm For Change Point Detection
1 page
2024.A Comprehensive Review of Machine Learning Techniques for Condition-Based Maintenance
No ratings yet
2024.A Comprehensive Review of Machine Learning Techniques for Condition-Based Maintenance
20 pages
A New Convolutional Neural Network Based Data-Driven Fault Diagnosis Method
100% (1)
A New Convolutional Neural Network Based Data-Driven Fault Diagnosis Method
9 pages
RNG Revised
No ratings yet
RNG Revised
132 pages
PDM For Conveyor Belts
No ratings yet
PDM For Conveyor Belts
17 pages
A new health indicator extracted by unsupervised learning using autoencoder in tandem with t-sne and multi-kernel CNN to enhance the early detection and classification of bearings multi-faults
No ratings yet
A new health indicator extracted by unsupervised learning using autoencoder in tandem with t-sne and multi-kernel CNN to enhance the early detection and classification of bearings multi-faults
14 pages
It Sylbus
No ratings yet
It Sylbus
220 pages
Time Series Classification Using Multi-Channels Deep Convolutional Neural Networks
No ratings yet
Time Series Classification Using Multi-Channels Deep Convolutional Neural Networks
13 pages
TapNet - Multivariate Time Series Classification With Attentional Prototypical Network
No ratings yet
TapNet - Multivariate Time Series Classification With Attentional Prototypical Network
8 pages
Solving-Quadratic-Equations-Exam-Questions
No ratings yet
Solving-Quadratic-Equations-Exam-Questions
5 pages
Mechanical Systems and Signal Processing: Wilson Wang, Derek Kanneg
No ratings yet
Mechanical Systems and Signal Processing: Wilson Wang, Derek Kanneg
15 pages
Bayesian Linear Regression in Data Mining: K.Sathyanarayana Sharma, Dr.S.Rajagopal
No ratings yet
Bayesian Linear Regression in Data Mining: K.Sathyanarayana Sharma, Dr.S.Rajagopal
3 pages
Patri BigData 2014
No ratings yet
Patri BigData 2014
10 pages
PHYS 347 Lecture Notes On Digital Electronics 2 2023 - 2024
No ratings yet
PHYS 347 Lecture Notes On Digital Electronics 2 2023 - 2024
30 pages
Grade 10 - Algebra Worksheet 2020
No ratings yet
Grade 10 - Algebra Worksheet 2020
15 pages
MAX31865 RTD-to-Digital Converter: General Description Features
No ratings yet
MAX31865 RTD-to-Digital Converter: General Description Features
25 pages
Assignment Dsp
No ratings yet
Assignment Dsp
8 pages
Mathematics Grade 6 Term 2 Marking Guideline
No ratings yet
Mathematics Grade 6 Term 2 Marking Guideline
4 pages
Unit 4 Multiple Degree of Freedom Systems
No ratings yet
Unit 4 Multiple Degree of Freedom Systems
11 pages
Analysis of Machine Learning Based Condition Monitoring Schemes Applied to Complex Electromechanical Systems
No ratings yet
Analysis of Machine Learning Based Condition Monitoring Schemes Applied to Complex Electromechanical Systems
4 pages
Essay
No ratings yet
Essay
9 pages
Rsmsat Syllabus & Exampattern
No ratings yet
Rsmsat Syllabus & Exampattern
5 pages
Mathematics in The Modern World Module 1
No ratings yet
Mathematics in The Modern World Module 1
9 pages
Assignment 1_ Survey Design & Data Analysis
No ratings yet
Assignment 1_ Survey Design & Data Analysis
3 pages
UMEP Sample
No ratings yet
UMEP Sample
2 pages
MAT 240 Module Five Assignment Template
No ratings yet
MAT 240 Module Five Assignment Template
3 pages
ACT Balancing Equations Phet
No ratings yet
ACT Balancing Equations Phet
3 pages
Julius_Plücker
No ratings yet
Julius_Plücker
4 pages
Regional Mathematical Olympiad Examination - 2019: Admit Card
No ratings yet
Regional Mathematical Olympiad Examination - 2019: Admit Card
2 pages
M Ch-15 Statistics
No ratings yet
M Ch-15 Statistics
4 pages
Digital Image Processing Full Report
No ratings yet
Digital Image Processing Full Report
4 pages
Direct Proof: Rational Numbers
No ratings yet
Direct Proof: Rational Numbers
4 pages
How To Design A Xilinx Digital Signal Processing System 13 1
No ratings yet
How To Design A Xilinx Digital Signal Processing System 13 1
1 page
Chaos Mesh for Resilient Kubernetes Deployments: The Complete Guide for Developers and Engineers
From Everand
Chaos Mesh for Resilient Kubernetes Deployments: The Complete Guide for Developers and Engineers
William Smith
No ratings yet
Prometheus Operator on Kubernetes Essentials: The Complete Guide for Developers and Engineers
From Everand
Prometheus Operator on Kubernetes Essentials: The Complete Guide for Developers and Engineers
William Smith
No ratings yet
Intelligent Technologies for Automated Electronic Systems
From Everand
Intelligent Technologies for Automated Electronic Systems
S. Kannadhasan
No ratings yet
Efficient Memory Optimization for IoT Intrusion Detection
From Everand
Efficient Memory Optimization for IoT Intrusion Detection
Ethan Evelyn
No ratings yet
Stream Processing Techniques and Patterns: Definitive Reference for Developers and Engineers
From Everand
Stream Processing Techniques and Patterns: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Energy Management Systems: Design and Implementation: Definitive Reference for Developers and Engineers
From Everand
Energy Management Systems: Design and Implementation: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Application Performance Management in Modern Systems: Definitive Reference for Developers and Engineers
From Everand
Application Performance Management in Modern Systems: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
DEEP LEARNING TECHNIQUES: CLUSTER ANALYSIS and PATTERN RECOGNITION with NEURAL NETWORKS. Examples with MATLAB
From Everand
DEEP LEARNING TECHNIQUES: CLUSTER ANALYSIS and PATTERN RECOGNITION with NEURAL NETWORKS. Examples with MATLAB
César Pérez López
No ratings yet
OpenTelemetry in Practice: Definitive Reference for Developers and Engineers
From Everand
OpenTelemetry in Practice: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Principles of Control Systems: Definitive Reference for Developers and Engineers
From Everand
Principles of Control Systems: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Observer Techniques and Applications: Definitive Reference for Developers and Engineers
From Everand
Observer Techniques and Applications: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Introduction to N.C.M., a Non Contact Measurement Tool
From Everand
Introduction to N.C.M., a Non Contact Measurement Tool
Dennis R. Branch
No ratings yet
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: SUPPORT VECTOR MACHINE, LOGISTIC REGRESSION, DISCRIMINANT ANALYSIS and DECISION TREES: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: SUPPORT VECTOR MACHINE, LOGISTIC REGRESSION, DISCRIMINANT ANALYSIS and DECISION TREES: Examples with MATLAB
César Pérez López
No ratings yet
DATA MINING and MACHINE LEARNING: CLUSTER ANALYSIS and kNN CLASSIFIERS. Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING: CLUSTER ANALYSIS and kNN CLASSIFIERS. Examples with MATLAB
César Pérez López
No ratings yet
SystemTap Essentials: Definitive Reference for Developers and Engineers
From Everand
SystemTap Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: NAIVE BAYES, NEAREST NEIGHBORS and NEURAL NETWORKS: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: NAIVE BAYES, NEAREST NEIGHBORS and NEURAL NETWORKS: Examples with MATLAB
César Pérez López
No ratings yet
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
From Everand
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
César Pérez López
No ratings yet
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
César Pérez López
No ratings yet
Introduction to Quantum Computing & Machine Learning Technologies: 1, #1
From Everand
Introduction to Quantum Computing & Machine Learning Technologies: 1, #1
M. Sreedevi
No ratings yet
Feedback Control Theory
From Everand
Feedback Control Theory
Bruce Francis
5/5 (1)
Analog Dialogue, Volume 48, Number 2
From Everand
Analog Dialogue, Volume 48, Number 2
Analog Dialogue
No ratings yet
Random Sample Consensus: Robust Estimation in Computer Vision
From Everand
Random Sample Consensus: Robust Estimation in Computer Vision
Fouad Sabry
No ratings yet
Kernel Methods: Fundamentals and Applications
From Everand
Kernel Methods: Fundamentals and Applications
Fouad Sabry
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Autoencoder-Based_Iterative_Modeling_and_Multivari

Uploaded by

Autoencoder-Based_Iterative_Modeling_and_Multivari

Uploaded by

Received 16 January 2023, accepted 13 February 2023, date of publication 22 February 2023, date of current version 28 February 2023.

Digital Object Identifier 10.1109/ACCESS.2023.3247564

Autoencoder-Based Iterative Modeling and

Corresponding author: Jonas Köhne (j.koehne@tu-berlin.de)

I. INTRODUCTION individuals and in different hardware and software develop-

VOLUME 11, 2023 18869

TABLE 1. Algorithms used for time-series clustering comparison.

FIGURE 1. Combination possibilities of time-series clustering categories.

18870 VOLUME 11, 2023

VOLUME 11, 2023 18871

Let’s also assume, that within the measurement data X

18872 VOLUME 11, 2023

In our use case we assume that some knowledge about the

V. MULTIVARIATE TIME-SERIES SUB-SEQUENCE ⟨ė1 (t), e2 (t)⟩

VOLUME 11, 2023 18873

defined as: cci = 1 − σi with cci ∈ R : cci ≤ 1 (22)

κ ij τ ij aij σij Nij yij

18874 VOLUME 11, 2023

VOLUME 11, 2023 18875

TABLE 4. Metric values for Figure 7 8 and 5.

18876 VOLUME 11, 2023

FIGURE 7. Agglomerative clustering from [15] applied on the

VOLUME 11, 2023 18877

FIGURE 10. Example of batch-wise offline clustering with a simple

TABLE 6. Summation of the number of outperformances of each

TABLE 5. Summation of the number of outperformances of each

which results in the final loss computation

18878 VOLUME 11, 2023

VOLUME 11, 2023 18879

18880 VOLUME 11, 2023

For a streaming application multiple runs of this procedure

VOLUME 11, 2023 18881

per Step, η: Subsequence Detection Score Threshold, ρ: Sub-

To estimate the complexity of the algorithm additional

18882 VOLUME 11, 2023

VOLUME 11, 2023 18883

VII. LIMITATIONS AND DISCUSSION VIII. CONCLUSION

18884 VOLUME 11, 2023

VOLUME 11, 2023 18885

18886 VOLUME 11, 2023

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.