0% found this document useful (0 votes)
34 views12 pages

Multi-Class Stress Detection Through Heart Rate Va

Uploaded by

chamartihareesh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
34 views12 pages

Multi-Class Stress Detection Through Heart Rate Va

Uploaded by

chamartihareesh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

This article has been accepted for publication in IEEE Access.

This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3274478

Received April XX, 2023. Date of publication xxxx 00, 0000, date of current version xxxx 00, 0000.
Digital Object Identifier 10.1109/ACCESS.2022.DOI

Multi-Class Stress Detection through


Heart Rate Variability: A Deep Neural
Network based Study
JON ANDREAS MORTENSEN1 , MARTIN EFREMOV MOLLOV1 , AYAN CHATTERJEE13 ,
DEBASISH GHOSE12 (SENIOR MEMBER, IEEE), AND FRANK Y. LI1
1
Department of Information and Communication Technology, University of Agder (UiA), N-4898 Grimstad, Norway
2
School of Economics, Innovation, and Technology, Kristiania University College, N-5022, Bergen, Norway
3
Department of Holistic Systems, Simula Metropolitan Center for Digital Engineering, N-0167, Oslo, Norway
Corresponding author: Debasish Ghose (e-mail: debasish.ghose@uia.no)
This work was supported in part by the Research Council of Norway under Grant 309257, “Orchestrating Internet of Things and Machine
Learning for Early Risk Detection to Ensure Inpatients Safety (StaySafe)."

ABSTRACT Stress is a natural human reaction to demands or pressure, usually when perceived as harmful
or/and toxic. When stress becomes constantly overwhelmed and prolonged, it increases the risk of mental
health and physiological uneasiness. Furthermore, chronic stress raises the likelihood of mental health
plagues such as anxiety, depression, and sleep disorder. Although measuring stress using physiological
parameters such as heart rate variability (HRV) is a common approach, how to achieve ultra-high accuracy
based on HRV measurements remains as a challenging task. HRV is not equivalent to heart rate. While
heart rate is the average value of heartbeats per minute, HRV represents the variation of the time interval
between successive heartbeats. The HRV measurements are related to the variance of RR intervals which
stand for the time between successive R peaks. In this study, we investigate the role of HRV features as stress
detection bio-markers and develop a machine learning-based model for multi-class stress detection. More
specifically, a convolution neural network (CNN) based model is developed to detect multi-class stress,
namely, no stress, interruption stress, and time pressure stress, based on both time- and frequency-domain
features of HRV. Validated through a publicly available dataset, SWELL−KW, the achieved accuracy score
of our model has reached 99.9% (Precision=1, Recall=1, F1-score=1, and MCC=0.99), thus outperforming
the existing methods in the literature. In addition, this study demonstrates the effectiveness of essential HRV
features for stress detection using a feature extraction technique, i.e., analysis of variance.

INDEX TERMS Stress detection, heart rate variability, convolution neural network, feature extraction.

I. INTRODUCTION heart is beating slowly and vice versa. Therefore, heart rate
Physical or mental imbalances caused by noxious stimuli and HRV generally have an inverse relationship [2] [3]. HRV
trigger stress to maintain homeostasis. Under chronic stress, varies over time based on activity levels and the amount of
the sympathetic nervous system becomes overactive, leading work-related stress.
to physical, psychological, and behavioral abnormalities [1]. Furthermore, stress is usually associated with a negative
Stress levels are often measured using subjective methods notion of a person and is considered to be a subjective feeling
to extract perceptions of stress. Stress level measurement of human beings that might affect emotional and physical
based on collected heart rate viability (HRV) data can help well-being. It is described as a psychological and biological
to remove the presence of stress by observing its effects on reaction to internal or external stressors [4], including a
the autonomic nervous system (ANS) [2]. biological or chemical agent and environmental stimulation
Typically, people with anxiety disorders have chronically that induce stress in an organism [5]. On a molecular scale,
lower resting HRV compared with healthy people. As re- stress impacts the ANS [6], which uses sympathetic and
vealed in [2] [3], HRV increases with relaxation and de- parasympathetic components to regulate the cardiovascular
creases with stress. Indeed, HRV is usually higher when a system. The sympathetic component in a human body [7]

VOLUME 4, 2016 1

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3274478

Mortensen et al.: Multi-Class Stress Detection through Heart Rate Variability: A Deep Neural Network based Study

works analogously to a car’s gas pedal. It activates the fight- based on analysis of variance (ANOVA) F-test and demon-
or-flight response, giving the body a boost of energy to re- strate that it is possible to achieve an accuracy score of 96.5%
spond to negative influences. In contrast, the parasympathetic with less than half of the features that are available in the
component is the brake for a body. It stimulates the body’s SWELL−KW dataset. Such a feature extraction reduces the
rest and digests reaction by relaxing the body when a threat computational load during the model training phase.
has passed. Given the fact that the ANS regulates the mental In a nutshell, the novelty and the main contributions of this
stress level of a human being, physiological measurements study are summarized as follows:
such as electrocardiogram (ECG), electromyogram (EMG), • We have developed a novel 1D CNN model to detect
galvanic skin response (GSR), HRV, heart rate, blood pres- multi-class stress status with outstanding performance,
sure, breath frequency, and respiration rate can be used to achieving 99.9% accuracy with a Precision, F1-score,
assess mental stress [8]. and Recall score of 1.0 respectively and a Matthews cor-
ECG signals are commonly adopted to extract HRV [9]. relation coefficient (MCC) score of 99.9%. We believe
HRV is defined as the variation across intervals between this is the first study that achieves such a high score of
consecutive regular RR intervals1 , and it is measured by accuracy for multi-class stress classification.
determining the length between two successive heartbeat • Furthermore, we reveal that not all 34 HRV features
peaks from an ECG reading. Conventionally, HRV has been are necessary to accurately classify multi-class stress.
accepted as a term to describe variations of both instanta- We have performed feature optimization to select an
neous heart rate and RR intervals [12]. optimized feature set to train a 1D CNN classifier,
Obtaining HRV from ECG readings requires clinical set- achieving a performance score that beats the existing
tings and specialized technical knowledge for data interpre- classification models based on the SWELL-KW dataset.
tation. Thanks to the recent technological advances on the • Our model with selected top-ranked HRV features
Internet of medical things (IoMT) [17], it is possible to does not require resource-intensive computation and
deploy a commercially available wearable or non-wearable it achieves also excellent accuracy without sacrificing
IoMT devices to monitor and record heart rate measurements. critical information.
Based on ECG data analysis (or HRV features, various The remainder of the paper is organized as follows. After
machine learning (ML) and deep learning (DL) algorithms summarizing related work and pointing out the distinction
have been developed in recent years for stress prediction between our work and the existing work in Sec. II, we in-
[20] - [27] (see more details in Sec. II). Among the pub- troduce briefly the framework for stress status classification,
licly available datasets for stress detection, SWELL−KW dataset, and data preprocessing in Sec. III. Then the devel-
developed in [13] [14] one of the two most popular ones. oped CNN model is presented in Sec. IV. Afterwards, Sec. V
However, none of the existing ML and DL studies based on defines the performance metrics to evaluate the proposed
the SWELL−KW dataset for multi-class stress classification classifier and Sec. VI presents the numerical results. Further
have achieved ultra-high accuracy, especially for multi-class discussions are provided in Sec. VII. Finally, the paper is
stress level classification [15] [16]. Therefore, there exists a concluded in Sec. VIII.
research gap on developing novel ML models which are able
to achieve ultra-high accurate prediction. II. RELATED WORK
Motivated by various existing applied ML and DL based The related work considered in this study covers HRV data
studies on HRV feature processing for stress level classifi- quality and various state-of-the-art ML/DL algorithms devel-
cations, we have designed and developed a one-dimensional oped for stress detection.
convolutional neural network (1D CNN) model for multi- For HRV data quality, a detailed review on data received
class stress classification and demonstrate its superiority over from ECG and IoMT devices such as Elite HRV, H7, Polar,
the state-of-the-art models based on the SWELL-KW dataset and Motorola Droid can be found in [18]. 23 studies indicated
in term of prediction accuracy. More specifically, we have minor errors when comparing the HRV values obtained from
performed studies on stress detection using both traditional commercially available IoMT devices with ECG instrument-
machine learning algorithms and/or multi-layer perceptron based measurements. In practice, such a small-scale error
(MLP) algorithms which are inspired from the fully con- in HRV measurements is reasonable, as getting HRVs using
nected neural network (FCNN) architecture. In our work, portable IoMT devices is more practical, cost-effective, and
we have developed a 1D CNN model which is based on no laboratory/clinical equipment is required [18] [19].
the convolution operation. CNN reduces number of training On the other hand, there have been a lot of recent research
parameters as MLP takes vector as input and CNN takes efforts on ECG data analysis to classify stress through ML
tensor as input so that CNN can understand spatial relation. and DL algorithms [20] – [23]. Existing algorithms have fo-
While the accuracy achieved with full features is nearly cused mainly on binary (stress versus non-stress) and multi-
100%, we have also introduced a feature reduction algorithm class stress classifications. For instance, the authors in [4]
1 An RR internal represents the time from an R-peak to the next R-
classified HRV data into stressed and normal physiological
peak [10]. It defines the time elapsed between two successive R-waves of the states. The authors compared different ML approaches for
Q-wave, R-wave and S-wave (QRS) signal on the electrocardiogram [11]. classifying stress, such as naive Bayes, k-nearest neighbour
2 VOLUME 4, 2016

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3274478

Mortensen et al.: Multi-Class Stress Detection through Heart Rate Variability: A Deep Neural Network based Study

Training Phase Testing Phase Feature Ranking


(ANOVA)

Feature Selection
Dataset Data Preprocessing
(HRV features)
Classification
Stress Level Annotations (1D-CNN)

Stress Level Detection

FIGURE 1. Framework of the proposed stress status classification model: From data collection to stress level classification.

(KNN), support vector machine (SVM), MLP, random forest, Tab. 3 of this paper).
and gradient boosting. The best recall score they achieved
was 80%. A similar comparison study was performed in [27], III. FRAMEWORK OVERVIEW AND DATA
where the authors showed that SVM with radial basis func- PREPROCESSING
tion (RBF) provided an accuracy score of 83.33% and In this section, we give an overview about the framework
66.66% respectively, using the time-domain and frequency- for multi-class stress classification. While the overview and
domain features of HRV. Moreover, dimension reduction model preparation (including data collection, dataset, and
techniques have been applied to select best temporal and data preprocessing) are outlined in this section, the CNN
frequency domain features in HRV [24]. Binary classifica- model itself is presented in the next section.
tion, i.e., stressed versus not stressed, was performed using
CNN in [25] through which the authors achieved an accuracy A. FRAMEWORK OVERVIEW
score of 98.4%. Another study, StressClick [26], employed Fig. 1 illustrates the schematic diagram of the proposed
a random forest algorithm to classify stressed versus not stress level classification framework. Briefly, the framework
stressed based on mouse-click events, i.e., the gaze-click constitutes the following procedures.
pattern collected from the commercial computer webcam and • Data collection and datasets. HRV signals are collected
mouse. and separated into a training dataset and a testing
In [14], tasks for multi-class stress classification (e.g., dataset. They will use to define the model’s architecture
no stress, interruption stress, and time pressure stress) were and to assess the proposed model’s effectiveness.
performed using SVM based on the SWELL−KW dataset. • Data preprocessing and feature extraction. Data are pre-
The highest accuracy they achieved was 90%. Furthermore, processed to fit into the feature ranking algorithm. In
another publicly available dataset, WESAD, was used in [27] this study, ANOVA F-tests [28] and forward sequential
for multi-class (amusement versus baseline versus stress) feature selection are employed for feature ranking and
and binary (stress versus non-stress) classifications. In their selection respectively.
investigations, ML algorithms achieved accuracy scores up • Classification and validation. The designed DL-based
to 81.65% for three-class categorization. The authors also multi-class classifier is trained, tested, and validated
checked the performance of deep learning algorithms, where with significant features and annotations (e.g., no stress,
they achieved an accuracy level of 84.32% for three-class interruption condition, and time pressure) labeled by
stress classification. Furthermore, it is worth mentioning medical professionals.
that novel deep learning techniques, such as genetic deep • Testing. In the testing phase, distinctive features are con-
learning convolutional neural networks (GDCNNs) [38] [39], sidered from the new test samples, and the class label is
have appeared as a powerful tool for two-dimensional data resolved using all classification parameters estimated in
classification tasks. To apply GDCNN to 1D data, however, training. Different numbers of features are extracted and
comprehensive modifications or adaptations are required and tested.
such a topic is beyond the scope of this paper. • Performance assessment. The performance of the clas-
As summarized in Tab. 5 of [15], in a fresh study published sifier is measured against discrimination analysis met-
online in August 2022, the best results for stress detection rics, such as Accuracy, Precision, Recall, F1-score, and
based on the SWELL−KW dataset for the single-dataset MCC.
models developed therein are 88.64% (Accuracy), 93.01%
(Precision), 92.68% (Recall), and 82.75% (F1-scores) re- B. DATA COLLECTION AND DATASET
spectively. Compared with these state-of-the-art models, the We adopt the SWELL−KW dataset, which was collected in
model developed in this study has achieved much better a study reported in [13] [14]. Various types of data have
performance (see more details in Subsec. VI-F especially been recorded, including computer logging, facial expression
VOLUME 4, 2016 3

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3274478

Mortensen et al.: Multi-Class Stress Detection through Heart Rate Variability: A Deep Neural Network based Study

TABLE 1. Explanation

of the HRV features in the SWELL–KW dataset [4] [35].


No. Feature Meaning
1 MEAN_RR Mean of RR intervals
2 MEDIAN_RR Median of RR intervals
3 SDRR SD of RR intervals
4 RMSSD Root mean square of succes-
sive RR interval differences
FIGURE 2. Distribution of data in SWELL−KW [13]. 5 SDSD SD of successive RR inter-
val differences
from camera recordings, body postures from a Kinect 3- 6 SDRR_RMSSD Ratio of SDRR over
dimensional (3D) sensor, heart rate (variability), and skin RMSSD
7 HR Heart rate
conductance from body sensors. 8 pNN25 Percentage of successive RR
In the experiments, 25 volunteers performed typical intervals that differ more
knowledge tasks (writing reports, making presentations, than 25 ms
reading emails, searching for information) during which their 9 pNN50 Percentage of successive RR
psychological and biological status data were recorded. The intervals that differ more
than 50 ms
working conditions of the participants were manipulated with
10 SD1 Measures short-term HRV
two types of stressors: email interruptions and time pressure. in ms and correlates with
The SWELL−KW dataset comprises HRV computed for baroreflex sensitivity (BRS)
stress and user modeling. The subjective experiences of par- 11 SD2 Measures of long-term HRV
ticipants with task load, mental effort, mood, and perceived in ms and correlates with
stress were also recorded. Each participant was exposed to BRS
12 KURT Kurtosis of RR intervals
three different working environments and the data are then 13 SKEW Skewness of RR intervals
labeled by medical professionals as follows. 14 MEAN_REL_RR RR Mean of relative RR in-
• No stress: The participants are permitted to work on the tervals
activities for as long as they need, up to 45 minutes. 15 MEDIAN_REL_RR Median of relative RR inter-
vals
However, they are unaware of the maximum duration
16 SDRR_REL_RR SD of relative RR intervals
of the task. 17 RMSSD_REL_RR Square root of the mean of
• Time pressure: Under time pressure, the time to com- the sum of the squares of the
plete the same job was decreased to 2/3 of its time in the difference between adjacent
normal condition. relative RR intervals
• Interruption: The participants were interrupted when
18 SDSD_REL_RR SD of interval of differ-
ences between adjacent rel-
they received 8 emails in the middle of a given activity. ative RR intervals
Some emails were pertinent to their tasks, and the par- 19 SDRR_RMSSD_REL_RR Ratio of SDRR_REL over
ticipants were asked to take particular actions, whereas RMSSD_REL
others were totally irrelevant to the ongoing tasks. 20 KURT_REL_RR Kurtosis of relative RR in-
tervals
21 SKEW_REL_RR Skewness of relative RR in-
tervals
22- VLF; VLF_PCT Very low (0.003 Hz - 0.04
23 Hz) frequency activity of the
HRV spectrum
24- LF; LF_PCT; LF_NU Low frequency activity in
26 the 0.04 - 0.15 Hz range
27- HF; HF_PCT; HF_NU High-frequency activity in
29 the 0.15 - 0.40 Hz range
30 TP Total HRV power spectrum
31 LF_HF Ratio of low to high fre-
quency
32 HF_LF Ratio of high to low fre-
quency
33 sampen Sample entropy of the RR
sign
34 higuci Higuchi Fractal Dimension
FIGURE 3. Time-domain features of HRV.

The distribution of the collected data with three different


stress classes is presented in Fig. 2. The HRV indices were from each participant’s peaks of the ECG signals. For each
computed by extracting an inter-beat interval (IBI) signal participant, the experiment lasted for approximately 3 hours.
4 VOLUME 4, 2016

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3274478

Mortensen et al.: Multi-Class Stress Detection through Heart Rate Variability: A Deep Neural Network based Study

IV. A CNN MODEL FOR STRESS STATUS


CLASSIFICATION
In this section, we present the developed deep learning model
for stress status classification. As shown on the right-side
hand of Fig. 1, the model consists of feature ranking, feature
extraction, and tress level classification.

A. FEATURE RANKING AND EXTRACTION


Firstly, we rank the essential features based on their relevance
to the classification task. To do so, the ANOVA [31] F-
test is adopted to select the significant features from the
SWELL−KW dataset for feature ranking and extraction.
ANOVA is a popular tool to perform a parametric statisti-
cal hypothesis test that assesses whether the means of two
FIGURE 4. Frequency-domain features of HRV. or more data samples (typically three or more) are from
the same distribution or not. An F-statistic or F-test is a
statistical test method that adopts ANOVA to calculate the
From the HRV data, various time-domain and frequency- ratio between variance values, such as variance from two
domain features are extracted, as presented in Tab. 1. Fur- different samples, or explained and unexplained variance.
thermore, we illustrate in Fig. 3 the time-domain features, Furthermore, ANOVA can be used when one variable is
e.g., time intervals between consecutive heart beats (RR numeric, and the other one is categorical, such as when a
interval) and hear rate of HRV signals. Correspondingly, the numerical input data and a classification outcome variable
frequency-domain features, i.e., the signal power levels with are compared in a classification task.
respect to low frequency (LF) and high frequency (HF), are
In this study, we first employ all features for stress clas-
illustrated in Fig. 4. These plots are generated using the first
sification and then drop the minor significant features based
1000 samples from the SWELL−KW dataset.
on the importance of features (i.e., feature ranking) before
performing the classification task. In the latter case, the
C. DATA PREPROCESSING training time is shortened while keeping the accuracy of the
model.
The collected HRV data in the SWELL−KW dataset are
time-variant. For classification, we re-construct the HRV
B. A CNN DL MODEL FOR STRESS CLASSIFICATION
data, which was a discrete time series with timestamps, to
a series indexed with sequence numbers without timestamps. The designed DL model for stress level classification is
Moreover, we convert all data into the numerical format. We developed based on the conventional, well-known CNN ar-
also remove participants’ noisy, incomplete, or missing data. chitectures [32]. CNN is a powerful tool for automatic feature
These processing steps result in 25 participant’s data with extraction and learning from 1D data sequences. The HRV
410322 number of records and 34 number of features for features of the CNN architecture that are used in our model
stress level classification. are illustrated in Tab. 1. For our model design, we retain a
reasonable number of neurons in each layer based on the
Moreover, we perform normality tests using methods, such common heuristics (e.g., validation loss, hidden units are
as Shapiro–Wilk [29], on each feature of the datasets and the a fraction of the input). The CNN kernels slide over the
results reveal that the data samples do not look like Gaussian. components of the 1D input pattern during convolution.
The normality tests are performed following the standard
More specifically, our 1D CNN model consists of an
hypothesis testing method with a P-value α ≥ 0.05 (i.e.,
input layer, multiple hidden layers, a max-pooling layer, a
sample looks like Gaussian). Further data preprocessing steps
flattening layer, and an output layer, as depicted in Fig. 5.
are performed as follows.
The input layer is a 1D convolutional layer, and it consists of
64 filters, a kernel of size 2, and a relative light unit (ReLU)
• Splitting data for training and testing as 80|20 for
activation function. The ReLU activation function helps to
train|test datasets, respectively;
avoid the vanishing gradient so that a faster convergence can
• Normalization with a standard scalar method to confine
be obtained. The 1D max-pooling layer has been introduced
the feature values within the range of {0,1}, as some of
to reduce the dimensions of the feature maps. The flattening
the selected features were in different magnitudes; and
layer has been adopted to convert the down-sampled data into
• Reshaping of each row of the training features into a 1D
a 1D vector that acts as an input to the output layer. A softmax
vector so that it becomes an input to the input layer of
activation function has been adopted in the output layer for
the deep learning model.
multi-class, i.e., no stress, time pressure, and interruption
classification based on probability distribution.
VOLUME 4, 2016 5

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3274478

Mortensen et al.: Multi-Class Stress Detection through Heart Rate Variability: A Deep Neural Network based Study

explained as follows [33]. T P is an outcome where the model


estimates the positive class accurately; T N is an outcome
in which the model correctly predicts the negative class;
F P is an outcome where the model estimates the positive
class inaccurately; and F N is an outcome in which the
model forecasts the negative class incorrectly. Accordingly,
The performance metrics for a given class are expressed
respectively as follows [29].
TP
P recision = (1)
TP + FP
TP
Recall = (2)
TP + FN
TP + TN
Accuracy = (3)
TP + TN + FP + FN
2 × Recall × P recision
F 1-score = (4)
Recall + P recision
A higher value from the above expressions represents
better performance of a model, and this applies to all per-
formance metrics. On the other hand, bias is an error due
to erroneous assumptions in the learning algorithm, and
FIGURE 5. The structure of the developed 1D CNN model for stress
variance is an error from sensitivity to small fluctuations in
classification. the training set. While high bias leads to under-fitting, high
variance results in overfitting. Accuracy and F1-scores can be
For loss calculation, we introduce the categorical cross- misleading because they do not fully account for the sizes of
entropy loss function to compile our 1D CNN model. For the four categories of the confusion matrix in the final score
model training, we adopt the adaptive moment estimation calculation. In comparison, the MCC is more informative
(ADAM) optimizer, as it is computationally efficient and than the F1-score and Accuracy because it considers the
claims less memory. To reduce the learning rate and improve balanced ratios of the four confusion matrix categories (i.e.,
the performance of our model, a validation split step of 0.05 TP, TN, FP, and FN). The F1-score depends on which class
is configured. is defined as a positive class. However, MCC does not depend
As the platform to train and validate the developed model, on which class is the positive class, and it has an advantage
we rely on Google Colab. Specifically, the model is trained over the F1-score as it avoids incorrectly defining the positive
with the default configuration of Google Colab, e.g., Intel(R) class [34]. The MCC is expressed as follows [30].
Xeon(R) central processing unit (CPU)@2.20 GHz and 12
GB random access memory (RAM). The initial input data TP ∗ TN − FP ∗ FN
M CC = p
shape is (328257, 34). Then the input data is reshaped to (T P + F P )(T P + F N )(T N + F P )(T N + F N )
(328257, 1, 34) where each row of the input data is formed (5)
into a one-dimensional vector. The Fit() generator turns train-
ing data into many batches, each with a size 64, for training. VI. CLASSIFICATION RESULTS AND DISCUSSIONS
In this section, we present the experimental results and reveal
V. PERFORMANCE METRICS the importance of ANOVA-based feature selection.
The performance of the developed 1D CNN model for multi-
class stress classification has been evaluated through dis- A. FEATURE RANKING AND SELECTION FOR
crimination analysis based on the SWELL−KW dataset. The SWELL−KW
discrimination analysis metrics are Precision (eq. (1)), Recall In this study, we have considered all 34 features provided by
(eq. (2)), Accuracy (eq. (3)), F1-score (eq. (4)), MCC (eq. the SWELL−KW dataset. However, some of the features are
(5)), classification report, and confusion matrix [29] [30]. irrelevant and act as outliers. With this regard, the ANOVA
A confusion matrix is a 2-dimensional table (actual versus method has been very significant. Initially, it ranks the 34
predicted) and both dimensions have four options, namely, features based on their F-values. Fig. 6 presents the ranking
true positives (TP), false positives (FP), true negatives (TN), of the HRV features that are available in the SWELL−KW
and false negatives (FN). dataset. Typically, features with higher F-values are more
The cells, or a collection of cells, considered by the important for final stress level categorization. The most rel-
ratios for a particular class in multi-class classification are evant and important subset of the rated features is further
6 VOLUME 4, 2016

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3274478

Mortensen et al.: Multi-Class Stress Detection through Heart Rate Variability: A Deep Neural Network based Study

FIGURE 6. Feature ranking of the 34 features using ANOVA.

FIGURE 7. Accuracies with ANOVA-sorted features.

identified via a forward sequential feature selection method. TABLE 2. Performance of the proposed 1D CNN model for three level
classifications with all features
The forward sequential feature selection forms the optimal
subset of features from the 34 features in their ranked order Classification Level Precision Recall F1-score
No stress 1.0 1.0 1.0
by sequentially selecting the features.
Time pressure 1.0 1.0 1.0
In Fig. 7, we demonstrate the accuracy scores by se- Interruption 1.0 1.0 1.0
quentially selecting the ANOVA-sorted features. It can be
observed that accuracy increases with the number of features
adopted for model training. More specifically, the developed algorithm for stress level detection when the top 15 features
model achieves above 95% accuracies with less than half of are selected.
the ANOVA-sorted features, i.e., less than 17 features. In the
following two subsections, we first evaluate the performance B. PERFORMANCE WHEN ALL FEATURES ARE
of our model in terms of Precision, Recall, F1-score, and APPLIED
MCC when all available features are applied to the classifier The developed CNN model has classified the SWELL−KW
and then demonstrate the efficacy of the feature reduction dataset into the following three stress categories based on
VOLUME 4, 2016 7

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3274478

Mortensen et al.: Multi-Class Stress Detection through Heart Rate Variability: A Deep Neural Network based Study

model is not overfitted, and it meets the criteria for a good


fit model.

C. PERFORMANCE WITH TOP FIFTEEN FEATURES


We further investigate the performance of the model by
employing only the top 15 ANOVA-sorted features, and the
obtained results are listed in Tab. 4. Through the values
shown in the table, we demonstrate that the average scores
for Precision, Recall, F1-score, and MCC achieved by the
proposed model are still excellent, reaching a score of 96.5%,
94.6%, 97.0% and 92.9%, respectively. Overall, we have
achieved a score of 96.5% accuracies on average. Further-
more, the performance of the model using a 70/30 train-test
split resulted in an accuracy of 0.961, precision of 0.960,
recall of 0.956, F1 score of 0.957, and MCC of 0.935.
On the other hand, it is worth reiterating that the perfor-
FIGURE 8. Confusion matrix obtained based on stress class classification.
mance of our 1D CNN model with all features is extraordi-
nary, outperforming the case with top 15 features. However,
such a benefit comes at a cost of a longer training time,
specially when the size of a dataset is massive. In general,
there is always a trade-off between performance and resource
consumption. Therefore, whether to select all features or not
depends on the key performance requirements of a system
or service. In our experiments, the model training time with
15 features is 1733 seconds, which is 8 seconds less than the
model training time with all features.

D. K-FOLD CROSS-VALIDATION
To validate the obtained results with the top 15 features, a
k-fold cross-validation procedure has been performed and
the results are compared with the ones obtained from the
developed 1D CNN model. K-fold cross-validation divides
FIGURE 9. Training versus validation accuracy. the dataset into k equal-sized folds, training and evaluating
the model k times, with each fold serving as the test set
once and the remaining k-1 folds serving as the training set.
emotional states, i.e., no stress, time pressure, and interrup- The evaluation scores are then averaged across the k folds to
tion, and it has obtained an extremely high level of accuracy. obtain a more robust estimate of the model’s performance.
For our validation, the default value, i.e., 5 splits is con-
More specifically, Tab. 2 demonstrates the performance of figured. In each split, the model is trained and evaluated on
the developed 1D CNN model on stress level classifications. the test data, and performance metrics in terms of Precision,
Clearly, we have achieved the highest accuracy score of 0.99 Recall, Accuracy, F1 score, and MCC are calculated. The
with Precision = 1, recall = 1, F1-score = 1, and MCC = evaluation results based on these five splits show that the
0.99 respectively. Overall, the accuracy of the developed 1D model achieves an average score of Precision = 0.944, Ac-
CNN model reaches an accuracy level of 99.9% for all three curacy = 0.945, Recall = 0.933, F1 = 0.908, and MCC =
classification levels. 0.908, obtained based on the same test dataset. As such, it is
Fig. 8 presents the confusion matrix obtained from the de- evident that the developed model is capable of classifying the
veloped 1D CNN model based on the SWELL−KW dataset. samples into their respective classes with ultra-high accuracy.
It is evident from the figure that the proposed classifier
correctly predicts the true label with less than 0.01% error E. HYPERPARAMETER OPTIMIZATION
for all three classes. Initially the model parameters are selected based on experi-
Furthermore, we have verified whether the proposed model ence (as explained in Sec. IV-B). In what follows, we further
is overfitted or not. Fig. 9 illustrates the training versus investigate the impact of hyperparameter optimization on the
validation accuracy obtained through our experiments. From performance of the developed model, using the Hyperband
this figure, it is clear that the validation accuracy and training Tuning technique.
accuracy are nearly identical, with the validation loss being Using the top 15 features of the SWELL−KW dataset,
slightly higher than the training loss. In other words, the hyperband [40] tuning is employed to optimize the hyper-
8 VOLUME 4, 2016

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3274478

Mortensen et al.: Multi-Class Stress Detection through Heart Rate Variability: A Deep Neural Network based Study

TABLE 3. Quantitative Comparison of the results with other state-of-the-art models

Binary/ No. of fea-


Reference Dataset Model Accuracy Precision Recall F1-score
Multilevel tures

[24] SWELL−KW Binary 17 SVM 92.75% N.A. N.A. N.A.

SWELL−KW,
[25] 3 class N.A. CNN 98.30% 96.00% 96.30% 95.80%
AMIGOS [5]

[13] SWELL−KW 3 class N.A. SVM 90.00% N.A. N.A. N.A.

Binary/3 84.32%/ 78.71%/


[27] WESAD [36] 7 ML/ANN N.A. N.A.
class 95.21% 94.24%

From Binary/3 92.85%/


[37] 7 MLP N.A. N.A. N.A.
experiment class 64.28%

[15] SWELL−KW Binary 34 MLP 88.64% 93.01% 92.68% 82.75%

This study SWELL−KW 3 class 34 1D-CNN 99.99% 100% 100% 100%

This study SWELL−KW 3 class 15 1D-CNN 96.50% 94.60% 97.00% 96.00%

TABLE 4. Performance of the proposed 1D CNN model for three level Existing studies that are based on publicly accessible
classifications with top 15 ANOVA-sorted features
datasets such as SWELL−KW, WESAD, and AMIGOS con-
Classification Level Precision Recall F1-score centrated on binary and multi-class stress detection when
No stress 0.96 0.97 0.97 assessing the effectiveness of their ML/DL models. It is
Time pressure 0.99 0.95 0.97
Interruption 0.89 1.0 0.94
worth mentioning that we used the SWELL−KW dataset
for multi-class stress detection. Regarding performance eval-
uation, prior studies, e.g., [24] and [13], considered merely
parameters of our model. The purpose of the tuning process the accuracy score as the key performance metric. Although
is to maximize the model’s validation accuracy. Through the accuracy is a popular indicator, it is sufficient only if the false
validation procedure illustrated in Appendix A, the best set of positive and false negative rates are essentially similar, and
hyperparameters is found by the algorithm to be filters=160, the dataset is symmetric.
kernel size=5, and dense units=48, resulting in a validation Furthermore, Tab. 3 reveals that, when all features are con-
accuracy of 0.99. sidered during model training, none of the existing ML/DL
On the other hand, it is worth noting that, although hyper- models reported in the literature outperform the one devel-
parameter tuning can be effective in improving the perfor- oped in this study in terms of Accuracy, Precision, Recall,
mance of ML models, it can be a challenging task to apply F1-score, and MCC for categorizing stress levels.
it in real-life applications. This is due to its demand for When a subset of features is selected for model training,
a significant amount of computational resources, especially the model presented in [25] shows higher performance than
for large-volume datasets and complex models which may the proposed model in this study with top 15 ANOVA-sorted
not always be available. Additionally, the optimal set of features. The reason is that the authors in [25] considered all
hyperparameters may be specific to the dataset, model, and available features in the datasets, and they did not apply any
the problem at hand, making it difficult to develop a gener- dimension reduction technique for performance evaluation of
alizable approach to hyperparameter tuning [41] [42]. Thus, their model.
default hyperparameters or a small set of manually tuned
hyperparameters may suffice in many cases including this VII. FURTHER DISCUSSIONS
study to achieve satisfactory performance. Execution time of full features versus top-15 features: The
execution time difference between the all feature-based
F. QUANTITATIVE COMPARISON WITH EXISTING model and the top-15 feature-based model reported in Sub-
STUDIES sec. VI-C seems small. There are two reasons for this result.
Finally, we make a quantitative comparison of our model 1) The SWELL−KW dataset which serves as the basis for
versus other related studies appeared in the literature. In this study has a moderate amount of data (410322 number of
Tab. 3, the performance indicators from a few recent studies records and 34 features as mentioned in Subsec. III-C) and 2)
for automatic classification of stress levels are compared with our training and validation procedures are performed based
our 1D CNN model. on Google Colab which has powerful CPUs and graphics
VOLUME 4, 2016 9

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3274478

Mortensen et al.: Multi-Class Stress Detection through Heart Rate Variability: A Deep Neural Network based Study

processing unit (GPUs) as well as a huge amount of RAMs. 6) The data is normalized using Scikit-learn’s MinMaxS-
When the volume of a dataset becomes huge which is typical caler and reshaped to fit the input shape of a 1D CNN.
for big data processing, or/and the data processing machine 7) The code defines a function called ‘build-model’ that
is less powerful, e.g., based on a personal computer or a creates a CNN with tunable hyperparameters using a
server located at a clinic, the benefit of our model with feature HyperParameters object. A hyperband tuner is instanti-
reduction will be more significant, specially for validation. ated with a maximum number of epochs of 50, a factor
This is because, after the data collection phase, data training of 3, and a directory and project name to store the
can be still performed offline based on powerful CPUs/GPUs. tuner’s logs and checkpoints.
Model Applicability: The model developed in this study 8) The best hyperparameters are obtained using the
is built based on the SWELL−KW dataset. Nevertheless, we tuner’s ‘get-best-hyperparameters’ method and the first
believe that, with proper parameter tuning or enhancement, set of hyperparameters is selected.
the model may be applicable to other datasets that target at 9) After obtaining the best hyperparameters from the
similar mental health status analysis. Within the framework tuner, a new CNN model is built using these hyperpa-
of an ongoing research project acknowledged below, we are rameters. The model is then trained using the training
collecting real-life data including HR and RR for mental data for 150 epochs, and the model’s history is stored
health inpatients in a Norwegian hospital based on non- in a ‘history’ object.
wearable Internet of things (IoT) devices. We plan to assess The best hyperparameters using hyperband are as follows:
the performance of the developed model based on our own conv1_filters: 160, conv1_kernel: 5, dense_units: 48, and
datasets. However, to include the validation results based on learning_rate: 0.001. With these hyperparameters, the model
these inpatient datasets is beyond the scope of this paper. achieved a test loss of 0.011 and a test accuracy of 0.995.

VIII. CONCLUDING REMARKS REFERENCES


In this study, we have developed novel a 1D CNN model for [1] H. G. Kim, E. J. Cheon, D. S. Bai, Y. H. Lee, and B. H. Koo, “Stress
stress level classification using HRV signals and validated and heart rate variability: A meta-analysis and review of the literature,"
the proposed model based on a publicly available dataset, Psychiatry Investigation, vol. 15, no. 3, pp. 235–245, Feb. 2018.
[2] D. Muhajir, F. Mahananto, and N. A. Sani, “Stress level measurements
SWELL−KW. In our model, we also applied an ANOVA using heart rate variability analysis on Android based application," Procedia
feature selection technique for dimension reduction. Through Comput. Sci., vol. 197, no. 8, pp. 189–197, Jan. 2022.
extensive training and validation, we demonstrate that our [3] J. Held, A. Vîslă, C. Wolfer, N. Messerli-Bürgy, and C. Flückiger, “Heart
model outperforms the state-of-the-art models in terms of rate variability change during a stressful cognitive task in individuals with
anxiety and control participants," BMC Psychology, vol. 9, no. 1, pp. 1–8,
major performance metrics, i.e., Accuracy, Precision, Recall, Mar. 2021.
F1-score, and MCC when all features are employed. Fur- [4] K. M. Dalmeida and G. L. Masala, “HRV features as viable physiological
thermore, our approach with ANOVA feature reduction also markers for stress detection using wearable devices," Sensors, vol. 21, no.
8, art. no. 2873, pp. 1–18, Mar. 2021.
achieves excellent performance. For future work, we plan to [5] M. C. J. Abdon, A. M. Khomami, S. Nicu, and P. Ioannis, “Amigos:
further investigate the feasibility of optimizing the model to A dataset for affect, personality and mood research on individuals and
fit it into edge devices so that real-time stress detection can groups," IEEE J. Biomed. Health Informat., vol. 12, no. 2, pp. 479–493,
Apr.–Jun. 2021.
become a reality. [6] E. Won, and Y.-K. Kim, “Stress, the autonomic nervous system, and
the immune-kynurenine pathway in the etiology of depression," Current
APPENDIX A Neuropharmacology., vol. 14, no. 7, pp. 665–673, Oct. 2016.
[7] B. Olshansky, H. Sabbah, N.Hani, J.P. Hauptman, and W.S. Colucci,
The hyperparameter tuning procedure is illustrated in a step- “Parasympathetic nervous system and heart failure: Pathophysiology and
by-step manner as follows: potential implications for therapy," Circulation., vol. 118, no. 8, pp. 863–
871, Aug. 2008.
1) The necessary libraries are imported, including Keras [8] S. Goel, P. Tomar, and G. Kaur, “ECG feature extraction for stress recogni-
Tuner, Pandas, NumPy, Scikit-learn, and TensorFlow. tion in automobile drivers," Electron. J. Biol., vol. 12, no. 2, pp. 156–165,
Mar. 2016.
2) The ‘train’ and ‘test’ datasets are read from CSV files
[9] V. N. Hegde, R. Deekshit, and P. S. Satyanarayana, “A review on ECG signal
using Pandas and concatenated into a single dataframe processing and HRV analysis," J. Medical Imaging Health Informat., vol. 3,
called ‘swell’. no. 2, pp. 270–279, Jun. 2013.
3) The ‘train’ and ‘test’ datasets are read from CSV files [10] M. Vollmer, “A Robust, simple and reliable measure of heart rate variabil-
ity using relative RR intervals," in Proc. Comput. Cardiology Conf., 2015.
using Pandas and concatenated into a single dataframe [11] P. A. Lanfranchi and V. K. Somers, Principles and Practice of Sleep
called ‘swell’. Medicine, 5th Ed., Elsevier Inc., 2011.
4) ANOVA F-value feature selection is performed on [12] M. Malik, J. T. Bigger, A. J. Camm, R. E. Kleiger, A. Malliani, A. J. Moss,
and P. J. Schwartz, “Heart rate variability," European Heart J., vol. 17, no.
‘swell’, and the top 15 features are selected based on 3. pp. 354–381, Mar. 1996.
their scores. The target variable ‘condition’ is also ex- [13] S. Koldijk, M. Sappelli, S. Verberne, M. A. Neerincx, and W. Kraaij, “The
tracted and encoded using Scikit-learn’s LabelEncoder. SWELL knowledge work dataset for stress and user modeling research," in
5) The data is split into training and testing sets using Proc. ACM Int. Conf. Multimodal Interaction (ICMI), 2014, pp. 291–298.
[14] S. Koldijk, M. A. Neerincx, and W. Kraaij, “Detecting work stress in
Scikit-learn’s train-test-split function, with a test size offices by combining unobtrusive sensors," IEEE Trans. Affective Comput.,
of 20 percent and a random state of 101. vol. 9, no. 2, pp. 227-239, Apr.–Jun. 2018.

10 VOLUME 4, 2016

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3274478

Mortensen et al.: Multi-Class Stress Detection through Heart Rate Variability: A Deep Neural Network based Study

[15] M. Albaladejo-González, J. A. Ruipérez-Valiente, and F. Gómez Már- [37] A. Aamir, M. Majid, A. R. Butt, and S. M. Anwar, “Classification of
mol, “Evaluating diferent confgurations of machine learning models and perceived mental stress using a commercially available EEG headband,"
their transfer learning capabilities for stress detection using heart rate,” IEEE J. Biomed. Health Informat., vol. 23, no. 6, pp. 2257–2264, Jul. 2019.
J. Ambient Intell. Humanized Comput., Aug. 2022. [Online]. Available: [38] R. G. Babukarthik, V. A. K. Adiga, G. Sambasivam, D. Chandramohan
https://doi.org/10.1007/s12652-022-04365-z. and J. Amudhavel, "Prediction of COVID-19 using genetic deep learning
[16] R. Walambe, P. Nayak, A. Bhardwaj, and K. Kotecha, “Employing multi- convolutional neural network (GDCNN)," IEEE Access, vol. 8, pp. 177647-
modal machine learning for stress detection," J. Healthcare Eng., vol. 2021, 177666, Sep. 2020.
art. no. 9356452, Oct. 2021. [39] R.G. Babukarthik, D. Chandramohan, D. Tripathi, M. Kumar, and G. Sam-
[17] A. Ibaida, A. Abuadbba, and N. Chilamkurti, “Privacy-preserving com- basivam, “COVID-19 identification in chest X-ray images using intelligent
pression model for efficient IoMT ECG sharing," Comput. Commun., vol. multi-level classification scenario," Comput. Electr. Eng., vol. 104, Part A,
166, pp. 1–8, Jan. 2021. art. no. 108405, Dec. 2022.
[18] C. W. Dobbs, M. V. Fedewa, H. V. MacDonald, C. J. Holmes, Z. S. [40] L. Li, K. Jamieson, G. DeSalvo, A. Rostamizadeh, and A. Talwalkar, "Hy-
Cicone, D. J. Plews, and M. R. Esco, “The accuracy of acquiring heart rate perband: A novel bandit-based approach to hyperparameter optimization,"
variability from portable devices: A systematic review and meta-analysis," J. Mach. Learning Research, vol. 18, no. 1, pp. 6765–6816, Jan. 2017.
Sports Med., vol. 49, no. 3, pp. 417–435, Mar. 2019. [41] M. Feurer and F. Hutter, Hyperparameter optimization, Book Chap. in
[19] C.-M. Chen, S. Anastasova, K. Zhang, B. G. Rosa, P. L. B. Lo, H. Automated Machine Learning: Methods, Systems, Challenges. Springer
E. Assender, and G.-Z. Yang, “Towards wearable and flexible sensors Nature, 2019.
and circuits integration for stress monitoring," IEEE J. Biomed. Health [42] J. Bergstra and Y. Bengio, “Random search for hyper-parameter optimiza-
Informat., vol. 24, no. 8, pp. 2208–2215, Aug. 2020. tion," J. Mach. Learning Research, vol. 13, no. 2, pp. 281–305, Feb. 2012.
[20] R. A. Rahman, K. Omar, S. A. Mohd Noah, M. S. N. M. Danuri, and M.
A. Al-Garadi, “Application of machine learning methods in mental health
detection: A systematic review," IEEE Access, vol. 8, pp. 183952-183964,
Oct. 2020.
[21] S. H. Jambukia, V. K. Dabhi and H. B. Prajapati, “Application of machine
learning methods in mental health detection: A systematic review," in Proc. JON ANDREAS MORTENSEN is a Bachelor
Int. Conf. Advances Comput. Eng. Appl., pp. 714–721, 2015. student in Computer Science at the Department
[22] S. Celin and K. Vasanth, “ECG signal classification using various machine of Information and Communication Technology in
learning techniques," J. Medical Syst., vol. 42, no. 12, pp. 1–11, Oct. 2018. the University of Agder, Norway. He is mainly
[23] A. Padha and A. Sahoo, “A parametrized quantum LSTM model for interested in data analytics and machine learning.
continuous stress monitoring," in Proc. Int. Conf. Comput. Sustainable
Global Develop., May 2022, pp. 1–6.
[24] S. Sriramprakash, V. D. Prasanna, and O. V. R. Murthy, “Stress detection
in working people," Procedia Computer Sci., vol. 115, pp. 359–366, Oct.
2017.
[25] P. Sarkar and A. Etemad, “Self-supervised learning for ECG-based emo-
tion recognition," in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process.
(ICASSP), 2020, pp. 3217–3221.
[26] M. X. Huang, J. Li, G. Ngai, and H. V. Leong, “Stressclick: Sensing stress
from gaze-click patterns," in Proc. ACM Int. Conf. Multimedia (MM), 2016,
pp. 1395–1404. MARTIN EFREMOV MOLLOV is a Bachelor
[27] P. Bobade and M. Vani, “Stress detection with machine learning and deep student in Computer Science at the Department
learning using multimodal physiological data," in Proc. IEEE Int. Conf. of Information and Communication Technology in
Inventive Res. Comput. Appl. (ICIRCA), 2020, pp. 51–57. the University of Agder, Norway. He is mainly
[28] B. J. Feir-Walsh and L. E. Toothaker, “An empirical comparison of the interested in data analytics and machine learning.
ANOVA F-test, normal scores test and Kruskal-Wallis test under violation
of assumptions," Educ. Psychological Meas., vol. 34, no. 4, pp. 789–799,
Dec. 1974.
[29] A. Chatterjee, M. W. Gerdes, and S. G. Martinez, “Identification of
risk factors associated with obesity and overweight – A machine learning
overview," Sensors, vol. 20, no. 9, art. no. 2734, pp. 1–30, May 2020.
[30] A. Chatterjee, N. Pahari, A. Prinz, and M. Reigler, “Machine learning
and ontology in eCoaching for personalized activity level monitoring and
recommendation generation," Scientific Reports, vol. 12, no. 1, pp. 1–26,
Nov. 2022. AYAN CHATTERJEE received the B.Eng. degree
[31] S. Lars and S. Wold, “Analysis of variance (ANOVA)," Chemometrics in Computer Science and Engineering (CSE) from
Intell. Lab. Syst., vol. 6, no. 4, pp. 259–272, Nov. 1989.
the West Bengal University of Technology, India
[32] S. Kiranyaz, O. Avci, O. Abdeljaber, T. Ince, M. Gabbouj, and D. J. Inman,
in 2009, and the master’s degree in Information
“1D convolutional neural networks and applications: A survey," Mechanical
Technology from Jadavpur University, India in
Syst. Signal Process., vol. 151, art. no. 107398, Apr. 2021.
2016. He worked as an Associate Consultant in
[33] F. Mattioli, C. Porcaro, and G. Baldassarre, “A 1D CNN for high accuracy
classification and transfer learning in motor imagery EEG-based brain-
Tata Consultancy Services ltd., India from 2009 to
computer interface," J. Neural Eng., vol. 18, no. 6, art. no. 066053, pp. 1–16, 2019 and was deputed to Denmark and the Nether-
Jan. 2022. lands for 3.4 years as a Java Solution Designer and
[34] D. Chicco and G. Jurman, “The advantages of the Matthews correlation Data Analyst. He has strong aptitude in Object-
coefficient (MCC) over F1 score and accuracy in binary classification Oriented programming concepts. He submitted his Ph.D. thesis from the
evaluation," BMC Genomics, vol. 21, no. 1, pp. 1–13, Jan. 2020. University of Agder, Norway (Feb 2019 - Sep 2022) with specialization
[35] K. Nkurikiyeyezu, K. Shoji, A. Yokokubo, and G. Lopez, “Thermal in ICT-eHealth. His research interests are AI, eHealth, Recommendation
comfort and stress recognition in office environment," in Proc. Int. Conf. Technology, Semantics, Human Centered Design, Software Engineering,
Health Informat. (HEALTHINF), 2019, pp. 256-263. and Bioinformatics. He is now working as a Senior Researcher (AI and
[36] P. Schmidt, A. Reiss, R. Duerichen, C. Marberger, and K. V. Laerhoven, Semantics) in the Simula Research Laboratory (SimulaMet), Oslo, Norway,
“Introducing WESAD, a multimodal dataset for wearable stress and affect and as an Adjunct Associate Professor (Object-Oriented-Programming) in
detection," in Proc. ACM Int. Conf. Multimodal Interaction (ICMI), 2018, University of Agder, Kristiansand, Norway.
pp. 400–408.

VOLUME 4, 2016 11

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3274478

Mortensen et al.: Multi-Class Stress Detection through Heart Rate Variability: A Deep Neural Network based Study

DEBASISH GHOSE received Ph.D. degree


in Information and Communication Technology
from the University of Agder, Grimstad, Norway
in 2019. He did his Post-Doc. at the same uni-
versity from 2021 to 2022. Prior to his post-doc,
he worked as a system developer at Confirmit,
Grimstad, Norway from 2020 to 2021. Currently,
he is with the School of Economics, Innovation,
and Technology, Kristiania University College,
Bergen, Norway as an Associate Professor. His
research interests include protocol design, modeling, and performance eval-
uation of the Internet of things. His other research interests include edge and
fog computing, data analytics, cyber security, and machine learning.

FRANK Y. LI received the Ph.D. degree from


the Department of Telematics (now the Depart-
ment of Information Security and Communication
Technology), Norwegian University of Science
and Technology (NTNU), Trondheim, Norway, in
2003. He was a Senior Researcher with the UniK-
University Graduate Center (now the Department
of Technology Systems), University of Oslo, Nor-
way, before joining the Department of Informa-
tion and Communication Technology, University
of Agder, Norway, in August 2007, as an Associate Professor and then a Full
Professor. From August 2017 to July 2018, he was a Visiting Professor with
the Department of Electrical and Computer Engineering, Rice University,
Houston, TX, USA. During the past few years, he has been an active
participant in multiple Norwegian and EU research projects. His research
interests include MAC mechanisms and routing protocols in 5G and beyond
mobile systems and wireless networks, the Internet of Things, mesh and ad-
hoc networks, wireless sensor networks, D2D communications, cooperative
communications, cognitive radio networks, green wireless communications,
dependability and reliability in wireless networks, QoS, resource manage-
ment, traffic engineering in wired and wireless IP-based networks, and
the analysis, simulation, and performance evaluation of communication
protocols and networks. He was listed as a Lead Scientist by the European
Commission DG RTD Unit A.03—Evaluation and Monitoring of Program
in November 2007.

12 VOLUME 4, 2016

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy