SVM_PSO
SVM_PSO
Abstract
People feel different emotions in different situations and at different stages of their daily practice: bored, calm, horrified and,
of course, funny. this case, an efficient function to recognize emotion is crucial for EEG data in order to qualify for depicting
emotions. The identification of the psychophysiological state of the people has been a major concern in many fields including
the adaptation ofSocial life to disabled people. In recent years, different physiological cues have been incorporated in emotion
recognition research because the requiredsuccess is not very high in those trials which are performed through basic methodologies
such as postures and facial gestures. evaluated utilizing the GAMEEMO dataset. For the purpose of this study, two emotions
are considered for the classification; Positive emotion and Negative emotion. PSObased deep learning method that can identify
human emotions by analyzing Electroencephalogram (EEG) signal is provided. underwent some preprocessing techniques like
bandpass filter and wavelet transform which is more sensitive to time frequency changes. Then feature extraction using Fast Fourier
Transform from which vectors were extracted. Last, the time frequency features extracted were used on one of the frequently used
machine learning technique called SUPPORT VECTOR MACHINES (SVM) having particle swam optimization classifier forThe
study employed the proposed method in GAMEEMO dataset to evaluate its effectiveness. This study considers two emotions,
namely‘Arousal’ and ‘Valence’ for classification Accuracy of LANV LAPV, HANV, HAPV . These experimental results represent
the complete works of the current study, proving the significance and efficiency of the proposed 99.37detection accuracy in the
proposed combination of SVM with PSO classifiers.
Keywords: EEG, GAMEEMO dataset, Particle Swarm Optimization, Support Vector Machine, Feature Extraction, Data
Augmentation, Emotion Recognition,
1. Introduction GSR, EMG, RT, and EEG, have been employed in the detection
of emotions in identifying a limited range of emotions. How-
Emotions are fundamental to changes in behavior and per- ever, it is evident that EEG data provide high temporal reso-
formance that signify learning, reflecting a sophisticated level lution, meaning-rich information that may be accessed using
of intelligence. To fully define emotion, it is essential to con- inexpensive, portable EEG equipment. Thus EEG is a commu-
sider three key aspects: The implicit or cognitive sense of emo- nication between the human mind and an external instrument
tion, the neurological and physiological mechanisms that un- may be created through an EEG-based system, which enables
derlie it, and the visible behavioral patterns, and the observ- the reading of physiological signals and the interpretation of
able expressive patterns, especially those manifested in facial specific elements of the individual’s mental state. These emo-
expressions [1]. Emotions are classified according to three tions can be captured during synaptic neuronal dendrite excita-
main theories: Plutchik’s, Ekman’s, and the James-Lange the- tions, BCI technology places electrodes on top of the head to
ory. Plutchik’s theory divides emotions into basic (e.g., joy, measure how the nervous system’s electricity reacts to current
fear, anger) and secondary (e.g., aggression, submission) cate- flow. EEG wearables with emotion identification might lead
gories. Ekman’s discrete model identifies The six primary feel- to breakthroughs in electronic learning, playing games, medi-
ings are sorrow, joy, annoyance, distaste, anxiety, and amaze- cal care, human-computer interface, and automobiles. The field
ment. The James-Lange theory categorizes emotions using a of Brain-Computer Interaction (BCI) originated in the 1970s,
dual-dimensional arousal valence paradigm, in which valence with early research focusing on the cortices of monkeys and
varies from negatively (unpleasant) to optimistic (pleasurable), rats at the University of Washington School of Medicine. Ini-
and arousal runs from minimal (passive) to excessive (active) tially, BCI research focused on medical diagnosis and brain
[2]. These emotions can be expressed by the human emotions function examination. Technological advancements have ex-
Gestures, speech, body language and physiological signals. But panded this scope to include BCI’s artificial intelligence. of
emotions can be accurately interpreted by physiological sig- facial expressions and eye movement, presence of AI, emotion
nals. Several physiological signals to be considered, like as recognition, and limb/hand movement detection. It seems that
1
Figure 1: Emotion Model
[4] Multivariate multiscale modified-distribution en- Valence and Arousal 95.73% accuracy for va-
tropy (MM-mDistEn), combined with ANN lence and 96.78% for
model arousal 95.73% and 96.78%
[5] GoogLeNet-based deep learning method using ‘Positive’ and ‘Nega- 98.78% accuracy with
EEG signals, through Continuous Wavelet Trans- tive’ emotions SVM, 98.53% with k-NN,
form and 98.41% with ELM
98.78%
[6] PrimePatNet87, a hand-crafted network using arousal, valence 99% accuracy
prime pattern and TQWT techniques
[7] Spectral entropy, classifying with bidirectional Valence and Arousal 76.91% accuracy
LSTM
[8] Projection dictionary learning with the fast PDPL positive, neutral, and 69.89% accuracy
technique using a genetic algorithm negative
[9] General non-subject-based model and applies LAPV HAPV LANV 98.38% accuracy
Capsule Networks for binary and multi-class HANV
recognition
[10] LEDPat-Net19, an advanced emotion recognition arousal,dominance 99.29% accuracy
model with TQWT multilevel feature generation and valance
network
Proposed Bandpass filtering using wavelet transform with LAPV HAPV LANV 99.37 accuracy%
Method swarm optimization using SVM HANV
2. Literature review
Recent studies have focused on extracting meaningful prop-
erties from EEG signals for emotion recognition. These studies
have proposed various algorithms that leverage different char-
acteristics of EEG signals to enhance emotion recognition. Au-
thor propose an emotion recognition system using multichan-
nel EEG and a novel entropy measure, multivariate multiscale
modified-distribution entropy (MM-mDistEn), combined with
an ANN model. Authors approach achieved 95.73% accuracy
for valence and 96.78% for arousal on the GAMEEMO dataset,
and 92.57% and 80.23% on the DEAP dataset, respectively [4].
tion recognition, we employed projection dictionary learning LANV and HANV zones (representing negative emotions)
with the fast PDPL technique and optimized parameters us- were examined separately. The process involved averaging
ing a genetic algorithm. Our method achieved 69.89% accu- the EEG signals within the positive and negative emotion cat-
racy on SEED, 24.11% on MPED, and 64.34% for two-class egories to create a unified EEG signal for each category. A
GAMEEMO[8]. band-pass filter is applied to the EEG signals to isolate the de-
Emotion recognition using EEG signals, influenced by aural sired frequency bands,followed by wavelet denoising using the
and visual stimuli, is challenging due to internal psychological ’db4’ wavelet and an 11-level decomposition. Wavelet trans-
emotions. This study introduces a general non-subject-based form is widely utilized for representing signals in the time-
model and applies Capsule Networks for binary and multi-class frequency domain. It decomposes a time-domain signal into
recognition. The method achieved a 10% improvement in ac- wavelet coefficients using a mother wavelet function, achieved
curacy over existing studies using the GAMEEMO dataset[9]. through shifting and dilation of the mother wavelet. When a
This work presents LEDPatNet19, an advanced emotion recog- noisy version of a signal is available, the challenge is to re-
nition model using EEG signals. It integrates a multilevel fea- store the original information. Traditional wavelet denoising
ture generation network with TQWT, statistical, and nonlinear modifies the wavelet coefficients based on their local proper-
feature extraction. Evaluated on GAMEEMO and DREAMER ties and then inverts the transformation to obtain a clean signal.
datasets, it achieved accuracies of 99.29% and 94.58% for dif- Wavelet denoising is a popular method for signal processing
ferent emotion categories, demonstrating its effectiveness[10]. and involves three key steps. First, the input signal is decom-
posed into wavelet coefficients. Next, these coefficients are ad-
justed based on a thresholding technique. Finally, the modified
3. Preprocessing
coefficients are used in the inverse transform to reconstruct the
Preprocessing of EEG signals is essential for noise removal signal, free from noise. In our research, we demonstrated effec-
and can be achieved through methods like band-pass filtering, tive wavelet-based EEG denoising using universal and statisti-
wavelet transform and Fourier transform. These techniques en- cal threshold functions.
hance the signal-to-noise ratio (SNR), reduce computational In our study EEG signals were filtered using a 5th-order
cost, and improve the accuracy of emotions. In this study, band-pass filter with cutoff frequencies of 0.05 Hz and 45 Hz.
the GAMEEMO dataset, consisting of EEG signals collected The filtering process can be represented as:
at a 128 Hz sampling rate, was filtered and denoised to remove
N−1
noise and artifacts. Four frequency bands were targeted: 8-8.57 X
Hz, 9.5-10.5 Hz, 11.5-12.5 Hz, and 14.5-15.5 Hz, correspond- y(t) = h[n] · x(t − n) (1)
n=0
ing to different emotional states.
In the GAMEEEMO dataset analysis for both binary and y(t) is the filtered signal. x(t) is the original signal. h[n] is the
multi-class classification, EEG signals from the LAPV and impulse response of the band-pass filter. N is the filter order.
HAPV zones (representing positive emotions) and from the Wavelet denoising is used to remove artifacts from the EEG
4
ity reduction of the EEG data while preserving the essential
information contained in the EEG signals. FE was conducted
on the cleaned EEG signals by segmenting them into epochs
of 16 seconds each. Features in both the time and frequency
domain were retrieved. For time-domain features, our study
used mean, variance, skew, and kurtosis; for frequency-domain
features, Power Spectral Density (PSD) and Fast Fourier Trans-
form (FFT) were applied. These features capture the essential
characteristics of the EEG signals for subsequent classification.
The Fast Fourier Transform (FFT) is used to extract fre-
quency domain features:
N−1
2π f n
X
X(k) = x(n) · e− j N (4)
n=0
X( f ) signifies the frequency characteristics of the signal. x[n]
is the signal representation of time-domain. N is the number of
points in the EEG signal.
Welch’s PSD estimation method:
K
Figure 5: All Features of EEG signal 1 X
P xx ( f ) = |Xk ( f )|2 (5)
K k=1
signals. The denoising process typically involves thresholding P xx ( f ) is the power spectral density. Xk ( f ) is the Fourier
the wavelet coefficients can represented as: transform of the k-th segment of the signal. K is the number
of segments.
c j if |c j | > λ
(
ĉ j = (2) Time-Domain Features
0 if |c j | ≤ λ
N
c j are the wavelet coe f f icients.λ is the threshold level. ĉ j 1 X
µ= xi (6)
are the denoised wavelet coefficients. N i=1
N
An EEG epoch was defined by segmenting the raw EEG 1 X
σ2 = (xi − µ)2 (7)
signal into 4-second intervals for each channel, with the sub- N − 1 i=1
sequent epoch obtained by shifting the window by 1 second.
N
After segmentation, features were computed independently for 1 X (xi − µ)3
γ1 = (8)
each epoch and channel, focusing on commonly used features. N i=1 σ3
Time-domain features (TDFs) were calculated directly from
N
the raw EEG signal, while frequency domain features were 1 X (xi − µ)4
extracted from the power spectral density (PSD) and time- γ2 = −3 (9)
N i=1 σ4
frequency domain features (TFDFs) were computed from dis-
crete wavelet transform (DWT) coefficients[11]. With respect to segmentation, Data augmentation (DA) refers
The challenge of EEG channel selection remains an active to the process of creating new samples to supplement an exist-
area of research, as using multiple channels can sometimes be ing data set by modifying the existing samples. This approach
redundant. In our study we are focusing on channel selec- might enhance the classification accuracy and its stability, espe-
tion, spatial averaging of features from the left and right hemi- cially when dealing with EEG data. The main goal of DA is to
spheres. Epochs are created by dividing the EEG signal into decrease the sensitive to such transformations and make classi-
fixed-length segments. Each segment (epoch) can be repre- fiers less biased by making them more invariant to such trans-
sented as: formations and therefore allowing the model to perform better
on unseen datasets. To improve the quality and stability for the
Epochk = {xt0 , xt1 , . . . , xtn } (3) heterogeneity to work smoothly for the training dataset, data
augmentation strategies were used. Gaussian noise was added
k denotes the epoch number.
to the EEG signals to simulate variability, and time-shifting and
t0 , t1 , . . . , tn are the time indices within the segment.
amplitude scaling were applied to introduce temporal and am-
plitude variations. This augmentation process quadrupled the
4. Feature Extraction dataset size, providing a richer training set for the classifier.
Adding Gaussian noise to the signal:
EEG signals are inherently complex and rich in information.
In feature extraction,The intention is to achieve dimensional- xaug (t) = x(t) + N(0, σ2 ) (10)
5
Figure 6: Caption for Figure 1
6
Algorithm 1: Pseudo-Code of SVM-PSO Algorithm
1. Initialize the swarm of particles S and corresponding velocity vectors.
2. For each iteration t = 1 to max iterations do:
3. For each particle i = 1 to S (total particles) do:
4. Evaluate the fitness of each particle based on SVM classification accuracy on a validation set.
5. For each dimension d = 1 to the number of parameters (2 for C and γ) do:
6. Update the velocity of each particle i using:
The particle’s own best position (pbest).
The global best position (gbest) in the swarm.
7. Update the position of particle i using the updated velocity.
8. End For (dimension update loop)
9. If the fitness of particle i is better than its pbest, update pbest.
10. End For (particle loop)
11. If any particle’s fitness is better than the global best (gbest), update gbest.
12. End For (iteration loop)
13. Train the final SVM model using the best hyperparameters C and γ obtained from gbest.
14. Terminate the process if the stopping criterion is met .
[t].5
Figure 12: CNN with 62.31% Accuracy
Pe = PY ES + PNO (21)
11
Biomedical Signal Processing and Control 60 (2020) 101951.
12