Incep-EEGNet: A ConvNet for Motor Imagery Decoding

Riyad, Mouad; Khalil, Mohammed; Adib, Abdellah

doi:10.1007/978-3-030-51935-3_11

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12119))

Included in the following conference series:

International Conference on Image and Signal Processing

5857 Accesses

Abstract

The brain-computer interface consists of connecting the brain with machines using the brainwaves as a mean of communication for several applications that help to improve human life. Unfortunately, Electroencephalography that is mainly used to measure brain activities produces noisy, non-linear and non-stationary signals that weaken the performances of Common Spatial Pattern (CSP) techniques. As a solution, deep learning waives the drawbacks of the traditional techniques, but it still not used properly. In this paper, we propose a new approach based on Convolutional Neural Networks (ConvNets) that decodes the raw signal to achieve state-of-the-art performances using an architecture based on Inception. The obtained results show that our method outperforms state-of-the-art filter bank common spatial patterns (FBCSP) and ShallowConvNet on based on the dataset IIa of the BCI Competition IV.

You have full access to this open access chapter, Download conference paper PDF

MR-EEGNet: An Efficient ConvNets for Motor Imagery Classification

EEG classification using a simple CNN model for imagined and executed motor signals

Article 15 November 2024

Sinc-Based Convolutional Neural Networks for EEG-BCI-Based Motor Imagery Classification

Keywords

1 Introduction

Brain-computer interfaces (BCI) link machines and human brains with the brainwaves as mean of communication for several purposes [1]. The necessity of such a link is crucial to automatize several tasks such as the prediction of epilepsy seizure, or the detection of neurological pathologies. Also, it commonly uses brain signals as a control signal for devices such as keyboards or joysticks, which can improve the quality of life of severely disabled patients, or many non-medical applications such as video games, controlling a robot or authentication [13]. The most used sensor is electroencephalography (EEG) that relies on electrodes placed in the scalp to detect the variation of electrical activity. It processes the collected data with signal processing techniques to keep important features. Then, machine learning take a decision depending on the use case.

The most well-known applications are related to Motor Imagery (MI) [15]. It is a neural response that is produced when a person performs a movement or just imagine it. Unfortunately, the signals are intrinsically non-stationary, non-linear, and noisy [13]. Overcoming those problems requires the use of sophisticated algorithms that requires human intervention (e.g. the eye blink elimination) and computational power that can be constraining. Deep Learning permits to waive a solution to all the previously cited obstacles [9]. It extracts the features automatically without human-engineered features and classifies in the same process which enables end-to-end approaches. Several other advances in new activation function, regularization, training strategies, and data augmentation yielded to state-of-the-art performances in several fields [3, 7, 10]. Also, it is possible to explain the decision of deep classifiers by advance visualization methods such as weight visualization to discover the learned features.

In this paper, we propose a new convolutional neural network (Convnet) architecture based on Inception for motor imagery classification. It allows to process the data with parallel process In our approach, we use the multivariate raw signal as input with a bandpass filter as preprocessing. Therefore, we use the same first block of [12] but with higher complexity which increases the capacity of the network. Then, an Inception block will extract temporal features more efficiency which improves the performance and speeds up the learning despite the depth to reduce the degradation problem [18]. To test our approach, we use dataset IIa from the BCI Competition IV [19]. As a baseline, we compare with FBCSP and ShallowConvNet which are the state-of-the-art techniques [2]. We investigate some visualization techniques to examine the ability of our networks to extract relevant features.

The rest of the paper is organize as follows: We presents some related works in Sect. 2. We introduce our method in Sect. 3. In Sect. 4, we evaluate the performances and visualize the learned features. Section 5 discuss the result and conclude the paper.

2 Related Works

The first interesting approach was a ConvNet that uses raw EEG data for P300 speller application [6]. It uses convolutional layers that extract temporal and spatial features. It is inspired from Filter Banks Common Spatial Pattern (FBCSP) [2]. A convolution is performed with a kernel of size \((1,n_t)\), then an other convolution with a kernel with a size (C, 1) where C is the number of the channels. Then, it use a softmax layer to classifies the features extracted. [17] introduced similar architectures for MI. ShallowConvNet is a shallow convnet that is composed with the two convolutional layers then the classification layers. DeepConvNet is a deep architecture that includes more aggregation layer after the convolutional layer. ShallowConvNet outperforms state-of-the-art FBCSP. [12] proposed EEGNet as a compact version of the existing methods. It relies on Depthwise convolutional and separable convolution which permitted to reduce the number of the parameter using 796 parameters only for the EEGNet 4, 2. EEGNet performs lower than ShallowConvNet since it was not trained with the same data augmentation (cropped training) suggested by [17]. Also, cropped training requieres a huge time to train which can be problematic in that cas of a takes a huge time to train, for one subjects compared with EEGNet.

3 Method

3.1 EEG Proprieties and Data Representation

MI yields on the apparition of fluctuation of the amplitude of the neuro-signals generated in the primary sensorimotor cortex [14]. It appears as an increase and a decrease of amplitude that target specific frequency bands that are related to motor activities. They are called Event-Related Synchronization (ERS) and Event-Related Desynchronization (ERD). The \(\mu \) and \(\beta \) bands are present respectively in [8, 13] Hz and the beta band [13, 30] Hz are the targeted pattern. As input, each trial is turned into a matrix of \( \mathbb {R}^{C \times T}\) where C represents the number of electrodes and T represents the number of time samples. We sample our data at 128 Hz and we use the segment [0.5–2.5] s after cue.

3.2 Incep-EEGNet

We propose Incep-EEGNet as it is illustrated in Fig. 1. It is a multistage ConvNet that is based on Inception [18]. It is composed as follows:

The first part is the same as EEGNet from [12]. They base it on two convolutional layers that act as temporal and spatial filter as act similarly to FBCSP, which is a widely used approach. We use a temporal convolutional layer with F kernel of size (1, tx) with padding. This layer will learn to extract relevant temporal features as it act as a FIR filter. We choose a size of 32 which correspond to a duration of 0.25 s of a signal sampled at 128 Hz. A second convolution is used to extract the spatial feature. It relies on Depthwise convolution that produces the number of feature maps per input which reduces considerably the computational cost. It is a convolution with a size of (C, 1) where C represents the number of channels. Also, we use batch normalization after each convolution and activation after the second one. This layer will allow only the important electrodes to contribute to the decision and learning frequency-specific spatial filter with Depthwise convolution where it controls the number of connections by the depth parameter D.

In the second part, we introduce the novelty of this architecture which is an inception based block. This block comes as a solution to the inconvenience of EEGNET that is too shallow and too compact, which restricts the capacity of the networks leading to overfitting in most cases. Even with a deeper network, the performance still low because of a degradation problem for DeepConvNet. Hence, we suggest to use an inception stage based That will learn features from several branches:

A convolutional branch with a convolution with a kernel size of (1, 7).
A convolutional branch with a convolution with a kernel size of (1, 9).
A branch with a pointwise convolution with a kernel size of 1, 1 with a stride of (1, 2)
A branch with an average pooling with a kernel size of

We merge the output of the different branches by stacking them along with the feature map dimension. We apply batch normalization and an activation. The use of dropout restricted only after final the activation cause we observed no improvement. Each convolutional branch include a pointwise convolution that reduces the number of feature map to 64 and an average pooling layer with a size of (1, 2).

In the final part, we use an additional convolutional layer with a \(F*D\) kernel with a size of (1, 5) along with batchnormalization, activation, and dropout. We use an Global AveragePooling layer to reduce the number of parameters to \(2*F\). Then, we use Softmax classification with 4 units that represent the 4 classes of the dataset.

3.3 Hyperparameters and Training

Our implementation uses publicly available codes of preprocessing based on braindecode [17]. We trained deep learning methods on a NVIDIA P100 1.12.0. We train our method by optimizing the categorical cross-entropy using ADAM Optimizer [11] with Nesterov. Dropout probability is 0.5 as advised by [3]. We use a batch size of 64 as for EEGNET [12]. We fix the network parameter to \(F=64\) and \(D=4\). Exponential Linear Unit (ELU) is chosen as the activation [7]. We train our ConvNets as follows: We train for 100 epochs with a learning rate (Lr) of \(5\times 10^{-4}\). At the end of the training, we retrain it for 50 epochs and Lr set to \(1\times 10^{-4}\) with the merged training and validation set. Once again, we do the same operation for 30 epochs and a Lr set to \(2\times 10^{-5}\). Similar training was done for ShallowConvNet [17].

4 Experiment

4.1 Dataset

As a dataset, we use the dataset IIa from the BCI competition IV [19]. It contains EEG data of four MI tasks (right hand, left hand, foot, and tongue imagined movements) from nine subjects. It uses a set of 22 electrodes placed on the scalp. The recording was on two different sessions where the first was defined as a training set and the second one as a testing set. The subjects are asked to performs 288 MI tasks per session (72 trials for each class) after a cue that was. The original data is sampled at 240 Hz and filtered with a bandpass filter between 0.1 Hz and 100 Hz. We add additional preprocessing to the data as described in [17]. We resample the signals at 128 Hz and filter with a bandpass filter between 1 Hz and 32 Hz. We use \(20\%\) of the training set as a validation set. We use a cropping data augmentation by extracting the segments [0.3, 2.3] s, [0.4, 2.4] s, [0.5, 2.5] s, [0.6, 2.6] s, [0.7, 2.7] s post cue only on the training set (1152 trials). The validation and testing set contain only [0.5, 2.5] s segment to prevent leaking (for validation set) that can compromise the training. Therefor, the input will have a shape of \(22 \times 256\).

4.2 Results

To assert the performances of our method, we compare with FBCSP, Riemannian geometry [4], Bayesian optimization [5], and ShallowNet [17]. Table 1 shows the results of the classification of our method and the baselines in terms of accuracy. It shows that the proposed method outperforms the baselines for several subjects (S2, S3, S5, S6, S7, S9). However, BO got better results for S1 and S8, when ShallowNet performs better for S4. On the other hand, FBCSP2 and RG did not achieve higher results. For an advanced evaluation, we conduct statistical testing with the Wilcoxon test. To evaluate the significance of the results on the mean value. It shows that our method has a statistically significant difference compared with BO with \(p < 0.05\). Comparing with FBCSP2 and RG, the difference is highly significant with \(p < 0.01\).

Table 1. Classification accuracy (%) comparaison of our methods and the baselines,

Full size table

Table 2 shows the results of the classification of our method and the baselines in terms of kappa. The result shows that our method outperforms for most of the subjects. It only failed to outperform FBCSP1 for S2 and ShallowNet for S4. Once Again, FBCSP2 and RG got bad results. Statistical testing shows that the increase in mean kappa is statistically significant with \(p < 0.05\) for FBCSP1, MDRM, and ShallowNet. For the other methods, the difference is highly significant at \(p < 0.01\).

Table 3 and Table 4 show the confusion matrix of Incep-EEGNet and FBCSP2 respectively. They show that both methods have difficulties to classify foot classes. Also, they confuse between right-hand and left-hand classes. Performances of our method are better than the reference.

Table 2. Kappa values comparison of our methods and the baselines

Full size table

Table 3. Confusion matrix of Incep-EEGNet

Full size table

Table 4. Confusion matrix of FBCSP

Full size table

Figure 2a represents the Fourier transform of a temporal filter learned in the first convolution. It was designed to extract the temporal features of the EEG signals. As it was expected, Incep-EEGNet learned exactly the frequencies that are involved in the MI neural response. Also, we observe that there is a peak at 55 Hz, which can indicate that MI may be also characterized by this band as was reported by [8]. Figure 2b shows a spatial filter reconstructed by interpolation of the weights. The scale in the right is from 1 to \(-1\). It shows that Incep-EEGNet extracts the signals from the electrodes C3, CZ, and C4. It happens that those electrodes cover the part of the brain that is responsible for the movement of the hands and the feet.

5 Discussion and Conclusion

Designing ConvNets for BCI applications may be problematic. The existing approaches need an intensive data augmentation, and to be Shallow. Deep ConvNets are defective and lacks performances. Therefore, we built the Incep-EEGnet which is a modified EEGNET with a greater number of feature map that increases the complexity of the model where it outperforms state-of-the-art methods. To diminish any problem of degradation, we use an inception block that has several branches that offer an efficient feature extraction layer. The pointwise convolution works as a residual connection that prevents from vanishing gradient problems. Incep-EEGNet outperforms FBCSP, RG, and several ConvNets. Indeed, CSP techniques are considered state-of-the-art techniques for their efficiency, but as drawbacks, they are sensitive to noises, artifacts, and need larger datasets [16]. RG relies on and representation of the data that does not take into account the frequential features as its authors praise. But, it lowers its performances compared with FBCSP and ConvNets. ConvNet methods perform better and faster in the same conditions if we wisely use them. The overall performances are still low for several subjects highlighting a strong incompatibility between some subjects.

References

Abdulkader, S.N., Atia, A., Mostafa, M.S.M.: Brain computer interfacing: applications and challenges. Egypt. Inform. J. 16(2), 213–230 (2015). https://doi.org/10.1016/j.eij.2015.06.002
Article Google Scholar
Ang, K.K., Chin, Z.Y., Wang, C., Guan, C., Zhang, H.: Filter bank common spatial pattern algorithm on BCI competition IV datasets 2a and 2b. Front. Neurosci. 6, 39 (2012)
Article Google Scholar
Baldi, P., Sadowski, P.J.: Understanding dropout, p. 9
Google Scholar
Barachant, A., Bonnet, S., Congedo, M., Jutten, C.: Multiclass brain-computer interface classification by Riemannian geometry. IEEE Trans. Biomed. Eng. 59(4), 920–928 (2012)
Article Google Scholar
Bashashati, H., Ward, R.K., Bashashati, A.: User-customized brain computer interfaces using Bayesian optimization. J. Neural Eng. 13(2), 026001 (2016). https://doi.org/10.1088/1741-2560/13/2/026001. 00007
Article Google Scholar
Cecotti, H., Graser, A.: Convolutional neural networks for P300 detection with application to brain-computer interfaces. IEEE Trans. Pattern Anal. Mach. Intell. 33(3), 433–445 (2011). https://doi.org/10.1109/TPAMI.2010.125
Article Google Scholar
Clevert, D.A., Unterthiner, T., Hochreiter, S.: Fast and accurate deep network learning by exponential linear units (ELUs). In: International Conference on Learning Representations (ICLR) (2016)
Google Scholar
Dose, H., Møller, J.S., Iversen, H.K., Puthusserypady, S.: An end-to-end deep learning approach to MI-EEG signal classification for BCIs. Expert Syst. Appl. 114, 532–542 (2018). https://doi.org/10.1016/j.eswa.2018.08.031. 00015
Article Google Scholar
Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning: Adaptive Computation and Machine Learning. The MIT Press, Cambridge (2016)
MATH Google Scholar
Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. In: Bach, F., Blei, D. (eds.) Proceedings of the 32nd International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 37, pp. 448–456. PMLR, Lille, July 2015. 16886
Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: Proceedings of the 3rd International Conference on Learning Representations (ICLR), December 2014
Google Scholar
Lawhern, V.J., Solon, A.J., Waytowich, N.R., Gordon, S.M., Hung, C.P., Lance, B.J.: EEGNet: a compact convolutional neural network for EEG-based brain-computer interfaces. J. Neural Eng. 15(5), 056013 (2018). https://doi.org/10.1088/1741-2552/aace8c
Article Google Scholar
Ortiz-Rosario, A., Adeli, H.: Brain-computer interface technologies: from signal to action. Rev. Neurosci. 24(5) (2013). https://doi.org/10.1515/revneuro-2013-0032
Pfurtscheller, G., Neuper, C.: Motor imagery and direct brain-computer communication. Proc. IEEE 89(7), 1123–1134 (2001). https://doi.org/10.1109/5.939829
Article Google Scholar
Pfurtscheller, G., Neuper, C.: Movement and ERD/ERS. In: Jahanshahi, M., Hallett, M. (eds.) The Bereitschaftspotential: Movement-Related Cortical Potentials, pp. 191–206. Springer, Boston (2003). https://doi.org/10.1007/978-1-4615-0189-3_12. 00054
Chapter Google Scholar
Reuderink, B., Poel, M.: Robustness of the common spatial patterns algorithm in the BCI-pipeline. Technical report, University of Twente (2008). 00042
Google Scholar
Schirrmeister, R.T., et al.: Deep learning with convolutional neural networks for EEG decoding and visualization. Hum. Brain Mapp. 38(11), 5391–5420 (2017). https://doi.org/10.1002/hbm.23730. Convolutional Neural Networks in EEG Analysis
Article Google Scholar
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2818–2826. IEEE (2016). 01916
Google Scholar
Tangermann, M., et al.: Review of the BCI competition IV. Front. Neurosci. 6 (2012). https://doi.org/10.3389/fnins.2012.00055

Download references

Author information

Authors and Affiliations

Networks, Telecoms and Multimedia Team, LIM@II-FSTM, B.P. 146, 20650, Mohammedia, Morocco
Mouad Riyad, Mohammed Khalil & Abdellah Adib

Authors

Mouad Riyad
View author publications
You can also search for this author in PubMed Google Scholar
Mohammed Khalil
View author publications
You can also search for this author in PubMed Google Scholar
Abdellah Adib
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mouad Riyad .

Editor information

Editors and Affiliations

GREYC, University of Caen Normandie, Caen, France
Abderrahim El Moataz
IRF-SIC, Faculty of Sciences, Ibn Zohr University, Agadir, Morocco
Driss Mammass
ImViA, University of Burgundy, Dijon, France
Alamin Mansouri
Math - Info, University of Quebec, Trois-Rivières, QC, Canada
Fathallah Nouboud

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Riyad, M., Khalil, M., Adib, A. (2020). Incep-EEGNet: A ConvNet for Motor Imagery Decoding. In: El Moataz, A., Mammass, D., Mansouri, A., Nouboud, F. (eds) Image and Signal Processing. ICISP 2020. Lecture Notes in Computer Science(), vol 12119. Springer, Cham. https://doi.org/10.1007/978-3-030-51935-3_11

Download citation

DOI: https://doi.org/10.1007/978-3-030-51935-3_11
Published: 08 July 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-51934-6
Online ISBN: 978-3-030-51935-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Incep-EEGNet: A ConvNet for Motor Imagery Decoding

Abstract

Similar content being viewed by others

MR-EEGNet: An Efficient ConvNets for Motor Imagery Classification

EEG classification using a simple CNN model for imagined and executed motor signals

Sinc-Based Convolutional Neural Networks for EEG-BCI-Based Motor Imagery Classification

Keywords

1 Introduction

2 Related Works

3 Method