Improving multi-class EEG-motor imagery classification using two-stage detection on one-versus-one approach

The multi-class motor imagery based on Electroencephalogram (EEG) signals in Brain-Computer Interface (BCI) systems still face challenges, such as inconsistent accuracy and low classification performance due to inter-subject dependent. Therefore, this study aims to improve multi-class EEG-motor imagery using two-stage detection and voting scheme on one-versus-one approach. The EEG signal used to carry out this research was extracted through a statistical measure of narrow window sliding. Furthermore, inter and cross-subject schemes were investigated on BCI competition IV-Dataset 2a to evaluate the effectiveness of the proposed method. The experimental results showed that the proposed method produced enhanced inter and cross-subject kappa coefficient values of 0.78 and 0.68, respectively, with a low standard deviation of 0.1 for both schemes. These results further indicated that the proposed method has an ability to address inter-subject dependent for promising and reliable BCI systems.


Introduction
Brain-Computer Interface (BCI) is a computerized system that translates the brain signals into commands, which are relayed to an external device to carry out specific actions [1], [2] that are recorded with an electroencephalogram (EEG) [3]. The motor imagery (MI) task is one of the most studied and promising types of EEG signals in BCI systems [4,5]. MI is designed to detect and translate the images of the user's imagination regarding a motor task, such as the hand movement needed to roll a wheelchair forward or to move a cursor left or right [6,7]. Several studies have shown that MI provides a promising means of control and communication to people with motor disabilities [3,8] and plays an essential role in learning and rehabilitating motor skills, as well as the control of prostheses [9].
Two of the most significant challenges in the detection of MI task are the efficient extraction and correct classification of EEG features [2,10,11]. The extraction process remains a challenge since the signal is naturally noisy [12], non-stationary [13,14], with intra and inter-subject dependents [6,15,16] that affects the classification results [12]. Furthermore, existing studies show several remaining problems, such as subjectdependent or inconsistent accuracy among all subjects, and low performance in multi-class EEG-MI classification [17][18][19][20][21].
The single classifier has a common drawback in handling the nonlinear, noisy and embedded outlier nature of EEG signal data [22]. For instance, the majority rely on the feature extraction and selection technique, thereby leading to the exhibition of poor performance when the features overlap with one another [23]. However, despite this major disadvantage, several studies have employed the use of this process, such as least squares classifier (LSC) [17] with novel feature extraction and linear discriminant analysis (LDA) [18] with fusion approach. Rodrigues et al. [17] proposed the space-time recurrent and LSC as feature extraction and classifiers for tackling multi-class EEG-MI classification using BCI competition IV-Dataset 2a. Their study gained high performance and low kappa coefficient in 2-class and 4-class classification, respectively. Razi et al. researched by employing the fusion theory, also known as the Dempster-Shafer theory (DST) [18]. In their study, the LDA was employed as a single classifier by fusing one-versus-one, one-versus-rest and twoversus-two processes on DST, to gain a promising kappa coefficient of 0.75. However, their proposed method showed a less consistent performance because several subjects gained a low kappa coefficient.
Several studies attempted to employ deep learning with a convolutional neural network as the main scheme to determine the limitation of single classifier performance [19][20][21]. Despite the promising features of deep learning in many areas, multiclass EEG-MI classification, however, has still not gained excellent performance. Several drawbacks associated with deep learning, such as the need for numerous training data and hyper-parameters or configuration, need to be properly tuned. Furthermore, many previous studies [19][20][21], have different hyper-parameters and configuration for deep learning model for every subject. Therefore, based on the performances of previous studies on multi-class, EEG-MI classification is still open for any improvement since its performance, and the kappa coefficient are found still far from excellence [19].
This study aims to address the remaining gaps in multi-class EEG-MI classification, which, according to numerous studies, still rely on the feature extraction and selection method and yield lower performance compared to the two-class EEG-MI classification. Meanwhile, the enhancement of multi-class EEG-MI classification via ensemble learning is not properly explored. Therefore, this study proposed the use of a hybrid classifier and one-versus-one (OvO) approach for multi-class EEG-motor imagery (MI) classification. In hybrid classifier, a two-stage detection with Linear Discriminant Analysis (LDA) as the first stage detector was used, while the kNN and Gradient Boosted Tree (GBT) were used as the second stage detector. Furthermore, seven statistical features were used for feature extraction with the introduction of the data transformation approach, also known as channel instantiation [8].
This study is organized as follows. Section 2 explains the materials and methods used in this research, which comprises a dataset description and the proposed method. Section 3 describes the quantitative result in comparison to previous studies and further discussions. Finally, Section 4 concludes the research.

Dataset
This research made the use of BCI Competition IV-Dataset 2a consisting of EEG data from 9 subjects [24]. The cue-based BCI paradigm consisted of four different motor imagery tasks the imagination of movement of the left hand (class 1), right hand (class 2), feet (class 3), and tongue (class 4). EEG data were recorded on two different days for two sessions of each subject. In each session, six runs were carried out, separated by short breaks. One run consisted of 48 trials (12 for each of the four possible classes), thereby culminating in 288 trials per session.
In addition, twenty-two electrodes were used to record the EEG signals as monopolar with the left mastoid serving as a reference and the right as ground. The signals were sampled with 250 Hz and band-pass filtered between 0.5 -100 Hz. Furthermore, the sensitivity of the amplifier was set to 100 µV, with an additional 50 Hz notch filter used to suppress line noise. This study used 2 out of 3 seconds, comprising of 500 data points from recorded EEG-MI.

The Proposed Method
The proposed method used in this research is the NWFE+OvO-TSD, purposely to develop a hybrid of narrow window feature extraction and classifier on a one-versus-one approach to enhance the performance of multi-class of EEG-MI classification. Furthermore, the ensemble technique was used by employing hybrid classifier. This study, utilized the two-stage detection (TSD) by employing Linear Discriminant Analysis (LDA) as first stage detector, while kNN and Gradient Boosted Tree (GBT) were used as the second stage detector. All of these combinations were employed in one-versus-one technique (OvO) commonly used to tackle multi-class classification. Fig. 1 -3 show the block diagram and modeling scheme of the proposed method.
The following are the detailed steps of the proposed method: 1) Data pre-processing This data pre-processing consisted of two stages. Firstly, windows were set to 3 -5 seconds (2 seconds), which consisted of 500 data points at a frequency of 250 Hz. Secondly, this time window was split into 10 windows, consisting of 50 data points.

2) Feature extraction
After filtering and windowing, each window was extracted with seven statistical measures: root mean square, mean absolute value, standard deviation, skewness, kurtosis, coefficient of variation and variance to mean ratio.

3) Data Transformation
In this proposed method, data transformation was generated to be the channel-trial dataset based on the channel-Trial instantiation approach. In this transformation, the EEG channels were converted into rows and the statistical measures of each window were converted into columns as shown in Fig. 3.  Fig. 4. 5) Conducting a voting scheme and calculating its accuracy and standard deviation. This voting scheme was carried out because each trial had many instances as the result of the channel-trial instantiation scheme so as to produce one decision for each trial. Fig. 4 shows 2 datasets, namely training and testing, used to build and evaluate the model, respectively. The evaluation process was based on testing the dataset results to determine the kappa coefficient and accuracy.

Results and Discussion
As shown in Table 1, three experiments were conducted in this research to evaluate the proposed method due to subjectdependent problem. The two experiments were used to handle inter-subject for both 2-classes and 4-classes. Meanwhile, 1 experiment was dedicated to handling cross-subject, which was more challenging in EEG-MI classification. (1) common training data + 30% own testing data, (2) common training data + 30% common testing data In the inter-subject scheme, the training dataset from BCI competition IV-Dataset 2a was combined to become a common training data with 30% testing data for the respective subject in the testing phase. Meanwhile, in the cross-subject scheme, all training data and 30% of all subject testing data were combined to become common training data. Therefore, in the crosssubject scheme, the training dataset was identical for all subjects with the implementation of scheme inter-session detection due to the difference in training and testing.   Table 2 shows that the classification related with the left hand (L) gained accuracy Fig. 4. Modeling scheme employing hybrid classifier over a one-versusone approach followed by a voting scheme as final detection below 90%, while those unrelated, such as R/F, R/T, and F/T had excellent accuracy as indicated in bold font with 90.28%, 91.90%, and 91.28%, respectively. In Experiment #2, 4-classes classification was carried out with the inter-subject scheme. Table 3 presents the classification results both in accuracy and kappa coefficient. Table 3 shows that S2 gained the highest results followed by S8 with accuracy >= 90% or kappa coefficient >= 0.9, where both were marked in bold font. Meanwhile, S7 followed by S6 gained the lowest accuracy with 72.22% and 72.92%, respectively. However, for the overall result, the proposed method yielded good accuracy with 82.68% (kappa coefficient = 0.78). This second finding indicated that the proposed method is promising since many previous studies gained event below 80%.
Experiment #3 was a more challenging task, with a 4-class classification and cross-subject scheme. Table 4 presents the classification results with a promising overall performance because of the overall average accuracy above 70% (73.75%) kappa coefficient of nearly 0.7 (0.68). This third finding indicated that the proposed method is promising for multi-class EEG-MI classification. Furthermore, to evaluate the competitiveness of the proposed method, a comparison prior to the research was carried out, as shown in Table 5. The comparison between Experiments #2 with other previous studies comprising of the same scheme (inter-subject scheme) used the bold font to indicate the best evaluation score in each subject. In addition,  the proposed method gained the second best in terms of overall average accuracy, compared to previous studies. The proposed method outperformed 2 out of 9 subjects compared to all previous studies. This finding indicated that the proposed method is promising in multi-class EEG-MI classification and can still be improved. The proposed method (Ex#2), represented with dot-plots, as shown in Fig. 5, was still competitive. It was used for the consistent detection of 8 others previous studies. To corroborate the effectiveness and competitiveness of the proposed method, the kappa coefficient was also compared, as shown in Table 6. Table 6 shows that the proposed method outperformed 3 out of 9 subjects compared to prior research (as marked in bold font). In overall results, the average kappa coefficient and standard deviation were found higher and lower, respectively. These findings corroborated with the effectiveness and competitiveness of the proposed method in multi-class EEG-MI classification. The dot plot diagram shown in Fig. 6 proves that the proposed method is better compared with prior research. Table 7. Comparison to previous research on accuracy for BCI competition IV-Dataset 2a (4-class) in Experiment #3 with cross-subject and inter-session scheme Table 7 shows the performance of Experiment#3 (crosssubject scheme) that was better compared to previous related research with less attention on the cross-subject in EEG-MI classification. According to [20], cross-subject is promising in EEG-MI classification because the signals are subjectdependent, thereby leading to the need for reliable BCI systems [21]. Prior research carried out by [21] performed a crosssubject scheme by employing a deep learning in the form of convolutional neural network (CNN). Table 7 shows that the proposed method outperformed 8 out of 9 subjects in previous studies (marked in bold font). This finding corroborated the proposed method, which handled the subject-dependent with various schemes and acceptable accuracies.
A dot plot was used to indicate that the detection spread of the proposed method was narrower compared to prior research, as shown in Fig. 7. The simple approach from the proposed method outperformed more complex processes under CNN scheme (MCNN and CCNN). A significance test was performed with the Bonferroni-Dunn test for comparing the proposed method with other related previous studies [25,26].
To carry out the Bonferroni-Dunn test, the performance methods needed to be ranked by employing six statistical measures. The six statistical measures consisted of a range of kappa coefficient (max-min), first quartile (Q1), third quartile (Q3), mean absolute deviation (MAD), coefficient of variation (CV), and coefficient of quartile variation (CQV). Table 8 shows the six statistical measure and its rank for all methods.
The proposed method (Experiment #2) gained the lowest rank compared to prior research. This initial finding indicated that the proposed method is more consistent compared to prior research. This rank was used in the Bonferroni-Dunn test with the critical difference (CD) plot, as shown in Fig. 8.
The research with STR+LSC [17] was outside the tick line, meaning that it is the only study with a statistical difference. This finding indicated that the proposed method is competitive with previous research. Another significant test was performed to evaluate the cross-subject results using six statistical measures, as shown in Table 9.
The result showed that the proposed method gained the lowest rank, meaning to be better in rank compared to other previous studies. However, the CD plot produced by the Bonferroni-Dunn test showed no difference among all methods as shown in Fig. 9. All findings from Experiment #2 and #3 indicated that the proposed method is effective, consistent and able to handle subject-dependent. Furthermore, cross-subject is a promising and important task in multi-class EEG-MI classification [21]. However, in the cross-subject scheme, it is still open for improvement since the accuracy was found below 80%.
The effectiveness of the proposed methods is a combination strategy from feature extraction using a classification scheme. Firstly, the narrow window and its combination tackles the nature of EEG signal, such as non-stationary and subjectdependent [27]. Secondly, the higher-order statistic (skewness and kurtosis) is used as a statistical feature extraction method [3], mean average value and root mean square [8,28]. Finally, the third component is a combination of several, such as ensemble technique, which helps to derive consistent and better classification results [29].

Conclusion
In conclusion, this study proposed the one-versus-one (OvO) approach for multi-class EEG-motor imagery (MI) classification using the two-stage detection as a classifier. This two-stage detection process employed Linear Discriminant Analysis (LDA) as the first stage detector, while kNN and Gradient Boosted Tree (GBT) were used in the second stage. The experimental results showed that the proposed methods gained higher kappa coefficient and lower standard deviation, approximately 80% and 0.8, respectively. In addition, the proposed method showed consistence detection among all subjects as indicated by low standard deviation with 6 statistical measures of consistent evaluation on the first rank. Another research carried out from the box plot, and Bonferroni-Dunn test corroborated the effectiveness and competitiveness of the proposed method compared to related works. Therefore, it can be concluded that the proposed method has an ability to address the inter-subject dependent problem in multi-class EEG-MI classification and enhance the detection performance. In future works, EEG channels reduction is needed to reduce the processed data with a decrease in time consumption, while applying the model in two-class EEG-MI classification to test the robustness of the proposed method.