Feature Selection Techniques for the Analysis of Discriminative Features in Temporal and Frontal Lobe Epilepsy: A Comparative Study

Feature Selection Techniques for the Analysis of Discriminative Features in Temporal and Frontal Lobe Epilepsy: A Comparative Study

The Open Biomedical Engineering Journal 07 Jun 2021 RESEARCH ARTICLE DOI: 10.2174/1874120702115010001



Because about 30% of epileptic patients suffer from refractory epilepsy, an efficient automatic seizure prediction tool is in great demand to improve their life quality.


In this work, time-domain discriminating preictal and interictal features were efficiently extracted from the intracranial electroencephalogram of twelve patients, i.e., six with temporal and six with frontal lobe epilepsy. The performance of three types of feature selection methods was compared using Matthews’s correlation coefficient (MCC).


Kruskal Wallis, a non-parametric approach, was found to perform better than the other approaches due to a simple and less resource consuming strategy as well as maintaining the highest MCC score. The impact of dividing the electroencephalogram signals into various sub-bands was investigated as well. The highest performance of Kruskal Wallis may suggest considering the importance of univariate features like complexity and interquartile ratio (IQR), along with autoregressive (AR) model parameters and the maximum (MAX) cross-correlation to efficiently predict epileptic seizures.


The proposed approach has the potential to be implemented on a low power device by considering a few simple time domain characteristics for a specific sub-band. It should be noted that, as there is not a great deal of literature on frontal lobe epilepsy, the results of this work can be considered promising.

Keywords: Temporal lobe epilepsy, Frontal lobe epilepsy, Time domain features, Intracranial EEG, Feature selection, Matthews’s correlation coefficient.


Epilepsy is the second most common and devastating neurologic disease, which affects over 70 million people around the world [15]. For some patients, it can be managed with antiepileptic medications or surgery. However, 20 to 30% of them would likely get worse after the initial diagnosis, and some may even become refractory to the current medicine [4, 5]. Anticipating seizures enough in advance could allow patients and/or caretakers to take appropriate actions and therefore, reduce the risk of injury [6].

Seizure is an irregular neural activity in the form of a sudden uncontrolled electrical discharge in the cortical brain regions. As a result, a collection of nerve cells start firing excessively and synchronously. People with frequent and unprovoked seizures are usually diagnosed as epileptics [7], [8].

Intracranial EEG (iEEG) is an electroencephalography recording utilizing intracranial electrodes implanted in the brain and require a surgical procedure [4]. So, compared to the scalp electrodes, EEG is less noisy and seizures can be identified typically earlier through the intracranial electrodes [6, 9].

The International League Against Epilepsy (ILAE) divided epileptic seizures into partial or focal and generalized seizures. Focal seizures originate in a limited region of the brain and may spread to other regions. On the other hand, generalized seizures are initiated in bilateral hemispheric areas and quickly propagate to all cortical areas [10]. Though there are different forms of seizures, we focused on those that are focal, mainly in the temporal and frontal lobes, entitled to Temporal Lobe Epilepsy (TLE) and Frontal Lobe Epilepsy (FLE), respectively.

TLE is one of the most prevalent forms of focal epilepsy. It has received a significant amount of attention from neurologists due to its high likelihood of clinical occurrence [1114]. This kind of epilepsy may be treated medically at the onset of the disease with different antiepileptic drugs [14].

FLE is the second-most common form of focal epilepsy after TLE accounting for 25% of epilepsy [1518]. Instead, FLEs, as compared to TLEs, tend to be brief, drug-resistant, more problematic, and to occur during sleep. Furthermore, the surgery for FLE has poorer outcomes than for TLE; as a result, the surgical workup of FLE is even more demanding [15, 19], [20]. Diagnosis of the FLE is rather hard due to having similar symptoms as a sleep disorder, or night terror, and psychiatric diseases [19]. Regarding the detection of FLE, some works have been published by analyzing various signals in the body such as EEG, ECG (Electrocardiography), EMG (Electromyography), and EOG (Electrooculography) [19], [2125]. One report about the prediction of frontal lobe epilepsy on WAG/Rij rats was published [15] and to the best of the authors’ knowledge, no previous studies have been reported in the prediction of frontal lobe epilepsy of humans. Then, implementing a prediction system for FLEs is crucial due to the lack of supervision at night.

Several feature extraction techniques have been introduced in the last few years, among them, the time-domain ones are the earliest recommended methods. Time-domain features are employed to achieve discriminative information at a low computational cost. The obtained features are then fed to feature selection method [26]. High-quality features can be defined as those that produce maximum class separability, robustness, and less computational complexity. In this research, the general impact of iEEG signal variations on 16 commonly used features was investigated.

Designing and implementing a reliable forecasting and early warning tool that can help epileptic individuals to take appropriate drugs during an early warning period is, therefore, vital [2729]; it will significantly improve their life quality. Furthermore, since portable devices are so available in daily life, targeting tools that can be easily implemented in such devices is the main objective of this study.

To this aim, the tool should enhance the daily life of the patients and increase their autonomy. However, knowing that:

  • for new customers, the application will have to be frequently updated during its first uses to be able to integrate the new patient data efficiently,
  • some patients may not have regular access to wireless connections and/or computers/tablets,

the opted strategy was to consider a dual-mode operation: the training/update should be performed on the portable device itself while the application is still working on prediction mode. So, in order to ensure an efficient online training/prediction, it is crucial to shorten the training CPU (Central Processing Unit) time while making the tool operation as simple as possible. This tool will then have to integrate such constraints.

One of the limitations of published works in this field is about employing few limited data in their studies. The authors attempted to study just one minute of data [30] or they used a limited amount of data: 5 min preictal and 10 min interictal [31]. Another notable limitation of existing works is filtering the EEG signal with a pass-band filter, which removes the high-frequency sub-bands that are very important in the prediction of the seizure [32, 33]. The plan is to use a wide range of frequencies (up to 120 Hz) and consider the whole available data for the nominated patients.

Several feature extraction techniques have been introduced in the last few years, among them the time-domain ones are the earliest recommended methods. Time-domain features are employed to achieve discriminative information at a low computational cost. The obtained features are then fed to feature selection methods [26]. High-quality features can be defined as those that produce maximum class separability, robustness, and less computational complexity, a key parameter to consider while targeting their use in portable devices. In this research, the general impact of iEEG signal variations on 16 commonly used features was investigated. However, little evidence has been reported on the effectiveness of various feature selection methods on EEG data of epilepsy patients. Therefore, it is required to explore their differences and find which one may work better than the others in perspective to deploy such algorithm in an implantable medical device that uses linear features, which allows rapid calculation with less complexity in prediction of the seizure.


The proposed method steps are illustrated in Fig. (1). In the first phase, as detailed in the appendix, six EEG signals are preprocessed and 16 features extracted (these features being adopted from previous studies). Then, the data are divided into train and test sets and three kinds of feature selection methods employed to reduce the data dimension, making the approach computationally efficient. Next, the obtained results are tested by a well-known judging classifier namely, Random Forest. The 30 top extracted important features are ranked by various methods and fetched into the Random Forest classifier, while Mathew’s correlation coefficient (MCC) is used to analyse the performance. Finally, the relevant features among the 30 top ones are retained for the winner feature selection method.

Fig. (1). An overview of the proposed work. D is considered as dimension of the data.
Fig. (2). An overview of an EEG signal containing seizure for a patient suffering from the frontal (on the top) and another patient with the temporal lobe epilepsy (at the bottom). The seizure period is highlighted in red [3].

2.1. EEG Database

The EEG dataset employed in this research is from the University Hospital of Freiburg, Germany. The EEG-database consists of two sets of files: “preictal (pre-seizure) data,” i.e. epileptic seizures with at least 50 min preictal data, and “interictal data,” which contains about 24 hours seizure-free EEG-recordings. The EEG-database comprises six iEEG electrodes from 21 patients with a sampling rate of 256 Hz. In this work, the participants are divided into two groups based on the origin of their seizures. In this study, 12 epilepsy patients (six TLE with the hippocampal and origins (134 hours) and six FLE with neocortical origin (137 hours)), i.e., the set of data available in the Freiburg database (Appendix B) were retained [34].

In Fig. (2), one-hour iEEG data (single channel) from a patient with frontal and an individual with temporal lobes is depicted. As one can see, epilepsy from the frontal lobe (which takes about 7 seconds) is shorter than the temporal one (which takes about 91 seconds). In addition, the morphology of the signal over a given period of time for each type of epilepsy is completely different. In fact, one can observe some hints in the preictal stage of the temporal one that is not in the other one.

2.2. Feature Extraction

The preliminary stage in EEG signal analysis is preprocessing. To decrease the effect of factors that cause baseline differences among the different recordings within the dataset and remove the signal DC component, iEEG signals were normalized using Z-score (expressed in terms of standard deviations from their means). The only potential artifact that could be addressed was the harmonic power line interference at 50 Hz. The 50 Hz interference was indirectly eliminated by performing the sub-band filtering which Gamma was divided into two sub-bands.

To deal with imbalanced dataset, an hour EEG signal was divided into 2 seconds non-overlapping windows for the interictal section, while the preictal section was split into chunks of the same window sizes with 50% overlapping. Various univariate linear measures were extracted at each epoch of two seconds along with a bivariate linear measure, as reported in Table 1. [3539]. These time domain features, well-known in seizure forecasting, are explained in detail in Appendix A.

Table 1.
Features extracted during 2 seconds sliding window.
S. NO. Features Comments Time of the Computation for Each Channel (s)
1 Energy One feature (36D)1 0.001593
2 Mean One feature (36D) 0.018089
3 Variance (VAR) One feature (36D) 0.011741
4 Skewness One feature (36D) 0.181114
5 Kurtosis One feature (36D) 0.014463
6 Interquartile range (IQR) One feature (36D) 0.036152
7 Zero Crossing Rate (ZCR) One feature (36D) 0.007410
8 Mean Absolute Deviation (MAD) One feature (36D) 0.010459
9 Entropy One feature (36D) 1.168083
10 Hjorth mobility One feature (36D) 0.007360
11 Hjorth complexity One feature (36D) 0.003935
12 Coefficient of Variation (CoV) One feature (36D) 0.010528
13 Root Mean Square (RMS) One feature (36D) 0.171703
14 MAX2 cross correlation 15 values for 6 channels, but one feature was considered (90D) 0.087968 (between two channels)
15 AR3 model Two features 4 (72 D) 0.026372
1 36D: 36-dimensional. 2 MAX: Maximum. 3 AR: Autoregressive. 4 (1 coefficient and an error term).

Prior to feature extraction, six band-pass FIR (Finite Impulse Response) filters were utilized to divide the iEEG signals into different frequency bands: Delta (0.5-4 Hz), Theta (4-8 Hz), Alpha (8-12 Hz), Beta (12-30 Hz), as well as two Gamma bands namely, low-Gamma (30-47 Hz), and high-Gamma (53-120 Hz) [4042]. This led to an input space of 630 features (dimensions) for each window.

2.3. Feature Selection

Feature selection is a vital stage in analyzing the data to improve model performance and reduce mathematical computational complexity by projecting the existing features onto a lower dimensional space. This technique reduces the input dimensionality by removing irrelevant or redundant features from the entire feature set [32, 4347]. In the machine learning literature, the approaches to feature subset selection are often categorized as filter, wrapper, and embedded strategies [44, 4648].

Filter approaches are based on the statistical properties of explanatory variables (predictor variables) and their relationship to the outcome variable (response); they are basically not computationally expensive. There are a lot of filter methods such as PCA (Principal Component Analysis), LDA (Linear Discriminant Analysis), and PLS (Partial Least Squares) which all find the linear combination of features to characterize two or more classes. However, even if they are linear, simple, and relatively low cost to reduce the dimensionality of the data, there is no clear interpretation of the feature ranking. Moreover, PCA as a famous feature reduction method is an unsupervised method, which does not consider dependent variables [4951].

We employed the Kruskal Wallis (KW), a nonparametric test, without making prior assumptions about the data distribution, unlike the One Way ANOVA [52], [53].

The value of Kruskal-Wallis ranking can be calculated as the following equation:


N is the total number of observations across all classes, ni is defined as the number of observations in group i, Rj is the mean rank of group i, c is the number of output group [53].

Wrapper approaches try to find a predictive model by using various combinations of features, then select the set of features that offer the highest evaluation performance. These techniques can be time consuming and tend to be slow. Therefore, they are not appropriate for large-scale problems to select the subset of features. One of the most popular wrapper techniques, Support Vector Machine- Recursive Feature Elimination (SVM-RFE), was used which backward eliminate features [54], [55]. The backward elimination technique builds a model on the entire set of all features and computes an importance score for each one. Then it removes the least significant features at each iteration which enhances the performance of the model. In other words, the top ranked variables are eliminated last [5456].

Embedded techniques are inbuilt feature selection, allowing a classifier to build a model that automatically performs attribute selection as a part of model training (performs feature selection and model fitting simultaneously) [44, 57]. In this work, XGBoost (Extreme Gradient Boosting) is used which has been broadly employed in many areas due to its parallel processing, high scalability and flexibility [5863]. This embedded technique is optimized implantation of Gradient Boosting framework. Boosting is building a strong learner with higher precision with a combination of weaker classifiers and it is known as the Gradient Boosting once the weak classifiers in each phase are built based on the gradient descent to optimize the loss function [5963]. To further improve it, the XGBoost classifier has two regularization terms (inbuilt L1 and L2) to penalize the complexity of the model and avoid overfitting [59, 60, 63].

2.4. Evaluation and Performance Analysis

After extracting a number of features to discriminate between the preictal and interictal periods, 30 attributes that held the most discriminative information were deemed. We extracted from three approaches and then applied each new group of features to one of the most powerful ensemble methods, Random Forest. This embedded approach belongs to bagging for judging the performance of the other attribute selectors and it differs from boosting mechanism [6466]. It should be noted that a bagging classifier is selected rather than boosting one to avoid systematic bias in the comparison results.

A classifier must be generalized, i.e., it should perform well when submitted to data outside the training set. Owing to the issue of class imbalance, accuracy could be an inadequate metric to evaluate the performance of the classifier [6770]. Although accuracy remains the most intuitive performance measure, it is simply a ratio of correctly predicted observations over the total observations, so reliable only when a dataset is symmetrical. However, this measure has been used exclusively by some researchers in analyzing seizures [7173].

Numerous metrics have been developed to analyze the effectiveness and efficiency of the model in handling the imbalanced datasets, such as F1 score, Cohen’s kappa, and MCC [6870]. Among the above popular metrics, MCC is revealed as a robust and reliable evaluation metric in the binary classification tasks and, in addition, it was claimed that measures like F1 score and Cohen’s kappa should be avoided due to the over-optimism results especially on unbalanced data [68, 69, 74, 75].

To visualize and evaluate the performance of a classifier, the confusion matrix was used (see Table (2), which represents the confusion matrix of a binary classification). After computation of the confusion matrices, it should be noted that MCC has been retained to compare the classification performance and effectiveness of the feature selection methods.

Table 2.
The confusion matrix for a binary classification task.
Actual Predicted
Positive Negative
Positive TP (True Positive) FN (False Negative)
Negative FP (False Positive) TN (True Negative)

2.3.1. Matthews’s Correlation Coefficient (MCC)

MCC takes into account all four quadrants of the confusion matrix, which gives a better summary of the performance of classification algorithms. MCC can be considered as a discretization of the Pearson’s correlation coefficient for two random variables due to taking a possible value in the interval between -1 and 1 [7577]. The score of 1 is deemed to be a complete agreement, −1 a perfect misclassification, and 0 indicates that the prediction is no better than random guessing (or the expected value is based on the flipping of a fair coin).



3.1. Dividing Signals Into Frequency Sub-bands

The aim of this study is to analyze and rank the time-domain features introduced by other researchers related to epileptic seizures in forecasting with EEG signals and compare the performance of three feature selection approaches based on the interictal and pre-ictal segments. Given the goal of classifying iEEG data into two classes: “1” denoting the ictal stage and the period preceding a seizure and “0” denoting seizure-free periods (interictal) and postictal (the time following a seizure).

Before ranking the features and comparing them, one needs to investigate how much dividing EEG signal into various sub-bands can be important. Therefore, a comparison of the accumulated energy for two cases, without and with dividing the signal into 6 sub-bands, was performed. The feature selection scores have represented both lobes in Fig. (3). The result for both graphs shows that dividing the EEG signal into various sub-bands can improve the performance of seizure forecasting because it contains much more discriminative information than the other case. Interestingly, the dimensionality of data will be increased for now but, later, the focus will be on specific sub-bands and reducing the dimensional feature space to consume less memory at runtime.

Fig. (3). Investigation of dividing the EEG signal into various sub-bands: top for temporal lobe bottom for frontal lobe.

3.2. Feature Selection Methods Comparison

It should be noted that three methods for feature ranking were employed and afterwards independent train and test sets were defined to compare their performance using a Random Forest classifier. To have a better estimation of the generalization performance of the work, we evaluated the top 30 selected attributes on the testing dataset, which has not been used during the training process.

Using MATLAB, the calculations are made on an Intel(R) Core (TM) i7CPU 3.3GHz, and 16 GB RAM. Once the preprocessing stage was covered in MATLAB, MAT files were converted to NumPy arrays and the rest of the work in Python (3.7.6) programming language was developed. The computation time for each feature selection method is listed in Table (3) and the performance of the various feature selection methods is listed in Table (4).

Table 3.
The comparison of computational cost of three feature selection methods applied on the train set.
Brain lobe/Selection method Kruskal-Wallis SVM-RFE XGBOOST
Temporal lobe 12.4 min 3411 min 18.7 min
Frontal lobe 10.9 min 2375 min 12 min

By comparing the two above tables, it can be concluded that the filter-based method, Kruskal Wallis, has the highest MMC score and less computation time, while SVM-RFE has a longer computation time compared to the other approaches and shows the poorest performance.

It is worth mentioning that, in several existing publishing works, the researchers commonly selected nearly 5% of the feature sets and investigated the related sensitivity. In this work, 10, 20, and 50 dimension cases were examined and a compression made with 30D based on the MCC score in Table (5), highlighting the KW approach as the winner. The results of both lobes obviously demonstrate an increase in MCC score while is computationally expensive and time consuming.

Table 4.
The performance of the various feature selection methods for both lobes applied on the test Matthews’s correlation coefficient (MCC) set with 95% confidence interval (2 second window for each lobe).
Selection method MCC score
performance Temporal lobe Frontal lobe
Kruskal-Wallis 0.55±0.003 0.24±0.003
SVM-RFE 0.40±0.003 0.11±0.002
XGBOOST 0.496±0.003 0.12±0.002

Table 5.
The comparison of performance of the various dimension for KW method based on MCC score for both lobes applied on the test.
The number of dimensions MCC score
Temporal lobe Frontal lobe
10D 0.44 0.07
20D 0.50 0.12
30D 0.55 0.24
50D 0.57 0.26

The top 30 ranked subsets are listed in Tables (6 and 7) for TLE and FLE, respectively, based on the three feature selection approaches. The most popular feature ranked by the three feature selection methods is AR. The second most important feature is the MAX cross-correlation and complexity is the next one. Interestingly, features like Mean, Skewness, Zero Crossing Rate, and Entropy are not deemed as top 30 ranked feature-subset as in Tables (6 and 7). Also, the following remarks can be made:

Table 6.
Top 30 feature-subset ranked by three types of approaches for temporal lobe epilepsy.
Top Features KW     SVM-RFE       XGBOOST
1 AR High-Gamma error MAX cross Beta AR error Beta
2 AR Alpha error AR Delta coefficient AR Beta error
3 AR Beta coefficient MAX cross High-Gamma AR High-Gamma error
4 AR Beta error MAX cross Beta AR Low-Gamma error
5 MAX cross Low-Gamma MAX cross Beta AR Theta Error
6 MAX cross High-Gamma MAX cross High-Gamma AR Delta error
7 AR High-Gamma Error RMS High-Gamma AR Delta error
8 IQR High-Gamma AR Theta coefficient AR Alpha coefficient
9 AR High-Gamma coefficient AR Theta Error Complexity High-Gamma
10 MAX cross High-Gamma RMS Beta AR Delta coefficient
11 IQR High-Gamma AR Beta coefficient AR High-Gamma error
12 MAD High-Gamma MAX cross Beta AR Delta error
13 AR Beta error MAX cross Delta Complexity Beta
14 Complexity High-Gamma MAX cross Delta AR Alpha coefficient
15 MAX cross Low-Gamma CoV Low-Gamma MAD High-Gamma
16 VAR High-Gamma MAX cross Alpha AR Alpha error
17 Energy High-Gamma RMS Delta MAX cross High-Gamma
18 RMS High-Gamma MAX cross Low-Gamma AR High-Gamma error
19 MAX cross High-Gamma RMS Theta AR coefficient Alpha
20 MAD High-Gamma MAX cross Beta AR High-Gamma coefficient
21 AR Low-Gamma error RMS Delta AR Beta coefficient
22 Complexity Low-Gamma RMS Low-Gamma AR Beta coefficient
23 MAX cross High-Gamma Complexity High-Gamma AR Beta error
24 VAR Low-Gamma Mobility Delta AR Alpha coefficient
25 Energy Low-Gamma Mobility Low-Gamma IQR Beta
26 RMS Low-Gamma MAX cross Delta AR Delta coefficient
27 IQR Low-Gamma Complexity Alpha AR Delta error
28 MAX cross Low-Gamma Complexity High-Gamma AR Alpha error
29 AR coefficient Alpha Complexity Low-Gamma AR Low-Gamma
30 AR High-Gamma error Complexity Delta Mobility Delta
Table 7.
Top 30 feature-subset ranked by three types of approaches for frontal lobe epilepsy.
Top Features KW       SVM-RFE XGBOOST
1 AR Low-Gamma error AR High-Gamma error AR delta coefficient
2 AR Alpha coefficient AR Alpha error AR Theta error
3 AR Theta error MAX cross High-Gamma AR Delta coefficient
4 AR Low-Gamma error AR Theta coefficient AR Theta error
5 AR Low-Gamma error MAX cross Beta AR Theta error
6 AR Theta error AR Delta error AR Delta error
7 Complexity Alpha MAX cross Alpha AR Low-Gamma error
8 IQR Alpha MAX cross Beta AR Alpha coefficient
9 Mobility Beta AR Delta error AR Alpha error
10 IQR Alpha RMS Low-Gamma AR Alpha error
11 IQR Theta MAX cross Delta AR Theta coefficient
12 IQR Beta RMS High-Gamma AR Beta coefficient
13 IQR Low-Gamma MAX cross Alpha AR Beta error
14 IQR Alpha Complexity Alpha Kurtosis High-Gamma
15 Kurtosis Beta MAX cross High-Gamma AR Beta error
16 AR Beta coefficient MAX cross Beta AR Low-Gamma coefficient
17 MAD Alpha MAX cross Theta AR Beta coefficient
18 MAD Beta MAX cross Low-Gamma AR High-Gamma error
19 Complexity Alpha MAX cross Beta AR Beta coefficient
20 IQR Theta RMS Delta AR Beta error
21 AR Beta Error MAX cross Alpha AR High-Gamma error
22 IQR High-Gamma MAX cross Theta AR Delta coefficient
23 AR Theta coefficient Complexity Delta AR High-Gamma error
24 Energy Alpha RMS Low-Gamma coefficient AR Delta error
25 RMS Theta Mobility Alpha AR High-Gamma coefficient
26 VAR Alpha Complexity High-Gamma AR Delta error
27 MAD Alpha Mobility Alpha AR High-Gamma error
28 Complexity Theta Complexity Low-Gamma AR Theta coefficient
29 Complexity Theta Complexity Delta MAD Beta
30 Energy Beta Mobility High-Gamma AR Delta coefficient
  • AR model is an interesting feature along with MAX cross correlation for all three feature selection methods and both lobes
  • Delta sub band is considered an important sub band for XGBOOST and SVM-RFE, but not the case for Kruskal Wallis
  • Error is more important than coefficient as AR parameters in discriminating feature between seizure and non-seizure for all three feature selection methods
Fig. (4). An overview of the features ranked by Kruskal-Wallis (KW) as a winner method.: (a) for temporal lobe epilepsy (TLE); (b) for frontal lobe epilepsy (FLE).

Some of the attributes were selected multiple times by the feature ranking approaches in both Tables, like AR High-Gamma Error which has been chosen by Kruskal Wallis as feature 1, 7, and 30 in Table (6). The reason behind that is such features have been selected and presented without considering the order of the electrodes.

Figs. (4a and 4b) illustrate an overview of the features selected with the KW for temporal and frontal lobes, respectively. Fig. (4a) confirms that amongst the Time-domain parameters that play an important role in the prediction of the seizure, AR is the top feature, followed by the cross correlation and the IQR. On the other hand, Fig. (4b) shows that the AR model, a measure of complexity obtained with Hjorth’s analysis, is an important feature besides IQR for patients with frontal lope epilepsy.

In the last step, another filtering method is applied to obtain the product-moment correlation coefficient, or Pearson correlation coefficient, in order to identify the linear relationship between the 30 top ranked features and, therefore, to eliminate any redundant information. This coefficient can be expressed as


where Cov(X1, X2) is the covariance of two features and óXi is the standard deviation of each variable. ñ can take a value in the range of [-1,+1] with +1 the case of a perfect positive linear relationship between random variables and -1 a negative linear relationship between two features. That is to say, the larger X1 values the smaller X2 values and vice versa. ñ=0 implies the independence between the variables. In other words, the higher absolute value of the correlation coefficient, the more similar they are [7880].

Table 8.
Features with the strongest linear relationship among top 30 features with the highest Kruskal-Wallis scores.
Lobes Attributes with a Strong Linear Relationship
Temporal lobe AR High-Gamma error / AR High-Gamma coefficient
IQR High-Gamma / MAD High-Gamma
Energy High-Gamma / VAR High-Gamma
Complexity High-Gamma / Complexity Low-Gamma
VAR Low-Gamma / Energy Low-Gamma
Frontal lobe Complexity Alpha / Mobility Beta / Energy Beta
MAD Alpha / IQR Alpha / RMS Theta
IQR Beta / MAD Beta
Energy Alpha / VAR Alpha

The product-moment correlation matrix was then calculated for the top 30 subset of features with the highest Kruskal-Wallis scores. The features with the strongest linear pattern are reported in Table (8). Interestingly, one can see a good relationship between IQR and MAD in both lobes, which has a strong linear relationship and it happens in Alpha and Beta sub-bands in frontal lobe and in low and high- Gamma sub-bands for temporal lobe. Also, there is a close relationship between variance and energy for both lobes.


For various reasons, some researchers have divided the EEG signal into various sub-bands [81, 82] while others have not [30, 39]. The aim was, therefore, to consider both cases and evaluate the impact of dividing the EEG signal into various sub-bands. In fact, as shown in Fig. (3), dividing the EEG signal into 6 sub-bands will carry more predictive information than not splitting it.

In this study, three feature selection methods were compared and the results showed that the filter method has the highest performance with the highest MCC. Also, KW can rank the features a lot faster, with the shortest computational time.

A large panel of wrapper approaches has been proposed for features selection but most of them are computationally expensive and complex in nature [5355]. The results obtained in the study confirmed this fact: SVM-RFE has the lowest prediction performance and is the most intensive in terms of computation. Although, in most of the existing works, it has been claimed that the embedded methods that combine filters and wrappers take advantage of both, the obtained results did not really demonstrate that claim, showing that the non-parametric filter-based method, Kruskal Wallis, outperforms better than the above approaches.

The AR method estimates the power spectrum density (PSD) of a given signal. Then, this approach does not have the problem of spectral leakage and one can expect a better frequency resolution dissimilar to the nonparametric method. PSD can be estimated by calculating the coefficients even when the order is low [39, 83, 84]. Furthermore, the prediction error term extracted from an AR model of the brain signals is claimed to reduce during the preictal stage [85].

The maximum of cross-correlation, a bivariate feature, can be considered as a measure for lag synchronization due to estimating the phase difference between two spatially separated sensors even with a low SNR (Signal to Noise Ratio) [86], [87]. The key points for Kruskal Wallis as a winner can be due to not just considering the parameters of AR model and MAX cross coloration. This feature selection tried to engage other important univariate features like complexity, which has an estimation of statistical moment of the power spectrum.

The temporal lobe is responsible to deal with the processing of information and it plays a vital role in long term memory. Gamma rhythms are involved in higher processing tasks and cognitive functioning. These waves are the fundamental waves for learning, memory and information processing. The Frontal lobe is responsible for emotion control center, planning, judgment, and short-term memory. Theta rhythms are produced to help one in creativity, relaxation, and emotional connection. Alpha waves help in the feeling of deep relaxation and Beta waves are related to someone’s consciousness and problem solving [88].

In this study, it was found out that the low and high-gamma sub-bands are the most discriminating ones between preictal and interictal for TLE patients, while the frequency ranges from Theta to low-Gamma were found to be the most discriminating features in six patients with FLE. The obtained results confirmed that gamma sub-bands are a promising biomarker in predicting of seizure for TLE [8991]. However, for FLE, one should consider a wider range of frequencies, including the lower frequency compared to TLE, in the preictal stage. Note that some existing works in detection of FLEs proposed that a range of frequencies less than 50 Hz can play dominant roles among different brain waves [22], [23], [92].

The results also demonstrated the complexity of seizure prediction due to the fact that the frontal lobes of the brain control a wide variety of complex structural and functional roles [15]– [17], [93]. These findings can help establish specific relationships regarding the impact of each lobe in a specific function and the generation of waveforms based on that function.

Furthermore, by comparing the performance results of Kruskal Wallis for both lobes in Table (4), MCC is not close to 1, the perfect prediction case. The reason for not having a high MCC is related to the low capacity of this version of Freiburg database due to having data up to 90 minutes of preictal, or to the fact that some seizures take few minutes. Based on Ramachandran et al. [71], it might be required at least 3 to 16 hours before the onset of the seizure to efficiently predict seizures which can be considered as a limitation in this study and weakness of this database. Another possible solution is employing non-linear measures, such as phase synchronization to improve the model performance in forecasting seizures.

Moreover, a non-stationary signal like EEG can be considered as a stationary signal in a short duration epoch, like a two-second window [94] and [95]. Also, based on the results in Figs. (4a and 4b), the variation of the mean feature for both lobes is nearly a constant value. This result partially confirms the previous claim, but this would require further investigations.

The effectiveness of the Kruskal Wallis as a nonparametric method is based on the fact that it does not need to assume a data distribution model, making the results promising in feature selection of EEG data of TLE and FLE. Therefore, dealing with a higher number of epilepsy patients will not be an issue and this approach would be applicable to a larger set of data. Furthermore, the data divided into train and test sets and the model was built by random forest was trained and validated with 10 fold cross validation. Also, we prevented overfitting, which can be considered as a generalized model for unseen data.

Furthermore, applying MCC in measuring the performance of the model, is a significant improvement over other existing works that deemed accuracy as the performance of the classifier while dealing with imbalanced data [31, 9698]. Specifically, authors in [71] proposed an effective feature extraction method in improving classification accuracy while the imbalanced ratio was not reported and this performance metric was the only measure of the performance of the classifier.

It is believed that the findings of this work can be implemented on low power hardware by efficiently considering less complex features for a specific sub band with the information from only one patient, instead of building and deploying a model for the entire patients in the database.


The epileptic seizures are the temporary occurrence of symptoms due to synchronization of abnormally excessive activities of the brain nerve cells. However, reviewing the EEG signals will be a time consuming task for neurologists to analyze and monitor continuous electroencephalograms. Therefore, even it is quite challenging, implementing a high performance automated analysis of EEG signals is in high demand.

The Kruskal-Wallis feature selection strategy is simple and less time consuming as compared to other approaches. Among the time-domain features investigated, the parameters of AR model are ranked as the top features for both lobes. The second most important features are the maximum of cross-correlation and IQR for temporal and frontal lobes, respectively. Moreover, a high range of frequency like low and high-Gamma have been introduced as an interesting sub-band for the temporal lobe epilepsy, while the middle range of frequencies from Theta to Beta can be seen as important ranges of frequency for frontal lobe epilepsy.

Future efforts should be focused to reliably improve the performance of the prediction on test set for a patient-specific by considering a combination of various features which provide an estimation of phase, frequency and amplitude of the EEG signal.


software, B.A.; visualization, B.A.; writing—original draft preparation, B.A.; writing—review and editing, B.A., C.T., and M.C.E.Y.; supervision, C.T. and M.C.E.Y. All authors have read and agreed to the published version of the manuscript.


Not applicable.


Not applicable.


Not applicable.


We used following database; we did not directly perform the tests ourselves: https://pubmed.ncbi.nlm.nih.gov/17201704/
Predicting Epileptic Seizures in Advance (plos.org)




The authors declare no conflict of interest, financial or otherwise.


Declared none.

Appendix A

Feature is a variable that can represent the signal variation and in this work, common features were selected in analyzing EEG signal which discriminates between pre-ictal and interictal phases of the seizures.

Within a 2 s window, an EEG signal is first passed through six FIR band-pass filters, leading to a total of a 36-dimensional (36D) feature vector. Note that the dimensionality of MAX cross correlation among the 6 channels for each sub-band will be 90D. Then, extracting two features from the AR model (for each sub-band) will add a 72-dimension to the feature matrix, making a final a matrix of 630-dimensions.

In this study, 13 types of univariate features, a bivariate feature along two features extracted from AR model were investigated.

1. Energy:

This feature can be considered as a measure of the signal strength. Calculating the accumulated energy at a given time-point t, is a commonly used feature in finding abnormal behavior in the brain. For a given discrete signal, x(n), the area under the squared of a signal is called energy and is expressed as [36], [39]:

2. Mean:

The mean of a discrete signal, x(n), can be expressed as [37]:

Where N is the number of the samples and x(n) the discrete signal

3. Variance:

The second moment of a signal is called the variance. Higher its value, higher the number of frequency components the signal contains [87]

4. Skewness:

The third statistical moment measures the asymmetry of the probability distribution about its mean.

5. Kurtosis:

The fourth statistical moment describes the flatness of a distribution real valued random variable.

6. Interquartile range:

The Interquartile Range (IQR) feature is a measure of spread and variability based on dividing the data into four equal parts. The separated values Q1, Q2, and Q3 for each part are named respectively first, second, and third quartiles. IQR is computed as the difference between the 75th and the 25th percentile or Q3 and Q1 as the following [38], [39]:

7. Zero crossing rate:

The Zero Crossing Rate (ZCR) is the rate at which the signal changes signs or is the sum of all positive zero crossings into the EEG segment

8. Mean Absolute Deviation:

The robustness of the collected quantitative data can be calculated by MAD (Mean Absolute Deviation). In other words, it is the average distance between each data point and the mean. For a given dataset, x = x1, x2, … xn, MAD can be calculated as [38]:

9. Entropy:

This feature is employed to quantify the degree of uncertainty and irregularity of a signal as well as the complexity of human brain dynamics. The uncertainty of the signal can be computed in terms of the repeatability of its amplitude [38]:

Where P(x) is the probability mass function

10. Hjorth mobility:

The Hjorth mobility parameter represents the square root of the variance of the first derivative of the signal, and it is proportional to the standard deviation of the power spectrum of a time series.

In the above equation, x(t) is a signal and x’(t) its derivative. var(-) is the variance of a signal over a period of time.

11. Hjorth complexity:

The Hjorth complexity defines how the shape of a signal is analogous to an ideal sine curve. This parameter gives an estimation of the bandwidth of the signal.

Complexity = mobility (x'(t)) / mobility (x(t))

One can have an estimation of the second and fourth statistical moment of the power spectrum in the frequency domain by employing the mobility and complexity, correspondingly. While Hjorth parameters are identified in time-domain, they can be useful for both time and frequency analysis. Interestingly, computation of the Hjorth parameters stands on variance, then the cost of their computation is significantly low [87, 99, 100].

12. Coefficient of Variation:

The coefficient of variation (CoV) is a measure is the division of the standard deviation to the mean of a signal [99]:

13. Root Mean Square:

The Root Mean Square (RMS) of a signal can be calculated as [30]:

14. Maximum linear cross-correlation:

As a simple bivariate measure, MAX cross-correlation calculates the linear association between two signals, which also yields fixed delays between two spatially distant EEG signals to accommodate potential signal propagation. This measure can also give us a similarity between two different channels.

Given an EEG signal containing N channels, one can compute the cross-correlation on pairs of channels (e.g. 15 pairs for N=6 for the employed iEEG database). Calculating the MAX cross correlation for six channels results in a 90-D vector.

15. Autoregressive (AR) model:

A sequence of observations ordered in time or space is called time-series and, in the electrical engineering context, is titled as signal. An AR model can be described by modeling the existing value of the variable as a weighted sum of its own preceding values. Similarly, one can employ this concept to forecast the future based on past behavior [101103].

An AR model with order p can be described as the following formula:

where εt is the error term, usually specified as white noise and β= (β1, β2,…, βp) is the AR coefficients.

For a first order, an AR(1) model can be expressed as [102104]:

and its process can be considered as a stationary process once |β1|<1. The coefficient and the error term have been considered as features in the prediction of the seizure [35, 87, 105].

Appendix B

The Freiburg EEG Database is one of the most cited resources employed in predicting and detecting experiments. The interictal and preictal intracranial electroencephalogram (iEEG) recordings of the Freiburg database (FSPEEG) was offered in the early 2000s as an EEG database [34]. The database consists of intracerebral (strips, grid and depth electrodes) EEG recordings from 21 epileptic patients. It contains six intracranial electroencephalography (iEEG) electrodes with a sampling frequency of 256 Hz and a 16-bit A/D converter.

The database contains 24 hours of continuous and discontinuous interictal recordings for 13 patients and eight patients, respectively. Each participant had 2 to 5 preictal recordings with about 50 minutes’ preictal recordings. This database contained 582 hours of EEG data, including preictal recordings of 88 seizures.

An overview of recruited patients is inserted in Table (9). Note that 50 seizures have been employed from 12 patients in this study (mean age: 31; age range: 14-50; both gender).

Table 9.
The information of patients in the dataset. SP = simple partial, CP = complex partial, GTC = generalized tonic-clonic; H = hippocampal origin, NC = neocortical origin; d = depth electrode, g = grid electrode, s = strip electrode.
Patient# Sex Age Seizure type H/NC Origin Electrodes Seizures analyzed
1 F 15 SP, CP NC Frontal g, s 4
2 M 38 SP, CP, GTC H Temporal d 3
3 M 14 SP, CP NC Frontal g, s 5
4 F 26 SP, CP, GTC H Temporal d, g, s 5
5 F 16 SP, CP, GTC NC Frontal g, s 5
7 F 42 SP, CP, GTC H Temporal d 3
8 F 32 SP, CP NC Frontal g, s 2
10 M 47 SP, CP, GTC H Temporal d 5
12 F 42 SP, CP, GTC H Temporal d, g, s 4
16 F 50 SP,CP, GTC H Temporal d, s 5
18 F 25 SP, CP NC Frontal s 5
19 F 28 SP, CP, GTC NC Frontal s 4


D. Ehrens, F. Assaf, N.J. Cowan, S.V. Sarma, and Y. Schiller, "Ultra Broad Band Neural Activity Portends Seizure Onset in a Rat Model of Epilepsy", Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBS, pp. 2276-2279.
The Lancet, "From wonder and fear: make epilepsy a global health priority", The Lancet, vol. 393, no. 10172, p. 612.
B. Abbaszadeh, R.S. Fard, and M.C.E. Yagoub, "Application of Global Coherence Measure to Characterize Coordinated Neural Activity during Frontal and Temporal Lobe Epilepsy", Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBS, pp. 3699-3702.
R. Hussein, M.O. Ahmed, R. Ward, Z.J. Wang, L. Kuhlmann, and Y. Guo, Human Intracranial EEG Quantitative Analysis and Automatic Feature Learning for Epileptic Seizure Prediction, .http://arxiv.org/abs/1904.03603
S. Yang, B. Li, Y. Zhang, M. Duan, S. Liu, Y. Zhang, X. Feng, R. Tan, L. Huang, and F. Zhou, "Selection of features for patient-independent detection of seizure events using scalp EEG signals", Comput. Biol. Med., vol. 119, .103671
A. Singh, and S. Trevick, "The Epidemiology of Global Epilepsy", Neurologic Clinics, vol. 34, no. 4, pp. 837-847.
L. F. Q. Jerrold S. Meyer, Psychopharmacology: Drugs, the Brain, and Behavior., 3rd edOxford University Press, .
S. Farahmand, T. Sobayo, and D.J. Mogul, "Noise-Assisted Multivariate EMD-Based Mean-Phase Coherence Analysis to Evaluate Phase-Synchrony Dynamics in Epilepsy Patients", IEEE Trans. Neural Syst. Rehabil. Eng., vol. 26, no. 12, pp. 2270-2279.
J. Parvizi, and S. Kastner, "Promises and limitations of human intracranial electroencephalography", Nature Neuroscience, vol. 21, no. 4, pp. 474-483.
R.S. Fisher, W. van Emde Boas, W. Blume, C. Elger, P. Genton, P. Lee, and J. Engel Jr, "Epileptic seizures and epilepsy: definitions proposed by the International League Against Epilepsy (ILAE) and the International Bureau for Epilepsy (IBE)", Epilepsia, vol. 46, no. 4, pp. 470-472.
A. Korotkov, J. D. Mills, J. A. Gorter, E. A. Van Vliet, and E. Aronica, "Systematic review and meta-analysis of differentially expressed miRNAs in experimental and human temporal lobe epilepsy", Scientific Reports, vol. 7, no. 1, pp. 1-13.
Y. Fu, Z. Wu, Z. Guo, L. Chen, Y. Ma, Z. Wang, W. Xiao, and Y. Wang, "Systems-level analysis identifies key regulators driving epileptogenesis in temporal lobe epilepsy", Genomics, vol. 112, no. 2, pp. 1768-1780.
M.T. Venø, C.R. Reschke, G. Morris, N.M.C. Connolly, J. Su, Y. Yan, T. Engel, E.M. Jimenez-Mateos, L.M. Harder, D. Pultz, S.J. Haunsberger, A. Pal, J.P. Heller, A. Campbell, E. Langa, G.P. Brennan, K. Conboy, A. Richardson, B.A. Norwood, L.S. Costard, V. Neubert, F. Del Gallo, B. Salvetti, V.R. Vangoor, A. Sanz-Rodriguez, J. Muilu, P.F. Fabene, R.J. Pasterkamp, J.H.M. Prehn, S. Schorge, J.S. Andersen, F. Rosenow, S. Bauer, J. Kjems, and D.C. Henshall, "A systems approach delivers a functional microRNA catalog and expanded targets for seizure suppression in temporal lobe epilepsy", Proc. Natl. Acad. Sci. USA, vol. 117, no. 27, pp. 15977-15988.
M. Alizadeh, L. Kozlowski, J. Muller, N. Ashraf, S. Shahrampour, F.B. Mohamed, C. Wu, and A. Sharan, "Hemispheric Regional Based Analysis of Diffusion Tensor Imaging and Diffusion Tensor Tractography in Patients with Temporal Lobe Epilepsy and Correlation with Patient outcomes", Sci. Rep., vol. 9, no. 1, p. 215.
A. De, A. Konar, A. Samanta, S. Biswas, and P. Basak, "Seizure prediction using low frequency EEG wavesfrom WAG/Rij rats", 2nd International Conference for Convergence in Technology, vol. 2017, no. 244, p. 249.
E. Verche, C. San Luis, and S. Hernández, "Neuropsychology of frontal lobe epilepsy in children and adults: Systematic review and meta-analysis", Epilepsy and Behavior, vol. 88, no. 15, p. 20.
B. Klugah-Brown, C. Luo, R. Peng, H. He, J. Li, L. Dong, and D. Yao, "Altered structural and causal connectivity in frontal lobe epilepsy", BMC Neurol., vol. 19, no. 1, p. 70.
M.O. Baud, S. Vulliemoz, and M. Seeck, "Recurrent secondary generalization in frontal lobe epilepsy: Predictors and a potential link to surgical outcome?", Epilepsia, vol. 56, no. 9, pp. 1454-1462.
M.M. Siddiqui, G. Srivastava, and H. Saeed, "Diagnosis of Nocturnal Frontal Lobe Epilepsy (NFLE) sleep disorder using short time frequency analysis of PSD approach applied on EEG signal", Biomed. Pharmacol. J., vol. 9, no. 1, pp. 393-403.
V. Patel, D. Chisholm, T. Dua, R. Laxminarayan, and M.E. Medina-Mora, Disease Control Priorities, Third Edition (Volume 4): Mental, Neurological, and Substance Use Disorders., Washington, DC: World Bank, .
F. Pisano, "Convolutional neural network for seizure detection of nocturnal frontal lobe epilepsy", Complexity, vol. 2020, .
G. Busonera, M. Cogoni, M. Puligheddu, R. Ferri, G. Milioli, L. Parrino, F. Marrosu, and G. Zanetti, "EEG Spectral Coherence Analysis in Nocturnal Epilepsy", IEEE Trans. Biomed. Eng., vol. 65, no. 12, pp. 2713-2719.
B. Pisano, "Autosomal dominant nocturnal frontal lobe epilepsy seizure characterization through wavelet transform of EEG records and self organizing maps", IEEE International Workshop on Machine Learning for Signal Processing, MLSP, .
C. Opherk, J. Coromilas, and L.J. Hirsch, "Heart rate and EKG changes in 102 seizures: Analysis of influencing factors", Epilepsy Res., vol. 52, no. 2, pp. 117-127.
F. Leutmezer, C. Schernthaner, S. Lurger, K. Pötzelberger, and C. Baumgartner, "Electrocardiographic changes at the onset of epileptic seizures", Epilepsia, vol. 44, no. 3, pp. 348-354.
S. Selim, M. Tantawi, H. Shedeed, and A. Badr, "Reducing execution time for real-time motor imagery based BCI systems", Advances in Intelligent Systems and Computing, vol. 533, pp. 555-565.
S.M. Usman, M. Usman, and S. Fong, "Epileptic seizures prediction using machine learning methods", Comput. Math. Methods Med., vol. 2017, .9074759
X. Wei, L. Zhou, Z. Zhang, Z. Chen, and Y. Zhou, "Early prediction of epileptic seizures using a long-term recurrent convolutional network", J. Neurosci. Methods, vol. 327, .108395
Y. Yang, M. Zhou, Y. Niu, C. Li, R. Cao, B. Wang, P. Yan, Y. Ma, and J. Xiang, "Epileptic seizure prediction based on permutation entropy", Front. Comput. Neurosci., vol. 12, p. 55.
T. Das, A. Ghosh, S. Guha, and P. Basak, Classification of EEG Signals for Prediction of Seizure using Multi-Feature Extraction, .
M.Z. Parvez, and M. Paul, "Epileptic seizure prediction by exploiting spatiotemporal relationship of EEG signals using phase correlation", IEEE Trans. Neural Syst. Rehabil. Eng., vol. 24, no. 1, pp. 158-168.
S. Priyanka, D. Dema, and T. Jayanthi, "Feature selection and classification of Epilepsy from EEG signal", 2017 International Conference on Energy, Communication, Data Analytics and Soft Computing, ICECDS 2017, pp. 2404-2406.
B. Karlik, and Ş. Hayta, Comparison Machine Learning Algorithms for Recognition of Epileptic Seizures in EEG..
EEG Database — Seizure Prediction Project Freiburg, .http://epilepsy.uni-freiburg.de/freiburg-seizure-prediction-project/eeg-database
N. Dhulekar, S. Nambirajan, B. Oztan, and B.Ü.L. Yener, "Seizure prediction by graph mining, transfer learning, and transformation learning", Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 9166, pp. 32-52.
N. Moghim, and D.W. Corne, "Predicting epileptic seizures in advance", PLoS One, vol. 9, no. 6, .e99334
M. Bedeeuzzaman, O. Farooq, and Y.U. Khan, "Automatic Seizure Detection using Inter Quartile Range", Int. J. Comput. Appl., vol. 44, no. 11, pp. 1-5.
P. Balakrishnan, S. Hemalatha, and D.N.S. Keshav, "Detection of Startle-Type Epileptic Seizures using Machine Learning Technique", Int. J. Epilepsy, vol. 5, no. 2, pp. 92-98.
N. Mohan, P.P. Muhammed Shanir, N. Sulthan, K.A. Khan, and S. Sofiya, "Automatic epileptic seizure prediction in scalp EEG", Proceedings - 2nd International Conference on Intelligent Circuits and Systems, ICICS, pp. 281-285.
T. Netoff, Y. Park, and K. Parhi, "Seizure prediction using cost-sensitive support vector machine", Proceedings of the 31st Annual International Conference of the IEEE Engineering in Medicine and Biology Society: Engineering the Future of Biomedicine, EMBC 2009, pp. 3322-3325.
E. Urrestarazu, J.D. Jirsch, P. LeVan, J. Hall, M. Avoli, F. Dubeau, and J. Gotman, "High-frequency intracerebral EEG activity (100-500 Hz) following interictal spikes", Epilepsia, vol. 47, no. 9, pp. 1465-1476.
Y. Park, L. Luo, K.K. Parhi, and T. Netoff, "Seizure prediction with spectral power of EEG using cost-sensitive support vector machines", Epilepsia, vol. 52, no. 10, pp. 1761-1770.
R. J. Urbanowicz, M. Meeker, W. La Cava, R. S. Olson, and J. H. Moore, "Relief-based feature selection: Introduction and review", Journal of Biomedical Informatics, vol. 85, pp. 189-203.
M. Cherrington, F. Thabtah, J. Lu, and Q. Xu, Feature selection: Filter methods performance challenges, .
G. Niu, Data-driven technology for engineering systems health management., Springer Singapore, .
G. Chandrashekar, and F. Sahin, "A survey on feature selection methods", Comput. Electr. Eng., vol. 40, no. 1, pp. 16-28.
C. Sammut, and G. Webb, Encyclopedia of Machine Learning., Springer, .
Subspace, Latent Structure and Feature Selection - Statistical and Optimization Perspectives Workshop, SLSFS 2005 Bohinj, Slovenia, February 23-25, 2005, Revised Selected Papers | Craig Saunders | Springer..https://www.springer.com/gp/book/9783540341376
I. Guyon, S. Gunn, M. Nikravesh, L.A. Zadeh, Eds., Feature Extraction - Foundations and Applications., .
S. Vora, and H. Yang, "A comprehensive study of eleven feature selection algorithms and their impact on text classification", Proceedings of Computing Conference 2017, pp. 440-449.
B. Mwangi, T. S. Tian, and J. C. Soares, "A review of feature reduction techniques in Neuroimaging", Neuroinformatics, vol. 12, no. 2, pp. 229-244.
S. Liang, "Relationship between dynamical characteristics of sit-to-walk motion and physical functions of elderly humans", Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBS, pp. 871-875.
Z. Wang, and Z. Xie, "Infrared face recognition based on local binary patterns and Kruskal-Wallis test", IEEE/ACIS 13th International Conference on Computer and Information Science, ICIS 2014 - Proceedings, pp. 185-188.
H. Sanz, C. Valim, E. Vegas, J.M. Oller, and F. Reverter, "SVM-RFE: selection and visualization of the most relevant features through non-linear kernels", BMC Bioinformatics, vol. 19, no. 1, p. 432.
W. Du, Z. Cao, T. Song, Y. Li, and Y. Liang, "A feature selection method based on multiple kernel learning with expression profiles of different types", BioData Min., vol. 10, no. 1, p. 4.
A. Smolinska, J. Engel, E. Szymanska, L. Buydens, and L. Blanchet, General Framing of Low-, Mid-, and High-Level Data Fusion With Examples in the Life SciencesData Handling in Science and Technology, vol. 31, pp. 51-79.
S.H. Huang, "Supervised feature selection: A tutorial", Artif. Intell. Res., vol. 4, no. 2, p. 22.
A.B. Bastiaan Sjardin, and Massaron Luca, "Large Scale Machine Learning with Python | Packt", Packt Publishing, .https://www.packtpub.com/product/large-scale-machine-learning-with-python/9781785887215
S. Bhattacharya, "A novel PCA-firefly based XGBoost classification model for intrusion detection in networks using GPU", Electron., vol. 9, no. 2, .
M. Chen, Q. Liu, S. Chen, Y. Liu, C.H. Zhang, and R. Liu, "XGBoost-Based Algorithm Interpretation and Application on Post-Fault Transient Stability Status Prediction of Power System", IEEE Access, vol. 7, pp. 13149-13158.
C.P. Hsieh, Y.T. Chen, W.K. Beh, and A.Y.A. Wu, "Feature Selection Framework for XGBoost Based on Electrodermal Activity in Stress Detection", IEEE Workshop on Signal Processing Systems, SiPS: Design and Implementation, pp. 330-335.
P. Mehta, "A high-bias, low-variance introduction to Machine Learning for physicists", Physics Reports, vol. 810, pp. 1-124.
D. Zhang, L. Qian, B. Mao, C. Huang, B. Huang, and Y. Si, "A Data-Driven Design for Fault Detection of Wind Turbines Using Random Forests and XGboost", IEEE Access, vol. 6, pp. 21020-21031.
T.T. Nguyen, J.Z. Huang, and T.T. Nguyen, "Unbiased feature selection in learning random forests for high-dimensional data", Scientific World Journal, vol. 2015, .471371
M.A. Elnaggar, M.A.E.L. Azeem, and F.A. Maghraby, "Machine Learning Model for Predicting Non-performing Agricultural Loans", Advances in Intelligent Systems and Computing, vol. 1153, pp. 395-404.
H. Zhang, J. Zhou, D. Jahed Armaghani, M.M. Tahir, B.T. Pham, and V. Van Huynh, "A Combination of Feature Selection and Random Forest Techniques to Solve a Problem Related to Blast-Induced Ground Vibration", Appl. Sci. (Basel), vol. 10, no. 3, p. 869.
I. Martin-Diaz, D. Morinigo-Sotelo, O. Duque-Perez, and R.D.J. Romero-Troncoso, "Advances in Classifier Evaluation: Novel Insights for an Electric Data-Driven Motor Diagnosis", IEEE Access, vol. 4, pp. 7028-7038.
R. Delgado, and X.A. Tibau, "Why Cohen’s Kappa should be avoided as performance measure in classification", PLoS One, vol. 14, no. 9, .e0222916
D. Chicco, and G. Jurman, "The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation", BMC Genomics, vol. 21, no. 1, p. 6.
M.A.U.H. Tahir, S. Asghar, A. Manzoor, and M.A. Noor, "A Classification Model for Class Imbalance Dataset Using Genetic Programming", IEEE Access, vol. 7, pp. 71013-71037.
V.R.K. Ramachandran, H.J. Alblas, D.V. Le, and N. Meratnia, "Towards an online seizure advisory system—An adaptive seizure prediction framework using active learning heuristics", Sensors (Switzerland), vol. 18, no. 6, .
A.K. Tafreshi, A.M. Nasrabadi, and A.H. Omidvarnia, "Empirical mode decomposition in epileptic seizure prediction", Proceedings of the 8th IEEE International Symposium on Signal Processing and Information Technology, ISSPIT 2008, pp. 275-280.
R. Henckaerts, Modeling, Predicting and Controlling Epileptic Seizures., KU Leuven: Leuven, Belgium, .
G. Rivera, R. Florencia, V. García, A. Ruiz, and J.P. Sánchez-Solís, "News classification for identifying traffic incident points in a Spanish-speaking country: A real-world case study of class imbalance learning", Appl. Sci. (Basel), vol. 10, no. 18, .
N. Li, M. Shepperd, and Y. Guo, "A systematic review of unsupervised learning techniques for software defect prediction", Information and Software Technology, vol. 122, no. 01, .
G. Schneider, Adaptive Systems in Drug Design., CRC Press, .
S. Boughorbel, F. Jarray, and M. El-Anbari, "Optimal classifier for imbalanced data using Matthews Correlation Coefficient metric", PLoS One, vol. 12, no. 6, .e0177678
S.X. McLachlan, G. J., Rathnayake, S., and Lee, Comprehensive Chemometrics-Chemical and Biochemical Data Analysis., 2nd edElsevier, .
Physics and Engineering of Radiation Detection., Elsevier, .
V.A. Profillidis, and G.N. Botzoris, Statistical Methods for Transport Demand Modeling.Modeling of Transport Demand., Elsevier, pp. 163-224.
P. Fergus, A. Hussain, D. Hignett, D. Al-Jumeily, K. Abdel-Aziz, and H. Hamdan, "A machine learning system for automated whole-brain seizure detection", Appl. Comput. Informatics, vol. 12, no. 1, pp. 70-89.
A. Sharma, J.K. Rai, and R.P. Tewari, "Epileptic seizure anticipation and localisation of epileptogenic region using EEG signals", J. Med. Eng. Technol., vol. 42, no. 3, pp. 203-216.
A. Mulye, Power Spectrum Density Estimation Methods for Michelson Interferometer Wavemeters., University of Ottawa, .
A.S. Al-Fahoum, and A.A. Al-Fraihat, "Methods of EEG signal features extraction using linear analysis in frequency and time-frequency domains", ISRN Neurosci., vol. 2014, .730218
J.M. Girault, F. Ossant, A. Ouahabi, D. Kouamé, and F. Patat, "Time-varying autoregressive spectral estimation for ultrasound attenuation in tissue characterization", IEEE Trans. Ultrason. Ferroelectr. Freq. Control, vol. 45, no. 3, pp. 650-659.
C. Teixeira, "Brainatic: A system for real-time epileptic seizure prediction", Biosystems and Biorobotics, vol. 6, pp. 7-17.
F. Mormann, T. Kreuz, C. Rieke, R.G. Andrzejak, A. Kraskov, P. David, C.E. Elger, and K. Lehnertz, "On the predictability of epileptic seizures", Clin. Neurophysiol., vol. 116, no. 3, pp. 569-587.
X. Sun, Send Orders for Reprints to reprints@benthamscience.ae New Phase Difference Measurement Method for Non-Integer Number of Signal Periods Based on Multiple Cross-Correlations, .
P.A. Abhang, B.W. Gawali, and S.C. Mehrotra, Introduction to EEG- and Speech-Based Emotion Recognition., Elsevier Inc., .
G. Buzsáki, and F.L. Silva, "High frequency oscillations in the intact brain", Prog. Neurobiol., vol. 98, no. 3, pp. 241-249.
B. Frauscher, N. von Ellenrieder, R. Zelmann, C. Rogers, D.K. Nguyen, P. Kahane, F. Dubeau, and J. Gotman, "High-Frequency Oscillations in the Normal Human Brain", Ann. Neurol., vol. 84, no. 3, pp. 374-385.
E. H. Smith, "Dual mechanisms of ictal high frequency oscillations in rhythmic onset seizures", med Rxiv, .
T. Uchida, K. Fujiwara, T. Inoue, Y. Maruta, M. Kano, and M. Suzuki, "Analysis of VNS Effect on EEG Connectivity with Granger Causality and Graph Theory", Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2018 - Proceedings, pp. 861-864.
X. Wu, W. Liu, W. Wang, H. Gao, N. Hao, Q. Yue, Q. Gong, and D. Zhou, "Altered intrinsic brain activity associated with outcome in frontal lobe epilepsy", Sci. Rep., vol. 9, no. 1, p. 8989.
M.K. Islam, Artifact Characterization, Detection and Removal from Neural Signals., National University of Singapore, .
D.A. Singh, and O. Aktas, The Window Size for Classification of Epileptic Seizures based on Analysis of EEG Patterns, .http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-186439
C. Xiao, S. Wang, L. Iasemidis, S. Wong, and W. A. Chaovalitwongse, "An Adaptive Pattern Learning Framework to Personalize Online Seizure Prediction", IEEE Trans. Big Data,, pp. 1-1.
T. Wen, and Z. Zhang, "Effective and extensible feature extraction method using genetic algorithm-based frequency-domain feature search for epileptic EEG multiclassification", Medicine (Baltimore), vol. 96, no. 19, .e6879
S. Ertekin, Learning in Extreme Conditions: Online and Active Learning with Massive, Imbalanced and Noisy Data., The Pennsylvania State University, .
U.R. Acharya, S. Vinitha Sree, G. Swapna, R.J. Martis, and J.S. Suri, "Automated EEG analysis of epilepsy: A review", Knowl. Base. Syst., vol. 45, pp. 147-165.
S. Aydin, "Determination of autoregressive model orders for seizure detection", Comput. Sci., vol. 18, no. 1, .
P. S. Nagpaul, Time Series Analysis in WinIDAMS, .
E.E. Holmes, M.D. Scheuerell, and E.J. Ward, "Applied Time Series Analysis for Fisheries and Environmental Sciences", NOAA Fisheries, Northwest Fisheries Science Center, . Available at: https:// nwfsc-timeseries.github.io/atsa-labs/sec-dlm-forecasting-with-a-univariate-dlm.html
R. Sharma, P. Sircar, and R.B. Pachori, Computer-aided diagnosis of epilepsy using bispectrum of EEG signals.Application of Biomedical Engineering in Neuroscience., Springer Singapore, pp. 197-220.