Essentials of Predicting Epileptic Seizures Based on EEG Using Machine Learning: A Review

The challenge to predict seizures using modern machine learning algorithms and computing resources would be a boon to a person with epilepsy and its caregivers. Researchers have shown great interest in the task of epileptic seizure prediction for a few decades. However, the results obtained have not clinical applicability because of the high false-positive ratio. The lack of standard practices in the field of epileptic seizure prediction makes it challenging for novice ones to follow the research. The chances of reproducibility of the result are negligible due to the unavailability of implementation environment-related details, use of standard datasets, and evaluation parameters.


INTRODUCTION
Epilepsy is a chronic non-communicable disease of the brain that affects people of all ages. According to the World Health Organization (WHO), around 50 million people suffer from epilepsy worldwide. It is estimated that there are more than 10 million people with epilepsy in India. Globally, an estimated five million people are diagnosed with epilepsy each year. Epilepsy leads to behavioral, health, and economic consequences. Efficient diagnosis and treatment are very much The organization of this paper is as follows: Section-2 describes the fundamentals of epilepsy and treatment. Section-3 and 4 describe the use of electroencephalogra-phy (EEG) for epileptic seizure prediction and the need for algorithms. Section-5 gives the comparative analysis of various approaches for epileptic seizure detection and prediction, followed by Section-6 with the research gaps. Section-7 gives a detailed understanding of datasets used in prior research of epileptic seizure prediction. Section-8 gives a comparative analysis of tools and libraries used for the implementation of epileptic seizure prediction systems, which is followed by the conclusion.

EPILEPSY AND TREATMENT
Seizures are due to excessively synchronous and/or excessively intense activity of neuronal circuits in the brain, particularly in the cerebral cortex. In epilepsy, seizures occur spontaneously, repeatedly, and usually suddenly. The unpredictability of seizures represents one of the main disabling features of epilepsy [1]. To date, the causes of epilepsy have not been identified. However, conditions like severe head injury, stroke and blood vessel diseases, tumors, changes in brain structure, and brain infections could provoke seizures [2]. Epileptic seizures can be classified into three groups [3]. Generalized Onset Seizures: These seizures affect both sides of the brain or groups of cells on both sides of the brain at the same time. This term still includes seizure types like tonic-clonic, absence, or atonic. Focal Onset Seizures: It can start in one area or group of cells on one side of the brain. Unknown Onset Seizures: When the beginning of a seizure is not known, it is called an unknown onset seizure.
Epileptic seizures have four different states: the preictal state, which is a state that appears before the seizure begins, the ictal state that begins with the onset of the seizure and ends with an attack; the postictal state that starts after ictal state, and interictal state that starts after the postictal state of first seizure and ends before the start of the preictal state of consecutive seizure [4]. The effect of epilepsy is different in each individual. So, recognizing and diagnosing the type of seizure or epilepsy affecting a person can sometimes be challenging. However, there are a few common ways of testing and determining an epilepsy diagnosis [3]. Various methods, like electroencephalography (EEG), Computerized Tomography (CT) scans, Magnetic Resonance Imaging (MRI), and functional imaging studies, are used to evaluate a person with epilepsy [5,6]. EEG is the most common method amongst all for the diagnosis and treatment of epilepsy. Persons with epilepsy are generally treated with Anti-Epileptic Drugs (AEDs). A high dosage of AEDs optimally controls epileptic seizures. However, regular consumption of AEDs generally shows side effects like tiredness, headache, dizziness, or blurred vision. It may also lead to behavioral changes in the person with epilepsy [7]. About 20-40% of persons with epilepsy are drug-resistant, i.e., they do not respond to AEDs even though a variety of drugs are available for decades [8,9], of whom only a small minority can be helped by epilepsy surgery [10].

ELECTROENCEPHALOGRAPHY(EEG)
An electroencephalogram (EEG) is the flow of neuronal ionic currents recorded using a pair of electrodes either inside or outside the scalp [11]. The applications of EEG signal processing are brain-computer interface, seizure detection, seizure prediction, schizophrenia detection and classification, diagnosis of Parkinson's disease, etc. EEG is significantly used in the diagnosis, classification, and treatment of epileptic seizures [8]. If an invasive technique is used to record the EEG signal from inside the skull, it is called intracranial EEG (iEEG). In a non-invasive technique, EEG signals are recorded from the scalp, called scalp EEG (sEEG). EEG waveforms are generally classified into normal and abnormal signals using frequency parameters [12]. EEG signals can be categorised into the following based on frequency: Delta (0.1 -4 Hz), Theta (4 -8 Hz), Alpha (8)(9)(10)(11)(12)(13), and Gamma . Different EEG frequency corresponds to different behavior and mental state of the brain [12]. Due to the high complexity of EEG signals, a single prediction feature can only quantify some of its properties [7].
The popularity of EEG is because of the following advantages: scalp EEG (sEEG) is a non-invasive technique which records the waveforms without much effort or active response by the subject/patient, EEG recording kit is portable and financially affordable, EEG recording devices do not make any noise and no special environment is needed to set it up [11]. Though EEG techniques are widely evolved, there are still a few limitations that need to be addressed. Such as, EEG is prone to low spatial resolution and low signal-to-noise ratio (SNR) [11]. Preprocessing of EEG is also very challenging. To have an artefact-free EEG to extract the control signals, the EEGs have to be restored from the artefacts, such as eyeblinking, electrocardiograms (ECGs), and any other internal or external disturbing effects [13].

THE NEED FOR EPILEPTIC SEIZURE PREDICTION ALGORITHMS
With the evolving EEG technology and resource advances, there has been huge interest in the EEG waveform-based research for brain-computer interface (BCI), disease detection, and treatment. Characterization of EEG waveforms plays a vital role in the field of epileptic seizure detection and classification. With the use of sEEG or iEEG signals, certain patterns can be found to detect the preictal state of seizure. The detection of a preictal state would trigger an alarm for the patient or patient's caregivers to take precautions or medicine beforehand to avoid the ill effect of seizure [7]. Seizure detection methods can be used for the offline analysis of EEG waveforms or seizure-abortion devices. Whereas the seizure prediction system identifies the occurrence of seizure before a certain period called the occurrence period [7].
With rapidly increasing computing power and storage, the availability of EEG data has become easier. Researchers have started using the power of modern machine learning algorithms to improve the results of seizure prediction algorithms [14 -27]. Though ample research has already been done, there is no clinical applicability yet [28,29]. This is because of the sensitivity of the seizure prediction algorithms. One of the challenges in the study of epileptic seizure detection and prediction is to achieve the results to apply the same for clinical applicability [7]. The statistical justification is also desirable for a complete understanding of existing approaches that achieve significant results [7]. For clinical applicability of seizure prediction approaches, the alarm shall trigger prior to a considerable time period. So, the seizure prediction horizon needs to be one of the important evaluation parameters of such systems [7]. Researchers have incorporated various approaches for epileptic seizure prediction. The following sections review the datasets incorporated, tools and libraries used and different algorithms considered for the same. This would help novice ones to explore and improve machine learning-based epileptic seizure prediction algorithms.

MACHINE LEARNING ALGORITHMS AND EVALUATION PARAMETERS
The work for epilepsy seizure prediction has evolved drastically since its inception. Based on various methodologies followed and algorithms used, epilepsy seizure prediction can be categorized into four approaches as follows. Also, Fig. (1) shows all four approaches: The traditional machine learning approach [1] Deep learning approach [2] Signal processing approach [3] Hybrid approach [4] Fig. (1). Diagrammatic representation of commonly used approaches for epileptic seizure prediction.

The Traditional Machine Learning Approach
The traditional machine learning approach includes various stages like preprocessing, feature extraction, feature selection, classification, and validation. Handcrafted features are being used for the detection of the preictal state. A brief description of all the stages is as follows: Data Cleaning, EEG data is nonstationary and prone to artefacts, it requires high preprocessing. This is the phase that takes maximum effort to read, interpret, and clean the noise. Feature Extraction, using different signal processing techniques, features are extracted in time domain, frequency domain, or time and frequency domain. Feature Selection, a huge amount of features is being generated from the feature extraction phase. It is important to find the feature of importance for a specific task. Some of the features considered by researchers are entropy, approximate entropy, Hjorth parameters, spectral moments, mobility, energy, entropy, correlation coefficients, Fast Fourier Transform (FFT), variance, skewness, kurtosis, mean, fractal dimension, frequency band power, peak amplitude, zero crossing, average spectral power, line length, and maximal and minimal values. Classification, the problem of epileptic seizure prediction and detection is actually to differentiate the pre-ictal state of EEG signal from the interictal and ictal state. Different classification algorithms like Support Vector Machine, knearest neighbor, Gaussian Naive Bayes, random forest, multilayer perceptron are used for this purpose, Validation, the hyperparameters are not trained by classification models, it would be predefined. Model performance depends highly on the selection of hyperparameters. The best model amongst various values of hyperparameters is chosen in the validation stage. Prediction is the last phase that performs the task of prediction based on the optimal performing model. Table 1 gives the detailed comparative analysis of various studies conducted using the traditional machine learning approach for epileptic seizure state detection and prediction.

Deep Learning Approch
Deep learning-based models are end-to-end models, i.e. once EEG signals are given as input, an automated approach for feature extraction and selection is done. Based on the learning and validation, the model would converge and give the prediction results. Another noticeable approach for epileptic seizure prediction is based on signal processing methods. The large amount of data recorded from even a single EEG electrode pair presents a difficult interpretation challenge. Convolutional Neural Network (CNN), Recurrent Neural Network (RNN), and variations of these algorithms have been used by researchers in the past for the task of seizure prediction. This approach is trending because of the availability of resources. Table 2 gives the detailed comparative analysis of various studies conducted using the deep learning approach for epileptic seizure state detection and prediction.

Signal Processing Approach
In this approach, traditional signal processing methods are used to highly process the data. Signal processing methods are needed to automate signal analysis and interpret the signal phenomena [30]. The prediction stage of this approach uses a basic classification algorithm for signal state classification. Table 3 gives the detailed comparative analysis of various studies conducted using the signal processing approach for epileptic seizure state detection and prediction.

Hybrid Approach
The combination of all three methods, i.e., traditional machine learning, deep learning, and signal processing leads to a hybrid approach, which takes best of all. Literature provides the data related to the combination of traditional machine learning and deep learning models, deep learning, and signal processing models, machine learning and signal processing methods deep learning methods. References for some hybrid approaches provides the detailed comparative analysis of various studies conducted using the hybrid approach to epileptic seizure state detection and prediction.
To evaluate the performance of various models, different evaluation parameters like sensitivity, specificity, false-positive rate, accuracy, prediction time, AUC, ROC, and F1 score have been considered in the past by other researchers. However, there is no clear consensus regarding which parameter signifies the performance of epileptic seizure detection or prediction algorithms best. Yannick Roy et al. [31] and Alexander Craik et al. [32] have presented a vast comparison of all work done in the field of epileptic seizure prediction. Along with the comparison, various aspects for future research are also mentioned.

RESEARCH GAPS
To increase the reliability of seizure prediction results and clinical applicability of the same, evolved models of EEG signal analysis are the need of the time [11]. This section describes some of the improvement areas which could aid in the performance of seizure detection and prediction models, which may lead to clinical applicability. No methods have exhibited both high sensitivity and zero false alarms per hour to achieve the reliability of results [28]. Also, existing machine learning algorithms unnecessarily reduce the number of parameters in feature selection for simplistic classification [28]. The advanced techniques can be used for preprocessing of EEG data to get increased sensitivity of the results [4]. Identification of proper matrices for the evaluation of seizure prediction model is also a big challenge for highly imbalanced data. Most of the work done in the field of epileptic seizure detection and prediction focuses on patient-specific approaches. The concept of domain adaptation and transfer learning can be used for cross-patient research, i.e. generalized models [28]. A better generalization performance between subjects will be necessary to truly make BCIs useful [31]. All prediction methods have been developed and tested on different EEG data pools, making it difficult to compare their performance [101,119].      Table 5 shows the comparative analysis of datasets used by various researchers for epileptic seizure detection and prediction. The datasets considered are highly unlike each other. The differences are in terms of EEG recording mechanism used i.e., sEEG or iEEG, number of subjects used, number of channels used, duration of the recordings, and number of recordings with the true positive case. The '*' in the Table 5 suggests that the parameter has not been specified by the researchers. The research is also diversified in intra-subject (subject-specific), and inter-subject (adaptive) approaches. The selection of the dataset is highly dependent on these parameters. As derived from Table 5, majorly used datasets in the field of epileptic seizure detection/prediction are CHB-MIT [81,82], Kaggle competition dataset [83], Bonn university dataset [84], and BCI competition dataset [85]. Apart from these, available significant datasets are the TUH dataset [86], European epilepsy dataset [87], and EEG epilepsy dataset [88]. Based on the subject-specific or adaptive approach of epileptic seizure prediction, the selection of the dataset is done [89]. For reproducibility of the results, utilization of the dataset must be clearly mentioned.

TOOLS AND LIBRARIES USED FOR IMPLE-MENTATION
Though an ample amount of work has already been done in the field of epileptic seizure detection and prediction, the available published work does not contain much information related to the implementation environment used by the researchers. Table 6 summarizes the details available in past work on epileptic seizure detection and prediction.
It can be derived from Table 6 [111]. Whereas R provides packages like eegkit [112], eegUtils [113], eegR [114], erpR [115], and ERP [116]. While undertaking any research project which includes EEG signal analysis, researchers spent an ample amount of time in finalizing the platform to adapt for implementation. Following detail shows a comparison of Matlab, Python, and R concerning various parameters useful for implementation: Table 6. Frameworks used by various researchers for epileptic seizure prediction.
Open Source: The major advantage of using Python and R over Matlab is that they are open source, making them quite attractive solution to many applications. Applications developed in Python and R can be widely distributable, making it easier to enable collaboration between scientists at various locations. Maturity of Tools: The tools supported by Matlab are quite mature to use compared to Python and R. It is best suitable for detailed analysis of the EEG signals. Flexibility of Manipulation: The maturity makes it less flexible to manipulate and extend features in Matlab. Community Support: Community support has important significance in selecting the specific platform for research. Matlab and Python both have huge community support when it comes to EEG signal processing. This area is quite evolving in R; however, the statistical analysis performed by R is no comparable.
One of the best solutions could be to use the best feature of each platform and use it in integration. Python has an Oct2py library [117], which converts Python data structures to Matlab or Octave data structures and vice versa. It is the simplest and most stable way to run Matlab functions on Python, and most EEGLAB functions may be called from within python using this method. R provides the reticulate package [118], which allows running Python code directly within R.

CONCLUSION
EEG-based epilepsy detection and prediction using a machine learning approached has taken a boost following the technology evolution. The essential aspects of EEG waveformbased research for epileptic seizure detection and prediction have been discussed in detail. The comparative analysis of various datasets and implementation platforms considered in the past approaches along with details of traditional machine learning, deep learning, and combination of both -the hybrid approach is given, which would help novice to begin. The selection of dataset and implementation environment is highly dependent on requirement of the research. Various criteria play an important role, such as for datasets: subject-specific research or generalized research to be carried out, number of channels to consider, number of patients to consider; and for implementation environment: budget, programming expertize, and size of the dataset. Shortcomings of epileptic seizure prediction approaches are also discussed to shed light on future enhancements.

CONSENT FOR PUBLICATION
Not applicable.