Close banner

2022-07-29 09:59:29 By : Ms. Butterfly Huang

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Scientific Reports volume  12, Article number: 13010 (2022 ) Cite this article

Electroencephalogram (EEG) is one of the main diagnostic tests for epilepsy. The detection of epileptic activity is usually performed by a human expert and is based on finding specific patterns in the multi-channel electroencephalogram. This is a difficult and time-consuming task, therefore various attempts are made to automate it using both conventional and Deep Learning (DL) techniques. Unfortunately, authors do not often provide sufficiently detailed and complete information to be able to reproduce their results. Our work is intended to fill this gap. Using a carefully selected 79 neonatal EEG recordings we developed a complete framework for seizure detection using DL approch. We share a ready to use R and Python codes which allow: (a) read raw European Data Format files, (b) read data files containing the seizure annotations made by human experts, (c) extract train, validation and test data, (d) create an appropriate Convolutional Neural Network (CNN) model, (e) train the model, (f) check the quality of the neural classifier, (g) save all learning results.

Epilepsy is a neurological disorder of the brain that affects people of all ages. Around 50 million people worldwide have epilepsy, making it one of the most common neurological diseases globally. It is estimated that up to 70% of people living with epilepsy could live seizure-free if properly diagnosed and treated with the help of anti-epileptic drugs1. A relatively small number of patients require surgical intervention (mainly those who are resistant to drug therapy) and/or electrical stimulation2,3.

The source of this disease is still not well understood. However, despite this, many patients can be medically treated if seizures are diagnosed on time. As a gold standard, the electroencephalogram (EEG) signal is very important in the diagnosis of epilepsy. The EEG recordings are collected by placing electrodes on the scalp of the patient and then record the electrical signals produced by the brain. Typically, diagnosis using EEG signals is carried out using the knowledge and experience of experts, based on visual inspection of the seizure signals recorded during EEG sessions. However this process is subject to errors, expensive and slow. It is not so rare that two independent experts will evaluate the same electroencephalogram significantly different4. This is not a desirable situation, as it may lead to, for example, improper treatment.

The aim of this paper was to develop a complete framework for EEG-based seizure detection using Deep Learning (DL) techniques. We have chosen Convolutional Neural Network (CNN) approach as currently one of the most promising technologies used in the area of data analysis. To present the developed framework we chose the EEG database with carefully selected 79 neonatal EEG recording along with seizure annotations made by three human experts5. Let us mention that this dataset was also used by other researchers in their works6,7,8. In6 the authors developed a novel method for detecting the nonstationary periodic characteristics of EEG signals to detect periods of seizure and nonseizure activities. In7the authors use similar to ours methodology based on CNNs and also note the need for large amounts of training data to achieve satisfactory results. They use a concept called weak annotations9 to increase the amount of training data. In8 authors assess how different deep learning models and data balancing methods influence learning in neonatal seizure detection. They also propose a model which provides a level of importance to each of the EEG channels which help clinicians understand which channels contributed most to the detection of seizure.

In recent times DL techniques have been shown to be very useful for solving many complex tasks, mainly related to the classification of images, video sequences and text data, see as an example two selected works10,11. A lot of papers have also been published in which the authors present the results of numerous automated seizure detection algorithms (SDA) for EEG signals. In three review works12,13,14, the authors have compiled most of these results. Different DL models have been used in SDA such as classical sequential CNNs but also its variants like Convolutional Autoencoders (CNN-AEs), Convolutional Recurrent Reural Networks (CNN-RNNs), Long Short-Term Memory (LSTM)15. Our work can be considered as another proposal in this area of research.

It should also be mentioned that there are quite a few non-EEG-based methods for epileptic seizures detection like near-infrared spectroscopy (NIRS), functional MRI (fMRI), positron emission tomography (PET), magnetoencephalography (MEG) or electrocorticography (ECoG)16.

The main contributions of this paper can be listed as follows:

We have proposed a DL-like framework based on CNN for detecting seizure activities and test its usability on a real neonatal EEG dataset.

We have proposed a sliding window design to generate fully balanced training data. The design can greatly increase the amount of data which is then fed to the neural network. This can be seen as a kind of data augmentation and this process is crucial for CNNs which typically require large amounts of data to operate effectively and produce useful results.

We have developed a solution for reading raw EDF and annotation files with seizure indications made by human experts. Based on these data a training dataset for CNN network is generated and saved in HDF5 format. This work was programmed in R programming environment and shared to the user as ready-to-use R scripts.

We have developed a CNN model which can be successfully trained to detect seizure episodes. The obtained results of the classification (at the level of 96–97%) should be considered almost perfect. This work was programmed in Python programming environment and shared to the user as a ready-to-use Python Jupyter notebook.

We have made it our priority to ensure that all the presented results are fully reproducible by other researchers. Therefore, all the source codes as well as all the output results obtained by the authors have been included in the Supplementary Information files. Detailed instructions on how to do this have been also included.

We consider this point particularly important. To cite a very extensive review work14, we have that “...the great majority of papers did not make their code available. Many papers reviewed are thus more difficult to reproduce: the data is not available, the code has not been shared, and the baseline models that were used to compare the performances of the models are either nonexistent or not available.”.

The study will also help readers to analyze their own EEG datasets with only minor modifications to our R and Python codes (adjusting them to possible differences in the EEG data used and in the way seizures are annotated).

The overall workflow of the of the proposed system.

The overall workflow of the proposed system, schematically depicted in Fig. 1, is decomposed into 4 main phases: (1) preprocessing of the raw EEG recordings and annotation files, (2) building CNN model, (3) training CNN model, (4) generating final classification results. The preprocessing stage is designed to load the input data (raw EDF and annotation files) and convert it to a format that can be submitted to the CNN model. This step has been implemented in the R software version 4.1.217. Building a CNN model, training it and finally generating all the results has been implemented in TesnorFlow version 2.8.018 and delivered as a Python Jupyter notebook19.

The study was conducted on a carefully selected 79 neonatal EEG recordings dataset. The neonates were admitted to the neonatal intensive care unit (NICU) at the Helsinki University Hospital between 2010 and 2014. The cohort is described in detail in5, please refer to the source text. Moreover, the relevant ethics approval has been included therein. All experiments were performed in accordance with the relevant guidelines and regulations.

A neonatal seizure is a seizure in a baby younger than 4 weeks old. Such seizures differ from those of older children and adults, mainly due to brain immaturity20. The most frequent neonatal seizures are described as subtle because the clinical manifestations are frequently overlooked21.

The neonatal EEG dataset consists of (a) 79 raw EDF files, (b) 3 annotation files in CSV and Matlab MAT formats. EDF files contain EEG referential signals recorded with 19 electrodes positioned as per the international 10-20 standard (Fp1, Fp2, F3, F4, F7, F8, Fz, C3, C4, Cz, P3, P4, Pz, T3, T4, T5, T6, O1, O2). Sampling frequency was set to 256 Hz and the signals were recorded in microvolts. The complete dataset is available at https://zenodo.org/record/4940267.

Electrode locations of International 10–20 system for EEG recording (figure taken from30).

The raw signals are not used directly. Instead, the so-called bipolar montage was generated known by the slang ’double banana’ (Fp2-F4, F4-C4, C4-P4, P4-O2, Fp1-F3, F3-C3, C3-P3, P3-O1, Fp2-F8, F8-T4, T4-T6, T6-O2, Fp1-F7, F7-T3, T3-T5, T5-O1, Fz-Cz, Cz-Pz), see Fig. 2. This bipolar EEG montage was used by 3 independent experts to annotate the presence of seizures.

The annotation files are sampled with one second resolution. The detailed structure of these files is described in5. Since reading this data directly from CSV or MAT files is quite inconvenient, we have collected basic quantitative data on seizures and included them in two tables. Table 6 shows how many seizures were annotated for each infant by each of the three experts. Note that we have 40 neonates with seizures annotated by 3 experts and 17 neonates had seizures annotated by 1 or 2 experts. 22 neonates were seizure free. The experts are marked as A, B or C. Table 7 shows a complete list of lengths of seizures annotated by 3 experts (in whole seconds). The total number of seizures is 1,379, which is obviously the same as shown in the last lines of Table 6. Tables 6 and 7 are very long but the authors decided to include them in its entirety, as obtaining this data from CSV files manually would be very time consuming. The use of an appropriate software here is basically essential. An additional summary of the annotations is provided in Table 8.

Let us note here that in many cases there is a discrepancy in the annotations of individual experts. For example, for infant number 41, experts A and C indicated significantly more seizures than expert B. The lengths of individual seizures also very often vary between experts. Such a variety of end results (no consensus among experts) is rather quite natural in the field of EEG signal analysis4.

The raw EEG recordings were preprocessed (reading and saving in Hierarchical Data Format (HDF5)) using the R software, version 4.1.217. HDF5 format was chosen because it is an ideal choice for storing and organizing large amounts of data.

In our research Keras DL library was used to develop the CNN model22. It is also worth to note that Keras is a wrapper to TensorFlow’s framework18. Keras was adopted and integrated into TensorFlow in mid-2017. Users can access it via the tf.keras module. TensorFlow, on the other hand, is an open-source DL framework developed by Google and released in 2015. Typically, one can define a model with Keras’ interface, which is easier to use, then drop down into TensorFlow if you need to use a feature that Keras doesn’t have, or you’re looking for a specific TensorFlow functionality.

Due to the required great computing power, our code was run in the Colaboratory cloud service hosted by Google23, where fast GPU graphics cards are available (https://colab.research.google.com). The Google Colaboratory service allows users to write and run Python code directly in the WWW browser, which is an extremely convenient solution. A similar functionality is offered by the Kaggle service (https://www.kaggle.com/).

This chapter describes in detail how to prepare datasets for further analysis. This is a very important issue that, if not done properly, may have an impact on the final results of EEG signals classification. Unfortunately, in many papers the authors omit a more detailed description of this stage. We would like to fill this gap here. There are several steps involved in this process, as described below.

Two exemplary EEG signals. At the top, the original signal sampled at 256 Hz is depicted, at the bottom one can see the signal after reducing the frequency to 64 Hz.

Step 1. Selection of EEG recordings: The data is analyzed separately for each expert (A, B or C). We are dealing here with a binary classification (seizure / non-seizure). Therefore, it is necessary to select from the available EEG signals those that were assessed by experts as seizures and those assessed as seizure free.

40 neonates had a seizure annotated by all 3 experts (infants No. 1, 4, 5, 7, 9, 11, 13, 14, 15, 16, 17, 19, 20, 21, 22, 25, 31, 34, 36, 38, 39, 40, 41, 44, 47, 50, 51, 52, 62, 63, 66, 67, 69, 71, 73, 75, 76, 77, 78, 79). We mark this subset as EXP3. 22 neonates were seizure free (infants No. 3, 10, 18, 27, 28, 29, 30, 32, 35, 37, 42, 45, 48, 49, 53, 55, 57, 58, 59, 60, 70, 72). We mark this subset as EXP0. Finally, the remaining 17 neonates had a seizure annotated by only 1 or 2 experts (infants No. 2, 6, 8, 12, 23, 24, 26, 33, 43, 46, 54, 56, 61, 64, 65, 68, 74). We mark this subset as EXP12. Due to the ambiguity in the expert opinion this subset was excluded from the analysis. Table 6 summarises the three subsets.

Step 2. Bipolar montage: In the next step the bipolar montage was generated as described in “Input data” section. At this point, it should be noted that the order of signals in individual EDF files is different, so it is required to always set them in the same order. It is a small but very important part of data preprocessing. For example in the EDF file of the infant No. 1 the order of raw signals is Fp1, Fp2, F3, F4, C3, C4, P3, P4, O1, O2, F7, F8, T3, T4, T5, T6, Fz, Cz, Pz while in the file of the infant No. 2 the order is Fp1, Fp2, F3, F4, F7, F8, Fz, C3, C4, Cz, T3, T5, T4, T6, P3, P4, Pz, O1, O2. This step has been implemented in R.

Step 3. Down-sampling: Sampling frequency of the EEG recordings was set to 256 Hz. In the case of analyzes using neural networks, this frequency is too high and unnecessarily increases the size of the input data (already quite large). Therefore, the data is down-sampled. After performing various experiments, the authors concluded that the optimal down-sampling coefficient should be 4. This means that signals with a frequency of 64Hz are fed to the input of the neural network. Reducing the frequency can, in a sense, be treated as a form of data smoothing. Figure 3 shows two fragments of EEG recordings, each 3 seconds long. In the upper figure, the signal frequency is 256 Hz and in the lower figure it is down-sampled to 64Hz. The aforementioned smoothing effect is clearly visible. Down-sampling has been implemented in R.

Step 4. Sliding window design: From Table 6 one can calculate that an average of 460 seizures were annotated per expert in the EEG dataset. This number is definitely too small to effectively train neural networks (especially when training convolutional neural networks). Therefore, we used a sliding window technique to increase the amount of data which is then fed to the neural network. The second important task of the proposed sliding window design is to select a balanced number of seizure and non-seizure chunks. The process of preparing training data for CNN consists of two steps: a) selection of positive and b) selection of negative samples from all recorded EEG signals. A positive sample is a chunk/fragment with an annotated seizure, a negative sample is a seizure-free chunk/fragment. The design is illustrated in Figs. 4, 5 and 6.

In all three figures the F3-C3 channel of infant # 1 is depicted (arbitrary selected by the authors). At the top panel there are two EEG signals with seizures annotated by expert A. The first one begins at the 104th second and ends at the 121st second, the second seizure begins at the 6847th second and ends at the 6863rd second (see Table 7). At the bottom panel there is the F3-C3 channel of infant # 10 which is seizure free with randomly selected appropriate number of chunks (5, 4 and 10, respectively).

We have two parameters at our disposal (window and chunks ). Using them, we can define what the resulted data samples will look like. In Fig. 4 we have given window=6 and chunks=3 . This means that we want to choose 3 chunks from every annotated seizure, each 6 seconds long. Note that the second seizure is 17 seconds long, so actually it is possible to select only 2 and not 3 chunks (otherwise, we will fall into a non-seizure area). The first seizure is 18 seconds long, so it is possible to select 3 chunks. In Fig. 5 we have given window=5 and chunks=2 and now the lengths of both seizures allow you to select 2 chunks. In Fig. 6 the window size is set to 2 and the desired number of chunks is 5.

Next, we need to select a relevant number of seizure-free chunks. The binary classification (seizure/non-seizure) requires that the dataset is well balanced. In the context of classification task, this means that the sizes of seizure and non-seizure samples should be more or less the same. The non-seizure chunks are randomly selected from non-seizure EEG signals (bottom panels in Figs. 4, 5 and 6, patients in the group EXP0, see Table 6). As the result, there is no danger that both subsets will be unbalanced. The total number of non-seizure chunks is 5, 4 and 10 respectively in our examples. window and chunks parameters can of course be set to any integer values, according to your needs.

The above-described method of selecting windows and chunks has been implemented in R.

Note also that there are studies in which the authors propose methods that allow for the effective detection of epileptic seizures for imbalanced EEG recordings24,25. However, our solution based on the CNN approach requires that the data be fully balanced, hence we use sliding windows design described above. Our design, by definition, guarantees the generation of a fully balanced data set. If unbalanced data were fed to the CNN network, the obtained results (binary classification: seizure / non-seizure) would be less reliable and accurate.

We also point out a subtle difference. In the field of EEG signal analysis, the term epoch is used. EEG epoching is a procedure in which specific time-windows are extracted from the continuous EEG signal. In our approach we use the term window and not the epoch to emphasize a slightly different meaning. We do not divide the entire EEG signal into epochs, but select only the fragments that interest us, which we call windows, please see Figs. 4, 5 and 6 for explanation. Because CNN networks require large amounts of data to function properly (mainly we mean reducing the phenomenon known as overfitting), we also introduce the concept of chunks, which allows us to increase the amount of training data we have in our disposal. Let us also mention that the concept of chunks is somewhat similar to the commonly used data augmentation, a powerful technique for mitigating overfitting in computer vision. Note also that some authors propose epoch reduction approach for better accuracy of the model26,27 but in our case, this technique is not applicable.

Step 5. Saving data in HDF5 format: After completing all the above steps, we obtain the final matrix where fragments with and without seizures are present. For the case shown in Fig. 4, the size of the matrix will be \(19 \times 20\) , see illustrative Fig. 8 (the last row is the class indicator, 1 means seizure, 0 means non-seizure). Note also that all 18 channels are analyzed simultaneously. The data in this form is then saved in HDF5 format which is very convenient for storing large files of numeric data in an efficient binary format. The saved HDF5 files are passed as input to the appropriate Python routines that implement CNN learning.

Note that in reality matrices generated from our real EDF files will be much bigger. After down-sampling our EEG dataset every second is represented by 64 datapoints (see Step 3 above). Therefore, the matrix for the data schematically depicted in Fig. 4 would be \(19 \times 3840\) . Moreover, when working with real data, matrices will be even many times larger since multiple seizures are marked in EDF files and EEG recordings are often longer than 20 seconds (unlike those shown in the Figs. 4, 5 and 6). Additionally, the seizures annotated last for many seconds (see Table 7) and the window and chunks parameters can actually take values greater than those in the toy example shown above. For example from Table 6 we read that expert A annotated 385 seizures in the subset EXP3. If the parameters have the following values: window=2 , chunks=3 and f=64Hz the matrix will have \(385 \times 64 \times 2 \times 3 = 147,840\) rows.

Sliding window design. (a) Channel F3-C3 of infant # 1. (b) Channel F3-C3 of infant # 10 which is seizures free. Red and blue signals are the real ones. The top signal has 2 seizures annotated by expert A. The first one starts at 104th second and ends at 121st second and is 18 seconds long. The second one starts at 6847th second and ends at 6863rd second and is 17 seconds long (see Table 7). By setting the appropriate values for the window and chunks variables, we can control the length of the samples (window variable) and their total number ( chunks variable). The window length was set to 6 seconds and the number of chunks was set to 3. Note, that the length of the second seizure fragment is 17 seconds. Consequently, it is possible to select only two chunks from the second seizure (although we assumed that we are selecting 3 chunks). From the first seizure one can safely get 3 chunks. The bottom EEG signal has no seizures annotated. We select randomly the same number of chunks (i.e. 5) as we have selected from the top EEG signal. Thanks to this method of selecting chunks, the number of seizure and non-seizure chunks is well balanced. The starting and ending seconds were chosen randomly (form example form 44 to 49 etc.).

An analogous example to the one shown in Fig. 4. The EEG signals are the same. The figures differ in that window and chunks parameters have different values. Note also that now the window length was set to 5 and 2 chunks from each of the two seizure fragments were selected, see top picture (a). Consequently, 4 chunks were selected form the non-seizure signal, see bottom picture (b).

An analogous example to the one shown in Fig. 4. The EEG signals are the same. The figures differ in that window and chunks parameters have different values. Note also that now window length was set to 2 and 5 chunks from each of the two seizure fragments were selected, see top picture (a). Consequently, 10 chunks were selected form the non-seizure signal, see bottom picture (b).

The CNN sequential model used by the authors in all numerical experiments. The Python codes where the model is implemented is available for download in Electronic Supplements.

The CNN model used in our research has the structure shown schematically in Fig. 7 using summary function implemented in Keras. Its structure is the result of many experiments and tests aimed at developing the most optimal structure possible. Summary of the most important elements of the CNN architecture is depicted in Table 1.

After selecting the desired number of chunks in Fig. 4 one must combine them in a matrix form. In this example the matrix has 18+1 rows (the last row is a class indicator) and 10 columns. Every single cell represents a 6 seconds long EEG signal.

The data stored in the form of two-dimensional matrix shown in Fig. 8 cannot become directly the input for the CNN network implemented in Keras system. It is required to transform it (in other words: rearranging) into the so-called tensors form. Tensor is nothing but a generalization of the concept of a vector or matrix. Only in this form the data can be used in CNN. For those interested, we recommend a very clearly written book15. Details of the rearranged matrix are shown in Fig. 9. A tensor of size \(10 \times 384 \times 18\) and a vector of size \(10 \times 1\) are created. The rearranging has been implemented in Python. Looking at the tensor it is easy to notice how the individual chunks are organized.

The two-dimensional matrix shown in Fig. 8 cannot be fed into the neural network in this form. In Keras a 3D tensor is required. The figure shows how the 2D matrix must be divided into a tensor and a vector with seizure indicators.

In Fig. 10 four randomly chosen pairs of seizure/non-saizure chunks are depicted. The individual EEG signal values are represented as colormaps. It is easy to notice that the analysis of EEG signals, in the form of time series, de facto leads to the analysis of two-dimensional images. Upper plots show seizure chunks and the lower ones show non-seizure chunks. A certain pattern can be seen in Fig. 10a,b. Top drawings appear more blurry. However, in Fig. 10c,d the human eye cannot see any clear differences. However, very good results of the classification with the use of the CNN approach prove (not for the first time anyway) that deep neural networks learning is able to successfully solve the seemingly unsolvable tasks.

In order to make CNN working correctly, it is necessary to split the data into three parts: a) training, b) validation and c) test. The model is trained on the training data and its accuracy is constantly checking using the validation data. Once the model is trained, it is tested on the test data. The test set is not involved in the process of building and tuning the model. This is the basic principle that guarantees the objectivity of the obtained results. The splitting data into test and validation sets is usually done randomly. The result of the validation stage will therefore depend on which elements of the dataset will be used during validation and which during the training stages. In this case, the validation result will not be reliable. The best practice in such situations is to use K-fold cross-validation. It is based on splitting the available data into K folds (see Fig. 10), creating K identical models and training each of them on K-1 folds. The model is evaluating on the remaining fold. The final validation score is the average of the K validation scores obtained. In our implementation the training and validation subset contains 80% of all data and the test subset contains 20%. The K parameter was set to 5.

Four exemplary seizure (top) and non-seizure (bottom) fragments of 2 seconds where the sampling frequency is 64 Hz. This gives 128 individual datapoins. The EEG signals are represented as colormaps. It is easy to notice that the analysis of EEG signals, in the form of time series, de facto leads to the analysis of two-dimensional images.

The input data for CNN are in the form of tensors and vectors, as shown in Fig. 9. Tensors contain both positive and negative samples and every sample is basically treated completely independently. This is in line with the neural networks principles: the neural network should be provided with as much training data as possible without making any assumptions about the relationship between the training samples. In other words, we do not make any additional assumptions about the acceptance or rejection of a given sample. Each of them is treated the same and it does not matter which patient it comes from.

As an example please refer to Table 6. We can see that expert A annotated 385 seizures in the set named EXP3. Using our sliding design methodology, let’s assume that window=2 and chunks=2 . So, we obtain \(385 \times 2 \times 3 = 1540\) positive samples in total that will go to the input of the CNN network. Consequently, all available seizure signals are used and no one is left-out.

Because neural networks work best when the training data is balanced, therefore, in the next step, we select the same number of negative samples. To make data fully balanced, we select exactly 1540 negative chunks, each with a length of 2 seconds. The negative samples are taken from the patients marked as EXP0 (i.e. without any annotated seizures). Consequently, our CNN network receives 3080 samples. This set is then randomly split into the training-validation part and test part (80% vs. 20%, i.e. 2464 vs. 616 samples). Finally, the training and validation subset is randomly split according to the cross-validation methodology as visualized in Fig. 11. For \(K=5\) in every fold \(2464 \times 4/5 = 1970\) samples is used for training and \(2464 \times 1/5 = 494\) samples is used for validation. The remaining 616 samples are used to evaluate the accuracy of the trained CNN models. The results are summarised in Tables 3, 4 and 5.

The seizure annotations presented in5 are shared in a specific non-standard format. Therefore, in the first place, we have developed software that allows one to easily load this data and, on the basis of it, prepare batches that can be used as input data for CNN. This part of the software was implemented in the R system. The data generated have the structure shown in Fig. 8. In our experiments we decided to choose the following values for the window and chunks variables: window=[1,2,5,10,20] and chunks=[1,2,5,10,20,10000] . 10000 means that the maximum possible set of contiguous chunks was selected. We can safely set chunks to 10000 and this way we are sure that the maximum possible set of chunks will be selected. Our dataset simply doesn’t have seizures as long as 10,000 seconds. Using these values 30 different datasets were generated for annotations prepared by each of the experts A, B and C. This makes a total of 90 different datasets saved as HDF5 files, see “Replicate the results” section for detailed explanation how to generate these files, how and where they are stored and what their naming convention is.

The CNN learning results are collected in Tables 3, 4 and 5. The best obtained test-set accuracy, the longest computation time and the biggest data size in chunks are printed in bold. We would like to point out here that the obtained results of the classification (at the level of 96%-97%) should be considered very good, almost perfect. It should be emphasized, however, that in order to obtain such results, large amounts of training data are required. For this reason, in principle, a sliding window design was developed and implemented.

It is worth noting that the learning process of CNNs is not deterministic. This means that, in principle, we are not able to obtain exactly the same results by performing the same calculations again and again. Each time the results will be slightly different. Nevertheless, the differences will not be too great. All calculations are performed 5 times (fivefold cross validation scheme) and then the average of all partial results is calculated. In Tables 3, 4 and 5 these average results are shown. Nevertheless, all the partial results are included in Electronic Supplements (in the results directory, see the directory structure in “Replication of the results” section).

In the tables we also show average computation time (rounded to full minutes). These results should be treated with a certain distance. A lot depends on the type of GPU card and its temporary load. We worked in the Google Colaboratory and Kaggle cloud environments, where shared resources vary over time and they can vary quickly.

In this section, we provide various details that will help one to replicate all the results of our numerical experiments. We would also like to point out that in some places the source codes are hard-coded with certain elements related to the specificity of the source data used. These are mainly: a) EDF file names, b) number and names of channels stored in EDF files, c) a coding system of seizures annotations. If the provided codes were to be used in the future to analyze a different data set of EEG signals, minor changes would have to be made. The authors declare the necessary help for potential researchers.

The fivefold cross-validation scheme used in our numerical experiments.

The overall workflow to reproduce the results obtained by the authors can be summarised in 7 steps which are shown schematically in Fig. 12.

Step 1. Download the dataset of neonatal EEG recordings: These are available at https://zenodo.org/record/4940267. There are 79 EDF files and 3 CSV annotations files. The EDF files are approximately 4GB in size.

Step 2. Download the repository from the Electronic Supplements: (see “Data and code availability” section). Upload 79 EDF files and 3 CSV files which you downloded in Step 1 to the edf and annotations directories. In the acc_loss , best_models , hists , logs , results , ROC and waveforms directories we have downloaded all our output results. However, you can regenerate these results yourself by running appropriate R and Python scripts, see the next steps below. The complete directory structure is given below and a short description of the content of individual working subdirectories is given in Table 2.

Step 3. Install R and RStudio software: In R install two required packages: edf and rhdf5. Please note that the latter is installed from the Bioconductor28 and not from the primary R package repository.

Step 4. Setup a computing environment for Python: At the beginning, it will probably be most convenient to use cloud-based environment. We may recommend Google Colaboratory (https://colab.research.google.com) or Kaggle (https://www.kaggle.com/). Both environments offer the possibility of using high-performance GPU cards for free. GPU cards are basically necessary to perform the required calculations using CNN. Computing without the use of GPU cards takes many times longer and, in fact, it is unlikely to be completed within a reasonable time.

Step 5. Generate required HDF5 files: To do this run EEG_neonatal.R script (this can take a few hour). Make sure that the current working directory is R . Set also the dir variable to the one indicating the appropriate directory structure in your local computer. The parameters of the generate_samples() function can be changed depending on your actual needs. Those that are saved in the EEG_neonatal.R script will generate exactly the same HDF5 files that were included in the Electronic Supplements. After generating copy all the HDF5 files to the inputs directory. The directory should contain 90 data files ready to fed to the neural network and additionally 184 auxiliary files (these files contain some details about the generated HDF5 file data but you do not need to use them). The data files have the same logical structure as in Fig. 9 and use a uniform naming convention. For example, the file expert_C_5sec_2chunk_64Hz.hdf5 means that data was generated according the annotations made by expert C, the windows size was set to 5 seconds and the number of contiguous chunks was set to 2 (see Figs. 4 and 5). The similar naming convention was used for all other files in the working subdirectories.

Note: we do not put HDF5 files in the regular Electronic Supplements, as their total size is about 16.6GB. However, for your convenience, we included them in separate zip archives, see “Data and code availability” section.

Step 6. CNN processing: Open the EEG_neonatal.ipynb Jupyter notebook in your favourite Python 3 environment, local or cloud-based. Before the script is run, two global variables, namely WRK_DIR and INPUT_DIR , should be set, indicating the appropriate directories for your runtime environment. Leave the other global variables unchanged if you use input data provided by the authors (i.e. HDF5 files in the working/inputs directory).

The calculation results will be saved in the subdirectories of the working directory. The files share the same naming convention described above. For example the file: best_model_expert_A_1sec_1chunk_64Hz_fold_0.h5 stores the best model obtained during training of the neural network using the input file expert_A_1sec_1chunk_64Hz.hdf5 during the first fold (fold_0 , see Fig. 11. We start counting folds from 0 according to Python convention). To get the complete results presented in the paper, in the __Run__ block set the following values:

which_expert=np.array(["A","B","C"]) ,

In this place, we clearly point out that the calculations will take several days in total. It must be realized that calculations performed in the CNN environment, unfortunately, require enormous computing power. The computation time can be reduced five times, but at the cost of leaving the k-fold scheme. Then, in the Global variables block set COMPLETE_CALCULATIONS=False . However, the results obtained will be somewhat less objective.

Step 7. Inspecting final results: All final results are stored in the individual subdirectories of the working directory. These are: a) confusion matrices, b) accuracy, precision, recall and F-measure metrics, c) CNN processing accuracy and loss metrics as well as appropriate plots, d) ROC curves.

The overall workflow to reproduce the same results that the authors obtained.

An example of CNN training and validation accuracy (upper curves), as well as the training and validation loss (lower curves). These curves can be considered almost ideal: accuracy is almost 1, loss is almost 0 and there is no very disadvantageous phenomenon called overfitting.

An example of CNN training and validation accuracy (upper curves), as well as the training and validation loss (lower curves). Unlike the curves shown in Fig. 13, these are very bad, overfitting occurs very quickly, in this example around the 50th epoch.

A very thoroughly developed data set was used for our research, although it is a quite specific data set, as it concerns neonates5. This dataset have been annotated for neonatal seizures by three independent experts having over 10 years of experience in the visual interpretation of neonatal EEG. So it can be assumed that these results are very reliable. The dataset consists of 79 raw EDF files and 3 CSV files containing the annotations of 3 experts for all the 79 neonates. Times of seizure occurrence are marked by experts with a resolution of one second, i.e. experts indicated in which second a seizure activity started and ended.

A variety of approaches have been proposed to diagnose seizures using EEG recordings. In the days before DL a variety of conventional machine learning algorithms were performed using statistical, time, frequency or time-frequency domains. A comprehensive overview of such methods can be found, among others in the book29. The results were better or worse, but the complexity of the EEG signals made it difficult to achieve truly spectacular results. However, it was only the development of DL techniques that resulted in really noticeable progress in the field of automatic seizure detection. We mainly mean DL with the use of convolutional neural networks.

We used sequential CNN model with such regularization techniques as dropout, max-pooling, batch normalization and L2 regularizers. It is important to note that we have developed a fairly flexible method of selecting the number of training samples (through the chunks parameter) and the length of individual samples (in seconds, through the window parameter). The user can thus very easily generate training data having the desired characteristics.

Our research basically confirms that deep neural networks, in order to perform their task well, must be provided with a sufficient amount of training data. The results presented in Tables 3, 4 and 5 show that the total number of training samples is not as important as the length of the individual samples (window parameter). The value window=5 seems to be optimal value. Increasing it does not bring much improvement. As for the chunks parameter, basically the higher its value, the better the results will be. However, keep in mind that the training time of the neural network learning process increases very quickly. The value chunks=20 gives quite good results.

In Fig. 13 we show an example of CNN training and validation accuracy (upper curves), as well as the training and validation loss (lower curves). The dataset was created on the basis of annotations made by expert B with the following parameters: window=5 and chunks=10000 . In the context of learning CNNs, these curves can be considered almost ideal: accuracy is almost 1, loss is almost 0 and there is no very disadvantageous phenomenon called overfitting. Note also that in this example the input dataset size is large enough (23,979, see Table 4) that this unfavorable phenomenon does not occur. If, on the other hand, CNN receives too little training data (expert B, window=1 and chunks=1 , see Fig. 14), overfitting occurs very quickly, in our example around the 50th epoch.

In our study, we used the neonatal EEG data set, which is basically quite specific. Nevertheless, the proposed framework can also be successfully applied to studies with EEG data from older patients (larger children, adults). In other words, our solution places no restrictions on what kind of patients the EEG data come from. We can consider two cases:

Building a new CNN model (or models) based on completely new data.

Classification of new data using CNN network already trained by us.

In the first case the main requirement is that seizure annotations be in the same specific format (non-standard in fact) as our data. The annotations must be stored in a CSV file where each column corresponds to a subject (patient) and each row is the annotation of one second of the EEG recording (1 for seizure and 0 for nonseizure. Please study the 3 files in the annotations directory for better understanding the files structure). As for raw EDF files, please note that they may have a slightly different structure (different number of channels, different channel names, etc.). So if someone would like to use our codes to analyze their own EDF datasets, they must meet the following requirements. See also the Electronic Supplements for more information.

EDF files must be readable by the read.edf() function (edf R package).

We assume that EDF file names have the format like: eeg phrase and consecutive numbers of subjects, like eeg1.edf , eeg2.edf etc. Otherwise, some minor changes are required in the generate_samples() function.

The EEG channel names are hard-codded in the function generate_montage() . Depending on the current structure of your raw EDF files, this function must be appropriately adapted to this structure.

In the second case one must be aware that our CNN network has been trained on a certain dataset (quite specific) and is ready to recognize a certain type of seizures (i.e. neonatals ones). Therefore, it should not be expected that when we provide completely different data to our pre-trained CNN network (e.g. based on elderly patients), the network will correctly classify the data. Also some technical details on EEG recordings must be considered carefully. In our case, signals from 18 EEG channels connected according to the ’double banana’ montage were fed to the CNN network. When the new data is not analogous, the classification results can be very questionable. Nevertheless, when the new data is compatible (in the sense as stated above), there are no major contraindications to feed them to our pre-trained CNN network. In the Electronic Supplements one can find some examples.

The Python codes are quite universal and the only requirement is to set a few variables in the Global variables block in the Jupyter notebok included. We also assume that the input data filenames (given as HDF5 files) are in the format expert_XXX_YYYsec_ZZZchunk_VVVHz.hdf5 where: XXX - any string indicating for example a human expert who annotated seizures, YYY - window size in seconds, ZZZ - number of contiguous chunks, VVV - base frequency in the HDF5 file. Data stored in HDF5 files must conform to the format shown in Fig. 9.

All data generated and analysed during this study, as well as R and Python source codes, are included in Supplementary Information files.

World Health Organization. Epilepsy (2021). https://www.who.int/en/news-room/fact-sheets/detail/epilepsy, accessed 20-July-2021.

Jette, N., Reid, A. Y. & Wiebe, S. Surgical management of epilepsy. CMAJ 186, 997–1004. https://doi.org/10.1503/cmaj.121291 (2014).

Article  PubMed  PubMed Central  Google Scholar 

Echauz, J. et al. Monitoring, signal analysis, and control of epileptic seizures: A paradigm in brain research. In 2007 Mediterranean Conference on Control Automation, 1–6, https://doi.org/10.1109/MED.2007.4433785 (2007).

Stevenson, N. et al. Interobserver agreement for neonatal seizure detection using multichannel EEG. Ann. Clin. Transl. Neurol. 2, 1002–1011. https://doi.org/10.1002/acn3.249 (2015).

Article  PubMed  PubMed Central  Google Scholar 

Stevenson, N. J., Tapani, K., Lauronen, L. & Vanhatalo, S. A dataset of neonatal EEG recordings with seizure annotations. Sci. Datahttps://doi.org/10.1038/sdata.2019.39 (2019).

Article  PubMed  PubMed Central  Google Scholar 

Tapani, K., Vanhatalo, S. & N.J., S. Time-varying EEG correlations improve automated neonatal seizure detection. Int. J. Neural Syst.29, 1850030, https://doi.org/10.1142/S0129065718500302 (2019).

O’Shea, A., Lightbody, G., Boylan, G. & Temko, A. Neonatal seizure detection from raw multi-channel EEG using a fully convolutional architecture. Neural Netw. 123, 12–25. https://doi.org/10.1016/j.neunet.2019.11.023 (2020).

Isaev, D. Y. et al. Attention-based network for weak labels in neonatal seizure detection. In Doshi-Velez, F. et al. (eds.) Proceedings of the 5th Machine Learning for Healthcare Conference, vol. 126 of Proceedings of Machine Learning Research, 479–507 (2020).

Saab, K., Dunnmon, J., Ré, C., Rubin, D. & Lee-Messer, C. Weak supervision as an efficient approach for automated seizure detection in electroencephalography. NPJ Digit. Med.https://doi.org/10.1038/s41746-020-0264-0 (2020).

Article  PubMed  PubMed Central  Google Scholar 

Kong, W., Jiang, B., Fan, Q., Zhu, L. & Wei, X. Personal identification based on brain networks of EEG signals. Int. J. Appl. Math. Comput. Sci. 28, 745–757. https://doi.org/10.2478/amcs-2018-0057 (2018).

MathSciNet  Article  MATH  Google Scholar 

Ciecierski, K. Mathematical methods of signal analysis applied in medical diagnostic. Int. J. Appl. Math. Comput. Sci.30, 449–462, https://doi.org/10.34768/amcs-2020-0033 (2020).

Shoeibi, A. et al. epileptic seizures detection using deep learning techniques: A review. Int. J. Environ. Res. Public Health 18, 1–33. https://doi.org/10.3390/ijerph18115780 (2021).

Craik, A., He, Y. & Contreras-Vidal, J. L. Deep learning for electroencephalogram (EEG) classification tasks: a review. J. Neural Eng. 16, 031001. https://doi.org/10.1088/1741-2552/ab0ab5 (2019).

ADS  Article  PubMed  Google Scholar 

Roy, Y. et al. Deep learning-based electroencephalography analysis: A systematic review. J. Neural Eng. 16, 051001. https://doi.org/10.1088/1741-2552/ab260c (2019).

Chollet, F. Deep Learning with Python (Manning, 2017).

Pandarinathan, G., Mishra, S., Nedumaran, A. M., Padmanabhan, P. & Gulyás, B. The potential of cognitive neuroimaging: A way forward to the mind-machine interface. J. Imaginghttps://doi.org/10.3390/jimaging4050070 (2018).

R Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria (2020). https://www.R-project.org.

Abadi, M. et al. TensorFlow: Large-scale machine learning on heterogeneous systems (2015). Software available from https://www.tensorflow.org.

Kluyver, T. et al. Jupyter notebooks - a publishing format for reproducible computational workflows. In Loizides, F. & Scmidt, B. (eds.) Positioning and Power in Academic Publishing: Players, Agents and Agendas, 87–90 (IOS Press, 2016).

Pressler, R. et al. The ILAE classification of seizures and the epilepsies: Modification for seizures in the neonate. Position paper by the ILAE Task Force on Neonatal Seizures. Epilepsia62, 615–628, https://doi.org/10.1111/epi.16815 (2021).

Panayiotopoulos, C. P. The Epilepsies: Seizures, Syndromes and Management, chap. 5, Neonatal Seizures and Neonatal Syndromes (Oxfordshire (UK): Bladon Medical Publishing, 2005). https://www.ncbi.nlm.nih.gov/books/NBK2599/.

Chollet, F. et al. Keras. https://keras.io (2015).

Bisong, E. Google Colaboratory, 59–64 (Apress, 2019). https://doi.org/10.1007/978-1-4842-4470-8_7, Colab system available at https://research.google.com/colaboratory/.

Siddiqui, M., Huang, X., Morales-Menendez, R., Hussain, N. & Khatoon, K. Machine learning based novel cost-sensitive seizure detection classifier for imbalanced EEG data sets. Int. J. Interact. Design Manuf. (IJIDeM) 14, 1491–1509. https://doi.org/10.1007/s12008-020-00715-3 (2020).

Sun, C. et al. Epileptic seizure detection with EEG textural features and imbalanced classification based on EasyEnsemble learning. Int. J. Neural Syst. 29, 1950021. https://doi.org/10.1142/S0129065719500217 (2019).

Siddiqui, M., Islam, M. & Kabir, M. A novel quick seizure detection and localization through brain data mining on ECoG dataset. Neural Comput. Appl. 31, 5595–5608. https://doi.org/10.1007/s00521-018-3381-9 (2019).

Siddiqui, M., Morales-Menendez, R., Huang, X. & Hussain, N. A review of epileptic seizure detection using machine learning classifiers. Brain Informatics 7, Article number 5. https://doi.org/10.1186/s40708-020-00105-1 (2020).

Bioconductor Team. Bioconductor: Open source software for bioinformatics. https://www.bioconductor.org/.

Sanei, S. & Chambers, J. EEG Signal Processing (Wiley, 2007).

Wikipedia. 10-20 system (EEG)—Wikipedia, The Free Encyclopedia (2021). https://en.wikipedia.org/wiki/10-20_system_(EEG), accessed 20-July-2021.

Institute of Control and Computation Engineering, University of Zielona Góra, Zielona Góra, Poland

Computer Center, University of Zielona Góra, Zielona Góra, Poland

You can also search for this author in PubMed  Google Scholar

You can also search for this author in PubMed  Google Scholar

A.G. proposed the concept of the paper, carried out an initial processing of data and developed all Python codes. Was also responsible for drafting the first version of the article. J.G. developed all R codes. Was also responsible for critical revision of the article and approved the final version for submission for publication. All authors participated in planning numerical experiments in Keras package as well as read and approved the final manuscript.

The authors declare no competing interests.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Gramacki, A., Gramacki, J. A deep learning framework for epileptic seizure detection based on neonatal EEG signals. Sci Rep 12, 13010 (2022). https://doi.org/10.1038/s41598-022-15830-2

DOI: https://doi.org/10.1038/s41598-022-15830-2

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Scientific Reports (Sci Rep) ISSN 2045-2322 (online)

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.