Deep learning for classification of sleep EEG data during the epidemic of Coronavirus Disease

Sleep is an important part of the body's recuperation and energy accumulation, and the quality of sleep also has a significant impact on people's physical and mental state during the epidemic of Coronavirus Disease. It has attracted increasing attention how to improve the quality of sleep and reduce the impact of sleep related diseases on health. The electroencephalogram (EEG) signals collected during sleep belong to spontaneous EEG signals. Spontaneous sleep EEG signals can reflect the body own changes, which is also an important basis for diagnosis and treatment of related diseases. Therefore, the establishment of an effective model for classifying sleep EEG signals is an important auxiliary tool for evaluating sleep.


Introduction
The sleep process is a complex process of dynamic changes. According to R&K, the international standard for the interpretation of sleep stages, different states in the sleep process are divided: in addition to the awake period, the sleep cycle consists of two alternate sleep states, namely rapid eye movement (REM) and non-REM.
In non-REM, according to the gradual change of sleep state from shallow to deep, it is further divided into sleep I, sleep II, sleep III and sleep IV. Sleep stage III and sleep stage IV can be combined with deep sleep stage. Figure 1 shows the time series of EEG signals corresponding to different sleep stages, from top to bottom, namely, wakefulness, sleep I, sleep II, deep sleep and REM. It can be observed from Figure 1 that the characteristics of EEG signals vary in different sleep stages during the epidemic of Coronavirus Disease. Quality, diagnosing and treating sleep related diseases. In this paper, outliers of each kind of original data were detected and deleted by using the principle of 3Sigma and k-means clustering + Euclidean distance detection method. Then, the Softmax multi-classification BP neural network model was constructed by using Adam algorithm with adaptive learning rate, and relatively high accuracy and AUC values were finally obtained.   Automatic staging based on EEG signals can reduce the artificial burden on expert physicians and it is a useful auxiliary tool for assessing sleep quality diagnosing and treating sleep-related diseases. In this paper, Python is used to build a neural network, and a sleep staging prediction model is designed. On the basis of as few training samples as possible, relatively high prediction accuracy is obtained.

Overview of BP neural network
Artificial neural network is widely used in some aspects, including pattern recognition, function approximation, data compression, data classification, data prediction, etc. [1][2][3][4][5][6]. BP neural Network is an important algorithm in ANN. The basic structure of the BP neural network is shown in Figure 2.

Figure 2. Basic structure diagram of BP neural network
Introduction to activation function and algorithm: ReLU function: Its operating principle is shown in Figure 3.

Figure 3. Principle of Softmax function
Step 3: Calculate the output of the output layer. The predicted output is calculated through the hidden layer output and connection weights and bias and the Softmax activation function.
Step 4: Calculate Softmax cross entropy as cost function according to predicted output and real label.
Step 5: Back propagation, and the adaptive learning rate Adam algorithm [7] is used to update the weight and bias.
Step 6: Determine whether the cost reaches the error range or the number of iterations. If not, return step 2.

Data description and preprocessing
Data were collected from 3000 sleep EEG samples and their labels taken from different healthy adults during overnight sleep. The first is a "known label", which represents the different sleep stages in digital form: stage wake [6], rapid eye movement [5], sleep I [4], sleep II [3], and deep sleep [2]; The second to fifth columns are the characteristic parameters calculated from the original time sequence, successively including "Alpha", "Beta", "Theta" and "Delta", which correspond to the energy proportion of EEG signals in the frequency range of "8-13Hz", "14-25Hz", "4-7Hz" and "0.5-4Hz" respectively. The unit of characteristic parameters is the percentage.
Raw data stage wake [6] is given, and REM [5], sleep I [4], the sleep II [3], deep sleep [2] four brain electrical signal energy proportion of five sleep stages of brain electrical signal energy proportion, but the original data are generally given there are some abnormal data outliers or missing value, therefore we to each index of the five sets of data make a boxplot graph, the result is as follows in figure 4.   Figure 4 shows, there are some outliers, namely, these all belong to the original data of abnormal points, this paper uses the principle of 3 sigma [8] will each table of abnormal data deletion, then after the processing of five tables to merge, and then using the K-means clustering + Euclidean distance outlier test [9], to find and remove outliers, as shown in figure 5, a total of 2883 samples after pretreatment.

Model training and prediction
We divided the data into training set and test set in a ratio of 2:8. We trained and tested the data using the traditional decision tree model [10] (DT) and support vector machine model (SVM), and compared the classification effect with the accuracy rate and AUC value as evaluation indexes. The results are as follows: Classifier Accuracy rate DT 0.59 SVM 0.63 Adam-BPN Net 0.73 Table 1. Comparison table of several classification accuracy rates As can be seen from Table 1, the accuracy of Adam-BPNNet in several traditional methods is relatively high. The ROC curve of each classification method is shown in Figure 6.  It can be seen from Table 2 that in the Adam-BPNNet model, fewer training sets will still have better classification effect. The prediction result is the best classification effect obtained after many experiments. In the early stage of the experiment, the classification accuracy is low. After repeated debugging of the number of hidden layers and nodes, the best AUC value of this experiment is 0.83.

Conclusion
This study is mainly based on theoretical research and combines theory with practice. BP neural network based on adaptive learning rate Adam algorithm is used for data classification. In addition, Softmax is selected as the activation function in the output layer, which enables the model to have good self-learning and self-adaptive ability. The most important thing is that the network has good generalization ability. When designing the classifier, it should consider whether the network can correctly classify the objects it needs to classify, and whether the network can correctly classify the unseen or noise-polluted patterns after training. The classification AUC value of this study is 0.83, which is scientific to a certain extent and can be used as an effective auxiliary tool for the evaluation of sleep quality, diagnosis and treatment of sleep-related diseases.

Conflict of interest
We have no conflict of interests to disclose and the manuscript has been read and approved by all named authors.