Post

EEG Classification and Diffusion

Testing the Efficacy of Generating EEG Signals for Classification

Project Overview

In this project, myself and three other UCLA students explored the efficacy of using diffusion to generate EEG signals to train classification. We would like to thank the Graz University of Technology, Austria for allowing us to use their EEG dataset for our project. The dataset can be found here.

Project Goals

The main goal of the project was maximizing the classification accuracy of different movements of a person based on EEG signals. We also wanted to explore the efficacy of using diffusion to generate EEG signals for classification. We also wanted to explore the efficacy of using different variations of CNNs for classification, specificially if a simple CNN would be better than a CNN with an LSTM or Multi-Head Attention layer.

Project Results

We found that using a simple CNN was the best model for classifying EEG signals, achieving a classification test accuracy of 70.4%. We also found that using diffusion to generate EEG signals was not as effective as we hoped for, which we attributed to the small amount of data and the amount of noise in the data. We also found that using a simple CNN was the best model for classifying EEG signals.

Hyperparameter Tuning

Here is a table of the different model architectures that we tested and their respective accuracies, both in total over all subjects and per subject:

ArchitectureTest AccSub 1 TASub 2 TASub 3 TASub 4 TASub 5 TASub 6 TASub 7 TASub 8 TASub 9 TA
CNN, No Aug54.246.042.052.064.061.755.164.048.055.3
CNN, All subs69.562.355.472.371.180.563.674.362.676.9
CNN Optimized, All subs70.465.056.072.570.579.363.871.565.578.8
CNN+LSTM, All subs55.642.042.059.759.770.552.259.746.569.0
CNN+MHA, All subs62.356.049.163.458.975.766.559.459.467.8
CNN, Sub 129.736.324.623.735.123.424.532.330.335.0
CNN+LSTM, Sub 121.722.418.028.028.017.020.416.022.025.5
CNN+MHA, Sub 133.024.329.134.634.024.334.132.943.141.6

In the above table, CNN Optimized is a CNN with optimized kernel sizes and strides. CNN+LSTM is a CNN with an LSTM layer after the CNN. CNN+MHA is a CNN with a Multi-Head Attention layer after the CNN. All subs means that the model was trained on all subjects. Sub 1 means that the model was trained on subject 1. As we can see, the CNN Optimized model performed the best out of all the models that we tested when trained on all subjects. The CNN+MHA model performed the best when trained on subject 1.

We also found that having a time window of 384 ms was the best time window for classification, with the lowest accuracies happening at 32 ms and 500 ms.

Diffusion

Here is an example of one of the generated EEG signals with an example of a real EEG signal:

Noise EEG Signal The real EEG signal is the red line and the generated EEG signal is the blue line. For this graph, the generated EEG signal recieved random noise as input for the diffusion model. As we can see, the generated EEG signal is very noisy and does not resemble the real EEG signal.

Average EEG Signal The real EEG signal is the red line and the generated EEG signal is the blue line. For this graph, the generated EEG signal recieved the per-channel average of the EEG signals as input for the diffusion model. As we can see, the generated EEG signal is noisy, but has some structure.

Visually, these signals are varying in quality. The random noise signal is very noisy and does not resemble the real EEG signal. The average signal is less noisy and has some structure, but still has some noise in it. We also validated this finding numerically by training a simple CNN on the generated EEG signals and found that the model was not able to classify the signals well. The best model that was trained on the generated signals achieved a test accuracy of 66.9 with 60% synthetic data and 40% real data.

Project Code

The code for this project can be found on my github. The code will be refactored and cleaned up in the future.

This post is licensed under CC BY 4.0 by the author.

Trending Tags