Unsupervised learning for segmentation of neural timeseries

Author: Ali Zaidi

Date: 09.01.22

Summary

Here I use an example iEEG recording to demonstrate how to use SVDs for timeseries segmentation on multiple timescales, and to leverage the obtained segments to discover different kinds of neural activity, and improve classification accuracy for events such as epileptic spikes. Using a seizure is convenient as most poeple have some idea of what a seizure is, both clinically and neurophysiologically. This enables a better understanding of the analysis, while also demonstrating it's clinical application.

The Data

The data consists of an iEEG recording with a single seizure flagged by an expert. The seziure begins at 3 minutes (180s) from the start of the recording and ends 3 minutes before the end of the recording.

The Analysis

This tutorial demonstrates the merits of a data-driven approach towards label discovery in unlabeled or noisily labaled datasets. I use a combination of a STFT along with an SVD-GMM to discover multi-scale patterns in the data. This enables the segmentation of the timeseries into events and states, that occur on the scale of milliseconds and seconds respectivily. The states can then be used to discover patterns over the scale of minutes, which can be used to re-discover the seizure period without explicitly being provided any labels.

The Results

Using a purely data-driven approach, we discover a 2 minute seizure in the 8 minute long recording, without explicitly being told what a seizure is, obtaining >95% overlap with the original seizure period flagged by an expert. We also obtain events at multiple timescales. For unpsupervised learning, this enables segmentation of the timeseries into a small number of events and states. For clustering, this enables a much better identification of specific events, since it enables the separation of the target neural activity from baseline, as the latter is often interspersed with the former. This enables faster and better classification accuracies.

Caveats

The analysis has heuristic elements, mostly in choice of parameters for windowing. A complete and rigorous statistical analysis of the effect of all parameters on the analysis is (sadly) beyond the scope of this tutorial.

This notebook is ideally meant to be read from start to finish, however it has been divided into parts to enable easier reference and navigation.

The data is from the ETH iEEG Seizure dataset. I'm using the datafile ID1_1.mat here.

TL;DR: Jump to the conclusions at the very end end.


Chapters:

  1. Preprocessing
  1. STFT decompositon
  1. SVD of STFT
  1. GMM based clustering
  1. Clustering PSDs into events
  1. Clustering events into states
  1. Clustering: robustness and speed
  1. Classification: accuracy and speed
  1. Conclusions

Preprocessing and visualization of data

STFT based analysis begins here

We're going to decompose the timeseries using STFT and attempt to cluster the resulting PSDs to obtain events.

SVD of DTFT

GMM fit to the data

We're going to cluster the 'ncomp' components of V'. Using the SVD allows us to specify a diagoncal covariance matrix since all the components are independent from one another and hence have zero correlation. This enables a much faster convergence and allows more robust clustering compared to full covariance matrix.

As we can see, there are two groups of events. Events 0 and 1 form one pair, and 2 and 3 form another pair, and the two pairs are quite distinct from one another.

Obtain the mean PSD of each event

Visualizing the time of various events

Obtaining a transition matrix for the states

Rough estimate of states

We can also estimate states from these events, where a state is defined as a timewindow with a specific distribution over all the events.

Non-background activity

In our analysis, we find that state 0 is by far the most common state. However, states 1 and 2 occur during a timewindow of roughly 100s. So if we separate the periods of background (state 0) vs non-background activity, we obtain something close to the original seizure period. Let's see if we can obtain the start and end time of the non-background period from the states...

This is pretty close to the original labels provided along with the data.

Start (s) Stop (s)
Expert: 180 305
Algorithm: 181.5 285

We discovered a seizure in the data!

95% overlap with original labels

Using a data driven approach, we obtain a 95% overlap with the original seziure period that was marked by an expert.

In the data-driven approach, the seizure start time is almost identical, but the stop time is 30s early. Generally, it is harder to estimate end times of neural events compared to start times.

What's the benefit of this approach?

There are two main advantages: faster computation and robust clustering and classification.

Assessing improvement in time complexity for clustering

We'll compare using the SVD for clustering (that enables diagonal covariance matrices), versus clustering on the entire PSD obtained per bin of the STFT.

Starting with the SVD and seeing how robust it is.

Compare that to the full PSD of the STFT and see how robust it is to prior specification of n_components

Classification of full STFT array vs the components from the SVD decomposition

Conclusions

To enable a better understanding of the neural timeseries, we attemped to identify different events at different timescales.

  1. Decomposing the STFT data using SVD enables much more robust event detection, and is markedly faster than using the PSD itself (takes only 10% of the time)!

  2. Using the segmented timeseries to obtain labels for the target is far more effective than using a larger temporal window. This is because the labels provided are composed of a mixture of multiple events, i.e. both seizure and non-seizure activity.

We re-discovered the seizure period in the data, that overlaps with the expert-provided labels by over 95%

Data-driven labels enable a balanced accuracy score of ~99.7%, and a reduction of ~97% in the misclassification rate between seizure and non-seizure activity!

Bottom line:

For unsupervised segmentation: reduction in clustering time by ~86%; robust to model priors.

For classification: reduction in misclassification rate by ~97%