Search Papers | Poster Sessions | All Posters

Poster A61 in Poster Session A - Tuesday, August 6, 2024, 4:15 – 6:15 pm, Johnson Ice Rink

Subject-Agnostic Transformer-Based Neural Speech Decoding from Surface and Depth Electrode Signals

Xupeng Chen1, Junbo Chen1, Ran Wang1, Chenqian Le1, Amirhossein Khalilian-Gourtani2, Adeen Flinker2, Yao Wang1 (); 1Department of Electrical and Computer Engineering, New York University, 2Department of Neurology, New York University

A growing body of research is aimed at decoding human speech from neural signals captured by intracranial electrodes. Most prior works with high decoding quality can only work with electrodes on a 2D grid (i.e., Electrocorticographic or ECoG array) and data from a single patient. Here we design a deep-learning model that accommodates surface (ECoG) and depth (stereotactic EEG or sEEG) electrodes from multiple participants with large variability in electrode placements. The proposed novel transformer-based model named SwinTW can work with arbitrarily positioned electrodes. We train subject-specific and subject-agnostic models exploiting data from multiple participants. The subject-specific models using only low-density ECoG achieved high decoding performance, outperforming our previous ResNet model. Incorporating additional strip and depth electrodes led to further improvement. For participants with only sEEG electrodes, subject-specific models still enjoy comparable performance. The subject-agnostic models generalized well to unseen participants through a cross-validation study. The proposed SwinTW decoder enables future speech neuroprostheses to utilize any electrode placement that is clinically optimal or feasible for a particular participant, including using depth electrodes, which are more routinely implanted in chronic neurosurgical procedures. Importantly, the generalizability of the multi-patient models suggests the exciting possibility of developing speech neuroprostheses for people with speech disability without relying on their own neural data for training.

Keywords: Neural Speech Decoding Electrocorticographic (ECoG) stereotactic EEG Brain-Computer Interface(BCI) 

View Paper PDF