Search Papers | Poster Sessions | All Posters

Poster A82 in Poster Session A - Tuesday, August 6, 2024, 4:15 – 6:15 pm, Johnson Ice Rink

Deep neural networks, trained on invariant recognition tasks, struggle to predict hierarchical invariance of speech representations in auditory cortex

Guoyang Liao1 (), Dana Boebinger1, Kirill Nourski2, Matthew Howard2, Christopher Garcia2, Thomas Wychowski1, Webster Pilcher1, Jenelle Feather3, Sam Norman-Haignere1; 1University of Rochester Medical Center, 2The University of Iowa, 3Flatiron Institute

The central computational challenge of speech recognition is that instances of the same class (e.g., word) vary enormously in their acoustics. Traditional auditory models cannot explain “invariant” speech recognition and have difficulty predicting human cortical responses to complex natural stimuli such as speech. Deep neural network (DNN) models trained on challenging invariance tasks such as speech recognition, have shown promise as neural encoding models, but it remains unclear whether they can explain invariant representations of speech in the human auditory cortex. To answer this question, we measured cortical responses to speech with and without acoustic variation using spatiotemporally precise intracranial recordings from neurosurgical patients. We found that representations of speech become increasingly invariant to acoustic variation in non-primary regions, consistent with hierarchical theories of functional organization. We also found DNN models trained on challenging invariance tasks predicted cortical response timecourses to speech better than standard acoustic models, with later network layers better predicting non-primary regions. Yet, all of the tested DNN models had difficulty predicting the hierarchical organization of invariance in the auditory cortex. These results suggest that the representational invariances learned by current DNN models may not align with those in the auditory cortex.

Keywords: deep neural network; speech; invariant coding; intracranial EEG. 

View Paper PDF