Search Papers | Poster Sessions | All Posters

Poster C14 in Poster Session C - Friday, August 9, 2024, 11:15 am – 1:15 pm, Johnson Ice Rink

Disentangling Intermediate Representations in Sound-to-Event DNNs using Invertible Flow Models

Tim Dick1 (), Enrique Hortal Quesada1, Alexia Briassouli2, Elia Formisano1; 1Maastricht University, Netherlands, 2University of Twente, Netherlands

Neural representations derived from fMRI responses to natural sounds within non-primary auditory cortical regions mirror those found in the intermediate layers of deep neural networks (DNNs) trained for sound recognition. However, the underlying characteristics of these representations remain elusive. In this study, we investigate the nature of these intermediate representations employing a disentangling invertible flow model. We recorded a novel dataset of natural sounds, designed to probe the hypothesis that sound to-event DNNs encode distinct basic sound generation mechanisms (human actions) and source properties (object materials) independently within their intermediate layers. To simulate brain responses to these natural sounds, we utilized the layer-by-layer activation of a convolutional DNN (Yamnet), pre-trained to categorize sound spectrograms into semantic categories. Crucially, through systematic manipulations of the obtained latent representations using the disentangling invertible flow model, we demonstrate predictable effects in the DNN’s output. This in silico demonstration offers a promising avenue for subsequent neuroscientific in vivo experimentation.

Keywords: Latent Space Disentanglement Machine Learning Normalizing Flow Auditory Processing 

View Paper PDF