Search Papers | Poster Sessions | All Posters
Poster B121 in Poster Session B - Thursday, August 8, 2024, 1:30 – 3:30 pm, Johnson Ice Rink
Evaluating the impact of multiscale temporal processing on sound-to-event recurrent neural networks
Michele Esposito1 (), Bruno, L. Giordano2, Giancarlo Valente1, Elia Formisano1; 1Maastricht University, 2Université Aix-Marseille
This study investigates the impact of multiscale temporal processing on environmental sound recognition using deep neural networks (DNN). Inspired by the brain's capability to process auditory information across various time scales, we developed a multi-scale DNN architecture, integrating multiple parallel recurrent neural network (RNN) streams. Each stream processes the input spectrogram at a distinct temporal scale. The outputs of these streams are then combined and further processed to achieve sound categorization with a temporal resolution of 50 ms. This design aims to capture the diverse dynamics of natural sounds at large, ranging from transient, impulsive signals to repetitive and sustained sounds. We conducted a comparative analysis between the performance of this multiscale RNN network and networks trained on single-scale inputs. A comparison of our multiscale RNN network with single-scale networks reveals superior multiscale-RNN recognition of events. This performance advantage suggests that the combination of the unique information in multiple temporal scales achieves superior classification of natural sound events.
Keywords: Multiscale temporal processing Natural sound recognition Deep neural networks Time-resolved event classification