Search Papers | Poster Sessions | All Posters

Poster C106 in Poster Session C - Friday, August 9, 2024, 11:15 am – 1:15 pm, Johnson Ice Rink

Human-like feature attention emerges in task-optimized models of the cocktail party problem

Ian Griffith1, R. Preston Hess2, Josh H. McDermott1,2,3,4; 1Program in Speech and Hearing Bioscience and Technology, Harvard University, 2Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, 3McGovern Institute for Brain Research, Massachusetts Institute of Technology, 4Center for Brains, Minds and Machines, Massachusetts Institute of Technology

Attention enables communication in settings with multiple talkers, selecting sources of interest based on prior knowledge of their features. Decades of research have left two gaps in our understanding of feature-based attention. First, humans succeed at attentional selection in some conditions but fail in others, for reasons that remain unclear. Second, neurophysiology experiments implicate multiplicative gains in selective attention, but it remains unclear whether such gains are sufficient to account for real-world attention-driven behavior. To address these gaps, we optimized an artificial neural network with stimulus-computable feature-based gains for the task of recognizing a cued talker’s speech, using binaural audio input (a “cocktail party” setting). Despite not being fit to match humans, the model replicated human performance across a wide range of real-world conditions, showing signs of selection based both on the voice’s timbre and spatial location. The results suggest that human-like attentional strategies emerge as an optimized solution to the cocktail party problem, providing a normative explanation for the limits of human performance in this domain

Keywords: attention computational modeling deep neural networks human behavior 

View Paper PDF