Search Papers | Poster Sessions | All Posters

Poster A114 in Poster Session A - Tuesday, August 6, 2024, 4:15 – 6:15 pm, Johnson Ice Rink

Affectless Visual Machines Explain a Majority of Variance in Human Visual Affect and Aesthetics for Natural Images

Daniel Graham1 (), Colin Conwell2, Chelsea Boccagno3,4, Edward Vessel5; 1Hobart and William Smith Colleges, Geneva, NY, USA, 2Johns Hopkins University, 3Harvard T.H. Chan School of Public Health, 4Massachusetts General Hospital, 5City College, City University of New York

Looking at the world involves not just seeing things, but feeling things. Feedforward machine vision systems that learn to perceive the world without physiology, thought, or feedback that resembles human affective experience offer tools to demystify the relationship between seeing and feeling, and to assess how much of affective experiences may be a function of representation learning over natural image statistics. We deploy 180 deep neural networks trained only on canonical computer vision tasks to predict human ratings of arousal, valence, and beauty for images from multiple categories (objects, faces, landscapes, art) across two datasets. We use features of these networks without additional learning, such that we linearly decode human affective responses from network activity just as one decodes information from neural recordings. We find that features of purely perceptual models predict average ratings of arousal, valence, and beauty with high accuracy: On average, models in our survey explain 53% of explainable variance in human responses; the most predictive model explains 72%. These results add to growing evidence for an information-processing account of visually-evoked affect linked to representation learning over natural statistics, and hint at a locus of affective and aesthetic valuation proximate to perception.

Keywords: visual aesthetics machine vision deep convolutional neural networks affect 

View Paper PDF