Search Papers | Poster Sessions | All Posters

Poster B84 in Poster Session B - Thursday, August 8, 2024, 1:30 – 3:30 pm, Johnson Ice Rink

Walk a mile in my shoes! 3D visual perspective taking in humans and machines.

Peisen Zhou1,2 (), Drew Linsley1,2, Alekh Ashok1,2, Gaurav Gaonkar1,2, Akash Nagaraj1,2, Francis Lewis1,2, Thomas Serre1,2; 1Carney Institute for Brain Science, Brown University, 2Department of Cognitive Linguistic & Psychological Sciences, Brown University

Visual perspective taking (VPT), and the ability to analyze scenes from a different viewpoint, is an essential feature of human intelligence. We systematically evaluated if a large zoo of over 300 deep neural networks (DNNs) could solve this task like humans can. While DNNs rival human performance on 3D tasks like depth perception, they are significantly worse than humans at VPT. Our findings indicate that despite the incredible progress of DNNs over recent years to rival or exceed human performance on many different visual tasks, significant progress is still needed for them to perceive and function like humans in complex 3D environments.

Keywords: Visual Perspective Taking Computer Vision 3D Perception 

View Paper PDF