Poster Presentation

Search Papers | Poster Sessions | All Posters

Poster B84 in Poster Session B - Thursday, August 8, 2024, 1:30 – 3:30 pm, Johnson Ice Rink

Walk a mile in my shoes! 3D visual perspective taking in humans and machines.

Peisen Zhou^1,2 (), Drew Linsley^1,2, Alekh Ashok^1,2, Gaurav Gaonkar^1,2, Akash Nagaraj^1,2, Francis Lewis^1,2, Thomas Serre^1,2; ¹Carney Institute for Brain Science, Brown University, ²Department of Cognitive Linguistic & Psychological Sciences, Brown University

Visual perspective taking (VPT), and the ability to analyze scenes from a different viewpoint, is an essential feature of human intelligence. We systematically evaluated if a large zoo of over 300 deep neural networks (DNNs) could solve this task like humans can. While DNNs rival human performance on 3D tasks like depth perception, they are significantly worse than humans at VPT. Our findings indicate that despite the incredible progress of DNNs over recent years to rival or exceed human performance on many different visual tasks, significant progress is still needed for them to perceive and function like humans in complex 3D environments.

Keywords: Visual Perspective Taking Computer Vision 3D Perception

View Paper PDF