Search Papers | Poster Sessions | All Posters

Poster A137 in Poster Session A - Tuesday, August 6, 2024, 4:15 – 6:15 pm, Johnson Ice Rink

Euclidean coordinates are the wrong prior for models of primate vision

Garrison Cottrell1 (), Shubham Kulkarni1, Martha Gahl1; 1UCSD

The mapping from the visual field to V1 can be approximated by a log-polar transform. In this domain, scale is a left-right shift, and rotation is an up-down shift. When fed into a standard shift-invariant convolutional network (CNN), this provides scale and rotation invariance. However, translation invariance is lost. This is compensated for by multiple fixations on an object. Due to the high concentration of cones in the fovea with the dropoff of resolution in the periphery, fully 10 degrees of visual angle take up about half of V1, with the remaining 170 degrees (or so) taking up the other half. This layout provides the basis for the central and peripheral pathways. Simulations with this model closely match human performance in scene classification, and competition between the pathways leads to the peripheral pathway being used for this task. Remarkably, despite the property of rotation invariance, this model provides a novel explanation for the inverted face effect. We suggest that using Euclidean image coordinates is the wrong prior for models of primate vision.

Keywords: Primate vision models log-polar transform face inversion effect scene perception 

View Paper PDF