Search Papers | Poster Sessions | All Posters
Poster B69 in Poster Session B - Thursday, August 8, 2024, 1:30 – 3:30 pm, Johnson Ice Rink
Naturalistic dataset augmentations lead to more human-like recognition of occluded objects in convolutional neural networks.
David Coggan1 (), Frank Tong1; 1Vanderbilt University
Convolutional neural networks (CNNs) are currently the strongest overall predictors of human neural and behavioral responses to object stimuli. However, CNNs are typically much more susceptible than humans to image perturbations such as occlusion. Here, we investigated how augmenting training datasets might lead to more occlusion-robust CNNs that better predict human visual behavior. To address this question, we trained separate instances of the same CNN architecture (CORnet-S) to classify the ImageNet 1k dataset either a) without augmentation, b) with occlusion by artificially generated shapes without texture, c) with occlusion by naturalistic shapes derived from photographs, without texture, and d) with occlusion by naturalistic shapes with original textures preserved. After training, we used an occluded object stimulus set from a human behavioral study to measure classification accuracy and predictivity of human responses for each model. Compared to the standard dataset, we found that both artificial and natural occlusion-training led to increased accuracy, however, only natural occlusion-training led to greater human-likeness, with separate benefits of naturalistic shape and texture. Overall, these findings indicate that human occlusion robustness may be shaped by the specific forms of occlusion that occur in nature.
Keywords: behavior CNN robustness occlusion