Search Papers | Poster Sessions | All Posters

Poster A110 in Poster Session A - Tuesday, August 6, 2024, 4:15 – 6:15 pm, Johnson Ice Rink

Brain-Inspired Embedding Model: Scaling and Perceptual Fine-tuning

Stephen Chong Zhao1, Jason Lee2, Andrew Bender3, Trisha Mazumdar2, Adaline Leong2, Prince Owusu Nkrumah2, Mark Wallace4, David A. Tovar4 (); 1Data Science Institute, Vanderbilt University, 2Computer Science Department, Vanderbilt University, 3Neurosciences Graduate Program, University of California San Diego, 4Psychology Department, Vanderbilt University

Human perception is complex and multifaceted, making it challenging to quantify the subtle nuances and variations in how individuals perceive and categorize objects. To address this, we propose a novel brain-inspired mental embedding model called CLIP-HBA (Human Behavioral/Brain Analysis), which leverages the multimodal capabilities of the CLIP (Contrastive Language-Image Pretraining) architecture to create generalizable embeddings from human behavioral outputs and neural data. By fine-tuning the CLIP model with a 66-dimensional behavioral embedding derived from the SPoSE (Sparse Positive Similarity Embedding) model and the THINGs dataset, CLIP-HBA demonstrates improvements in behavioral and brain alignment compared to the original CLIP-ViT (Vision Transformer) model. The model’s generalizability is validated through external magnetoencephalography (MEG) datasets, consistently outperforming CLIP-ViT in brain alignment. This work opens new avenues for creating personalizable embeddings specific to diverse populations.

Keywords: Object Perception Embeddings Transformers 

View Paper PDF