Search Papers | Poster Sessions | All Posters

Poster A87 in Poster Session A - Tuesday, August 6, 2024, 4:15 – 6:15 pm, Johnson Ice Rink

Attention as a Driving Force in Large Language Model Advancements

Xiaoyan Li1; 1Tsinghua University

Large language models have developed rapidly in recent years and exhibited extraordinary problem-solving abilities. However, there is limited research on how attention abilities influence the problem-solving capacity of large language models. This study explores the intersection of cognitive neuroscience and large language models, focusing on the fine-tuning of these models to analyze how human cognitive abilities and disabilities affect the problem-solving function of large language models. Two GPT-4 based models were developed through prompt fine-tuning and retrieval-augmented generation. Results showed that the GPT-4-fine-tuned model achieved the highest accuracy (81.2%), while the model lacking attention performed poorly on questions requiring long-term inference. GPT-4's analysis recognized the lack of attention in the modified model, highlighting the importance of this cognitive ability in solving problems that demand long-term reference. This study sheds light on the mechanisms of problem-solving in the brain and the potential of AI to approximate human-like cognition.

Keywords: Large language model cognitive ability attention cognition 

View Paper PDF