Search Papers | Poster Sessions | All Posters
Poster A87 in Poster Session A - Tuesday, August 6, 2024, 4:15 – 6:15 pm, Johnson Ice Rink
Attention as a Driving Force in Large Language Model Advancements
Xiaoyan Li1; 1Tsinghua University
Large language models have developed rapidly in recent years and exhibited extraordinary problem-solving abilities. However, there is limited research on how attention abilities influence the problem-solving capacity of large language models. This study explores the intersection of cognitive neuroscience and large language models, focusing on the fine-tuning of these models to analyze how human cognitive abilities and disabilities affect the problem-solving function of large language models. Two GPT-4 based models were developed through prompt fine-tuning and retrieval-augmented generation. Results showed that the GPT-4-fine-tuned model achieved the highest accuracy (81.2%), while the model lacking attention performed poorly on questions requiring long-term inference. GPT-4's analysis recognized the lack of attention in the modified model, highlighting the importance of this cognitive ability in solving problems that demand long-term reference. This study sheds light on the mechanisms of problem-solving in the brain and the potential of AI to approximate human-like cognition.
Keywords: Large language model cognitive ability attention cognition