Poster C61 in Poster Session C - Friday, August 9, 2024, 11:15 am – 1:15 pm, Johnson Ice Rink

Language use is only sparsely compositional: The Case of English Adjective-Noun Phrases in Humans and LLMs

Aalok SATHE1 (), Evelina FEDORENKO1, Noga ZASLAVSKY2; 1MIT, 2UC Irvine

Compositionality is a hallmark of human language. However, most research focuses on item-level compositionality, e.g., to what extent the meanings of phrases are composed from the meanings of their sub-parts, rather than on language-level compositionality, which is the degree to which possible meaningful combinations are utilized in practice during language use. Here, we propose a novel way to quantify the degree of language-level compositionality, and apply it in the case of English adjective-noun combinations. Using corpus analyses, large language models, and human acceptability ratings, we find that (1) English only sparsely utilizes the compositional potential of adjective–noun combinations; and (2) LLMs struggle to predict human acceptability judgments of rare combinations. Taken together, our findings shed new light on the role of compositionality in language and highlight a challenging area for further improving LLMs.

Keywords: language compositionality large language models information theory 

View Paper PDF