Carrie Demmans Epp, University of Alberta and Gaisha Oralova, University of Pittsburgh

Recent advances in (large) language models, (L)LMs, have opened new avenues for psycholinguistic research, particularly in modeling language processing and prediction. This workshop introduces the concept of surprisal, a key measure derived from information theory that quantifies processing difficulty based on word predictability. We will explore how surprisal is computed in (L)LMs, its relevance to human sentence processing, and its empirical correlation with behavioral data such as reading times, eye-tracking and ERP measures. During the workshop, participants will engage in hands-on experience using Python to extract surprisal values and analyze their implications in psycholinguistic research. We will discuss how surprisal can be applied to studying linguistic complexity and how it can reveal native vs. non-native reading differences. Additionally, we will analyze experimental data with surprisal predictors using linear mixed-effects modelling in R. This workshop is designed for researchers and students in psycholinguistics and cognitive science, who are interested in leveraging (L)LMs for psycholinguistic experiments.

No prior experience with computational modeling is required, though familiarity with basic linguistic concepts and R/Python programming will be beneficial.