Aki-Juhani Kyröläinen (McMaster)

Free-form texts provide a unique perspective on understanding how language use can reflect aspects of cognition. At the same time, the quantitative analysis of free-form texts presents multiple challenges for psycholinguistic research. In this course, participants will familiarize themselves with tools related to preprocessing of texts such as tokenization, lemmatization and syntactic parsing based on Universal Dependencies. These preprocessing steps provide opportunities for further quantitative analysis of free-form texts. During the course, quantitative semantic analysis of free-form texts is carried out with structural topic modelling. Finally, we will utilize topic modelling to generate features from the texts and use them in combination with machine learning. This allows us to model various individual-level variables commonly employed in psycholinguistic research, such as level of education, perceived loneliness, and the frequency of memory failures.