CCP | Corpus-linguistic text processing with R
Corpus-linguistic text processing with R

Stefan Th. Gries (UC Santa Barbara & JLU Giessen)

This course will introduce basic corpus-linguistic analysis using the
open-source programming language R. We will cover basic data handling
and programming issues (general as well as corpus-linguistic ones),
fundamentals of regular expressions, and we will go over a variety of
small applications involving different kinds of raw and annotated
corpus data.

Readings

Gries, Stefan Th. Managing synchronic corpus data with the British National Corpus (BNC). In Andrea L. Berez-Kroeker, Brad McDonnell, Eve Koller, & Lauren Collister (eds.), MIT Open Handbook of Linguistic Data Management. Cambridge, MA: The MIT Press.

Gries, Stefan Th. & John Newman. 2013. Creating and using corpora. In Robert J. Podesva & Devyani Sharma (eds.), Research methods in linguistics, 257-287. Cambridge: Cambridge University Press.