Syllabus for Analyzing Linguistic Data: LIN392

Course Information

Syllabus and Text

This page serves as the syllabus for this course.

Official course textbooks:

We will also make use of other readings, which will be made available on the course website.

Exams and Assignments

Assignments will be updated on the Assignments page. A tentative schedule for the entire semester is posted on the Schedule page. Readings and exercises may change up a week in advance of their due dates. There is an end-of-term project for the course, where students will be expected to choose a dataset that they intend to analyze.

Philosophy and Goal

Many research topics in linguistics require or can benefit from sophisticated statistical analysis of language datasets. This course will introduce fundamental concepts that will enable students to formulate quantitatively-oriented research questions and answer them with appropriate visualization, modeling and testing. Students are expected to learn these techniques, apply them to data sets provided in languageR and by us, and generalize them to a dataset of their own choice.

We use the R programming language, which allows much more flexible and customizable ways of performing such exploration and analysis, compared to statistical packages based on point-and-click interfaces (like SPSS). It also forms a strong basis for using more complex modeling techniques than are covered in this course—including writing one's one code to do so.

Content Overview

This course provides hands-on introduction to statistics for language, using the R programming language. Using data from existing linguistic studies, we will study the following topics:

  • data exploration through visualization
  • probability distributions
  • mean and standard deviation of a single dataset
  • comparing pairs of datasets and hypotheses:testing for statistical significance
  • regression modeling
  • clustering for data exploration

Course Requirements

  • Homeworks (12% each): 5 homeworks will be assigned during the course
  • Project proposal (5%)
  • Project progress report (5%)
  • Project presentation (5%)
  • Project final report (25%)

