Kaisa_2012_3_photo by Veikko Somerpuro

14.8.2018 at 09:00 - 18.10.2018 at 23:59


Here is the course’s teaching schedule. Check the description for possible other schedules.

Thu 6.9.2018
10:15 - 11:45
Mon 10.9.2018
14:15 - 15:45
Thu 13.9.2018
10:15 - 11:45
Mon 17.9.2018
14:15 - 15:45
Thu 20.9.2018
10:15 - 11:45
Mon 24.9.2018
14:15 - 15:45
Thu 27.9.2018
10:15 - 11:45
Mon 1.10.2018
14:15 - 15:45
Thu 4.10.2018
10:15 - 11:45
Mon 8.10.2018
14:15 - 15:45
Thu 11.10.2018
10:15 - 11:45
Mon 15.10.2018
14:15 - 15:45
Thu 18.10.2018
10:15 - 11:45


The course belongs to the MA Programme Linguistic Diversity in the Digital Age

  • study track: language technology
  • modules: Studies in Language Technology (LDA-T3100), Essentials in Language Technology (LDA-TA500), Comprehensive specialization in Language Technology (LDA-TB500)

This is an optional course.

The course is available to students from other study tracks and degree programmes.

  • Programming for linguists or equivalent (BA level)
  • Mathematics for linguists or equivalent (BA level)
  • Machine learning for linguists or equivalent (BA level)
  • Linguistics in the digital age
  • Computational syntax
  • Computational semantics
  • Computational morphology

After successfully completing the course, students will be able to

  • explain models and algorithms used in selected NLP applications
  • describe properties of local prediction models and structural prediction models and methods that can be used to train them
  • explain the differences between generative and discriminative models and between supervised and unsupervised learning
  • describe the main components of a selected NLP application, for example a part-of-speech tagger
  • train and evaluate practical NLP models in a sound and scientific manner.

Students are advised to take this course in year 2 (semester 3). The course is offered during the autumn term in period I.

Models and algorithms used in common NLP applications:

  • Local prediction models and different training algorithms (Naive Bayes, perceptron, log-linear models)
  • Structured prediction models and different training algorithms (HMM, structured perceptron, CRF)
  • Dynamic programming algorithms for alignment and decoding
  • Semi-supervised and unsupervised learning (EM algorithm).

Weekly lectures

Weekly assignments consisting of theoretical questions and programming exercises

The literature depends on the selected application, for example Philipp Koehn: "Statistical Machine Translation" (Cambridge University Press) in case of machine translation.

Other recommended literature: Manning and Schütze: Foundations of Statistical Natural Language Processing (MIT Press).

Additional web material and literature distributed on the course.

  • Lectures and tutorials
  • Interactive sessions, for example flipped classroom activities
  • Problem-based collaborative project work
  • Seminars with peer-review
  • Activities documented in Moodle

Weekly assignments consisting of theoretical questions and programming exercises

  • Contact teaching (lectures, tutorials, seminars)
  • Self studies and group work


  • one or more of the following: flipped classroom activities, overview paper, written exam (part I)
  • project report and presentation with peer-review (part II)