Tidsschema
Beskrivning
The course belongs to the MA Programme Linguistic Diversity in the Digital Age.
- study track: language technology
- modules: Studies in Language Technology (LDA-T3100), Essentials in Language Technology (LDA-TA500), Comprehensive specialization in Language Technology (LDA-TB500)
This is an optional course.
The course is available to students from other study tracks and degree programmes.
After successfully completing the course, students will be able to:
- describe the differences between various machine translation paradigms (rule-based, statistical, neural machine translation)
- explain the basics of neural machine translation and the most common model architectures
- describe the issues related to machine translation evaluation
- train, test and evaluate a neural machine translation system including the appropriate pre- and post-processing steps
- read, understand and present scientific papers on current challenges in machine translation.
Students can take this course in year 1 or 2. The course is offered during the spring term in period 4.
- History of machine translation and paradigms
- Parallel corpora, alignment, preprocessing
- Human and automatic evaluation of machine translation
- Introduction to neural networks and neural machine translation
- Common model architectures: sequence-to-sequence models, attention, self-attention
- Open vocabulary translation
- Current topics in machine translation: multilingual MT, unsupervised MT, multimodal MT, domain adaptation, etc.
- Weekly lectures and hands-on assignments
- Shared-task-based assignment with report
- Seminar presentation on current topic
- Philipp Koehn (2012): Statistical machine translation. Cambridge. [Including appendix chapter on neural machine translation]
- Mikel Forcada (2017): Making sense of neural machine translation. In: Translation Spaces 6/2.
- Additional material distributed in the course.
Grading follows the standard scale 0 – 5. The following aspects are taken into account in grading:
- Hands-on assignments (⅓)
- Shared-task-based assignment with report (⅓)
- Seminar presentation on current topic (⅓)
Prerequisites:
- Command-line course or equivalent
Recommended optional studies:
- Programming for linguists or equivalent (BA level)
- Mathematics for linguists or equivalent (BA level)
- Machine learning for linguists or equivalent (BA level)