Master's Programme in Data Science is responsible for the course
The course belongs to the Machine learning module
The course is available to students from other degree programmes
Prerequisites in terms of knowledge
Understanding of probability calculus and statistics (including multivariate statistics), linear algebra (matrix calculus) and differential calculus (differentiation and integration). One needs to be able to fluently follow mathematical description of methods and algorithms based on these concepts, as well as to perform simple derivations. Programming skills in some numerical language (typically Python or R) sufficient for implementing machine learning algorithms.
Prerequisites for students in the Data Science programme, in terms of courses
DATA11002 Introduction to Machine Learning
Prerequisites for other students in terms of courses
DATA11002 Introduction to Machine Learning
Recommended preceding courses
MAT22005 Bayesian Inference, DATA20001 Deep Learning, DATA12002 Probabilistic Graphical Models
Courses in the Machine Learning and Statistical Data Science modules
Other courses that support the further development of the competence provided by this
Advanced Statistical Inference, Advanced Course in Bayesian Statistics, Data Science Project
Obtains deeper knowledge of domain skills in machine learning: Can describe the basic formulation of machine learning as minimising the expected risk, and recognises alternative formulations for the risk. Can derive practical loss functions starting from the formal definition, and can describe the relationship between probabilistic models and loss minimisation. Can describe clearly the core tasks of unsupervised and supervised learning, and recognises also more advanced learning setups. Is able to derive and implement in a numerical programming language at least one algorithm suitable for each typical unsupervised learning task: clustering, factor analysis and dimensionality reduction. Can derive and implement in a numerical programming language sparse and regularised linear methods for classification and regression, and can implement some non-linear classification methods such as random forests and support vector machines. Recognises various forms of neural networks and can follow derivation of the relevant learning algorithms and regularisation techniques. Is able to implement simple deep learning models using suitable software frameworks.
Recommended time/stage of studies for completion: first spring
Term/teaching period when the course will be offered: yearly in spring, fourth period
Formulation of machine learning as risk minimisation and as probabilistic modelling. Different kinds of machine learning tasks, covering also advanced setups such as transfer learning. Optimisation for machine learning: gradient-based methods, expectation maximisation, back-propagation. Unsupervised learning methods: clustering, factor analysis, matrix factorisation, non-linear dimensionality reduction. Supervised learning methods: Linear and non-linear classifiers, kernel methods, decision trees and forests, boosting. Neural networks and deep learning: multi-layer perceptron, convolutional networks, autoencoders, Boltzmann machines.
Course book: Kevin P. Murphy "Machine Learning: A Probabilistic Perspective", MIT Press, 2012.
The course book is complemented with additional publicly available material, and the course book may change in future.
The primary mode of instruction consists of lectures and exercise sessions with active guidance, supported by other forms of teaching methods when applicable. The students are encouraged to attend the lectures and they need to solve exercise problems including problems involving programming tasks to reach the learning outcomes related to implementation skills. The exercise problems are formulated in an open manner to support acquisition of problem-solving skills, and require written presentation to facilitate learning of scientific presentation skills.
Yleistenttien luetteloissa käytetään seuraavia merkintöjä: (U): Tentti on (ensimmäinen kurssia seuraava) yleinen tentti ja samalla kurssitentin/kurssitenttien uusintatentti. Uusintatenttisuorituksessa harjoituspisteet tms. otetaan huomioon. (HT): Tenttiin voivat osallistua vain ne, jotka ovat suorittaneet kurssiin kuuluvat pakolliset harjoitustyöt (tms.). (HT/U): Kuten (U), mutta osallistumisoikeus on rajoitettu HT-kokeen tapaan. (ei erityismerkintää): Tentti on yleinen tentti; osallistumista ei ole rajoitettu, mutta kurssin vaatimat esitiedot on syytä ottaa huomioon.
The course is completed via a combination of exam and exercises, and both parts need to be passed to complete the course. Part of the exercises involve programming.
Completing the course with separate exam requires solving a small research project.