### Timetable

### Conduct of the course

The course exam will be arranged as home examination on May 6th. See Moodle for details, and note that you can only take part in the exam if you have solved all the exercise problem sets.

### Description

Master's Programme in Data Science is responsible for the course.

The course belongs to the Machine learning module.

The course is available to students from other degree programmes.

**Prerequisites in terms of knowledge **

Understanding of probability calculus and statistics (including multivariate statistics), linear algebra (matrix calculus) and differential calculus (differentiation and integration). One needs to be able to fluently follow mathematical description of methods and algorithms based on these concepts, as well as to perform simple derivations. Programming skills in some numerical language (typically Python or R) sufficient for implementing machine learning algorithms.

**Prerequisites for students in the Data Science programme, in terms of courses **

DATA11002 Introduction to Machine Learning

**Prerequisites for other students in terms of courses **

DATA11002 Introduction to Machine Learning

**Recommended preceding courses **

MAT22005 Bayesian Inference, DATA20001 Deep Learning, DATA12002 Probabilistic Graphical Models

Courses in the Machine Learning and Statistical Data Science modules

Other courses that support the further development of the competence provided by this

course:

Advanced Statistical Inference, Advanced Course in Bayesian Statistics, Data Science Project

Obtains deeper knowledge of domain skills in machine learning: Can describe the basic formulation of machine learning as minimising the expected risk, and recognises alternative formulations for the risk. Can derive practical loss functions starting from the formal definition, and can describe the relationship between probabilistic models and loss minimisation. Can describe clearly the core tasks of unsupervised and supervised learning, and recognises also more advanced learning setups. Is able to derive and implement in a numerical programming language at least one algorithm suitable for each typical unsupervised learning task: clustering, factor analysis and dimensionality reduction. Can derive and implement in a numerical programming language sparse and regularised linear methods for classification and regression, and can implement some non-linear classification methods such as random forests and support vector machines. Recognises various forms of neural networks and can follow derivation of the relevant learning algorithms and regularisation techniques. Is able to implement simple deep learning models using suitable software frameworks.

Recommended time/stage of studies for completion: first spring

Term/teaching period when the course will be offered: yearly in spring, fourth period

Formulation of machine learning as risk minimisation and as probabilistic modelling. Different kinds of machine learning tasks, covering also advanced setups such as transfer learning. Optimisation for machine learning: gradient-based methods, expectation maximisation, back-propagation. Unsupervised learning methods: clustering, factor analysis, matrix factorisation, non-linear dimensionality reduction. Supervised learning methods: Linear and non-linear classifiers, kernel methods, decision trees and forests, boosting. Neural networks and deep learning: multi-layer perceptron, convolutional networks, autoencoders, Boltzmann machines.

Course book: Kevin P. Murphy "Machine Learning: A Probabilistic Perspective", MIT Press, 2012.

The course book is complemented with additional publicly available material, and the course book may change in future.

The primary mode of instruction consists of lectures and exercise sessions with active guidance, supported by other forms of teaching methods when applicable. The students are encouraged to attend the lectures and they need to solve exercise problems including problems involving programming tasks to reach the learning outcomes related to implementation skills. The exercise problems are formulated in an open manner to support acquisition of problem-solving skills, and require written presentation to facilitate learning of scientific presentation skills.

The course is completed via a combination of exam and exercises, and both parts need to be passed to complete the course. Part of the exercises involve programming.

Completing the course with separate exam requires solving a small research project.