### Instruction

Name | Cr | Method of study | Time | Location | Organiser |
---|---|---|---|---|---|

Introduction to Machine Learning (HT/U) | 5 Cr | General Examination | 30.1.2020 - 30.1.2020 |

Name | Cr | Method of study | Time | Location | Organiser |
---|---|---|---|---|---|

Introduction to Machine Learning | 5 Cr | Course exam | 17.12.2019 - 17.12.2019 | ||

Introduction to Machine Learning | 5 Cr | Lecture Course | 30.10.2019 - 13.12.2019 | ||

Introduction to Machine Learning (HT) | 5 Cr | General Examination | 11.9.2019 - 11.9.2019 | ||

Introduction to Machine Learning (HT) | 5 Cr | General Examination | 7.6.2019 - 7.6.2019 | ||

Introduction to Machine Learning (HT) | 5 Cr | General Examination | 17.4.2019 - 17.4.2019 | ||

Introduction to Machine Learning (HT/U) | 5 Cr | General Examination | 31.1.2019 - 31.1.2019 | ||

Introduction to Machine Learning | 5 Cr | Examination | 18.12.2018 - 18.12.2018 | ||

Introduction to Machine Learning | 5 Cr | Lecture Course | 1.11.2018 - 14.12.2018 | ||

Introduction to Machine Learning (HT) | 5 Cr | General Examination | 12.9.2018 - 12.9.2018 | ||

Introduction to Machine Learning (HT) | 5 Cr | General Examination | 8.6.2018 - 8.6.2018 | ||

Introduction to Machine Learning (HT) | 5 Cr | General Examination | 18.4.2018 - 18.4.2018 | ||

Introduction to Machine Learning (HT/U) | 5 Cr | General Examination | 31.1.2018 - 31.1.2018 | ||

Introduction to Machine Learning | 5 Cr | Lecture Course | 1.11.2017 - 15.12.2017 |

### Target group

Data Science Master's programme

Data Science Methods

The course is available to students from other degree programmes

### Prerequisites

**Prerequisites in terms of knowledge**

Basics of probability calculus and statistics (including multivariate probability, Bayes formula, and maximum likelihood estimators) and intermediate level linear algebra (including multivariate calculus). Good programming skills in some language and the ability to quickly acquire the basics of a new environment (R or python/numpy/scipy). Some knowledge of data science and artificial intelligence is useful but not required.

**Prerequisites for students in the Data Science programme, in terms of courses**

None

**Prerequisites for other students in terms of courses**

Introduction to statistics (including multivariate probability, Bayes formula, and maximum likelihood estimators). Linear algebra and matrices I-II (including multivariate calculus). TKT10002 Introduction to Programming and TKT10003 Advanced Course in Programming (i.e., good programming skills in some language and the ability to quickly acquire the basics of a new environment (R or python/numpy/scipy)).

**Recommended preceding courses**

DATA11001 Introduction to Data Science and DATA15001 Introduction to Artificial Intelligence

### Learning outcomes

- Defines and is able to explain basic concepts in machine learning (e.g. training data, feature, model selection, loss function, training error, test error, overfitting)
- Recognises various machine learning problems and methods suitable for them: supervised vs unsupervised learning, discriminative vs generative learning paradigm, symbolic vs numeric data
- Knows the basics of a programming environment (such as R or python/numpy/scipy) suitable for machine learning applications
- Is able to implement at least one distance-based, one linear, and one generative classification method, and apply these to solving simple classification problems
- Is able to implement and apply linear regression to solve simple regression problems
- Explains the assumptions behind the machine learning methods presented in the course
- Implements testing and cross- validation methods, and is able to apply them to evaluate the performance of machine learning methods and to perform model selection
- Comprehends the most important clustering formalisms (distance measures, k-means clustering, hierarchical clustering)
- Explains the idea of the k-means clustering algorithm and is able to implement it
- Is able to implement a method for hierarchical clustering and can interpret its results

### Timing

First semester (Autumn)

Typically 2nd period

### Contents

- statistical learning, models and data, evaluating performance, overfitting, bias-variance tradeoff
- linear regression
- classification: logistic regression, linear and quadratic discriminant analysis, naive Bayes, nearest neighbour classifier, decision trees, support vector machine
- clustering (flat and hierarchical); k-means, agglomerative clustering
- resampling methods (cross-validation, bootstrap), ensemble methods (bagging, random forests)

### Activities and teaching methods in support of learning

The course will involve weekly exercises that include both programming and other kinds of problems ("pen and paper").

### Study materials

Gareth James, Daniela Witten, Trevor Hastie and Robert Tibshirani: *An Introduction to Statistical Learning with Applications in R, *Springer, 2013.

Parts of the textbook that are required are specified on the course web page.

### Assessment practices and criteria

Assessment and grading is based on completed exercises and a course exam. Possible other criteria will be specified on the course web page.

### Recommended optional studies

Courses in the *Machine Learning* module

### Completion methods

- Contact teaching
- Possible attendance requirements are specified each year at the course web page
- Completion is based on exercises and one or more exams. Possible other methods of completion will be announced on the course web page.