Variable selection using glmnet R package.

How to do statistical inference with large numbers of variables encountered across modern data science applications?

Ilmoittaudu
23.10.2017 klo 09:00 - 13.12.2017 klo 23:59

Aikataulu

Tästä osiosta löydät kurssin opetusaikataulun. Tarkista mahdolliset muut aikataulut kuvauksesta.

PäivämääräAikaOpetuspaikka
Ma 30.10.2017
10:15 - 12:00
Ti 31.10.2017
12:15 - 14:00
Ma 6.11.2017
10:15 - 12:00
Ti 7.11.2017
12:15 - 14:00
Ma 13.11.2017
10:15 - 12:00
Ti 14.11.2017
12:15 - 14:00
Ma 20.11.2017
10:15 - 12:00
Ti 21.11.2017
12:15 - 14:00
Ma 27.11.2017
10:15 - 12:00
Ti 28.11.2017
12:15 - 14:00
Ma 4.12.2017
10:15 - 12:00
Ti 5.12.2017
12:15 - 14:00
Ma 11.12.2017
10:15 - 12:00
Ti 12.12.2017
12:15 - 14:00

Muu opetus

01.11. - 29.11.2017 Ke 14.15-16.00
13.12.2017 Ke 14.15-16.00
Matti Pirinen
Opetuskieli: englanti

Kuvaus

Master's Programme in Mathematics and Statistics (MAST) is responsible for the course.

The course belongs to the Statistics study track of MAST.

The course is available to students from other degree programmes.

Bsc level statistics, R software

Computational statistics

Knowledge of methods for high-dimensional inference problems and practical experience of solving them with computer.

Recommended time/stage of studies for completion: 1. or 2. year.

The course is lectured every second year.

Statistical inference when either the number of data units is large and/or each unit has been measured on a large number of variables.

1. Large scale inference (P-values, false discovery rates, Bayesian posterior probabilities).

2. Variable selection (stepwise regression, AIC, BIC; penalized regression with glmnet package using Lasso, ridge regression and elastic net).

3. Dimension reduction (principal components analysis, singular value decomposition).

Applications across modern data science.

Weekly exercises and exam.

Lectures, computer excersises, project work.

Graded 1-5 based on exercises and exam.

Exercises and project work.