Kaisa_2012_3_photo by Veikko Somerpuro

Enrol
30.12.2019 at 09:00 - 19.1.2020 at 23:59

Timetable

Description

The Master's programme in Data Science is responsible for the course.
The course is available to students from other degree programmes.

Prerequisites in terms of knowledge

Basics in statistics and probability calculus. Students should know how
to calculate ordinary probabilities, conditional probabilities and
marginal probabilities and how to use sum and product rule. Furthermore,
students should know the difference between discrete distribution and
continuous distribution and how to calculate expectation/mean, variance,
standard deviation and p-fractiles for both kind of distributions.
Familiarity in basic distributions like Binomial distribution, Poisson
distribution, Normal distribution and Beta distribution is benefitial.
Basic programming skills are required. Especially it is useful to know R
software environment.

Prerequisites for students in the Data Science programme, in terms of courses

None

Prerequisites for other students in terms of courses

None

Recommended preceding courses

None

MAST32004 Advanced Bayesian inference, MAST32005 Spatial modeling and
Bayesian inference, Computational statistics, High dimensional
statistics, Advanced course in Machine learning, Probabilistic graphical
models

The course covers the basic theory behind Bayesian statistical inference
and its applications to common problems in data science. After the
course students understand the Bayes theorem and the related concepts,
including prior, posterior and predictive distribution and the
likelihood function. Students will also be familiar with graphical model
representation and basics in model assessment and criticism. Students
are also able to apply Bayes theorem to write down simple hierarchical
Bayesian models for common data analysis problems such as basic
parametric models and (generalized) linear regression. Students will be
familiar with the basic concept of Markov chain Monte Carlo (MCMC) and
are able to apply MCMC methods to solve hierarchical Bayesian models
using the R and Stan software.

First year of Master's studies
Second period

The course starts with introduction to Bayesian inference. We will study
Bayes theorem and its components: prior distribution and likelihood
function and how these define the posterior distribution. We will apply
the Bayes theorem to inference on population parameters using Binomial
model. After this we jump into technical necessities related to Bayesian
inference. The main mathematical operation in Bayesian analysis is
integration which is used when solving for the posterior distribution,
in marginalization over model parameters and in prediction. In this
course, we learn how to use Markov chain Monte Carlo (MCMC) methods to
approximate the required integrals. We will use R and Stan software to
conduct the practical calculations in all exercises. Next, we will study
(generalized) linear models and few common hierarchical parametric
models (Binomial, Gaussian, Poisson) and develop practical experience on
their use in some common applied questions. For last we will introduce
model assessment and criticism with posterior predictive checks and
sensitivity analyses and take a quick look to more fundamental topics in
Bayesian statistics, including exchangeability and conditional
independence, graphical models and model comparison. However, more
thorough treatment of these topics will be left for other courses such
as MAST32004 Advanced Bayesian Inference.

* Gelman, A., Carlin, J. B., Stern, H. S., Dunson D. B., Vehtari A. and
Rubin, D. B. (2013). Bayesian Data Analysis. Chapman & Hall/CRC. Second
or third edition.
* Number of selected articles
* R programming environment and Stan software

* Students are required to read course material, complete required
proportion of the exercises and pass the exam
* Teacher and course assistants will assist during the lectures,
exercises and via Moodle

The course grade will depend on the exercises and the exam. The course
cannot be passed by doing only the exam or only the exercises.

General exams last 3 hours and 30 minutes. Renewal exam (marked with "(U)") is the first general exam after the course and also a renewal exam of course exam(s). In a renewal exam the points student has earned during the course are taken into account. Exams marked with "(HT)" are allowed only to students who have completed the obligatory projects or other exercises included in those courses. Exams marked with "(HT/U)" are renewals to students who have completed the obligatory projects during the course. General exams might cover different area than the lectured course. Check the course web page and contact the responsible teacher if in doubt.

The course consists of lectures, exercises and an exam. Completion
requires passing the exercises and passing the exam.

Jarno Vanhatalo