### Timetable

### Description

The Master's programme in Data Science is responsible for the course.

The course is available to students from other degree programmes.

**Prerequisites in terms of knowledge**

Basics in statistics and probability calculus. Students should know how

to calculate ordinary probabilities, conditional probabilities and

marginal probabilities and how to use sum and product rule. Furthermore,

students should know the difference between discrete distribution and

continuous distribution and how to calculate expectation/mean, variance,

standard deviation and p-fractiles for both kind of distributions.

Familiarity in basic distributions like Binomial distribution, Poisson

distribution, Normal distribution and Beta distribution is benefitial.

Basic programming skills are required. Especially it is useful to know R

software environment.

**Prerequisites for students in the Data Science programme, in terms of ****courses**

None

**Prerequisites for other students in terms of courses**

None

**Recommended preceding courses**

None

MAST32004 Advanced Bayesian inference, MAST32005 Spatial modeling and

Bayesian inference, Computational statistics, High dimensional

statistics, Advanced course in Machine learning, Probabilistic graphical

models

The course covers the basic theory behind Bayesian statistical inference

and its applications to common problems in data science. After the

course students understand the Bayes theorem and the related concepts,

including prior, posterior and predictive distribution and the

likelihood function. Students will also be familiar with graphical model

representation and basics in model assessment and criticism. Students

are also able to apply Bayes theorem to write down simple hierarchical

Bayesian models for common data analysis problems such as basic

parametric models and (generalized) linear regression. Students will be

familiar with the basic concept of Markov chain Monte Carlo (MCMC) and

are able to apply MCMC methods to solve hierarchical Bayesian models

using the R and Stan software.

First year of Master's studies

Second period

The course starts with introduction to Bayesian inference. We will study

Bayes theorem and its components: prior distribution and likelihood

function and how these define the posterior distribution. We will apply

the Bayes theorem to inference on population parameters using Binomial

model. After this we jump into technical necessities related to Bayesian

inference. The main mathematical operation in Bayesian analysis is

integration which is used when solving for the posterior distribution,

in marginalization over model parameters and in prediction. In this

course, we learn how to use Markov chain Monte Carlo (MCMC) methods to

approximate the required integrals. We will use R and Stan software to

conduct the practical calculations in all exercises. Next, we will study

(generalized) linear models and few common hierarchical parametric

models (Binomial, Gaussian, Poisson) and develop practical experience on

their use in some common applied questions. For last we will introduce

model assessment and criticism with posterior predictive checks and

sensitivity analyses and take a quick look to more fundamental topics in

Bayesian statistics, including exchangeability and conditional

independence, graphical models and model comparison. However, more

thorough treatment of these topics will be left for other courses such

as MAST32004 Advanced Bayesian Inference.

* Gelman, A., Carlin, J. B., Stern, H. S., Dunson D. B., Vehtari A. and

Rubin, D. B. (2013). Bayesian Data Analysis. Chapman & Hall/CRC. Second

or third edition.

* Number of selected articles

* R programming environment and Stan software

* Students are required to read course material, complete required

proportion of the exercises and pass the exam

* Teacher and course assistants will assist during the lectures,

exercises and via Moodle

The course grade will depend on the exercises and the exam. The course

cannot be passed by doing only the exam or only the exercises.

General exams last 3 hours and 30 minutes. Renewal exam (marked with "(U)") is the first general exam after the course and also a renewal exam of course exam(s). In a renewal exam the points student has earned during the course are taken into account. Exams marked with "(HT)" are allowed only to students who have completed the obligatory projects or other exercises included in those courses. Exams marked with "(HT/U)" are renewals to students who have completed the obligatory projects during the course. General exams might cover different area than the lectured course. Check the course web page and contact the responsible teacher if in doubt.

The course consists of lectures, exercises and an exam. Completion

requires passing the exercises and passing the exam.

Jarno Vanhatalo