Regression estimates of particle concentrations in urban environment.

The methods of machine learning are integral part of the methods toolbox in sciences. This course will provide you with the necessary theoretical background to understand the fundamental machine learning concepts and to use the basic methods of supervised and unsupervised learning in a proper manner. The course will prepare you for the further studies in machine learning and introduce you to the methods and tools that are used to solve the problems in practice. The course will focus especially on the problems and applications related to the atmospheric and earth sciences.

After the course, the student is able to understand and explain basic concepts in machine learning (e.g., training data, generalisation error). The student will be able to map the practical problems into various machine learning tasks, knows the underlying assumptions, and is able to take the correct steps to solve the problems and knows how to interpret and evaluate the outcomes. The student knows the basics of a programming environment suitable for solving machine learning problems and is able to independently to do the basic data analysis tasks with such programming environments.

Prerequisites: we assume that the students have some background knowledge of statistics (e.g., they know the concepts of random variables, what is an expectation, and probability including conditional probability), linear algebra (e.g., basic matrix and vector operations, eigenvalues, and eigenvectors), basics of optimisation (e.g., the student understands how a function can be minimised using differentiation at "high school level"), and programming (good programming skills in some language and ability to quickly acquire the basics of a new environment such as R or Python). The course will include a brief introduction to the above mentioned topics and we will provide pointers to self-study material already prior to the course, with limited support available during the course. No specific background in atmospheric and earth system sciences is required, with the disclaimer that the examples discussed during the course may seem esoteric for students from other disciplines. Most students, e.g., with a BSc degree in physics should have sufficient math and programming background to participate to the course and/or quickly adapt the prerequisite skills.

The course is planned to be structured as a two-week intensive course, which consists of lectures, tutorials, hands-on exercises, and a project work. Attendance during the two-week period is mandatory. Detailed assessment practices and criteria as well as completion methods will be announced later.

This intensive course will cover roughly the same machine learning content as course DATA11002 Introduction to Machine Learning, but with emphasis on applications to atmospheric and earth system research. I will use the experience gained on this intensive course to design a new variant of DATA11002, to be lectured beginning Autumn 2020.

The course will be lectured by Associate Professor Kai Puolamäki. The course assistant is Dr Anna Shcherbacheva. The course will take place at Kumpula campus during 11-22 May 2020.

The course is available to students from other degree programmes. The number of participants is limited. The students are asked with the registration to describe briefly (in 1-2 sentences) why they would like to take the course. Registration closes tentatively on 27 April 2020. If there is larger than expected number of enrolments the students will be selected based on their reason to take the course, the study background, and the order of registration.

The course is under planning. Therefore, updates will be forthcoming and changes to the plan are possible! Follow this course website for updates. For more information please contact ml-inar2020@helsinki.fi

Enrol
11.2.2020 at 09:00 - 27.4.2020 at 23:59

Timetable

Here is the course’s teaching schedule. Check the description for possible other schedules.

DateTimeLocation
Mon 11.5.2020
09:15 - 16:00
Tue 12.5.2020
09:15 - 16:00
Wed 13.5.2020
09:15 - 16:00
Thu 14.5.2020
09:15 - 16:00
Fri 15.5.2020
09:15 - 16:00
Mon 18.5.2020
09:15 - 16:00
Tue 19.5.2020
09:15 - 16:00
Wed 20.5.2020
09:15 - 16:00
Fri 22.5.2020
09:15 - 16:00

Material

Tasks

Exercise Set 0: Prerequisite Knowledge

Submit the answers to the Exercise Set 0: Prerequisite Knowledge via Moodle on 1 May 2020, at latest.

Description

The methods of machine learning are integral part of the methods toolbox in sciences, including the atmospheric and earth system sciences. This course will provide you with the necessary theoretical background to understand the fundamental machine learning concepts and to use the basic methods of supervised and unsupervised learning in a proper manner. The course will prepare you for the further studies in machine learning and introduce you to the methods and tools that are used to solve the problems in practice. The course will focus especially on the problems related to the various problems in atmospheric and earth sciences.

Notice that this is a Master's level course, where we assume that the students have some background in statistics (e.g., random variables, concept of probability including concepts as conditional probability), linear algebra (e.g., basic matrix and vector operations and eigenvalues and eigenvectors), optimization (e.g., the student understands how a function can be minimized using differentiation at "high school level"), and programming (good programming skills in some language and ability to quickly acquire the basics of a new environment such as R or Python). The course will include a brief introduction to the above mentioned topics and we will provide pointers to self-study material already prior to the course. Most students, e.g., with BSc degree in physics should have sufficient math background to participate to the course and/or quickly adapt the required skills.

After the course, the student is able to understand and explain basic concepts in machine learning (e.g., training data, generalization error). The student will be able to map the practical problems into various machine learning tasks, knows the assumptions made, and is able to take the correct steps to solve the problems and knows how to interpret and evaluate the outcomes. The student knows the basics of a programming environment suitable for solving machine learning problems and is able to independently to do the basic data analysis tasks with such programming environments.

11-22 May 2020

The course is available to students from other degree programmes. The number of participants is limited. The students are asked with the registration to describe briefly (in 1-2 sentences) why they would like to take the course. Registration closes on 27 April 2020. If there are more than 20 enrolments the students will be selected based on their reason to take the course, the study background, and the order of registration.

The course is planned to be structured as a two-week intensive course, which consists of lectures, tutorials, hands-on exercises, and a project work. Attendance during the two-week period is mandatory. Detailed assessment practices and criteria as well as completion methods will be announced later.

This intensive course will cover roughly the same machine learning content as course DATA11002 Introduction to Machine Learning, but with emphasis on applications to atmospheric and earth system research. I will use the experience gained on this intensive course to design a new variant of DATA11002, to be lectured beginning Autumn 2020.