Kaisa_2012_3_photo by Veikko Somerpuro

In this course, students work in small groups of 4-5 persons, producing a concrete solution for a real data science problem.

In this course, students work on a practical data science project as part of a group, taking responsibility of individual elements of a bigger project while actively interacting with the group towards solving a common goal.

The project is originally offered by a client, typically a company or research lab.
Through discussion with the client, students formalize the task for a data-driven service that addresses the client's need.

Towards completing the task, students choose suitable tools for solving the problem, plan ahead a timeline for the contribution of each group member, and maintain a diary to record challenges they faced in the process.

At the end, each group shares their results through a written report and a presentation in front of clients and the class.



Mon 10.09.2018: The class meets and discusses practicalities. Students start to form groups.
Mon 17.09.2018: Each group is assigned to a topic.
Mon 24.09.2018 & Mon 1.10.2018: Student presentations about data science tools.
Mon 08.10.2018: Work plan is due and presented in-class.
Mon 12.11.2018: Mid-review presentations: students present their progress to the client and instructor.
Mon 10.12.2018: Final presentation.
Fri 14.12.2018: Diary and Report is due.

Mon 10.9.2018
12:15 - 14:00
Mon 17.9.2018
12:15 - 14:00
Mon 24.9.2018
12:15 - 14:00
Mon 1.10.2018
12:15 - 14:00
Mon 8.10.2018
12:15 - 14:00
Mon 15.10.2018
12:15 - 14:00
Mon 29.10.2018
12:15 - 14:00
Mon 5.11.2018
12:15 - 14:00
Mon 12.11.2018
12:15 - 14:00
Mon 19.11.2018
12:15 - 14:00
Mon 26.11.2018
12:15 - 14:00
Mon 3.12.2018
12:15 - 14:00
Mon 10.12.2018
12:15 - 14:00

Other teaching


Slides from the class sessions.


Master's Programme in Data Science is responsible for the course.

The course belongs to the Data Science Methods module.

The course is primarily intended for students of the Data Science Master's program. Other students can enrol for the course, but in case it fills up preference is given for the Data Science students.

Prerequisites in terms of knowledge

Software development skills on a level that is sufficient for working as part of a larger software development team (good programming skills, version control etc), for example as obtained during Bachelor in Computer Science. Some background on modeling data; no requirements are assumed on specific set of algorithms, but one should be familiar with the basic process of learning models from data and evaluating their accuracy, and should know some practical models or algorithms that can be applied for such tasks.

Prerequisites for students in the Data Science programme, in terms of courses

DATA11002 Introduction to Machine Learning (or DATA12002 Probabilistic Graphical Models)

Prerequisites for other students in terms of courses

Good programming skills; DATA11001 Introduction to Data Science; at least one of: DATA11002 Introduction to Machine Learning, DATA20001 Deep Learning, DATA12002 Probabilistic Graphical Models

Recommended preceding courses


The project is about applying theoretical knowledge into solving practical problems, and hence all other courses in the program support the course.

Other courses that support the further development of the competence provided by this
course: Data Science Project II

Student is able to solve a practical data science challenge as part of a group, taking responsibility of individual elements of a bigger project while actively interacting with the group towards solving a common goal. Can identify and formalise a need or target for a data-driven service given a context (typically a data source or device that produces data), can choose suitable tools for solving the problem, and is able to deliver a functioning service that fills the need. Is aware of the challenges associated with working on real data and recognises potential limitations and challenges of data science tools, and can find information for solving them. Can analyse practical data science tools and make presentable conclusions about their usability. Is able to apply theoretical knowledge learned during other courses in practice.

Recommended time/stage of studies for completion: first year spring or during second year

Term/teaching period when the course will be offered: offered during both spring and fall, covering periods I-II and III-IV

Application of data science skills in producing a practical data science product or service. The detailed content, such as algorithms and tools used for creating the solution, depends on the practical problem and domain chosen by the group.

The course material is provided as lecture notes, slides and links to external sources.

The course combines instructions by the lecturer, presentations by the students, and long-term group work. The details of the supervision of the group work will be determined case-by-case. The students will write a study diary analysing and reflecting their learning during the course.

Grading scale is 1...5.

The grading is based on active participation in the group work, demonstrable individual contributions in the final result, the quality and complexity of the solution and its presentation, and the quality of the individual work not carried out as a group member, such as a tool presentation and the study diary.

The course is completed as a group project. The group is together responsible for delivering a practical data science solution for a problem they have jointly identified. The group will also present the solution for the rest of the course. In addition, the course typically involves elements the student completes alone, such as analysing a particular tool and presenting it for the rest of the course attendants as well as a study diary. The groups receive supervision from the teacher and possibly other instructors.