0% found this document useful (0 votes)
13 views58 pages

Lecture 01 Introduction

The document outlines an introductory lecture on Machine Learning by Prof. Dr. Aleksandar Bojchevski, covering various applications across fields such as biology, medicine, and robotics. It defines machine learning as a subset of artificial intelligence that enables computers to learn from data without explicit programming and discusses different types of machine learning problems, including supervised, unsupervised, and reinforcement learning. Additionally, it provides details on the course structure, schedule, homework, and grading criteria.

Uploaded by

LauRita AranGo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views58 pages

Lecture 01 Introduction

The document outlines an introductory lecture on Machine Learning by Prof. Dr. Aleksandar Bojchevski, covering various applications across fields such as biology, medicine, and robotics. It defines machine learning as a subset of artificial intelligence that enables computers to learn from data without explicit programming and discusses different types of machine learning problems, including supervised, unsupervised, and reinforcement learning. Additionally, it provides details on the course structure, schedule, homework, and grading criteria.

Uploaded by

LauRita AranGo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 58

Machine Learning

Lecture 1: Introduction

Prof. Dr. Aleksandar Bojchevski


09.04.25
Biology: Protein folding

1
Medicine: Colon cancer detection

2
Chemistry: Drug discovery

3
Physics: ATLAS experiment

4
Earth System Science: Weather (AtmoRep)

5
Self-driving cars: Road segmentation

6
Robotics: Arm control

7
Web, Recommendation and Search

Content recommendation E-commerce

8
Game playing

AlphaGO AlphaStar

9
Image generation

Cologne cathedral Style transfer 10


Natural language processing

Conversation with GPT-4 11


Natural language processing

Conversation with GPT-4 (continued) 12


Any many others …

Digital twin earth Ocean mind Face recognition

13
What unites all of these?

They are all using Machine Learning !

14
What is machine learning?

Field of study that gives computers the ability to learn without being explicitly
programmed.1

1
Due to Arthur Samuel who is credited with coining the term in 1959.
2
Due to Tom Mitchell.
3
Answer by GPT-4 when asked ”What is Machine Learning?”.

15
What is machine learning?

Field of study that gives computers the ability to learn without being explicitly
programmed.1

A field of inquiry devoted to understanding and building methods that leverage


data to improve performance on some set of tasks.2

1
Due to Arthur Samuel who is credited with coining the term in 1959.
2
Due to Tom Mitchell.
3
Answer by GPT-4 when asked ”What is Machine Learning?”.

15
What is machine learning?

Field of study that gives computers the ability to learn without being explicitly
programmed.1

A field of inquiry devoted to understanding and building methods that leverage


data to improve performance on some set of tasks.2

A subset of Artificial Intelligence that enables computer systems to learn from


data and improve their performance without explicit programming.3

1
Due to Arthur Samuel who is credited with coining the term in 1959.
2
Due to Tom Mitchell.
3
Answer by GPT-4 when asked ”What is Machine Learning?”.

15
Learned from data instead of explicitly programmed

Simple example - classify transactions into legitimate vs. fraudulent.


Rule-based approaches - rules handcrafted by human experts

Machine learning - learning from data

16
When to use machine learning?

Problems amenable to ML techniques:

Impossible or expensive to solve explicitly (e.g. high-dimensional)


Human expertise does not exist
Approximate solutions are fine
Complex, non-linear relationships
A (large) dataset of examples is available
Limited reliability and interpretability is fine

Good examples:
Bad examples:
17
When to use machine learning?

Problems amenable to ML techniques:

Impossible or expensive to solve explicitly (e.g. high-dimensional)


Human expertise does not exist
Approximate solutions are fine
Complex, non-linear relationships
A (large) dataset of examples is available
Limited reliability and interpretability is fine

Good examples: Recommendation, spam detection, clinical support, …


Bad examples:
17
When to use machine learning?

Problems amenable to ML techniques:

Impossible or expensive to solve explicitly (e.g. high-dimensional)


Human expertise does not exist
Approximate solutions are fine
Complex, non-linear relationships
A (large) dataset of examples is available
Limited reliability and interpretability is fine

Good examples: Recommendation, spam detection, clinical support, …


Bad examples: Parole decision, computing taxes, clinical decision, …
17
Types of machine learning problems

Three main paradigms:

Supervised learning

Unsupervised learning

Reinforcement learning

18
Types of machine learning problems

Three main paradigms:

Supervised learning
Classification
Regression
Unsupervised learning
Clustering
Dimensionality reduction
Generative modelling
Reinforcement learning

18
Supervised learning

Given a training dataset Dtrain = {(x1 , y1 ), . . . , (xN , yN )} where


xi ∈ X are the features of instance i, and yi ∈ Y are the corresponding targets.

Find a function f that generalizes this relationship, i.e. f (xi ) ≈ yi .

Using f , make predictions ytest = f (xtest ) for unseen test data.

19
Supervised learning: Classification

If the targets yi are categorical, i.e. yi ∈ {1, . . . , C}, we have classification.

Examples
Handwritten digit recognition
Fraud detection
Object classification
Cancer detection
Spam detection
MNIST digits
Usually we have single-label (binary or multi-class) classification. Sometimes we
also have multi-class multi-label classification.
20
Supervised learning: Regression

If the targets yi are continuous numbers, e.g. yi ∈ R, we have regression.

Examples
Demand forecasting
Molecule toxicity prediction
Stock market ”prediction”

Penguins
Sometimes the targets are ordinal (e.g. age) which needs special care.
21
Unsupervised learning

Unsupervised learning is concerned with finding structure in unlabeled data.

Typical tasks:
Clustering: group similar objects together
Dimensionality reduction: project down high-dimensional data
Generative modelling: controllably generate new ”realistic” data
Anomaly detection: find outliers in the data

22
Unsupervised learning: Clustering and Dimensionality reduction

MNIST digits Gene expressions 23


Unsupervised learning: Dimensionality reduction

2D embedding of emojis 24
Reinforcement learning

Learning by interacting with a dynamic environment. Goal is to maximize rewards


obtained by performing “desirable” actions.

Agent

New state st+1 Reward rt+1 Action at

Environment

25
Other categories

Semi-supervised learning: combine lots of unlabeled data + sparse labeled data

Self-supervised learning: unsupervised via supervised learning, y is part of x

Active learning: carefully select which instances to label and query an oracle

Learning to learn (meta learning): learn to construct better models or optimizers

Domain adaptation: transfer knowledge from one domain to another domain

Ranking: learn to order instances in terms of relevance to a query

26
From concrete problems to abstract tasks

Problems Abstract tasks Models/Algorithms


Recommend a movie Classification Linear model
Segment a tumor Regression kNN
Generate a molecule Clustering Decision tree
Play chess Dimens. reduction SVM
Assess soil health ... Neural network
... ...

27
From concrete problems to abstract tasks

Problems Abstract tasks Models/Algorithms


Recommend a movie Classification Linear model
Segment a tumor Regression kNN
Generate a molecule Clustering Decision tree
Play chess Dimens. reduction SVM
Assess soil health ... Neural network
... ...

How much domain knowledge can and should we insert?

27
The basic recipe

Abstract (part of) the problem to a standard task: classification, clustering, …

Gather a dataset: images, documents, audio, graphs, molecules, targets

Choose a model class: linear models, neural networks, trees, …

Find a good model: optimize a loss function, select hyper-params, …

Choose the right metric

28
Learning ≈ Optimization

Find the model f ∗ from the set F that minimizes the loss l on a dataset D.

f ∗ = min l(f, D)
f ∈F

29
Machine Learning vs. Others
Artificial Intelligence

ML
Machine Learning vs. Others

Data Science

ML
Machine Learning vs. Others

Machine Learning

DL
Machine Learning vs. Others
Machine Learning Data Mining
Machine Learning vs. Others

Machine Learning Statisics


Machine Learning vs. Others

Machine Learning Information Retrieval

30
Machine Learning vs. Others
Artificial Intelligence Machine Learning Data Mining

ML

Data Science Machine Learning Statisics

ML

Machine Learning Machine Learning Information Retrieval

DL

30
General information

Staff
Lecturer: Prof. Dr. Aleksandar Bojchevski
TAs: Soroush H. Zargarbashi, Simone Antonelli, Jimin Cao, Sadegh Akhondzadeh

Details
9 ECTS
Language - English
Course materials on ILIAS

Use the ILIAS forum to ask questions – emails will likely not be answered!

31
Schedule and logistics

Lectures
Tue & Wed, 16:00 – 17:30
321 Lecture Hall (Hörsaal) II

Tutorials
Tue/Wed (see next slide)
Seminar room 1.421, Sibille-Hartmann-Str. 2-8 (Building 415)

Week Monday Tuesday Wednesday Thursday

Lecture/Tutorial Lecture/Tutorial
n
Sheet n − 1 due Sheet n released Solution n − 1

32
Exercise groups

This week we have 2 tutorials on


Wednesday (09.04) at 12:00 and 14:00.

From next week onward we decide based


on the poll. Vote by the end of the week.

33
Weekly schedule

Week Dates Lecture Tutorial

1 08.04, 09.04 Introduction + Math Refresher Python Refresher


Weekly schedule

Week Dates Lecture Tutorial

1 08.04, 09.04 Introduction + Math Refresher Python Refresher


2 15.04, 16.04 kNN + Trees Sheet 1
Weekly schedule

Week Dates Lecture Tutorial

1 08.04, 09.04 Introduction + Math Refresher Python Refresher


2 15.04, 16.04 kNN + Trees Sheet 1
3 22.04, 23.04 Probabilistic Inference + Evaluation Sheet 2
4 29.04, 30.04 Linear Regression + ML Libraries Sheet 3
5 06.05, 07.05 Linear Classification + Ensembles Sheet 4
Weekly schedule

Week Dates Lecture Tutorial

1 08.04, 09.04 Introduction + Math Refresher Python Refresher


2 15.04, 16.04 kNN + Trees Sheet 1
3 22.04, 23.04 Probabilistic Inference + Evaluation Sheet 2
4 29.04, 30.04 Linear Regression + ML Libraries Sheet 3
5 06.05, 07.05 Linear Classification + Ensembles Sheet 4
6 13.05, 14.05 Optimization Sheet 5
Weekly schedule

Week Dates Lecture Tutorial

1 08.04, 09.04 Introduction + Math Refresher Python Refresher


2 15.04, 16.04 kNN + Trees Sheet 1
3 22.04, 23.04 Probabilistic Inference + Evaluation Sheet 2
4 29.04, 30.04 Linear Regression + ML Libraries Sheet 3
5 06.05, 07.05 Linear Classification + Ensembles Sheet 4
6 13.05, 14.05 Optimization Sheet 5
7 20.05, 21.05 SVM + MLPs Sheet 6
8 27.05, 28.05 Deep Learning Sheet 7
9 03.06, 04.06 Recap
Weekly schedule

Week Dates Lecture Tutorial

1 08.04, 09.04 Introduction + Math Refresher Python Refresher


2 15.04, 16.04 kNN + Trees Sheet 1
3 22.04, 23.04 Probabilistic Inference + Evaluation Sheet 2
4 29.04, 30.04 Linear Regression + ML Libraries Sheet 3
5 06.05, 07.05 Linear Classification + Ensembles Sheet 4
6 13.05, 14.05 Optimization Sheet 5
7 20.05, 21.05 SVM + MLPs Sheet 6
8 27.05, 28.05 Deep Learning Sheet 7
9 03.06, 04.06 Recap + PCA Sheet 8
10 10.06, 11.06 No Lecture: Whitsun Holiday
11 17.06, 18.06 Manifold Learning + Matrix Factorization Sheet 9
12 24.06, 25.06 Clustering Sheet 10
Weekly schedule

Week Dates Lecture Tutorial

1 08.04, 09.04 Introduction + Math Refresher Python Refresher


2 15.04, 16.04 kNN + Trees Sheet 1
3 22.04, 23.04 Probabilistic Inference + Evaluation Sheet 2
4 29.04, 30.04 Linear Regression + ML Libraries Sheet 3
5 06.05, 07.05 Linear Classification + Ensembles Sheet 4
6 13.05, 14.05 Optimization Sheet 5
7 20.05, 21.05 SVM + MLPs Sheet 6
8 27.05, 28.05 Deep Learning Sheet 7
9 03.06, 04.06 Recap + PCA Sheet 8
10 10.06, 11.06 No Lecture: Whitsun Holiday
11 17.06, 18.06 Manifold Learning + Matrix Factorization Sheet 9
12 24.06, 25.06 Clustering Sheet 10
13 01.07, 02.07 Privacy + Fairness Sheet 11
14 08.07, 09.07 Recap + Mock Exam Sheet 12 34
Homework, exam, grading

Homework
12 sheets in total
Theoretical questions
Practical coding tasks in Python and Jupyter Notebook

Exam
Date 1: 24.07.24 08:00 - 11:00
Date 2: 30.09.24 13:00 - 16:00

Bonus
One grade (0.3) improvement if sufficient effort: ≥ 50% points for ≥ 9/12 sheets
Only grades in the range 1.3 − 4.0 can be improved (e.g. 1.3 → 1.0)
Homework points at the end of the semester, attend the tutorial to get feedback 35
Exercise sheets and group formation

Each sheet has both homework problems and in-class problems.


You only need to submit homework problems. The in-class problems will be
discussed in the tutorial but will not be graded.

Submit homework in groups of up to 2 people.

You can also work alone by forming a group of 1.

Late submissions are not allowed since ”buffers” are already built in (see bonus).

36
Homework policies

Any resources that you use to solve the homework must be cited. Examples:

- Stack Overflow to look up (part of a) coding exercise


- ChatGPT to formulate/edit (part of) your answer
- Solution of a similar exercise (previous year, other course) as inspiration
- …

Doing any of the above without proper citation constitutes cheating, as well
copying the solution of your fellow students.

- First time you get a warning, second time you fail the course

37
Reading material

Books that we mostly follow:

- Kevin Murphy, Probabilistic Machine Learning: An Introduction (link)

- Christopher M. Bishop, Pattern Recognition and Machine Learning (link)

- Simon J. Prince, Understanding Deep Learning (link)

The specific chapters and other relevant books and resources will be provided on
the last slide of each lecture.

Free online (PDF) versions are available.

38
Prerequisites and what’s next?

Brush up on your linear algebra, calculus, and probability knowledge:


- Read ”Probabilistic Machine Learning: An Introduction” by Murphy
[ch. 2, 3.1, 3.2, 7.1 – 7.3, 7.8]
- Read ”Pattern Recognition and Machine Learning” by Bishop
[ch. 1.2.0 - 1.2.3, 2.1 - 2.3.0]
- Read “Linear Algebra Review and Reference” by Kolter
- Read “Review of Probability Theory” by Maleki and Do
- Watch “Essence of linear algebra” by 3Blue1Brown
- Watch “Essence of calculus” by 3Blue1Brown

Solve the math refresher (exercise sheet 1, instructions on ILIAS)

Brush up on your coding skills. Python refresher in the tutorial on Wednesday.


Some slides are based on an older version by S. Gunnemann. 39

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy