0% found this document useful (0 votes)

6 views44 pages

DSA5102X Lecture5

The document outlines Lecture 5 of the Foundations of Machine Learning course, focusing on neural networks, particularly convolutional and recurrent architectures. It discusses the limitations of fully connected neural networks, introduces convolutional layers for spatial data, and explains recurrent neural networks for temporal data. Additionally, it provides guidelines for homework submission and project requirements.

Uploaded by

Jake Huang

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views44 pages

DSA5102X Lecture5

Uploaded by

Jake Huang

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 44

Foundations of Machine Learning

DSA 5102X • Lecture 5

Soufiane Hayou
Department of Mathematics
Homework 1
General comments
• Before submission, reset kernel (Esc-0-0 + enter) and run
all cells in your notebook. Make sure the notebook runs
without errors
• Normalize inputs for Gaussian kernel

• Visualization plots
• Discover relationships
• Formulate hypothesis
• Select variables
Last time
We introduced
• Neural networks
• Universal/nonlinear approximation
• Shallow and deep networks
• Optimization of deep neural networks
• Gradient Descent
• SGD, interaction with learning rates and batch sizes
• Backpropagation algorithm
Today, we will look at some modern architectures that are very
useful for practical applications, and the key ideas behind them
What’s Wrong with FCNNs?
Recall The Fully Connected NN
Architecture
Permutations
A permutation of objects is one-to-one transformation of these
objects.
In other words, it is a bijection on

{ 1,2,3,4 ,5,6 ,7 ,8 ,9 }
𝑝 𝑝 (1 )=3 , 𝑝 ( 2 )=9 …
{ 3 ,9 ,1,5,4 ,6 ,8,7,2 }
Permutation Invariance
Observe that the FCNN hypothesis space has the following
invariance property:

Suppose , then for any permutation on the indices, define ,

then the function as well

In other words, we do not care about permuting the signal’s

components since if we can fit one permutation, we can fit any
permutation! Is this sensible?
Random
Permutation
Random
Permutation
Random
Permutation
Limitation of FCNN
The permutation invariance
property loses spatial/temporal
structure in the data!
Convolutional Neural Networks
Convolutions
Convolution is a basic concept in signal processing and harmonic
analysis.
Given two functions and and , we define their convolution as

Basic properties
• Commutativity:
• Linearity:
What do convolutions do?
Convolution with

Convolution with a smooth bump smooths out the signal.

Is this the only thing convolutions can do?

Discrete Convolutions
Real signals are rarely truly continuous
• Inherently discrete
• Discrete samples from continuous data

In fact, often discrete samples are enough to reconstruct

continuous signals, as long as the latter is band-limited! [Shannon,
1949].
Discrete Convolutions
Let be discrete signals

We define the discrete convolution as

Convolution vs Cross-Correlation
Discrete Convolution

Discrete Cross-Correlation (flip to )

Finite Convolutions and Boundary
Conditions
Practical signals are finite, so need to truncate

Assume , with

𝑚
( 𝑤∗𝑥 )( 𝑘) =∑ 𝑤 ( 𝑖 ) 𝑥 ( 𝑘+𝑖 )
𝑖=0
Finite Convolutions and Boundary
Conditions
Practical signals are finite, so need to truncate
• Circular convolutions
𝑤
1 2 1

𝑥
5 3 1 4 1 5 3

𝑤∗𝑥
12 9 10 11 14
Finite Convolutions and Boundary
Conditions
Practical signals are finite, so need to truncate
• Valid convolutions
𝑤
1 2 1

𝑥
3 1 4 1 5

𝑤∗𝑥
9 10 11
Finite Convolutions and Boundary
Conditions
Practical signals are finite, so need to truncate
• Zero-Padded convolutions with “same” padding
𝑤
1 2 1

𝑥
0 3 1 4 1 5 0

𝑤∗𝑥
7 9 10 11 10
Convolution in 2D
When dealing with image data, we often use convolutions in 2D

Now, and are matrices and also a matrix:

We call
• : convolution filter or kernel
• : input signal
Example: Convolutions in 2D

https://towardsdatascience.com/intuitively-understanding-convolutions-for-deep-learning-
1f6f42faee1
Image Input Data
Most basic
• with ( matrix)

Colored (RGB) data:

• with and
• is the number of image channels
Convolutional Layer
A convolutional layer is a basic building block of convolutional
neural networks. It is of the form

Compare with fully connected case

Why Convolutions?
Weight sharing:

and

Still matrix multiplication, but with shared weights!

Effective Feature Extractors
𝑤1 ∗ 𝑥
𝑤1

𝑤1 ∗ 𝑥

𝑤2
Equivariance
Let be some transformation on the space of input signals. We say
that a function is equivariant with if

Observe: convolutions are equivariant with translations!

What is the significance of this?

Invariance
On the other hand, we say a function is invariant with respect to
if

Examples:

Other examples?
Equivariance and Invariance
For some transformation , if is equivariant, and is invariant, then
is invariant!

Furthermore, if are equivariant, then

is also invariant.

Convolution layers can help build complex (approximately)

translation invariant models!
Shrinking/Focusing the Hypothesis
Space

𝓗
∗
𝑓

𝓗′
{ 𝑓 : 𝑓 ( 𝑇 (𝑥 ))= 𝑓 ( 𝑥)}
Pooling Layers
Max pooling In 1D with stride :

𝑥
𝑥
3 1 4 1
3 1 4 1 5 9
5 9 2 6
5 3 5 8 9 6
3 4 9
9 7 9 3 9 9
A Typical CNN Architecture

Conv 1 Conv 2
Flatten
Filters Filters

FCNN

( )
0 .9
0.1

Max Pool Max Pool

Demo: CNNs for Image
Classification
Recurrent Neural Networks
Time Series Data
Another type of data with structure

The outputs may also be temporal

We need to model the relationship between and . Say and we

want to approximate/learn
The Recurrent Architecture
Recurrent Neural Networks (RNNs) models the relationship
between and as

• are the hidden states. Their purpose is to make the

system memoryless
• are the trainable parameters
The Recurrent Architecture

h 𝜏+1=𝑔 ( h 𝜏 , 𝑥𝜏 +1 , 𝜃 )

Minimize

𝑥𝜏 +1 h𝜏 𝑜𝜏 𝑦𝜏
𝑜 𝜏 =𝑢 ( h𝜏 , 𝜙 )

The parameters are trained so that the outputs are close to

How to Optimize RNNs?

𝑦1 𝑦2 𝑦 𝜏− 1 𝑦𝜏

𝑜1 𝑜2 𝑜𝜏 −1 𝑜𝜏
𝑢 (h0 , 𝜙 ) 𝑢 (h1 , 𝜙) 𝑢 ( h 𝜏 −1 , 𝜙 ) 𝑢 ( h 𝜏 , 𝜙 )

h0 h1 h1 … h 𝜏− 1 h𝜏
𝑔 (h 0 , 𝑥 1 , 𝜃) 𝑔 (h1 , 𝑥2 , 𝜃) 𝑔 (h𝜏 − 1 , 𝑥𝜏 , 𝜃)

𝑥1 𝑥2 𝑥𝜏 − 1 𝑥𝜏
Important Observations
• Parameters are shared in time, as opposed to space as in CNNs

• Temporal/causal structure is preserved

• The hidden states can be made large to give the system some
form of memory
• By changing the form of the hidden states and/or the
functions , we can obtain many variants, including Gated
Recurrent Units (GRU) and Long Short Term Memory
(LSTM)
Demo: RNNs for Text Generation
Summary
Convolution Neural Networks
• Suitable for data with spatial structure (e.g. images)
• Convolution and pooling builds equivariance, invariance

Recurrent Neural Networks

• Suitable for data with temporal structure (e.g. text)

They can build powerful models!

Project
• Use at least one of Fully connected NN, CNN or RNN
model in your project.

• Due to availability of computing resources, I am not

looking for high accuracy or good performance on large
network structures. Rather, I am looking at correct usage,
so try it on simple models if you like
Test
• Style: similar to homework 2

• Duration: 1 hour 30 mins

• Date and time:

• Main slot: 2:00pm to 3:30pm on 2 Oct 2021

• Practice: test questions/answers from last year

will be uploaded in the next few days.

Maths For AI
No ratings yet
Maths For AI
176 pages
Artificial Intelligence by Example Develop Machine Intelligence From Scratch Using Real Artificial Intelligence Use Cases Denis Rothman PDF Download
100% (4)
Artificial Intelligence by Example Develop Machine Intelligence From Scratch Using Real Artificial Intelligence Use Cases Denis Rothman PDF Download
59 pages
UNIT 2 Study Materials 1
No ratings yet
UNIT 2 Study Materials 1
42 pages
4th Unit Aktu Machine Learning
No ratings yet
4th Unit Aktu Machine Learning
9 pages
CH 9
No ratings yet
CH 9
41 pages
Unit 1
No ratings yet
Unit 1
109 pages
Final Report Virtual Herbal Garden
100% (1)
Final Report Virtual Herbal Garden
57 pages
DL Unit-Ii
No ratings yet
DL Unit-Ii
36 pages
Week8 WEB
No ratings yet
Week8 WEB
54 pages
ML 11
No ratings yet
ML 11
62 pages
DSA5102 Lecture5
No ratings yet
DSA5102 Lecture5
45 pages
DL-19-CNN Sequential Model 210223
No ratings yet
DL-19-CNN Sequential Model 210223
18 pages
MLPS2021 03 CNN-RNN-LSTM
No ratings yet
MLPS2021 03 CNN-RNN-LSTM
71 pages
CNN Concept
No ratings yet
CNN Concept
57 pages
ChatGPT - Convolution and Pooling Operations
No ratings yet
ChatGPT - Convolution and Pooling Operations
43 pages
Convolution Neural Network-1
No ratings yet
Convolution Neural Network-1
44 pages
Deep Learning
No ratings yet
Deep Learning
90 pages
Neural Network (RNN & CNN)
No ratings yet
Neural Network (RNN & CNN)
31 pages
DL6 - Convnets 4
No ratings yet
DL6 - Convnets 4
57 pages
Convolutional Neural Networks: CS 535 Deep Learning, Winter 2020 Fuxin Li
No ratings yet
Convolutional Neural Networks: CS 535 Deep Learning, Winter 2020 Fuxin Li
44 pages
CNN2
No ratings yet
CNN2
70 pages
DL Mod4
No ratings yet
DL Mod4
18 pages
Images and Convolutional Neural Networks: Practical Deep Learning
No ratings yet
Images and Convolutional Neural Networks: Practical Deep Learning
34 pages
Unit 3 NNDL-1
No ratings yet
Unit 3 NNDL-1
31 pages
NN 06
No ratings yet
NN 06
18 pages
Why Convolutions?: Till Now in MLP
No ratings yet
Why Convolutions?: Till Now in MLP
38 pages
Iii Unit - Deeplearning
No ratings yet
Iii Unit - Deeplearning
93 pages
H13-311 - V3.5 Huawei Exam Practice Questions
No ratings yet
H13-311 - V3.5 Huawei Exam Practice Questions
13 pages
DL Unit-Ii
No ratings yet
DL Unit-Ii
34 pages
Module 3
No ratings yet
Module 3
67 pages
Unit Iii Convolutional Networks and Sequence Modelling
No ratings yet
Unit Iii Convolutional Networks and Sequence Modelling
38 pages
Natural Language Processing in Python
No ratings yet
Natural Language Processing in Python
214 pages
Chap 9-2 - Convolutional Neural Network - Heechul Lim
No ratings yet
Chap 9-2 - Convolutional Neural Network - Heechul Lim
58 pages
CNNs
No ratings yet
CNNs
22 pages
Aiml Ece Unit-5
No ratings yet
Aiml Ece Unit-5
48 pages
L09 Convolutional Networks
No ratings yet
L09 Convolutional Networks
9 pages
Convolutional Networks
No ratings yet
Convolutional Networks
37 pages
Convolution Neural Networks: S. Sumitra Department of Mathematics Indian Institute of Space Science and Technology
No ratings yet
Convolution Neural Networks: S. Sumitra Department of Mathematics Indian Institute of Space Science and Technology
123 pages
Module 4
No ratings yet
Module 4
20 pages
Class Notes Unit 5
No ratings yet
Class Notes Unit 5
13 pages
Deep Learning Module-04 Search Creators
No ratings yet
Deep Learning Module-04 Search Creators
17 pages
Convolutional Neural Networks (Part I)
No ratings yet
Convolutional Neural Networks (Part I)
61 pages
Lecture 3 Updated
No ratings yet
Lecture 3 Updated
56 pages
HODL Lec 3 DNNs For Vision 1
No ratings yet
HODL Lec 3 DNNs For Vision 1
36 pages
Unit - 4 DL
No ratings yet
Unit - 4 DL
19 pages
Module 3
No ratings yet
Module 3
46 pages
DL Mod 3
No ratings yet
DL Mod 3
4 pages
CNN PPT Unit Iv
No ratings yet
CNN PPT Unit Iv
134 pages
21CS743 Module4 Notes
No ratings yet
21CS743 Module4 Notes
15 pages
Aiml Ece Unit-5
No ratings yet
Aiml Ece Unit-5
48 pages
Deep Learning 4/7: Convolutional Neural Networks: C. de Castro, IEIIT-CNR, Cristina - Decastro@ieiit - Cnr.it
0% (1)
Deep Learning 4/7: Convolutional Neural Networks: C. de Castro, IEIIT-CNR, Cristina - Decastro@ieiit - Cnr.it
49 pages
AE556 2024 Topic4 CNN
No ratings yet
AE556 2024 Topic4 CNN
26 pages
Stock Price Prediction Using Deep Learning
No ratings yet
Stock Price Prediction Using Deep Learning
60 pages
Convolutional Neural Networks
No ratings yet
Convolutional Neural Networks
108 pages
Convolutional Neural Networks Notes
No ratings yet
Convolutional Neural Networks Notes
29 pages
Ch3 CNN
No ratings yet
Ch3 CNN
64 pages
Neural Networks and Deep Learning (PE - V) (18CSE23) Unit - 4
No ratings yet
Neural Networks and Deep Learning (PE - V) (18CSE23) Unit - 4
11 pages
Dhanush 23
No ratings yet
Dhanush 23
30 pages
Mergeddv
No ratings yet
Mergeddv
2 pages
Electricity Theft Detection Techniques Using Artificial Intelligence A Survey
No ratings yet
Electricity Theft Detection Techniques Using Artificial Intelligence A Survey
6 pages
Module 3 Notes
No ratings yet
Module 3 Notes
22 pages
CT-AI Certified Tester AI Testing Practice Questions
No ratings yet
CT-AI Certified Tester AI Testing Practice Questions
6 pages
DL Unit I & Unit II
No ratings yet
DL Unit I & Unit II
156 pages
Preethi Balamurugan Resume
No ratings yet
Preethi Balamurugan Resume
1 page
What Is Convolutional Neural Network
No ratings yet
What Is Convolutional Neural Network
16 pages
Ai Theory Kai-501
No ratings yet
Ai Theory Kai-501
65 pages
Unit 2a
No ratings yet
Unit 2a
31 pages
ML Unit 1
No ratings yet
ML Unit 1
22 pages
CauseFormer Interpretable Anomaly Detection With Stepwise Attention For Cloud Service
No ratings yet
CauseFormer Interpretable Anomaly Detection With Stepwise Attention For Cloud Service
16 pages
Rul Estimation
No ratings yet
Rul Estimation
30 pages
Othp 90
No ratings yet
Othp 90
49 pages
An Internship Report 34
No ratings yet
An Internship Report 34
35 pages
Neural Networks
No ratings yet
Neural Networks
11 pages
K Means Final
No ratings yet
K Means Final
10 pages
Course Work Syllabus Revised
No ratings yet
Course Work Syllabus Revised
12 pages
Groundwater Quality Forecasting Using Machine Learning Algorithms For
No ratings yet
Groundwater Quality Forecasting Using Machine Learning Algorithms For
13 pages
1 s2.0 S2352012424012979 Main
No ratings yet
1 s2.0 S2352012424012979 Main
18 pages
LAB Manual - OCS351 AIMF Labarotory-1
No ratings yet
LAB Manual - OCS351 AIMF Labarotory-1
44 pages
Research PPR 1
No ratings yet
Research PPR 1
7 pages
5 41-55 IJMSPHR Performance of Machine Learning Algorithm
No ratings yet
5 41-55 IJMSPHR Performance of Machine Learning Algorithm
15 pages
Data Sceince and AI Training Curriculum - V4.0
No ratings yet
Data Sceince and AI Training Curriculum - V4.0
19 pages
chp3 Hebb Network
No ratings yet
chp3 Hebb Network
4 pages
TransferLearningwithAdaptiveFine Tuning
No ratings yet
TransferLearningwithAdaptiveFine Tuning
16 pages
21CS743 DL Module4 Notes
No ratings yet
21CS743 DL Module4 Notes
7 pages
A Hybrid Prediction Model Integrating Artificial Intelligence and Geospatial Analysis For Disaster Management
No ratings yet
A Hybrid Prediction Model Integrating Artificial Intelligence and Geospatial Analysis For Disaster Management
12 pages
Artificial Intelligence Answers
No ratings yet
Artificial Intelligence Answers
104 pages
The Tech Interview Playbook: From DSA to System Design
From Everand
The Tech Interview Playbook: From DSA to System Design
Chinmoy Mukherjee
No ratings yet
Nonlinear Transformations of Random Processes
From Everand
Nonlinear Transformations of Random Processes
Ralph Deutsch
No ratings yet
Sequences and Infinite Series, A Collection of Solved Problems
From Everand
Sequences and Infinite Series, A Collection of Solved Problems
Steven Tan
No ratings yet
Convolutional Neural Networks: Fundamentals and Applications for Analyzing Visual Imagery
From Everand
Convolutional Neural Networks: Fundamentals and Applications for Analyzing Visual Imagery
Fouad Sabry
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

DSA5102X Lecture5

Uploaded by

DSA5102X Lecture5

Uploaded by

Foundations of Machine Learning

DSA 5102X • Lecture 5

Suppose , then for any permutation on the indices, define ,

In other words, we do not care about permuting the signal’s

Convolution with a smooth bump smooths out the signal.

Is this the only thing convolutions can do?

In fact, often discrete samples are enough to reconstruct

We define the discrete convolution as

Discrete Cross-Correlation (flip to )

Now, and are matrices and also a matrix:

Colored (RGB) data:

Compare with fully connected case

Still matrix multiplication, but with shared weights!

Observe: convolutions are equivariant with translations!

What is the significance of this?

Furthermore, if are equivariant, then

Convolution layers can help build complex (approximately)

Max Pool Max Pool

The outputs may also be temporal

We need to model the relationship between and . Say and we

• are the hidden states. Their purpose is to make the

The parameters are trained so that the outputs are close to

• Temporal/causal structure is preserved

Recurrent Neural Networks

They can build powerful models!

• Due to availability of computing resources, I am not

• Duration: 1 hour 30 mins

• Date and time:

• Practice: test questions/answers from last year

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.