0% found this document useful (0 votes)

22 views54 pages

[Fall 2024] Deep Learning 3

Uploaded by

David Earnest

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

22 views54 pages

[Fall 2024] Deep Learning 3

Uploaded by

David Earnest

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 54

Deep Learning 3

By: ML@B Edu Team

Announcements
● Quiz 2 due next Monday
○ Covers content from Monday’s (Deep Learning 2) and today’s (Deep Learning 3) lectures
● Homework 1 due next Monday
● Ofﬁce Hours will be held every Thursday 3-4 PM, at Cory 531
1. Deep Learning is Representation

Outline 2.
Learning
Transfer Learning
3. Unsupervised and
Self-Supervised Learning
What are we trying to learn?
What is a “feature”
● Consider the classical machine learning problem:
○ You have an input x and some function 𝚽(x) that returns any relevant features from x
■ For example, if x is a house and y is it’s selling price, then 𝚽(x) could be a vector with
information like the house age, the number of rooms, whether it has a basement or not, etc.
○ You pass the features into some model fθ(࣭𝚽(x)), parameterized by θ, to predict the label y
○ The machine learning problem is to “learn” θ from a training dataset of (x, y) pairs

House House Age Number of Number of Size (sq. Floors Basement? Garage? Backyard? Pool?
Bathrooms Bedrooms feet)

1 12 months 4 4 2500 2 Yes Yes No No

2 30 months 2 3 2000 1 No Yes Yes Yes

What is a “feature”
● Consider the classical machine learning problem:
○ You have an input x and some function 𝚽(x) that returns any relevant features from x
■ For example, if x is a house and y is it’s selling price, then 𝚽(x) could be a vector with
information like the house age, the number of rooms, whether it has a basement or not, etc.
○ You pass the features into some model fθ(࣭𝚽(x)), parameterized by θ, to predict the label y
○ The machine learning problem is to “learn” θ from a training dataset of (x, y) pairs

Input Features Output

Feature Extraction ML Algorithm

x → 𝚽(x) 𝚽(x) → y
fθ
An alternate view of deep learning
● In the early days, feature extractors were programmed by hand, i.e., ML
practitioners would manually select what features they would feed into the model
of their choice.
○ If the input is an image, this could include edge and corner information, presence of certain shapes,
patterns or colors, etc — these are all different representations of the same image

Output

Hand-engineered Algorithm with

features Learned Weights
An alternate view of deep learning
● However, this is a compromise! The model is learning its parameters from data yet
we are still hand-programming the feature extractors ourselves. Can we make the
entire process learned from end-to-end?
○ Yes! This is where deep learning comes in!
○ Learn the feature extractor as well — learn EVERYTHING in the entire pipeline!

Output

Hand-engineered Algorithm with

Output

Learned Feature Algorithm with

Extractor Learned Weights
An alternate view of deep learning
● However, this is a compromise! The model is learning its parameters from data yet
we are still hand-programming the feature extractors ourselves. Can we make the
entire process learned from end-to-end?
○ Yes! This is where deep learning comes in!
○ Learn the feature extractor as well — learn EVERYTHING in the entire pipeline!

Output
Deep learning is Representation learning
● Deep learning allows a model to learn “good” representations directly from data!
○ The main idea is to relinquish all control to the model and let it learn whatever it feels is important to
solve the task at hand
○ Features are synonymous with representations in ML
● The output of each layer in a neural network is a learned representation so deep
learning can be viewed as the process of learning stacked representations
○ We call these representations hierarchical, i.e., later representations depend on, and are more
abstract than earlier ones — depth refines representations
Deep learning is Representation learning
Transfer Learning
Training a network from scratch
● Training a good model from scratch takes
○ Time
○ Compute
○ Training data — the more, the better.
■ Models benefit significantly from A LOT of data — especially in computer vision, a few
thousand examples usually doesn’t cut it.
● Money for all of the above
Transfer learning motivation
● When trained from scratch, a model’s parameters are initialized randomly and
updated by some optimization algorithm like stochastic gradient descent or Adam
● So, if there are two different tasks to be solved, no matter how similar, this process
will be repeated separately both times
● However, does it really need to be? Consider the case when we humans learn
something new: do we ALWAYS start from the ground up?
Transfer learning motivation
● Let’s apply this idea to deep neural networks — consider training separate models
for cat/dog classification and face recognition
○ They will all need to learn a suitable set of feature extractors for their inputs. Since representations
are hierarchical:
■ Low-level (earlier) layers might learn feature extractors for very concrete details like edges,
corners, shapes, patterns, colors, etc.
■ High-level (deeper) layers might learn feature extractors for abstract concepts like mental
models of cats/dogs or human faces
○ Note: these low level features are general enough to be extracted from any kind of image for
any kind of task and representations/features only start to diverge significantly across tasks in
the later layers
○ Don’t worry about what exactly these feature extractors / neural network layers look like for now…
we will discuss that next time
Take some hypothetical model trained on
image inputs — what kind of information
and/or visual content does each layer look for?
Transfer learning motivation
● What about a neural network trained for
ImageNet classification?
○ The ImageNet dataset is a huge collection of
~1.3 million images divided across 1000 ResNet 18 ResNet 101
incredibly diverse classes
○ ResNet, AlexNet and DenseNet are just
special kinds of neural network architectures…
we will cover some of them in detail soon
enough but just think of them as some black
box NNs for now
○ Takeaway: different models trained for AlexNet DenseNet 121
image classification learn very similar lower The first layers of completely different models, trained
layers separately, are trying to extract the same kind of
information from an input image!
Transfer learning motivation
How can we transfer the “knowledge gained” by one network to another? What are we
really “sharing” between the two networks? Or rather, what can we possible share
between them?

Hint: neural networks are stacks of layers

Freezing layers
● Take the pretrained network from task 1

x … … y
Freezing layers
● Take the pretrained network from task 1
● Freeze some of the initial layers (i.e. disable gradient updates to them) and treat them as a ﬁxed feature
extractor — take the activations from these layers as some intermediate representation of your input

Frozen
Layers

x … y
Freezing layers
● Take the pretrained network from task 1
● Freeze some of the initial layers (i.e. disable gradient updates to them) and treat them as a ﬁxed feature
extractor — take the activations from these layers as some intermediate representation of your input
● Discard the remaining un-frozen (possibly none) later layers
● Optionally, attach a new custom network (often only a few linear layers that are initialized randomly) that
is not frozen to the frozen layers

Frozen Custom
Layers Layers
Aside: in computer vision
literature, if this custom network is
only a single linear layer, then this
process is also called linear
x … … probing!
y
Freezing layers
● Take the pretrained network from task 1
● Freeze some of the initial layers (i.e. disable gradient updates to them) and treat them as a ﬁxed feature
extractor — take the activations from these layers as some intermediate representation of your input
● Discard the remaining un-frozen (possibly none) later layers
● Optionally, attach a new custom network (often only a few linear layers that are initialized randomly) that
is not frozen to the frozen layers
● Train these new layers on unseen data from task 2
Frozen Custom
Layers Layers
Aside: in computer vision
literature, if this custom network is
only a single linear layer, then this
process is also called linear
x … … probing!
y
Fine-tuning layers
● Instead of discarding the non-frozen layers of the pretrained network, ﬁne-tune them, i.e., train them
further (starting with the same pre-trained weights from task 1) on the new data for task 2.

Frozen Trainable
Layers Layers

x … … y
Fine-tuning layers
● Instead of discarding the non-frozen layers of the pretrained network, ﬁne-tune them, i.e., train them
further (starting with the same pre-trained weights from task 1) on the new data for task 2.
● Can also simultaneously train any (optional) additional output layers (again, these tend to just be
randomly initialized linear layers) on top of the original pre-trained network layers as necessary!

Frozen Trainable Custom

Layers Layers Layers

x … … … y
Fine-tuning the whole network
In some cases, it might be favorable to ﬁne-tune the entire pretrained network rather
than some subset of layers. This can equivalently be viewed as initializing a network’s
parameters to a pretrained network’s parameters, instead of the usual random
initialization. In other words, a good pretrained network gives you a “head-start” in the
training process.
Fine-tuning vs freezing
Choosing the number of frozen, ﬁne-tunable and custom layers greatly depends on the
problem at hand. Nonetheless, here is what CS231N (Stanford’s DL course) suggests
for four common scenarios in which transfer learning is generally applicable:

Large Task 2 Dataset Small Task 2 Dataset

Should be ok to ﬁne-tune the entire No need to ﬁne-tune. Freeze most of

Task 2 dataset similar to
network. the initial layers and train a linear
task 1 dataset
classiﬁer on top of them.

Should be ok to ﬁne-tune the entire Don’t ﬁne-tune. Freeze some of the

Task 2 dataset different
network. initial layers and train a custom
from task 1 dataset
network on top of them.
A small aside on some deep learning
terminology
Shots
● A shot is the number of new examples you show a pre-trained network during
transfer learning
● few shot: show the network a small number of examples
● k-shot: show the network exactly k examples (say, k labeled data points)
● 1-shot: show the network only a single example
● 0-shot: don’t show the network any new examples
○ Using frozen feature extractors is an example of 0-shot adaptation — one example is LLMs
Embeddings
● An embedding is the output of a map
from some (discrete or continuous)
space to a different (continuous)
space.
○ For example, this map can be from the
space of image pixels, text tokens or audio
waves to the space of d-dimensional
vectors.
○ This map yields image, text or audio
embeddings respectively.
● In other words, embeddings are just
another way of representing complex
non-numerical data using numbers!
Learned embeddings
● How does one obtain these embeddings in the first place?
○ Like with everything in deep learning, let’s learn them!
● Neural networks already learn their parameters via gradient-based optimization
— it is only natural to use one to generate data embeddings.
● Or alternatively, instead of training a whole new network from scratch, just take
the output of some intermediate or final layer of an existing neural network.
○ Recall that pretrained models with frozen layers can be treated as feature extractors.
○ These features can now function as pretrained learned embeddings!
Learned embeddings
Softmax + Cross Entropy
Learned embeddings
Latent spaces
● High dimensional data can have an
inherent low-dimensional structure that
may be more preferable to work with
● This low-dimensional structure can be
captured by a latent space of features
that aims to encode all meaningful
information required to represent the
original high-dimensional data.
● In particular, the low-dimensional latent
space is said to be embedded in in the
high-dimensional input space of data
Why are latent spaces important?
● Latent spaces contain the
compressed representations
(sometimes also called latents) of the
original inputs. Latent vectors
○ are embeddings which tend to be closer
together for items that are semantically
“similar”to each other in their original
high-dimensional space
○ are a very important concept in the theory
of generative modeling, and we will revisit
them when discussing models such as
VAEs, GANs, LDMs, etc.
Back to transfer learning
Why are pretrained networks useful?
● The most useful pretrained networks tend to be the ones trained on massive
datasets: lots of diverse data ⇒ better and generalizable representations
○ These are particularly useful when being leveraged for tasks in low-data regimes, where there isn’t
enough data for a model to develop these representations on its own
○ Someone has already done the hard work on training on large datasets for us… let’s just build off of
that to improve our own pipelines
● Popular pretrained networks are available off-the-shelf. For example,
○ Parameters for vision models trained on ImageNet can be found online
○ LLMs are pre-trained networks and some of them (like Meta’s LLaMa) are even open source!
○ Various word2vec or GLoVe embeddings are available through certain python libraries
Why are pretrained networks useful?
● Embeddings can be pre-computed and stored in memory instead of the original
high-dimensional data
○ This can free up a lot of space
■ A 256 x 256 x 3 image contains 196,608 pixels. Suppose a pixel occupies 1 byte of memory —
so this image effectively takes up 192 kB of space.
■ What if we can get a 2048 dimensional embedding of this image? Each element in the
embedding vector will be a 4 byte floating point number — this embedding only takes up
8192 bytes = 8kB of space.
■ This is an almost 96% reduction in the memory requirement for storing this dataset!
○ Lower-dimensional embeddings (which already capture most of the low-level features of the
original data) can also speed up training — think about the computation time of passing a
~200k-dimensional image through a large neural network vs passing a ~2k-dimensional embedding
into a small neural network
Transfer learning Recap
● Transfer learning in a nutshell: yoinking other people’s work, who have already
done the hard part of training big models on huge datasets, to aid your own model.
● While most of the examples today focused on CV, everything can generally be
applied to any kind of neural network. In fact, transfer learning is huge in NLP!

Learning
Dataset 1 system task
Learning
1
Dataset 1 system task
1

Knowledge
Learning
Dataset 2 system task
2
Learning
Dataset 2 system task
2
Unsupervised Learning
Supervised vs unsupervised learning
● Supervised learning — the training process receives supervision from labels
○ Ex: classification, regression, object detection, etc.
● Unsupervised learning — the training process receives no supervision from labels
○ Ex: dimensionality reduction (PCA, t-SNE), clustering, generative modeling, etc.
Unsupervised representation learning
● Models trained on large labelled datasets learn pretty useful features in general,
but can we also learn them from completely unlabeled data?
● Imagine scraping a giant collection of images, text, audio samples, etc. from the
web without having to manually label them. Consider the implications of this:
○ ImageNet only has a million images but there are billions of images online
○ Curated text datasets (like the English Wikipedia) may have billions of tokens but there are trillions
of words on the internet
● Moreover, labeling data for good supervision can be extremely difficult, tedious,
expensive and time-consuming
○ Ex. medical data, legal data, etc.
Self-Supervised Learning
● Self-supervised learning — ML model first generates labels out of raw data and
then train in a supervised manner
● “The general technique of self-supervised learning is to predict any unobserved or
hidden part of the input from any observed or unhidden part of the input”
Some terminology
● Recall how we have been referring to a “task 1” and “task 2” during our discussion
of transfer learning so far. Turns out, they both have a special name:
○ Pretext task — the task on which a self-supervised pretrained network is first trained on
○ Downstream tasks — the task which then leverages the pretrained representations in some way.
Some examples in different fields of ML include:
■ Computer Vision – classification, object detection, segmentation, etc.
■ NLP – classification, translation, summarization, question-answering, etc.
■ RL – online model-free fine-tuning, etc.
Autoencoder (the most basic unsupervised model)
● Pretext task — learn a “compressed” representation of an input (does not have to
be an image, but it’s one in the example below for illustrative purposes)
● The bottleneck is a hidden layer with very few neurons — if the input can be
reconstructed from the output of the bottleneck layer, it must have captured
enough “important” information about the original input in a smaller vector
Rotnet
● The pretext task is to predict the
rotation angle of a rotated image
○ 4-way classification between
0, 90, 180 and 270 degrees
● Model learns about the
relationships between high level
features in an image (ex. location,
type and poses of objects), instead
of just low-level patterns
Rotnet
Jigsaw
● Pretext task – sample 9 patches from 3 by 3 square grid, shuffle them and then
predict their original order
○ Shuffling order sampled from a predefined set instead of all 9! permutations
● Model learns that images are made up of “parts” that are related to each other!
Word2Vec
● Ways to generate embeddings that map discrete words to vectors
● Multiple ways to learn word2vec embeddings, two of which include:
○ Continuous Bag of Words — predict a word from the surrounding (context) words
■ Ex: predict bit in “The dog bit the man” from dog and the.
○ Skip-Gram — predict the surrounding context from a word
■ Ex: predict dog and bit in “The dog bit the man” from bit
Masked Language Models
● Common word2vec approaches involve predicting words from only their
immediately surrounding context. What if we take into consideration the entire
sentence a word belongs to?
● Further, what if you predict multiple words within that sentence simultaneously?
● We say that these words have been masked and the pretext task here is to unmask
them using the remaining words as context
BERT (2018)
● Input tokens (words) are
randomly masked with 15%
probability — the model
must predict these
○ Gives word level
representations
● Pairs of sentences are
passed in together — the
model must predict the
sentence order
○ Gives sentence level
representations
Lecture Attendance

http://tinyurl.com/fa24-dl4cv
Contributors
● Aryan Jain

Shayak
No ratings yet
Shayak
6 pages
TensorFlow in 1 Day: Make your own Neural Network
From Everand
TensorFlow in 1 Day: Make your own Neural Network
Krishna Rungta
3.5/5 (10)
BTP Presentation On Text To Image Synthesis
100% (1)
BTP Presentation On Text To Image Synthesis
38 pages
BMM 2018 - Deep Learning Tutorial
No ratings yet
BMM 2018 - Deep Learning Tutorial
47 pages
Introduction To Deep Learning: TA: Drew Hudson May 8, 2020
No ratings yet
Introduction To Deep Learning: TA: Drew Hudson May 8, 2020
33 pages
Deeplearning Rostami Part 2
No ratings yet
Deeplearning Rostami Part 2
114 pages
Transfer Learnring
No ratings yet
Transfer Learnring
5 pages
DL_UNIT-4_Part-1
No ratings yet
DL_UNIT-4_Part-1
10 pages
Deep Learning
No ratings yet
Deep Learning
90 pages
DL Question Paper Solved
No ratings yet
DL Question Paper Solved
12 pages
MN906 AI Watermarking
No ratings yet
MN906 AI Watermarking
99 pages
Deepnet Lourentzou
No ratings yet
Deepnet Lourentzou
49 pages
UNIT III
No ratings yet
UNIT III
26 pages
DL unit 4 perfect pdf_1
No ratings yet
DL unit 4 perfect pdf_1
23 pages
Image Classification Using Small Convolutional Neural Network
No ratings yet
Image Classification Using Small Convolutional Neural Network
5 pages
TensorFlow Regression
No ratings yet
TensorFlow Regression
445 pages
Convolutional Neural Networks in Python _ DataCamp
No ratings yet
Convolutional Neural Networks in Python _ DataCamp
22 pages
Short Course On Deep Learning: Welcome!!
No ratings yet
Short Course On Deep Learning: Welcome!!
57 pages
Machine Learning (ML) :: Aim: Analysis and Implementation of Deep Neural Network. Definitions
No ratings yet
Machine Learning (ML) :: Aim: Analysis and Implementation of Deep Neural Network. Definitions
6 pages
6-DeepVisualLearning L6
No ratings yet
6-DeepVisualLearning L6
82 pages
Cheatsheets For Deep Learning 1650192034
No ratings yet
Cheatsheets For Deep Learning 1650192034
95 pages
Deep Learning
100% (2)
Deep Learning
49 pages
L3 - UUCLxDeepMind DL2020
No ratings yet
L3 - UUCLxDeepMind DL2020
110 pages
23 DeepLearning PDF
No ratings yet
23 DeepLearning PDF
74 pages
Lec6 RNN Attention Search
No ratings yet
Lec6 RNN Attention Search
62 pages
Deep Learning and Applications: Pham The Bao Ptbao@sgu - Edu.vn
No ratings yet
Deep Learning and Applications: Pham The Bao Ptbao@sgu - Edu.vn
43 pages
Cats and Dogs Classification
No ratings yet
Cats and Dogs Classification
12 pages
Convolutional Neural Networks: CMSC 733 Fall 2015 Angjoo Kanazawa
No ratings yet
Convolutional Neural Networks: CMSC 733 Fall 2015 Angjoo Kanazawa
55 pages
Deep Learning
No ratings yet
Deep Learning
45 pages
Business Data Mining Week 12
No ratings yet
Business Data Mining Week 12
24 pages
AI Lab 1
No ratings yet
AI Lab 1
11 pages
04Introduction to Neural Networks
No ratings yet
04Introduction to Neural Networks
62 pages
Cnn
No ratings yet
Cnn
56 pages
Facial Emotion Detection
No ratings yet
Facial Emotion Detection
10 pages
Lect11 Neural Nets2
No ratings yet
Lect11 Neural Nets2
48 pages
ENG6500 8 DL IntroductionToDeepLearning Part2
No ratings yet
ENG6500 8 DL IntroductionToDeepLearning Part2
65 pages
Unit 3
No ratings yet
Unit 3
105 pages
Introduction To Deep Convolutional Neural Networks: March 2016
No ratings yet
Introduction To Deep Convolutional Neural Networks: March 2016
51 pages
A Selective Overview of Deep Learning: Jianqing Fan Cong Ma Yiqiao Zhong April 16, 2019
No ratings yet
A Selective Overview of Deep Learning: Jianqing Fan Cong Ma Yiqiao Zhong April 16, 2019
37 pages
Unit - V
No ratings yet
Unit - V
44 pages
Deep Unsupervised Learning
No ratings yet
Deep Unsupervised Learning
90 pages
Deep Learning For NLP
No ratings yet
Deep Learning For NLP
78 pages
Unit 4
No ratings yet
Unit 4
27 pages
MSCDA 605 Machine Learning Exam Model Answers May_2019
No ratings yet
MSCDA 605 Machine Learning Exam Model Answers May_2019
7 pages
The Deep Learning Revolution: Introductory Overview Lecture
No ratings yet
The Deep Learning Revolution: Introductory Overview Lecture
35 pages
Dlincv 161110052148 PDF
No ratings yet
Dlincv 161110052148 PDF
271 pages
Transfer Learning
No ratings yet
Transfer Learning
13 pages
Operations Slides
No ratings yet
Operations Slides
11 pages
Module 3 - S8 CSE NOTES - KTU DEEP LEARNING NOTES - CST414
No ratings yet
Module 3 - S8 CSE NOTES - KTU DEEP LEARNING NOTES - CST414
20 pages
8. Deep learning
No ratings yet
8. Deep learning
95 pages
Machine Learning Unit 3-5
No ratings yet
Machine Learning Unit 3-5
13 pages
Deep Learning Artificial Intelligence
No ratings yet
Deep Learning Artificial Intelligence
9 pages
2 Deep Learning in Image Classification A Survey Report
No ratings yet
2 Deep Learning in Image Classification A Survey Report
4 pages
Deep Learning Final Sheet
No ratings yet
Deep Learning Final Sheet
915 pages
DL Unit-3
No ratings yet
DL Unit-3
9 pages
Plant Disease Identification
No ratings yet
Plant Disease Identification
17 pages
Deep Learning Most Important Ideas PDF
No ratings yet
Deep Learning Most Important Ideas PDF
16 pages
Deep Learning Unit 4
No ratings yet
Deep Learning Unit 4
11 pages
Unit II
No ratings yet
Unit II
27 pages
Deep Learning Fundamentals in Python
From Everand
Deep Learning Fundamentals in Python
LazyProgrammer
4/5 (9)
The Numpy Pocketbook: Essentials on the Go
From Everand
The Numpy Pocketbook: Essentials on the Go
Silas Meadowlark
No ratings yet
Image Classification: Step-by-step Classifying Images with Python and Techniques of Computer Vision and Machine Learning
From Everand
Image Classification: Step-by-step Classifying Images with Python and Techniques of Computer Vision and Machine Learning
Mark Magic
No ratings yet
ML UNIT-4 Notes PDF
100% (1)
ML UNIT-4 Notes PDF
40 pages
Deep Learning Course File
No ratings yet
Deep Learning Course File
56 pages
Widrow-Hoff Learning Rule
No ratings yet
Widrow-Hoff Learning Rule
9 pages
sc module 2
No ratings yet
sc module 2
14 pages
Deep Learning Tutorial: Reference: Hung-Yi Lee
100% (1)
Deep Learning Tutorial: Reference: Hung-Yi Lee
179 pages
Bcse332l Deep-Learning TH 1.0 0 Bcse332l
No ratings yet
Bcse332l Deep-Learning TH 1.0 0 Bcse332l
3 pages
Lecture 5 - CS50's Introduction to Artificial Intelligence with Python
No ratings yet
Lecture 5 - CS50's Introduction to Artificial Intelligence with Python
16 pages
deep_l_ppt[1]
No ratings yet
deep_l_ppt[1]
8 pages
Convolutional Neural Network in DIP
No ratings yet
Convolutional Neural Network in DIP
2 pages
Conference Upcoming
No ratings yet
Conference Upcoming
1 page
Thesis Presentation
No ratings yet
Thesis Presentation
20 pages
Branch Wise Cut Off 21719226567
No ratings yet
Branch Wise Cut Off 21719226567
60 pages
Neural Syllabus
No ratings yet
Neural Syllabus
1 page
W-Net A Deep Model For Fully Unsupervised Image Segmentation
No ratings yet
W-Net A Deep Model For Fully Unsupervised Image Segmentation
13 pages
Cs224n 2025 Lecture06 Fancy Rnn
No ratings yet
Cs224n 2025 Lecture06 Fancy Rnn
57 pages
Next Word Prediction Using Machine Learning Techniques: Cybersecurity November 2022
No ratings yet
Next Word Prediction Using Machine Learning Techniques: Cybersecurity November 2022
12 pages
Self Organizing Feature Map (SOFM) in ML
No ratings yet
Self Organizing Feature Map (SOFM) in ML
2 pages
Lect4 architectureNN
No ratings yet
Lect4 architectureNN
17 pages
Unit 6 - Week 5: Assignment 5
No ratings yet
Unit 6 - Week 5: Assignment 5
3 pages
20HCC22XX: B.Tech (III Sem)
No ratings yet
20HCC22XX: B.Tech (III Sem)
2 pages
Applied Deep Learning - Part 4 - Convolutional Neural Networks - by Arden Dertat - Towards Data Science
No ratings yet
Applied Deep Learning - Part 4 - Convolutional Neural Networks - by Arden Dertat - Towards Data Science
32 pages
Lec18 Denoising+SparseAutoencoders
No ratings yet
Lec18 Denoising+SparseAutoencoders
18 pages
Handwritten Digit Recognition Using Convolutional Neural Networks
No ratings yet
Handwritten Digit Recognition Using Convolutional Neural Networks
6 pages
36-Multi-Layer Perceptron and Its Properties-30-10-2024
No ratings yet
36-Multi-Layer Perceptron and Its Properties-30-10-2024
39 pages
Brochure
No ratings yet
Brochure
2 pages
Question Bank Beel801 PDF
100% (1)
Question Bank Beel801 PDF
10 pages
Assignment - 2
No ratings yet
Assignment - 2
2 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

[Fall 2024] Deep Learning 3

Uploaded by

[Fall 2024] Deep Learning 3

Uploaded by

Deep Learning 3

By: ML@B Edu Team

1 12 months 4 4 2500 2 Yes Yes No No

2 30 months 2 3 2000 1 No Yes Yes Yes

Input Features Output

Feature Extraction ML Algorithm

Hand-engineered Algorithm with

Hand-engineered Algorithm with

Learned Feature Algorithm with

Hint: neural networks are stacks of layers

Frozen Trainable Custom

Large Task 2 Dataset Small Task 2 Dataset

Should be ok to ﬁne-tune the entire No need to ﬁne-tune. Freeze most of

Should be ok to ﬁne-tune the entire Don’t ﬁne-tune. Freeze some of the

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.