0% found this document useful (0 votes)

107 views26 pages

Fundamentals of Deep Learning

Uploaded by

Debak Roy

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

107 views26 pages

Fundamentals of Deep Learning

Uploaded by

Debak Roy

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 26

FUNDAMENTALS

OF DEEP
LEARNING
Meticulously curated
information on the
base concepts of
deep learning
presented in diagrams
and visualizations.

Computer Science

Artificial
Intelligence

Machine
Learning

Deep
Learning

Mısra Turp
Fundamentals of Deep Learning

What is deep learning?

Deep learning is a discipline of Artificial Intelligence that is based on
Neural Networks. It has the power to learn complex patterns directly
from the data.

Deep Learning is a branch of

Machine Learning. Computer Science

Artificial
Machine Learning is one of the
Intelligence
approaches to Artificial
Intelligence. Machine
Learning
Artificial Intelligence is a
research branch of Computer Deep
Learning
Science.

Comparison of
Traditional Machine Learning and Deep Learning

Traditional
Deep Learning
Machine Learning

Feature extraction Needed Not Needed

Computational Laptop would High comp.

Power work fine power needed

Amount of Data Small datasets Big datasets

Needed are fine needed

1
Fundamentals of Deep Learning

Most Common Deep Learning Techniques

Deep Neural Networks Convolutional NNs

Not very complex problems, Works well with image data,

structured data used in computer vision

Recurrent NNs Generative Adversarial

Networks
Works well with sequential Can generate real-looking
data such as language, audio image, video or audio

Reinforcement Learning Autoencoders

Playing games, robotics Learns representation of data

feature/anomaly detection

Transformers Deep Boltzman Machine

Used a lot in NLP, machine Object/speech recognition

translation

Deep Belief Networks

recognize and generate

images, video sequences

2
Fundamentals of Deep Learning

Types of layers

input output

hidden layer

A single neuron
1st input feature
X1 w bias
1 b1
output of the
z1
neuron

X2 w2
weight
2nd input feature

Calculating the output of a single neuron

without the activation with the activation

function function

activation function

3
Fundamentals of Deep Learning

Vectorization for quick calculations

weights of the biases of the

network network
equation to
calculate inputs

4
Fundamentals of Deep Learning

Activation functions

What is an activation function?

An activation function is a
transformation on the output of a
neuron.

Each layer (except input layer) has

it's own activation function.

There are many different types of

activation functions.

Some examples are:

Linear
Sigmoid
Tanh
ReLU

Why do we need a non-linear activation function?

A non-linear activation function is If linear functions are used in the hidden layer,
what makes the network learn the network can only do linear transformations
complex patterns that need of the outputs. And no matter how big the
something more than linear network gets it will be as good as a single neuron
functions to be represented. because of this linearity. This will cause the
network to not be able to fit complex problems.

Activation function rules of thumb

For hidden layers: For output layer:
Hidden layer activation Sigmoid for binary classification and multi-label
should never be linear. classification.
Softmax for multi-class classification.
ReLU if you need strictly positive outputs
Linear activation for regression.

5
Fundamentals of Deep Learning

Backpropagation
Backpropagation pseudo algotihm
Initialize the network with randomly generated weights
Do forward propagation step and calculate the output
Calculate the error/loss/cost
Find out how much each parameter contributes to the error
Calculate this ratio using all the data points in this batch

At the end what you have is a list of

ratios of how each parameter
effects the error. It is called the
Gradient vector
gradient vector.

You can think of it as a many-

dimensional graph that shows the
relationship between each parameter
and the cost. If we had only two
parameters, the graph would look like
this.

Gradient Descent
Now that we have this many-dimentional graph showing us how the relationship
between all the parameters and the cost works we can decide in what way to update
these parameters to lower the cost.

Because the graph would be many many- Gradient descent is the act of changing the
dimentional, in real-life we cannot actually parameters based on the gradient vector.
look at it and decide where to go. That's Here is how it works.
why we use the gradient vector.

new values old values of gradient

for parameters parameters — learning rate x vector

6
Calculating
layer outputs:

activation
# of iterations/epochs function

regularization
loss
batch size

Dataset
te
# of neurons
in the input

actual value
ra

predictions

loss/error
g
nin
layer

... – = le a
r
...

...
w b # of neurons Wx
w in the output loss function
weight optimization
initialization layer
technique algorithm
# of neurons
in the hidden
layers

# of hidden
layers

Pre-determined hyperparameters

Number of neurons Number of neurons

in the input layer in the output layer

This will depend on the data you This will again depend on the
are training with. data you are training with.

If you have n features in your If you are doing binary

dataset you will have n input classification or regression, 1
neurons. If your input is a photo output neuron will do. If you are
with nxn pixels, you will have doing multi-class or multi-label
input neurons. classification, you need as many
output neurons as
classes/labels.

7
Hyperparameters that need tuning

Number of hidden layers Number of neurons

The more hidden layers you in the hidden layers
have, the deeper your network Each hidden layer can have a
is going to be. Generally, having different number of neurons.
more layers will help your This value needs to be adjusted
network more than increasing based on how the network is
the number of neurons in the performing.
hidden layers.

Activation function
Each layer has its own activation function except the input layer. The
activation function of the hidden layers cannot be linear.

Optimization algorithm Loss function

The optimization algorithm The loss (cost) function is what

determines how the network we're trying to minimize during
updates its weights. training. The cost function you
use will depend on the type of
problem you have.
Batch size

Batch size is the amount of data

points you put in each batch
while training your network.

If you use a batch size of 1, you will be doing Stochastic gradient

descent (GD). If you use batch size equals the number of data points
in your whole dataset, that is Batch GD. If batch size is anything
between 2 to number of data points in your whole dataset, you are
doing mini-batch gradient-descent.
8
Number of epochs Learning rate
This is how many times you run Learning rate determines how
the whole dataset through your big of a step to take towards the
network. The network learns the direction set by gradient
dataset a little bit better after descent. Too small of a learning
each epoch. rate might slow down learning
and too big of a learning rate
might cause the network to
Weight initialization technique keep missing the optima.
This is how the weights of the
networks are initialized. Bias Learning rate scheduling
values can be (and often are) techniques are used to
initialized to zero. But weights dynamically decide on the
cannot be initialized to zero learning rate value.
because if they are, they will all
be updated the same way and
the model won't be able to learn. Regularization
Regularization is a way to deal
At the same time, just using
with overfitting. There are many
normal distribution for the
different ways to do
random initialization of the
regularization and often these
weights cause problems with the
techniques come with their own
network. That's why, sometimes
hyperparameters.
we need a more advanced
approach.
Some examples are:
L1
L2
Dropout

L1 and L2 regularization has a

hp called alpha and dropout has
a hp called the dropout rate.

9
Fundamentals of Deep Learning

What is overfitting?
Overfitting is when the model is
not able to generalize well. The
model performs well on the
training set but fails to capture
the same performance for the
validation set. We see this in
the comparison of training and
validation loss. As we train the
network more, the training loss
keeps getting lower whereas
after a point validation loss
starts increasing.

What is underfitting? If this is the ideal

Underfitting is when the model is not able way to fit a dataset:
to fit the data at all. We see this when the
performance of the network is very low
already on the training set.

This is what it would

look like when
overfitted: Or when underfitted:

10
Fundamentals of Deep Learning

Bias and Variance

Bias: The amount of assumptions a model makes about the data. The more
assumptions it has, the simpler the model will be.

Variance: dependence of the model on the particular training set that was
used to train it.

If a model has overfit it means its bias is low and variance is high.

If a model has underfit it means its bias is high and variance is low.

If the mid point of these

circles signify where all the
correct values lie, the
predictions of high bias
and high variance models
will look like the yellow
marks.

Solutions to high bias/variance

11
Fundamentals of Deep Learning

Regularization

Constraining a model Adding more

to simplify it information
L1 / L2 regularization Data augmentation
Drop out
Early stopping

L1 / L2 regularization
Works by lowering the weights of the network. Achieves this by adding the weight values
to the cost function.

L1 regularization: Add the sum of the absolute values of the weights to the loss.

L2 regularization: Add the sum of the squared values of the weights to the loss.

The alpha parameter: how much attention to pay to this addition to the cost function.
Value between 0 (no penalty) and 1 (full penalty).

Tips You can use them with all network types

Normalize input for best results
Use L1+L2 together

Note!
L1 and L2 regularization lowers the weights of the network to combat overfitting but
what does overfitting have to do with high weights?

When you have high weights, that means you output

are exaggerating the importance of a certain
input. When you overfit this is what the model
looks like, right?

It looks like the importance of this input is so

exaggerated that the model follows its pattern
to the full. input

12
Fundamentals of Deep Learning

Dropout regularization
Works by making some neurons inactive in every training step. Every neuron has a
probability p of being inactive. This is called the dropout rate.

In training time, on average 1/p neurons are inactive. During test time all neurons are
active. That's why the input to each neuron is multiplied with the keep probability which
is (1-p).

Early-stopping
Works by stopping training at the point the validation loss starts getting higher. This is
the same place we mentioned before where overfitting starts happening.

But early-stopping is not the best approach

to use for overfitting. Because training and
mitigating overfitting should be separate
processes where separate approaches and
techniques are used. Early-stopping
combines these two.

Data augmentation
Transforming your data in multiple ways before feeding it to your network. This way, you
can generate more data points from the same one. These transformations help the
model tolerate any of the changes made (e.g. flipping, different orientation, color
changes, RGB or black and white).

Images from http://datahacker.rs/tf-data-augmentation/ and Buslaev, Alexander & Parinov, Alex & Khvedchenya, Eugene &
Iglovikov, Vladimir & Kalinin, Alexandr. (2018). Albumentations: fast and flexible image augmentations.

13
Fundamentals of Deep Learning

Unstable gradients
Gradients are what we use to update
the parameters of the network.

Sometimes they get extremely small

or extremely big.

This is because while calculating the

gradient of a parameter (that is how
much this parameter effects the In this calculation a lot of small numbers are
cost), we need to multiply a lot of multiplied with each other, causing a very small
values from other neurons. number for the gradient to be calculated.

In the example of the right, to find

out the gradient of the parameters of
the red neuron, we need to work our
way all the way to the output neuron. The reverse could happen if the parameter
values happened to be big values. That way we
would get the

Solutions
Changing weight initializers
The way of initializing the weights might contribute to the unstable gradients problem.
There is a need for a strategy that is better than just initializing them with normal
distribution and a mean of zero. Here are some of the initializers:

Glorot (Xavier) Initialization He Initialization LeCun Initialization

Where is the number of inputs and is

the number of outputs of a neuron.

14
Fundamentals of Deep Learning

Using a non-saturating activation function

Some activation functions cause saturation on
the extremes like seen in the sigmoid function
to the right. In combination with using the
wrong initialization technique, this causes the
unstable gradients problem.

ReLU (Rectified Linear Unit) activation function

does not cause saturation for positive values.
But it still causes saturation for negative values.
So there are a bunch of variations on it.

Rectified Linear Unit (ReLU)

Leaky ReLU (with parameter alpha)
Randomized Leaky ReLU (RReLU)
Parametric Leaky ReLU (PReLU)
Exponential Linear Unit (ELU)
(with parameter alpha)
Scaled ELU
Gaussian Error Linear Units (GELU)

You should choose the

Which activation function to correct initializer for your
use when? activation function
Initializer Activation function
Try the activation functions in this order:
Glorot Linear, tanh, softmax
SELU He ReLU and variants of ReLU
ELU
Leaky ReLU (and/or variants) LeCun SELU
ReLU
Tanh
Logistic (sigmoid) To use SELU, make sure:
your input is standardized
If speed is a priority ReLU is a good choice (mean=0, std=1),
because most libraries adapted fast ReLU you are using LeCun initializer,
implementations. that you have a sequential
network.

15
Fundamentals of Deep Learning

Batch normalization scale offset

Centers the inputs to all layers at zero and sets
standard deviation of them to 1. After that
scales and offsets them by 2 trainable values.

Scale and offset values that are the best for this normalized values
network, is learned during training.

Advantages of batch normalization

1. Even after setting a good activation 2. Eliminates the need for manually
function and initializer unstable adding a standardization layer.
gradients might occur. Batch 3. Makes the network converge to the
normalization stops the possibility of optimum faster.
unstable gradients. 4. Reduces the need for regularization

Gradient clipping

Batch normalization is very tricky to be used on RNNs. That's why instead gradient
clipping is used to deal with exploding gradients.

Gradient clipping is, like the name suggests, clipping the values of the gradients as they
get too big.

Gradient clipping
Gradient vector = range = [-1.1]

Clipping by absolute values Clipping by norm

16
Fundamentals of Deep Learning

Techniques to speed up training

Applying good initialization Using mini-batches
Using a good activation function Learning rate scheduling
Using batch normalization A faster optimization algorithm
Reusing parts of a pretained network Pruning the network
Normalizing the data

Normalizing the data

When you train your network with unnormalized data, the parameters of the network will
have a more complicated relationship with the cost function.

Specifically, the graph of relationship between the parameters and the cost will look like
a very wide bowl. The edges of this bowl is steep, so gradient descent will make quick
progress at first but as we get closer to the minima (which is the point where the cost is
the lowest) the progress will get smaller and smaller because the slope of the graph is
very small.

Whereas with normalized data, the relationship graph will look more like a smooth bowl.
Thus, the progress will be faster since the direction of the minima is very clear from
anywhere on the bowl.

Images from www.coursera.org/learn/deep-neural-network/

17
Fundamentals of Deep Learning

Using mini-batches

Batch Gradient Descent (GD): Run the whole data through the network
Mini-batch GD: Run 2 or more but less than the whole data through the network
Stochastic GD: Run examples one by one

Batching rules of thumb Use mini-batch sizes that are powers of 2.

Research suggests you can use as much as 8192*.
Small dataset (<2000 data 2 to 512 is typical.
points) use Batch GD Find a number that fits in your CPU/GPU.
Otherwise use mini-batch GD Start big and lower if performance is bad.

*Elad Hoffer et al. and Priya Goyal et al. Given that you use learning rate warming up.
18
Fundamentals of Deep Learning

Using a faster optimization algorithm

Gradient Descent Gradient Descent with Momentum

Nesterov Accelerated Gradient Other algorithms

AdaGrad
RMSProp
Adam
Nadam

Here is a comparison of optimizers based on Aurélien Géron's book*

Optimization algorithm Convergence speed Convergence quality

Gradient descent (GD) Bad Good

GD with momentum Average Good

GD with momentum
Average Good
and Nesterov

AdaGrad Good Bad (stops too early)

RMSProp Good Average to Good

Adam Good Average to Good

Nadam Good Average to Good

Adamax Good Average to Good

*Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems
19
Fundamentals of Deep Learning

Pruning the network

Pruning the network gets rid of parameters with very small values, creating a sparse
network. This network will take up less space in memory and the prediction times of the
network will be faster. Ways to prune a network is:

Using L1 regularization at training time

Removing very small weights manually (will probably lead to worse performance)
Use tools like TensorFlow Model Optimization Toolkit

Learning rate scheduling

Dynamically determining what the learning should be, in order to overcome the
limitations of having a pre-determined static learning rate.

Manual approach Piecewise constant scheduling

Train many models, give them Use constant

increasing values for the learning learning rate for a
rate and choose a value a bit certain number of
before than when the loss starts epochs. Very
shooting up. similar to manual.

Power scheduling Performance scheduling

The learning rate is a function of Drop the learning rate when the
the epoch number. The rate of validation error stops dropping.
Can't hurt to try

decrease, decreases.

1cycle scheduling

First increase linearly towards

Exponential scheduling during first half of training. Then
lower it back to , during second
The learning rate is a function of half of training.
Recommended

the epoch number. It drops by a

factor of 10 every s epochs.

20
Fundamentals of Deep Learning

Model lifecycle

Evaluating the model

Have only one metric to compare models
You can combine multiple metrics to create one metric
E.g.: combine precision and recall to have the F-score or
average multiple accuracy values
If necessary, you can add a helper metric
E.g.: model speed

Levels of performance
Training set performance
The goal of diagnosis is to find
Validation set performance
out where in these levels the
Test set performance problem occurs and fix it.

Real-life performance

How do we know what training set performance is good?

We compare it to human-level performance.

How do we know what real-life performance is good?

We evaluate user satisfaction or whatever other real-life evaluation metrics (this
could still be accuracy or precision in real-world tasks).

21
Fundamentals of Deep Learning

Bayes Optimal Performance

This is the best way a task can be done. It is a hypothetical limit. Many times we
don't know what it is. But most of the time, Bayes Optimal Performance is thought to
be the best performance achieved on this task by humans.

Bayes Optimal Performance is always either equal to or better than Human-level

performance. Machine learning algorithms (shown in orange line) progress fairly fast
to the point of achieving human-level performance. But after they surpass that level,
it takes much more effort to make progress towards Bayes Optimal Performance.

Human-level performance
Human-level performance is normally what models aspire to. But it can be defined in
many different ways. Let's say we observed these error percentages in a given task
by these groups of people:

In this case, the best performance is by

Group of Pharmacists. If we have not
other information on how well this task
can be done, we can accept 0.3% as
Bayes Optimal Error.

But what we should accept as human-

level performance will depend on our task.

22
Fundamentals of Deep Learning

Improving the model

Deciding which difference is bigger will

help us decide what actions to take to
make the model better.

That is why, choosing the correct human-

level performance for each specific case
matters.

As we talked about
before, there are things
we can do to address
high bias or high
variance.

But on top of that, here are what we should do to make the model better when we want
to address a specific performance gap.

Human-level performance Train more

Increase model complexity
Try more advanced architecture
Performance on training set More data Find better hyperparameters
Regularization
Try different architecture
Performance on validation set Find better hyperparameters
Have a bigger validation set

Performance on test set

Change validation set
Change cost function
Performance in real-life

23
Fundamentals of Deep Learning

Hyperparameter tuning
There are many hyperparameters and many options for hyperparameter in neural
networks. That's why, many times, trying out all the combinations of hyperparameter
settings is not feasible. Instead, we have some tactics we use to make life easier.

Grid search
Grid search is trying each option one by one. This is might be possible for some
traditional machine learning models but it is not really an option for neural networks.

Random search
Random search is trying out a subset of settings in the whole space of possibilities. It
does not guarantee finding the best possible combination of hyperparameter settings
but it works good enough most of the time.

Manual zooming in
This is a manual way of looking for the best settings. The idea is to do iterative random
search. In each iteration, the search is focused around the settings that performed the
best in the previous run.

Bayesian search
Build a probabilistic model of the relationship between the hyperparameters and the
cost function.

Gradient-based search
Approaches the hyper parameter tuning problem like the learning problem.

Evolutionary computing based search

Uses processes like randomization, natural selection and survival of the fittest to find
the best hyperparameter settings.

Early-stopping based search

The main approach of these search algorithms is to focus resources on settings that are
promising. There are different types of algorithms (SHA, ASHA, Hyperband) that fall into
this cateory.

24
Fundamentals of Deep Learning

Which approach to use?

The best way to do hyperparameter tuning, especially for your first couple of projects, is
to use random search. Once you have a good grasp on it, you can use different
approaches like bayesian search or even evolutionary computing based search.

You do not need to implement these search algorithms yourself. Here are some libraries
that can help you:

Bayesian search Spearmint

Scikit-Optimize

Gradient-based search Adatune

Evolutonary computing based search Sklearn-Deap

Early-stopping based search Hyperband

Other libraries Hyperopt

and resources Kopt
Talos
Hyperas
Keras Tuner
SHERPA
Google CLoud AI Platform HP tuning service
SigOpt
Oscar

Unit 2 Deep Learning and Neural Networks
No ratings yet
Unit 2 Deep Learning and Neural Networks
38 pages
Fixing Neural Network Course 2 1659759284
No ratings yet
Fixing Neural Network Course 2 1659759284
30 pages
CS601_Machine Learning_Unit 2 New
No ratings yet
CS601_Machine Learning_Unit 2 New
56 pages
Complete Deep Learning Interview Question
No ratings yet
Complete Deep Learning Interview Question
46 pages
Deep Learning and Its Applications
No ratings yet
Deep Learning and Its Applications
21 pages
Deep Learning UNIT-II Part1
No ratings yet
Deep Learning UNIT-II Part1
48 pages
Notes On Introduction To Deep Learning
No ratings yet
Notes On Introduction To Deep Learning
19 pages
Unit Online 1.4
No ratings yet
Unit Online 1.4
132 pages
Unit 1 (1)
No ratings yet
Unit 1 (1)
72 pages
Deep Learning Turorial PDF
No ratings yet
Deep Learning Turorial PDF
301 pages
Chapter 5 Final
No ratings yet
Chapter 5 Final
80 pages
Lesson Script in English: National Reading Program
No ratings yet
Lesson Script in English: National Reading Program
63 pages
Deep Learning Tutorial 9
No ratings yet
Deep Learning Tutorial 9
70 pages
Deep Learning Notes
No ratings yet
Deep Learning Notes
3 pages
Deep Learning Tutorial: Reference: Hung-Yi Lee
100% (1)
Deep Learning Tutorial: Reference: Hung-Yi Lee
179 pages
Deep Learning
No ratings yet
Deep Learning
299 pages
Optimization of Deep Networks
No ratings yet
Optimization of Deep Networks
84 pages
Lect 12 -Deep Feed Forward NN- Review
No ratings yet
Lect 12 -Deep Feed Forward NN- Review
93 pages
ANN-CNN-RNN
No ratings yet
ANN-CNN-RNN
26 pages
DEEP LEARNING
No ratings yet
DEEP LEARNING
38 pages
Intro To DL
No ratings yet
Intro To DL
28 pages
Efficient Deep Learning (First Early Release) (Gaurav Menghani Naresh Singh) (Z-Library)
No ratings yet
Efficient Deep Learning (First Early Release) (Gaurav Menghani Naresh Singh) (Z-Library)
69 pages
Soft Computing 2
No ratings yet
Soft Computing 2
33 pages
DeepLearningHandBook2024
No ratings yet
DeepLearningHandBook2024
185 pages
Introduction Deep Eng (1)
No ratings yet
Introduction Deep Eng (1)
50 pages
ANN Doc
No ratings yet
ANN Doc
2 pages
Domnic Object Detecion Basics
No ratings yet
Domnic Object Detecion Basics
62 pages
Ian Young, Executive Director, TDR Training - Diploma IAG
No ratings yet
Ian Young, Executive Director, TDR Training - Diploma IAG
11 pages
Lecture 4
No ratings yet
Lecture 4
45 pages
DNN Hyperparameter Tuning
No ratings yet
DNN Hyperparameter Tuning
105 pages
Deep Learing
No ratings yet
Deep Learing
37 pages
Deep_Learning_Interview_Q&A
No ratings yet
Deep_Learning_Interview_Q&A
10 pages
DL Mod2
No ratings yet
DL Mod2
45 pages
(Original PDF) Literacy in Australia Pedagogies For Engagement, 1st Edition 2024 Scribd Download
100% (9)
(Original PDF) Literacy in Australia Pedagogies For Engagement, 1st Edition 2024 Scribd Download
51 pages
AI Chapter 4
No ratings yet
AI Chapter 4
63 pages
Final Exam For Level 500 Nov 08
No ratings yet
Final Exam For Level 500 Nov 08
5 pages
Special Education
No ratings yet
Special Education
13 pages
Fundamentals of Deep Learning
No ratings yet
Fundamentals of Deep Learning
195 pages
Deep Learning
No ratings yet
Deep Learning
49 pages
Mid 1 DL Notes
No ratings yet
Mid 1 DL Notes
15 pages
Introduction To Neural Network
No ratings yet
Introduction To Neural Network
20 pages
PDF_1678529419
No ratings yet
PDF_1678529419
100 pages
Direct Instruction Revisited: A Key Model For Instructional Technology
No ratings yet
Direct Instruction Revisited: A Key Model For Instructional Technology
15 pages
cst414- Deep learning
No ratings yet
cst414- Deep learning
34 pages
Deep Learning
100% (2)
Deep Learning
49 pages
ML unit 4
No ratings yet
ML unit 4
23 pages
ca3dl
No ratings yet
ca3dl
6 pages
Intelligence
No ratings yet
Intelligence
16 pages
Intermediate Writing For ESL Students - Sample
100% (1)
Intermediate Writing For ESL Students - Sample
3 pages
tutorial 1,2
No ratings yet
tutorial 1,2
12 pages
Deep Learning (1)
No ratings yet
Deep Learning (1)
19 pages
Deep Learning
100% (4)
Deep Learning
100 pages
Pure Optimization
No ratings yet
Pure Optimization
23 pages
Weekly Lesson Plan - B7 Week 2
No ratings yet
Weekly Lesson Plan - B7 Week 2
2 pages
Deep Learning Interview Questions and Answers
No ratings yet
Deep Learning Interview Questions and Answers
21 pages
unit-1
No ratings yet
unit-1
19 pages
shortnotedeeplearning (2)
No ratings yet
shortnotedeeplearning (2)
11 pages
Learning and It's Theory
No ratings yet
Learning and It's Theory
18 pages
Notes DL-1
No ratings yet
Notes DL-1
10 pages
Artificial Intelligence - Chapter 7
No ratings yet
Artificial Intelligence - Chapter 7
18 pages
Deep Learning Andrew NG
100% (3)
Deep Learning Andrew NG
173 pages
Career Guidance Worksheet Grade 4 Final
No ratings yet
Career Guidance Worksheet Grade 4 Final
2 pages
Principles of Teaching
No ratings yet
Principles of Teaching
7 pages
Tiếng Anh 6 Smart World - Unit 2. SCHOOL
No ratings yet
Tiếng Anh 6 Smart World - Unit 2. SCHOOL
25 pages
French 1 Unit 5 Les Loisirs
No ratings yet
French 1 Unit 5 Les Loisirs
2 pages
MSW 005
No ratings yet
MSW 005
4 pages
Guiding Principles in The Teaching of Edukasyon Sa Pagpapakatao
100% (3)
Guiding Principles in The Teaching of Edukasyon Sa Pagpapakatao
3 pages
Deep Learning concepts ppt
No ratings yet
Deep Learning concepts ppt
13 pages
ZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZ
No ratings yet
ZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZ
5 pages
GROUP 4case Study WPS Office 3 1
No ratings yet
GROUP 4case Study WPS Office 3 1
9 pages
Deep Learning concise notes
No ratings yet
Deep Learning concise notes
4 pages
13 Useful Deep Learning Interview Questions and Answer
No ratings yet
13 Useful Deep Learning Interview Questions and Answer
6 pages
Sample Syllabus 1
No ratings yet
Sample Syllabus 1
6 pages
Deep Learning (All in One)
No ratings yet
Deep Learning (All in One)
23 pages
LESSON PLAN 4th Quarter
No ratings yet
LESSON PLAN 4th Quarter
2 pages
Drama Teacher - Natraj Hasrat
No ratings yet
Drama Teacher - Natraj Hasrat
1 page
Ted 690 Literature Review Domain C Wudel Integrating Best Practices in Ell For Pe
No ratings yet
Ted 690 Literature Review Domain C Wudel Integrating Best Practices in Ell For Pe
5 pages
Grade 4 Geography Baseline Test Term1 2025
No ratings yet
Grade 4 Geography Baseline Test Term1 2025
3 pages
Lesson Plan Form: III/English III
No ratings yet
Lesson Plan Form: III/English III
6 pages
Julie Craven Resume
No ratings yet
Julie Craven Resume
3 pages
Pauline Observation 2
No ratings yet
Pauline Observation 2
3 pages
Anecdotal Record Template 1
100% (1)
Anecdotal Record Template 1
5 pages
18CSC305J - Artificial Intelligence Unit IV Question Bank Part A
No ratings yet
18CSC305J - Artificial Intelligence Unit IV Question Bank Part A
7 pages
CW-Q1-W1 - Revised
No ratings yet
CW-Q1-W1 - Revised
6 pages
A Probabilistic Theory of Deep Learning: Unit 2
No ratings yet
A Probabilistic Theory of Deep Learning: Unit 2
17 pages
DLP Contextualized Math 5 - N. Lozano
100% (1)
DLP Contextualized Math 5 - N. Lozano
17 pages
Neural Networks with Python
From Everand
Neural Networks with Python
Mei Wong
No ratings yet
Image Classification: Step-by-step Classifying Images with Python and Techniques of Computer Vision and Machine Learning
From Everand
Image Classification: Step-by-step Classifying Images with Python and Techniques of Computer Vision and Machine Learning
Mark Magic
No ratings yet
TensorFlow in 1 Day: Make your own Neural Network
From Everand
TensorFlow in 1 Day: Make your own Neural Network
Krishna Rungta
3.5/5 (10)
Perceptrons: Fundamentals and Applications for The Neural Building Block
From Everand
Perceptrons: Fundamentals and Applications for The Neural Building Block
Fouad Sabry
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.