0% found this document useful (0 votes)

10 views64 pages

Unit-5 AI ETC

The document provides an overview of Artificial Neural Networks (ANNs) and their applications in various fields such as speech and image recognition, as well as game playing. It explains the role of activation functions, the structure of neural networks, and the process of training through steps like defining functions, assessing their goodness, and selecting the best function. Additionally, it discusses concepts like gradient descent, local minima, and backpropagation for efficient computation in neural networks.

Uploaded by

choudharyvaish91

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views64 pages

Unit-5 AI ETC

Uploaded by

choudharyvaish91

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 64

Artificial Neural Networks

Machine Learning ≈ Looking for a Function

• Speech Recognition
f( ) = “How are you”
• Image Recognition
f( ) = “Cat”

• Playing Go
f( ) = “5-5” (next move)
• Dialogue System
f( “Hi” )= “Hello”
(what the user said) (system response)
An Activation Function decides whether a
neuron should be activated or not. This
means that it will decide whether the
neuron’s input to the network is
important or not in the process of
prediction using simpler mathematical
operations.

The role of the Activation Function is to

derive output from a set of input values
fed to a node (or a layer).

But—

Let’s take a step back and clarify: What

exactly is a node?

Well, if we compare the neural network to

our brain, a node is a replica of a neuron
that receives a set of input signals—
external stimuli.
Image Recognition:

Framework f( )= “cat”

A set of Model
function f1 , f 2 

f1 ( )= “cat” f2 ( )= “money”

f1 ( )= “dog” f2 ( )= “snake”
Image Recognition:

Framework f( )= “cat”

A set of Model
function f1 , f 2  Better!

Goodness of
function f
Supervised Learning

Training function input:

Data
function output: “monkey” “cat” “dog”
Image Recognition:

Framework f( )= “cat”

Training Testing
A set of Model
function f1 , f 2  “cat”
Step 1

Goodness of Pick the “Best” Function

Using f
function f f*
Step 2 Step 3

Training
Data
“monkey” “cat” “dog”
Three Steps for Learning

Step 1: define a set of function

Neural Network

Step 2: goodness of function

Step 3: pick the best function

Neural Network
Neuron
z = a1w1 +  + ak wk +  + aK wK + b

a1
… w1 A simple function

…
wk z  (z )
ak … + a
Activation
…

wK function
aK weights b bias
Neural Network
Neuron Sigmoid Function  (z )

 (z ) =
1
−z
1+ e z
2
1

 (z )
4
-1 -2 + 0.98

Activation
-1
function
1 weights 1 bias
Neural Network
Different connections lead to
different network structures

+  (z )

+  (z ) +  (z )

+  (z )
The neurons have different values of
weights and biases.
Weights and biases are network parameters 𝜃
Activation functions
• Transforms neuron’s input into output
• Features of activation functions:
• A squashing effect is required
• Prevents accelerating growth of activation levels
through the network
• Simple and easy to calculate
Fully Connect Feedforward Network
1 4 0.98
1
-2
1
-1 -2 0.12
-1
1
0
Sigmoid Function  (z )

 (z ) =
1
−z
1+ e z
Fully Connect Feedforward Network
1 4 0.98 2 0.86 3 0.62
1
-2 -1 -1
1 0 -2
-1 -2 0.12 -2 0.11 -1 0.83
-1
1 -1 4
0 0 2
Fully Connect Feedforward Network
1 0.73 2 0.72 3 0.51
0
-2 -1 -1
1 0 -2
-1 0.5 -2 0.12 -1 0.85
0
1 -1 4
0 0 2
This is a function. 1 0.62 0 0.51
𝑓 = 𝑓 =
Input vector, output vector −1 0.83 0 0.85
Given parameters 𝜃, define a function
Given network structure, define a function set
Fully Connect Feedforward Network
neuron
Input Layer 1 Layer 2 Layer L Output
x1 …… y1
x2 …… y2

……
……

……

……
xN …… yM
Input Output
Layer Hidden Layers Layer
Deep means many hidden layers
Deep = Many hidden layers
22 layers

19 layers

8 layers
6.7%
7.3%
16.4%

AlexNet (2012) VGG (2014) GoogleNet (2014)

Deep = Many hidden layers

152 layers

Special
structure

3.57%

7.3% 6.7%
16.4%
AlexNet VGG GoogleNet Residual Net
(2012) (2014) (2014) (2015)
Output Layer
• Softmax layer as the output layer

Ordinary Layer

z1  ( )
y1 =  z1
In general, the output of
z2  ( )
y2 =  z 2
network can be any value.

May not be easy to interpret

z3  ( )
y3 =  z 3
Output Layer
Probability:
• Softmax layer as the output layer ◼ 1 > 𝑦𝑖 > 0
◼ σ𝑖 𝑦𝑖 = 1
Softmax Layer

3 0.88 3

e
20
z1 e e z1
 y1 = e z1 zj

j =1

1 0.12 3
z2 e e z 2 2.7
 y2 = e z2
e
zj

j =1
0.05 ≈0
z3 -3
3
e e z3
 y3 = e z3
e
zj

3 j =1

+ e zj

j =1
Example Application

Input Output

y1
0.1 is 1
x1
x2 y2
0.7 is 2
The image
is “2”

……
……
……
x256 y10
0.2 is 0
16 x 16 = 256
Ink → 1 Each dimension represents
No ink → 0 the confidence of a digit.
Example Application
• Handwriting Digit Recognition

x1 y1 is 1
x2
y2 is 2
Neural
…… Machine “2”

……
……
Network
x256 y10 is 0
What is needed is a
function ……
Input: output:
256-dim vector 10-dim vector
Example Application
Input Layer 1 Layer 2 Layer L Output
x1 …… y1 is 1
x2 ……
A function set containing the y2 is 2
candidates for “2”

……
……

……

……
……
Handwriting Digit Recognition
xN …… y10 is 0
Input Output
Layer Hidden Layers Layer

You need to decide the network structure to

let a good function in your function set.
FAQ

• Q: How many layers? How many neurons for each

layer?
Trial and Error + Intuition
• Q: Can we design the network structure?
Convolutional Neural Network (CNN)

• Q: Can the structure be automatically determined?

• Yes, but not widely studied yet.
Three Steps for Deep Learning

Step 1: define a set of function

Step 2: goodness of function

Step 3: pick the best function

Training Data
• Preparing training data: images and their labels

“5” “0” “4” “1”

“9” “2” “1” “3”

The learning target is defined on

the training data.
x1 …… y1 is 1

Softmax
x2 …… y2 is 2

……

……
x256 …… y10 is 0
16 x 16 = 256
Ink → 1 The learning target is ……
No ink → 0
Input: y1 has the maximum value

Input: y2 has the maximum value

A good function should make the loss
Loss of all examples as small as possible.

“1”

x1 …… y1 As close as 1
x2 possible

Softmax
Given a set ……
of y2 0
parameters
……

……
……

……

……
Loss
x256 …… y10 𝑙 0

Loss can be square error or cross entropy target

between the network output and target
Total Loss:
Total Loss 𝑅

𝐿 = ෍ 𝑙𝑟
For all training data … 𝑟=1

x1 NN y1 𝑦ො 1
𝑙1 As small as possible
x2 NN y2 𝑦ො 2
𝑙2 Find a function in
function set that
x3 NN y3 𝑦ො 3 minimizes total loss L
𝑙3
……
……

……
…… Find the network
xR NN yR 𝑦ො 𝑅
parameters 𝜽∗ that
𝑙𝑅 minimize total loss L
Three Steps for Deep Learning

Step 1: define a set of function

Step 2: goodness of function

Step 3: pick the best function

How to pick the best function

Find network parameters 𝜽∗ that minimize total loss L

Layer l Layer l+1
Enumerate all possible values

Network parameters 𝜃 =
106
𝑤1 , 𝑤2 , 𝑤3 , ⋯ , 𝑏1 , 𝑏2 , 𝑏3 , ⋯
weights

……
……
Millions of parameters

E.g. speech recognition: 8 layers and

1000 1000
1000 neurons each layer
neurons neurons
Network parameters 𝜃 =
Gradient Descent 𝑤1 , 𝑤2 , ⋯ , 𝑏1 , 𝑏2 , ⋯

Find network parameters 𝜽∗ that minimize total loss L

➢ Pick an initial value for w
Total
Random, pre-train
Loss 𝐿
Usually good enough

w
Network parameters 𝜃 =
Gradient Descent 𝑤1 , 𝑤2 , ⋯ , 𝑏1 , 𝑏2 , ⋯

Find network parameters 𝜽∗ that minimize total loss L

➢ Pick an initial value for w
Total ➢ Compute 𝜕𝐿Τ𝜕𝑤
Loss 𝐿 Negative Increase w

Positive Decrease w

w
Network parameters 𝜃 =
Gradient Descent 𝑤1 , 𝑤2 , ⋯ , 𝑏1 , 𝑏2 , ⋯

Find network parameters 𝜽∗ that minimize total loss L

➢ Pick an initial value for w
Total ➢ Compute 𝜕𝐿Τ𝜕𝑤
Loss 𝐿 𝑤 ← 𝑤 − 𝜂𝜕𝐿Τ𝜕𝑤
Repeat

η is called
−𝜂𝜕𝐿Τ𝜕𝑤 “learning rate” w
Network parameters 𝜃 =
Gradient Descent 𝑤1 , 𝑤2 , ⋯ , 𝑏1 , 𝑏2 , ⋯

Find network parameters 𝜽∗ that minimize total loss L

➢ Pick an initial value for w
Total ➢ Compute 𝜕𝐿Τ𝜕𝑤
Loss 𝐿 𝑤 ← 𝑤 − 𝜂𝜕𝐿Τ𝜕𝑤
Repeat Until 𝜕𝐿Τ𝜕𝑤 is approximately small
(when update is little)

w
Local Minima
Total
Loss Very slow at the
plateau
Stuck at saddle point

Stuck at local minima

𝜕𝐿 ∕ 𝜕𝑤 𝜕𝐿 ∕ 𝜕𝑤 𝜕𝐿 ∕ 𝜕𝑤
≈0 =0 =0

The value of a network parameter w

Local Minima
• Gradient descent never guarantee global minima

Different initial point

Reach different minima,

so different results
𝑤1 𝑤2
Gradient Descent
This is the “learning” of machines in deep
learning ……
Even alpha go using this approach.
People image …… Actually …..

I hope you are not too disappointed

For example, you can do …….
• Image Recognition
“monkey”
“monkey”
“cat”
Network
“cat”
“dog”

“dog”
For example, you can do …….
“Talk” in e-mail
Spam
filtering Network 1/0
(Yes/No)
“free” in e-mail
1 (Yes)

0 (No)
Backpropagation: an efficient way to
compute 𝜕𝐿Τ𝜕𝑤 in neural network

Backpropagation
Back Propagation algorithm – Illustration

K Kotecha
Forward phase

K Kotecha
Computing error

K Kotecha
Backward phase

K Kotecha
Weight Update

K Kotecha
Advantages & Disadvantages

Advantages

• Massively parallel in nature

• Fault (noise) tolerant because of parallelism
• Can be designed to be adaptive

Disadvantages

• No clear rules or design guidelines for arbitrary applications

• No general way to assess the internal operation of the network
• (therefore, an ANN system is seen as a “black-box”)
• Difficult to predict future network performance (generalization)

K Kotecha
ANN- When ?

◼ Input is high-dimensional discrete or real-valued

◼ The target function is real-valued, discrete-valued or vector-valued
◼ Possibly noisy data
◼ The form of the target function is unknown
◼ Human readability of result is not (very) important
◼ Long training time is accepted
◼ Short classification/prediction time is required

K Kotecha

Notes On Introduction To Deep Learning
No ratings yet
Notes On Introduction To Deep Learning
19 pages
Ambiguous Grammar: Context Free Grammars (CFGS) Are Classified Based On
No ratings yet
Ambiguous Grammar: Context Free Grammars (CFGS) Are Classified Based On
3 pages
Deep Learning Computer Vision
No ratings yet
Deep Learning Computer Vision
302 pages
Deep Learning Turorial PDF
No ratings yet
Deep Learning Turorial PDF
301 pages
2.game AI 1
No ratings yet
2.game AI 1
268 pages
Deep Learning Tutorial
No ratings yet
Deep Learning Tutorial
133 pages
Deep Learning
No ratings yet
Deep Learning
299 pages
Lecture8,9-Neural Networks
No ratings yet
Lecture8,9-Neural Networks
65 pages
Neural Networks
No ratings yet
Neural Networks
61 pages
Unit Ii DNN
No ratings yet
Unit Ii DNN
24 pages
NNDL Umit 1 Important Questions
No ratings yet
NNDL Umit 1 Important Questions
8 pages
AI Unit5 Neural Network 1c2c9166 c1b7 47a3 8ce1 E914f1ab6afb
No ratings yet
AI Unit5 Neural Network 1c2c9166 c1b7 47a3 8ce1 E914f1ab6afb
52 pages
1725876123-Unit 1 Fundamental of Deep Learning
No ratings yet
1725876123-Unit 1 Fundamental of Deep Learning
51 pages
Unit I
No ratings yet
Unit I
90 pages
Deep Learning: Hung-yi Lee 李宏毅
No ratings yet
Deep Learning: Hung-yi Lee 李宏毅
29 pages
Unit 1
No ratings yet
Unit 1
16 pages
AI Mod4 Session 8 Best Fit Line & ANN
No ratings yet
AI Mod4 Session 8 Best Fit Line & ANN
39 pages
An Introduction To Neural Networks: Instituto Tecgraf PUC-Rio Nome: Fernanda Duarte Orientador: Marcelo Gattass
No ratings yet
An Introduction To Neural Networks: Instituto Tecgraf PUC-Rio Nome: Fernanda Duarte Orientador: Marcelo Gattass
45 pages
Deep Learning
No ratings yet
Deep Learning
37 pages
CS217 2024 Lec11
No ratings yet
CS217 2024 Lec11
7 pages
Neural Networks
No ratings yet
Neural Networks
29 pages
Intro To DL
No ratings yet
Intro To DL
28 pages
Introduction Deep Eng
No ratings yet
Introduction Deep Eng
50 pages
Lecture 09 Slides - After
No ratings yet
Lecture 09 Slides - After
57 pages
Deep Learning
No ratings yet
Deep Learning
189 pages
DEEP LEARNING Paper
No ratings yet
DEEP LEARNING Paper
12 pages
Neural Network
No ratings yet
Neural Network
20 pages
ML Unit 4
No ratings yet
ML Unit 4
23 pages
NN Unit - 1
No ratings yet
NN Unit - 1
27 pages
ANNs
No ratings yet
ANNs
57 pages
WINSEM2023-24 BITE410L TH VL2023240503970 2024-03-11 Reference-Material-I
No ratings yet
WINSEM2023-24 BITE410L TH VL2023240503970 2024-03-11 Reference-Material-I
40 pages
4 - DL (v2)
No ratings yet
4 - DL (v2)
32 pages
Ds Unit V Ann Perceptron
No ratings yet
Ds Unit V Ann Perceptron
69 pages
Structure of Neural Networks
No ratings yet
Structure of Neural Networks
12 pages
Unit V
No ratings yet
Unit V
9 pages
Artificial Neural Network
No ratings yet
Artificial Neural Network
86 pages
ANN Doc
No ratings yet
ANN Doc
2 pages
CS 611 Slides 5
No ratings yet
CS 611 Slides 5
28 pages
7 Neural Networks - Lecture Slides
No ratings yet
7 Neural Networks - Lecture Slides
74 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
13 pages
Ad3451 ML Unit 4 Notes
No ratings yet
Ad3451 ML Unit 4 Notes
34 pages
Ad3451 ML Unit 4 Notes
No ratings yet
Ad3451 ML Unit 4 Notes
36 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
34 pages
Artificial Neural Network
No ratings yet
Artificial Neural Network
29 pages
Convolutional Neural Networks
No ratings yet
Convolutional Neural Networks
21 pages
DL Presentation
No ratings yet
DL Presentation
82 pages
Introduction To Neural Network
No ratings yet
Introduction To Neural Network
20 pages
Machine Learning Unit 2
No ratings yet
Machine Learning Unit 2
33 pages
What Actions Can Human Brain Do?: Trained
No ratings yet
What Actions Can Human Brain Do?: Trained
40 pages
NN Learning
No ratings yet
NN Learning
69 pages
Module 2
No ratings yet
Module 2
44 pages
Week 8 - ANN
No ratings yet
Week 8 - ANN
42 pages
Unit 3 Self Made
No ratings yet
Unit 3 Self Made
23 pages
Unit 5 ML
No ratings yet
Unit 5 ML
37 pages
Deep Learning Concepts
No ratings yet
Deep Learning Concepts
13 pages
Neural Network
No ratings yet
Neural Network
7 pages
scikit-learn机器学习（第2版）: Chinese Edition
From Everand
scikit-learn机器学习（第2版）: Chinese Edition
Posts & Telecom Press
No ratings yet
TensorFlow深度学习项目实战: Chinese Edition
From Everand
TensorFlow深度学习项目实战: Chinese Edition
Posts & Telecom Press
No ratings yet
TensorFlow机器学习: Chinese Edition
From Everand
TensorFlow机器学习: Chinese Edition
Posts & Telecom Press
No ratings yet
Inverse Trigonometric Functions (Trigonometry) Mathematics Question Bank
From Everand
Inverse Trigonometric Functions (Trigonometry) Mathematics Question Bank
Mohmmad Khaja Shareef
No ratings yet
Advanced Ec Section 6
No ratings yet
Advanced Ec Section 6
5 pages
Wang 2015
No ratings yet
Wang 2015
14 pages
Character - Ai Faces Lawsuit After Teen's Suicide - The New York Times
No ratings yet
Character - Ai Faces Lawsuit After Teen's Suicide - The New York Times
10 pages
Parallelism in A Uniprocessor System: Multiprogramming
No ratings yet
Parallelism in A Uniprocessor System: Multiprogramming
2 pages
BDA3073 - 11 Bode Plot
No ratings yet
BDA3073 - 11 Bode Plot
26 pages
The Elements of User Experience
No ratings yet
The Elements of User Experience
23 pages
CHM121 - Module 2 - Significant Figures
No ratings yet
CHM121 - Module 2 - Significant Figures
26 pages
DevOps Engineer
No ratings yet
DevOps Engineer
2 pages
Buy Stigum's Money Market 4E Ebook at Discount Price
100% (1)
Buy Stigum's Money Market 4E Ebook at Discount Price
12 pages
IMAS 08.10 Ed.1 Am2
No ratings yet
IMAS 08.10 Ed.1 Am2
19 pages
Algan/Gan Hemts-An Overview of Device Operation and Applications
No ratings yet
Algan/Gan Hemts-An Overview of Device Operation and Applications
10 pages
The Future of Cybersecurity - Emerging Trends and Challenges
No ratings yet
The Future of Cybersecurity - Emerging Trends and Challenges
5 pages
Lesson 12 Practice Problem #5 3T AY1920-1
No ratings yet
Lesson 12 Practice Problem #5 3T AY1920-1
2 pages
TK Series Magnet GPS Tracker USER MANUAL
No ratings yet
TK Series Magnet GPS Tracker USER MANUAL
26 pages
AWS Certified SysOps Administrator - Associate
No ratings yet
AWS Certified SysOps Administrator - Associate
2 pages
Viscosity Sample Information Sheet PDF
No ratings yet
Viscosity Sample Information Sheet PDF
1 page
T Test Formula
100% (1)
T Test Formula
2 pages
Rundown Pelatihan Threat Hunting - Beta Dan Charlie (WIB)
No ratings yet
Rundown Pelatihan Threat Hunting - Beta Dan Charlie (WIB)
3 pages
BL Outline 14 01 24
No ratings yet
BL Outline 14 01 24
8 pages
Lecture - 6 PWM and DC Motor Control
No ratings yet
Lecture - 6 PWM and DC Motor Control
14 pages
ES - Lecture2 - Aug 2
No ratings yet
ES - Lecture2 - Aug 2
37 pages
FinalPaperDesign and Simulation of PID Controller For Power Electronics Converter Circuits170541
No ratings yet
FinalPaperDesign and Simulation of PID Controller For Power Electronics Converter Circuits170541
6 pages
Synchronizing Notes: Dropbox
No ratings yet
Synchronizing Notes: Dropbox
1 page
COS 101.use. Lecture 1
No ratings yet
COS 101.use. Lecture 1
16 pages
Brushless DC Electric Motor
No ratings yet
Brushless DC Electric Motor
7 pages
Instruction Sheet Is-Rsv5006 Style Number: Rsv5006Af: Package Contents
No ratings yet
Instruction Sheet Is-Rsv5006 Style Number: Rsv5006Af: Package Contents
6 pages
NetLogo User Manual
No ratings yet
NetLogo User Manual
438 pages
Marnada Et Al 2022 - Agile Project Management Challenge in Handling Scope and Change: A Systematic Literature Review
No ratings yet
Marnada Et Al 2022 - Agile Project Management Challenge in Handling Scope and Change: A Systematic Literature Review
11 pages
PNG Digital Transformation Policy - 21122020 - Updated
No ratings yet
PNG Digital Transformation Policy - 21122020 - Updated
52 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Unit-5 AI ETC

Uploaded by

Unit-5 AI ETC

Uploaded by

Artificial Neural Networks

Machine Learning ≈ Looking for a Function

The role of the Activation Function is to

Let’s take a step back and clarify: What

Well, if we compare the neural network to

Training function input:

Goodness of Pick the “Best” Function

Step 1: define a set of function

Step 2: goodness of function

Step 3: pick the best function

AlexNet (2012) VGG (2014) GoogleNet (2014)

May not be easy to interpret

You need to decide the network structure to

• Q: How many layers? How many neurons for each

• Q: Can the structure be automatically determined?

Step 1: define a set of function

Step 2: goodness of function

Step 3: pick the best function

“5” “0” “4” “1”

“9” “2” “1” “3”

The learning target is defined on

Input: y2 has the maximum value

Loss can be square error or cross entropy target

Step 1: define a set of function

Step 2: goodness of function

Step 3: pick the best function

Find network parameters 𝜽∗ that minimize total loss L

E.g. speech recognition: 8 layers and

Find network parameters 𝜽∗ that minimize total loss L

Find network parameters 𝜽∗ that minimize total loss L

Find network parameters 𝜽∗ that minimize total loss L

Find network parameters 𝜽∗ that minimize total loss L

Stuck at local minima

The value of a network parameter w

Different initial point

Reach different minima,

I hope you are not too disappointed

• Massively parallel in nature

• No clear rules or design guidelines for arbitrary applications

◼ Input is high-dimensional discrete or real-valued

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.