0% found this document useful (0 votes)
11 views71 pages

Military AI-Week 03-ANN

Uploaded by

Adhi Kusumadjati
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views71 pages

Military AI-Week 03-ANN

Uploaded by

Adhi Kusumadjati
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 71

Artificial Neural Network (ANN)

Military Artificial Intelligence


Prof. Dr. Eng. Wisnu Jatmiko, S.T., M.Kom.
Dr. Ario Yudo Husodo, S.T., M.T.
Grafika Jati, S.Kom., M.Kom.
© Fasilkom UI - 2023
Discussion Topic

Artificial Neural Network

Jaringan Saraf Tiruan


Outline Presentasi
• Basic Neural Network
• History of Artificial Neural Network
• Usage of Neural Network
• Activation Function
• Perceptron
• Calculation Example
• Deep Learning
Basic Neural Network
What is neural networks ?
A neural network is a massively parallel distributed processor made up of simple processing units, which has a
natural propensity for storing experiential knowledge and making it available for use.
Neural Network adopts human brain to process the task

• To perform task, Neural Network employ strong interconnection of computing cells known an “neurons”
• Knowledge is acquired by the neural network from its environment through a learning process
• Interneuron connection strengths, known as synaptic weights, are used to store the acquired knowledge
Neural Network in Details

https://www.youtube.com/watch?v=bfmFfD2RIcg
Properties and Capabilities Neural Networks (1)
• Nonlinearity
• Nonlinearity is a highly important property. particularly if the underlying physical
mechanism responsible for generation of the input signal (e.g .• speech signal)
• Input-Output Mapping
• Neural network learns from the examples by constructing an input-output mapping for the
problem at hand
• Adaptivity
• Neural networks have a built-in capability to adapt their synaptic weights to changes in the
surrounding environment
• Evidential Response
• In the context of pattern classification, a neural network can be designed to provide
information not only about which particular pattern to select, but also about the
confidence in the decision made
• Contextual Information
• Knowledge is represented by the very structure and activation state of a neural network
Properties and Capabilities Neural Networks (1)
• Fault Tolerance
• Capable of robust computation, in the sense that its performance degrades
gracefully under adverse operating conditions
• VLSI (Very Large Scale Integration) Implementability
• VLSI provides a means of capturing truly complex behavior in a highly hierarchical
fashion
• Uniformity of Analysis and Design
• same notation is used in all domains involving the application of neural networks
• Neurobiological Analogy
• The design of a neural network is motivated by analogy with the brain, which is a
living proof that fault tolerant parallel processing is not only physically possible but
also fast and powerful
Basic Equation in ANN

b = bias,
w = weight,
x = value of a feature
Neural Network Architectures (1)
• In general, we can group neural network architectures, as follow:

1. Single-layer Feedforward Networks

2. Multilayer Feedforward Networks

3. Recurrent Networks
Neural Network Architectures (2)
Single-layer Feedforward Networks

• X : Input ; W : weight ; Y: output


Neural Network Architectures (3)
Multilayer Feedforward Networks

• X : input ; V, W : weight ; Y: output ;


Z : hidden layer
Neural Network Architectures (4)
Recurrent Neural Networks

• Has cyclic path that makes different


with feedforward
Artificial Neural Network Variation

Single Layer Perceptron Multi Layer Perceptron Learning Vector Quantization


(LVQ)

Recurrent Neural Network


Convolutional Neural Network (RNN)
Deep Learning (CNN)
Neural Network Summary

https://www.youtube.com/watch?v=rEDzUT3ymw4
History of Neural Network
History of Neural Network (1)
• Progression (1943-1960) • Progression (1980-)
• First Mathematical model of neurons Pitts • 1986 Backpropagation reinvented:
& McCulloch (1943) ∗ Learning representations by back-
• Beginning of artificial neural networks propagation errors. Rumilhart et al. Nature
• Perceptron, Rosenblatt (1958) • Successful applications in
∗ A single neuron for classification ∗ Character recognition, autonomous cars, ...,
∗ Perceptron learning rule etc.
∗ Perceptron convergence theorem • But there were still some open questions in
• Degression (1960-1980) ∗ Overfitting? Network structure? Neuron
number? Layer number? Bad local minimum
• Perceptron can’t even learn the XOR
function points? When to stop training?
• Hopfield nets (1982), Boltzmann machines, ...,
• We don’t know how to train MLP
etc.
• 1963 Backpropagation (Bryson et al.)
∗ But not much attention

Introduction to Machine Learning (Lecture Notes) Perceptron Lecturer: Barnabas Poczos


History of Neural Network (2)
• Degression (1993-) • Progression (2006-)
• SVM: Support Vector Machine is • Deep learning is a rebranding of ANN research.
developed by Vapnik et al.. • Deep Belief Networks (DBN)
However, SVM is a shallow ∗ A fast learning algorithm for deep belief nets.
architecture.
• Graphical models are becoming Hinton et al. Neural Computation.
more and more popular ∗ Generative graphical model
• Great success of SVM and graphical ∗ Based on restrictive Boltzmann machines
models almost kills the ANN
(Artificial Neural Network) ∗ Can be trained efficiently
research. • Deep Autoencoder based networks
• Training deeper networks ∗ Greedy Layer-Wise Training of Deep
• consistently yields poor results.
However, Yann LeCun (1998)
developed deep convolutional
Networks.Bengio et al. NIPS Convolutional neural
networks running on GPUs
neural networks (a discriminative ∗ Great success for NN since the massiave usage
model). of GPUs.
∗ AlexNet (2012). Krizhevsky et al. NIPS

Introduction to Machine Learning (Lecture Notes) Perceptron Lecturer: Barnabas Poczos


Usage of Neural Network
PROBLEMS THAT CAN BE SOLVED WITH
NEURAL NETWORK

• The training data is too much noisy


• Complex data
• The use of symbolic representations, for example a decision tree
NEURAL NETWORK PROBLEM CHARACTERISTIC

Neural networks can solve problems that have the following


characteristics:
• Instances are represented in multiple attribute-value pairs
• Output targets can be discrete, real, or vector(discrete/real)
• The use of symbolic representations, for example a decision tree
• Training data can contain errors
• There is no time limit for training
• It takes a quick evaluation of the learning process
• Not concerned with human ability to understand the learning
process
NEURAL NETWORK USAGE

• Signal Processing
• Control system
• Pattern recognition
• Medical
• Speech synthesis and recognition
• Business
SIGNAL PROCESSING

• Removes
sound
distractions
• Removes
sound echo
CONTROL SYSTEM
Robot
Control
System
PATTERN RECOGNITION

Letter Recognition Picture Recognition


MEDICAL

• Diagnosing
Disease
VOICE RECOGNITION AND SYNTHESIS

• Sound
synthesis
• Voice
recognition
BUSINESS

• Stock price
predictions
• Predicting
business
failure
Implementation of Neural
Network : Tele-ECG
ECardio
E-Cardio is an integrated system that helps people to examine their cardiovascular
health, without having to meet a doctor. This is especially useful in a situation like
Indonesia.

Sensors
1 The system utilizes sensors to measure a person’s heartbeat and
will visualize and store the heartbeat data in an Android
smartphone

Classification
2 The system could also provide an automatic classification of the
person’s cardiovascular health. In addition to that, the system also
sends the person’s data to a doctor.

Transmission
3 Developed a method for ECG signal compression to be transmitted
via cellular signal
Activation Function
Learning Process (1)
Activation Function
• A function that determines the internal
state of a neuron in ANN
• The output will be sent to other
neurons as input
• Variation: identity, binary ladder,
bipolar ladder, binary sigmoid, etc.
Activation Functions (1)
► Identity Function f (x) = x, for all x

Commonly used on the input layer neurons of the system


f(x)

x
Activation Functions (2)
► Binary Ladder Function
(Heaviside/threshold) f(x)

⎧ 1, if x≥θ
f (x) = ⎨0, if
1
⎩ x<θ

Commonly used in single layer networks


0
Used to convert continuous input to x
binary output
Activation Functions (3)
► Bipolar Ladder Function
f(x)
⎧ 1, if x≥θ 1
f (x) = ⎨− 1,
⎩ if x<θ

-1
Similar to binary ladder
Difference in range {-1,1}
Activation Functions (4)
► Binary Sigmoid Function

f (x) =
1 f(x)

1 − exp(− σx)
1
•Logistic function and hyperbolic
tangent (S curve)

• A simple relationship between a


function and its derivative

• f ′(x) = σf (x)[1 − f (x)]


x

•Low computational cost for back-


propagation
Perceptron
PERCEPTRON

• A Perceptron is a neural network


unit that does certain
computations to detect features or
business intelligence in the input
data
• The simple architecture in an
artificial neural network consists of
one unit input layer (which the
number of neurons corresponds to
the number of components of the
data you want to recognize) and
one output unit.
Basic Components of Perceptron
• Input Nodes or Input Layer
• Weight and Bias
• Activation Function
MODEL PERCEPTRON
PERCEPTRON TRAINING RULE (1)
• Learning with the perceptron
assumes that the learning
procedure is proven to direct the
weights to converge
• If there are no errors then the
weight value will not change, or in
other words that the weight only
changes according to certain
conditions.
PERCEPTRON TRAINING RULE (2)
Backpropagation
Back-propagation Learning Algorithm
Back-propagation
• Back-propagation was introduced by Rumelhart, Hinton and Williams and popularized in the book
Parallel Distributed Processing (Rumelhart and McLelland, 1986)
• The basic principle of the back-propagation algorithm has three phases:
• Feedforward phase for learning input pattern
• Calculate and back-propagation phase from error received
• Weight adjustment phase.
• Multilayer network
• Consists of one layer of input units, one or more hidden layers and one layer of output units
• The basic structure is the same as the perceptron, so it is called a multilayer perceptron
• Every neuron at a layer in the Back-Propagation network gets a signal
• input from all neurons in the previous layer along with one bias signal.
Back-propagation Learning Algorithm
► Back-propagation
► Characteristics of the activation function: continuous, differentiable, monotonous not
descending, and the derivative of the function is easy to calculate.
a. Sigmoid biner function (range : [0,1]) b. Sigmoid bipolar function (range : [-1,1])

f(x) f(x
)
1 1

x
-1
Back-propagation Learning Algorithm
► Back-propagation 1 1

(Architecture)
► input layer is shown by unit-
v11 w11
unit Xi
X1 Z1 Y1

► output layer is shown


by unit-unit Yk

...

...

...
► hidden layer is shown
by unit-unit Zj
Xi vij Zj wjk Yk

Xn vnp Zp wpm Ym
Back-propagation Learning Algorithm
► Back-propagation learning
1. Weight initialization
1.
wi = 0 or random number for i = b, 1, 2, 3, ….., n
Set learning rate α (0,1 ≤ nα ≤ 1)
► As long as the stop condition has not been met:
a. Feedforward:
2. Each input unit (Xi, i = 1, …, n) receive input signal xi and pass it on to
2. all units in the layer above it (hidden units)
2. Each hidden unit (Zj, j = 1, …, p) calculate the total weighted input
3.
signal and its activation function.
Back-propagation Learning Algorithm
4. Each output unit (Yk, k = 1, …, m) calculate the total weighted
input signal and its activation function

b. Backpropagation of error:
5. Each output unit (Yk, k = 1, …, m) receive a target pattern that
matches the training input pattern. The unit calculates error
information and corrects the weights.
Δwjk = αδk z j
δk = (tk − yk ) f '(y _ ink )
Δw0k = αδk
Back-propagation Learning Algorithm
6.
6. Each hidden unit (Zj, j = 1, …, p) calculate the input difference
(from the units in the layer above it)
Δvij = αδ j xi
m
δ _ in = δ w δ j = δ _ in j f '(z _ in j
∑ k jk Δv0 j = αδ j
j
k =1
)
7. Each output unit (Yk, k = 1, …, m) and hidden unit (Zj, j = 1, …, p)
changing biases and weights (j = 0, …, p) dan (i = 1, …, n):
wjk (new) = wjk (old ) + Δwjk vij (new) = vwij (old ) + Δvij

8.
8. Set stopping criteria
Back-propagation Learning Algorithm
► Back-propagation Testing
1.1. Weight initialization wi = 0 untuk i = b, 1, 2, 3, ….., n (the value of the weight
of the learning outcomes)
2. Set activation for input unit xi = si i = 1, 2, …, n
3. For j = 1, …, p

4. For k = 1, …,
m
Back-propagation Learning Algorithm
Selection of initial weights and bias in back-propagation
• The choice of initial weight affects whether the network will reach an error
• global (or local) minimum, and if reached, how fast it will converge.
• The weight update depends on the activation function of the inner unit (input
signaling) and the derivative of the activation function of the outer unit (input
signal receiver), so it is necessary to avoid selecting an initial weight that causes
both to be 0.
• If using the sigmoid function, the initial weight value should not be too large
because it can cause the derivative value to be very small (falling in the
saturation region). Nor should it be too small, as this may cause the net input to
the hidden unit or the output unit to be too close to zero, which makes learning
too slow.
Back-propagation Learning Algorithm

Selection of initial weights and bias in back-propagation


• Two examples of weight and bias initialization models.
• Random Initialization.
• Weights and biases are initialized to random values between -0.5 and 0.5
(or between -1 and 1, or at any other appropriate interval).
• Initialize Nguyen-Widrow.
• This method provides a faster learning rate. The following is an example
for an architecture with one hidden layer.
• The weights from hidden units/neurons to output units/neurons are
initialized to a random value between -0.5 and 0.5.
Back-propagation Learning Algorithm
The weights from input neurons to hidden neurons are initialized as follows:
(1) Set
n = number of input unit
p = number of hidden unit
β = scale 0.7 (p)1/
factor n

(2). For each hidden unit j (j = 1, …, p). do step (3) – (6).


(3). For each i = 1, …, n (for all input units), vij(old) = random number between -0.5
and 0.5 (or between –γ and γ).
(4). Calculate norm value||vj(old)||.
(5). Reinitialize the weights of the input units (i = 1, …, n) :
(6). Set bias : v0j = random number between –β and β.
Back-propagation Learning Algorithm
Back-propagation with a hidden layer (with 3 units) to recognize XOR logic functions at a
learning rate α = 0,2

s1 s2 t

1 1 0

1 0 1

0 1 1

0 0 0
Back-propagation Learning Algorithm

Hidden Unit Output (Zj)


Weight Initialization

Z1 Z2 Z3 Y z_net 1 = -0,3 + 1 (0,2) + 1 (0,3) = 0,2

X1 0,2 0,3 -0,1 z_net 2 = 0,3 + 1 (0,3) + 1 (0,1) = 0,7


Z1 0,5 z_net 3 = 0,3 + 1 (-0,1) + 1 (-0,1) = 0,1
X2 0,3 0,1 -0,1
Z2 -0,3
1 -0,3 0,3 0,3 Activation Function (Z )
Z3 -0,4 j

1 -0,1 z_net 1 = 0,55


z_net 2 = 0,67
z_net 3 = 0,52
Back-propagation Learning Algorithm
Output unit (Ykj)

y_net k = -0,1 + 0,55 (0,5)+ 0,67 (-0,3)+ 0,52 (-0,4) = -0,24


y = f(y_net k) = 0,44

Factor δ in output unit Yk


δk = ( tk – yk) f’ (y_net k) = (t – y) y (1 – y) = (0 – 0,44) (0,44) (1 – 0,44) = -0,11

Weight change wjk (∆wjk = α δk zj), with α = 0,2


∆w01 = 0,2 (-0,11) (1) = -0,02 ∆w21 = 0,2 (-0,11) (0,67) = -0,01

∆w11 = 0,2 (-0,11) (0,55) = -0,01 ∆w21 = 0,2 (-0,11) (0,67) = -0,01 ∆w31 = 0,2 (-0,11) (0,52) = -0,01
Back-propagation Learning Algorithm
The sum of the errors of the hidden units (δ) Error factor δ in hidden unit
Back-propagation Learning Algorithm
Change of weight to hidden units ∆vij = α δj xi

Z1 Z2 Z3
X1 ∆v11 = (0,2) (-0,01) ∆v12 = (0,2) (0,01) ∆v13 = (0,2) (0,01)
(1) = -0,002 ≈ 0 (1) = 0,002 ≈ 0 (1) = 0,002 ≈ 0

X2 ∆v21 = (0,2) (-0,01) ∆v22 = (0,2) (0,01) ∆v23 = (0,2) (0,01)


(1) = -0,002 ≈ 0 (1) = 0,002 ≈ 0 (1) = 0,002 ≈ 0

1 ∆v01 = (0,2) (-0,01) ∆v02 = (0,2) (0,01) ∆v03 = (0,2) (0,01)


(1) = -0,002 ≈ 0 (1) = 0,002 ≈ 0 (1) = 0,002 ≈ 0
Back-propagation Learning Algorithm
Change in unit weight output (wjk (new) = wjk(old) + ∆wjk)

w01(old) = -0,1 – 0,02 = -0,12 w21(old) = -0,3 – 0,01 = -0,31

w11(old) = 0,5 – 0,01 = 0,49 w31(old) = -0,4 – 0,01 = -0,41

Hidden Unit Weight Change

Z1 Z2 Z3
X1 v11 = 0,2 + 0 = 0,2 v12 = 0,3 + 0 = 0,3 v13 = -0,1 + 0 = -0,1

X2 v21 = 0,3 + 0 = 0,3 v22 = 0,1 + 0 = 0,1 v23 = -0,1 + 0 = -0,1

1 v01 = -0,3 + 0 = -0,3 v02 = 0,3 + 0 = 0,3 v03 = 0,3 + 0 = 0,3

Continue For 2nd - 4th Training Data


Multi Layer Perceptron
Multi-layer Perceptrons (MLPs)
• Single-layer perceptrons can only represent linear decision surfaces.
• Multi-layer perceptrons can represent non-linear decision surfaces.
From Perceptron become Multi Layer Perceptron
Good news: Adding hidden layer allows more target
x2 functions to be represented.

Minsky & Papert (1969)


+ −
Not linearly separable
XO
R
x1
− +

Minsky & Papert (1969)


Perceptrons can only represent linearly separable functions.
Hidden Units
• Hidden units are nodes that are situated between the input nodes and the
output nodes.
• Hidden units allow a network to learn non-linear functions.
• Hidden units allow the network to represent combinations of the input
features.
Example: Boolean XOR
XOR
o 0.5
X1 ⊕ X2 ⇔ (X1 ∨X2) ∧¬(X1∧X2)

Not Linear separable 🡪 1 −1


Cannot be represented by a OR AND
single-layer perceptron 0.5 h1 h2 1.5

1
1
Let’s consider a single hidden layer 1 1
network, using as building blocks
threshold units.
x1 x2

w1 x1+w2 x2 –w0 > 0


Multi-Layer Feedforward Networks

• Boolean functions:
x1 o1
• Every boolean function can be represented by a network with
single hidden layer x2 o2
• But might require exponential (in number of inputs) hidden units

• Continuous functions:
• Every bounded continuous function can be approximated with
arbitrarily small error, by network with single hidden layer
[Cybenko 1989; Hornik et al. 1989]

• Any function can be approximated to arbitrary accuracy by a


network with two hidden layers [Cybenko 1988].
xN oO
Multi-layer Feedforward Neural Networks
or Multi-Layer Perceptrons (MLP)

• A multilayer neural network with one hidden layer and 10 inputs


• Layers are often fully connected (but not always).
• Number of hidden units typically chosen by hand.
Learning Algorithms for MLP
Goal: minimize sum squared errors
Err1=y1-
o1
Err2=y2-
o2

oi Erri=yi-oi

parameterized function of inputs:


Erro=yo- weights are the parameters of
the function.
oo
How to compute the errors Clear error at the output layer
for the hidden units?

We can back-propagate the error from the output layer to the hidden layers.
The back-propagation process emerges directly from a
derivation of the overall error gradient.
Deep Learning Introduction
• Deep learning (DL) is a machine learning subfield that uses multiple layers for learning
data representations
• DL is exceptionally effective at learning patterns
Machine Learning (ML) vs Deep Learning (DL)
•Conventional machine learning methods rely on human-designed feature representations
• ML becomes just optimizing weights to best make a final prediction
Machine Learning (ML) vs Deep Learning (DL)
● DL applies a multi-layer process for learning rich hierarchical features (i.e., data representations)
● Input image pixels → Edges → Textures → Parts → Objects
Why is Deep Learning Useful?
● DL provides a flexible, learnable framework for representing visual,
text, linguistic information
● Can learn in supervised and unsupervised manner
● DL represents an effective end-to-end learning system
● Requires large amounts of training data
● Since about 2010, DL has outperformed other ML techniques
● First in vision and speech, then NLP, and other applications
Thank You

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy