Military AI-Week 03-ANN
Military AI-Week 03-ANN
• To perform task, Neural Network employ strong interconnection of computing cells known an “neurons”
• Knowledge is acquired by the neural network from its environment through a learning process
• Interneuron connection strengths, known as synaptic weights, are used to store the acquired knowledge
Neural Network in Details
https://www.youtube.com/watch?v=bfmFfD2RIcg
Properties and Capabilities Neural Networks (1)
• Nonlinearity
• Nonlinearity is a highly important property. particularly if the underlying physical
mechanism responsible for generation of the input signal (e.g .• speech signal)
• Input-Output Mapping
• Neural network learns from the examples by constructing an input-output mapping for the
problem at hand
• Adaptivity
• Neural networks have a built-in capability to adapt their synaptic weights to changes in the
surrounding environment
• Evidential Response
• In the context of pattern classification, a neural network can be designed to provide
information not only about which particular pattern to select, but also about the
confidence in the decision made
• Contextual Information
• Knowledge is represented by the very structure and activation state of a neural network
Properties and Capabilities Neural Networks (1)
• Fault Tolerance
• Capable of robust computation, in the sense that its performance degrades
gracefully under adverse operating conditions
• VLSI (Very Large Scale Integration) Implementability
• VLSI provides a means of capturing truly complex behavior in a highly hierarchical
fashion
• Uniformity of Analysis and Design
• same notation is used in all domains involving the application of neural networks
• Neurobiological Analogy
• The design of a neural network is motivated by analogy with the brain, which is a
living proof that fault tolerant parallel processing is not only physically possible but
also fast and powerful
Basic Equation in ANN
b = bias,
w = weight,
x = value of a feature
Neural Network Architectures (1)
• In general, we can group neural network architectures, as follow:
3. Recurrent Networks
Neural Network Architectures (2)
Single-layer Feedforward Networks
https://www.youtube.com/watch?v=rEDzUT3ymw4
History of Neural Network
History of Neural Network (1)
• Progression (1943-1960) • Progression (1980-)
• First Mathematical model of neurons Pitts • 1986 Backpropagation reinvented:
& McCulloch (1943) ∗ Learning representations by back-
• Beginning of artificial neural networks propagation errors. Rumilhart et al. Nature
• Perceptron, Rosenblatt (1958) • Successful applications in
∗ A single neuron for classification ∗ Character recognition, autonomous cars, ...,
∗ Perceptron learning rule etc.
∗ Perceptron convergence theorem • But there were still some open questions in
• Degression (1960-1980) ∗ Overfitting? Network structure? Neuron
number? Layer number? Bad local minimum
• Perceptron can’t even learn the XOR
function points? When to stop training?
• Hopfield nets (1982), Boltzmann machines, ...,
• We don’t know how to train MLP
etc.
• 1963 Backpropagation (Bryson et al.)
∗ But not much attention
• Signal Processing
• Control system
• Pattern recognition
• Medical
• Speech synthesis and recognition
• Business
SIGNAL PROCESSING
• Removes
sound
distractions
• Removes
sound echo
CONTROL SYSTEM
Robot
Control
System
PATTERN RECOGNITION
• Diagnosing
Disease
VOICE RECOGNITION AND SYNTHESIS
• Sound
synthesis
• Voice
recognition
BUSINESS
• Stock price
predictions
• Predicting
business
failure
Implementation of Neural
Network : Tele-ECG
ECardio
E-Cardio is an integrated system that helps people to examine their cardiovascular
health, without having to meet a doctor. This is especially useful in a situation like
Indonesia.
Sensors
1 The system utilizes sensors to measure a person’s heartbeat and
will visualize and store the heartbeat data in an Android
smartphone
Classification
2 The system could also provide an automatic classification of the
person’s cardiovascular health. In addition to that, the system also
sends the person’s data to a doctor.
Transmission
3 Developed a method for ECG signal compression to be transmitted
via cellular signal
Activation Function
Learning Process (1)
Activation Function
• A function that determines the internal
state of a neuron in ANN
• The output will be sent to other
neurons as input
• Variation: identity, binary ladder,
bipolar ladder, binary sigmoid, etc.
Activation Functions (1)
► Identity Function f (x) = x, for all x
x
Activation Functions (2)
► Binary Ladder Function
(Heaviside/threshold) f(x)
⎧ 1, if x≥θ
f (x) = ⎨0, if
1
⎩ x<θ
-1
Similar to binary ladder
Difference in range {-1,1}
Activation Functions (4)
► Binary Sigmoid Function
f (x) =
1 f(x)
1 − exp(− σx)
1
•Logistic function and hyperbolic
tangent (S curve)
f(x) f(x
)
1 1
x
-1
Back-propagation Learning Algorithm
► Back-propagation 1 1
(Architecture)
► input layer is shown by unit-
v11 w11
unit Xi
X1 Z1 Y1
...
...
...
► hidden layer is shown
by unit-unit Zj
Xi vij Zj wjk Yk
Xn vnp Zp wpm Ym
Back-propagation Learning Algorithm
► Back-propagation learning
1. Weight initialization
1.
wi = 0 or random number for i = b, 1, 2, 3, ….., n
Set learning rate α (0,1 ≤ nα ≤ 1)
► As long as the stop condition has not been met:
a. Feedforward:
2. Each input unit (Xi, i = 1, …, n) receive input signal xi and pass it on to
2. all units in the layer above it (hidden units)
2. Each hidden unit (Zj, j = 1, …, p) calculate the total weighted input
3.
signal and its activation function.
Back-propagation Learning Algorithm
4. Each output unit (Yk, k = 1, …, m) calculate the total weighted
input signal and its activation function
b. Backpropagation of error:
5. Each output unit (Yk, k = 1, …, m) receive a target pattern that
matches the training input pattern. The unit calculates error
information and corrects the weights.
Δwjk = αδk z j
δk = (tk − yk ) f '(y _ ink )
Δw0k = αδk
Back-propagation Learning Algorithm
6.
6. Each hidden unit (Zj, j = 1, …, p) calculate the input difference
(from the units in the layer above it)
Δvij = αδ j xi
m
δ _ in = δ w δ j = δ _ in j f '(z _ in j
∑ k jk Δv0 j = αδ j
j
k =1
)
7. Each output unit (Yk, k = 1, …, m) and hidden unit (Zj, j = 1, …, p)
changing biases and weights (j = 0, …, p) dan (i = 1, …, n):
wjk (new) = wjk (old ) + Δwjk vij (new) = vwij (old ) + Δvij
8.
8. Set stopping criteria
Back-propagation Learning Algorithm
► Back-propagation Testing
1.1. Weight initialization wi = 0 untuk i = b, 1, 2, 3, ….., n (the value of the weight
of the learning outcomes)
2. Set activation for input unit xi = si i = 1, 2, …, n
3. For j = 1, …, p
4. For k = 1, …,
m
Back-propagation Learning Algorithm
Selection of initial weights and bias in back-propagation
• The choice of initial weight affects whether the network will reach an error
• global (or local) minimum, and if reached, how fast it will converge.
• The weight update depends on the activation function of the inner unit (input
signaling) and the derivative of the activation function of the outer unit (input
signal receiver), so it is necessary to avoid selecting an initial weight that causes
both to be 0.
• If using the sigmoid function, the initial weight value should not be too large
because it can cause the derivative value to be very small (falling in the
saturation region). Nor should it be too small, as this may cause the net input to
the hidden unit or the output unit to be too close to zero, which makes learning
too slow.
Back-propagation Learning Algorithm
s1 s2 t
1 1 0
1 0 1
0 1 1
0 0 0
Back-propagation Learning Algorithm
∆w11 = 0,2 (-0,11) (0,55) = -0,01 ∆w21 = 0,2 (-0,11) (0,67) = -0,01 ∆w31 = 0,2 (-0,11) (0,52) = -0,01
Back-propagation Learning Algorithm
The sum of the errors of the hidden units (δ) Error factor δ in hidden unit
Back-propagation Learning Algorithm
Change of weight to hidden units ∆vij = α δj xi
Z1 Z2 Z3
X1 ∆v11 = (0,2) (-0,01) ∆v12 = (0,2) (0,01) ∆v13 = (0,2) (0,01)
(1) = -0,002 ≈ 0 (1) = 0,002 ≈ 0 (1) = 0,002 ≈ 0
Z1 Z2 Z3
X1 v11 = 0,2 + 0 = 0,2 v12 = 0,3 + 0 = 0,3 v13 = -0,1 + 0 = -0,1
1
1
Let’s consider a single hidden layer 1 1
network, using as building blocks
threshold units.
x1 x2
• Boolean functions:
x1 o1
• Every boolean function can be represented by a network with
single hidden layer x2 o2
• But might require exponential (in number of inputs) hidden units
• Continuous functions:
• Every bounded continuous function can be approximated with
arbitrarily small error, by network with single hidden layer
[Cybenko 1989; Hornik et al. 1989]
oi Erri=yi-oi
We can back-propagate the error from the output layer to the hidden layers.
The back-propagation process emerges directly from a
derivation of the overall error gradient.
Deep Learning Introduction
• Deep learning (DL) is a machine learning subfield that uses multiple layers for learning
data representations
• DL is exceptionally effective at learning patterns
Machine Learning (ML) vs Deep Learning (DL)
•Conventional machine learning methods rely on human-designed feature representations
• ML becomes just optimizing weights to best make a final prediction
Machine Learning (ML) vs Deep Learning (DL)
● DL applies a multi-layer process for learning rich hierarchical features (i.e., data representations)
● Input image pixels → Edges → Textures → Parts → Objects
Why is Deep Learning Useful?
● DL provides a flexible, learnable framework for representing visual,
text, linguistic information
● Can learn in supervised and unsupervised manner
● DL represents an effective end-to-end learning system
● Requires large amounts of training data
● Since about 2010, DL has outperformed other ML techniques
● First in vision and speech, then NLP, and other applications
Thank You