0% found this document useful (0 votes)
18 views16 pages

Mod3 NN

The document provides an introduction to neural networks, detailing their structure, learning mechanisms, and historical development influenced by neuroscience. It contrasts classical AI with neural networks, highlighting the advantages of the latter in handling complex, real-world tasks. Additionally, it discusses various neural network models, learning methods, and the significance of backpropagation in training multilayer networks.

Uploaded by

aditric
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views16 pages

Mod3 NN

The document provides an introduction to neural networks, detailing their structure, learning mechanisms, and historical development influenced by neuroscience. It contrasts classical AI with neural networks, highlighting the advantages of the latter in handling complex, real-world tasks. Additionally, it discusses various neural network models, learning methods, and the significance of backpropagation in training multilayer networks.

Uploaded by

aditric
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

SC-mod3

Neural Network
Topic: Introduction to Neural Networks – Advent of Modern Neuroscience,
Classical AI and Neural Networks

1. Introduction to Neural Networks


A Neural Network (NN) is a computational paradigm inspired by the structure and functioning of the
human brain. Neural networks are a central component of soft computing, known for their capacity to
learn from data and adaptively improve over time.

They consist of layers of nodes (neurons), with each connection having a weight.

Neural networks are used in tasks like classification, regression, prediction, and pattern
recognition.

Neural networks fall under the broader category of machine learning algorithms and are capable of
function approximation, i.e., learning a mapping from input data to output.

2. Advent of Modern Neuroscience


The development of neural networks is deeply tied to progress in neuroscience, particularly in
understanding how the brain processes information.

Key Historical Milestones:


1943: McCulloch and Pitts Model

Proposed the first model of an artificial neuron using binary threshold logic.

This model laid the groundwork for connecting logic and neural computation.

1949: Hebbian Learning (Donald Hebb)

Suggested that connections between neurons strengthen when they are activated together:
“Cells that fire together, wire together.”

This concept influenced unsupervised learning and weight adjustment algorithms in ANNs.

1958: Perceptron Model (Frank Rosenblatt)

Introduced the perceptron, a single-layer neural network capable of binary classification.

Could learn weights using a supervised learning rule, known as the Perceptron Learning Rule.

1969: Minsky and Papert's Critique

Highlighted limitations of the perceptron, especially its inability to solve non-linearly separable
problems like XOR.

Temporarily stalled research in neural networks.

SC-mod3 1
1980s: Revival with Backpropagation

Development of the Backpropagation algorithm enabled training of multi-layer neural


networks.

Marked the beginning of modern neural network research.

3. Classical AI vs Neural Networks


Classical AI and neural networks are two fundamentally different paradigms for building intelligent
systems.

Classical AI:
Also known as symbolic AI or good old-fashioned AI (GOFAI).

Based on explicit rule-based systems, logic, and predefined knowledge bases.

Suited for problems with well-defined rules (e.g., theorem proving, games like chess).

Neural Networks (Connectionist AI):


Inspired by the human brain’s architecture and learning processes.

Focuses on learning from examples, rather than explicit rules.

Capable of handling ambiguity, noise, and incomplete data.

Feature Classical AI Neural Networks

Knowledge Representation Explicit (rules, logic) Implicit (learned weights)

Learning Mechanism Manual rule encoding Data-driven learning

Performance Rigid, struggles with uncertain data Flexible, generalizes from examples

Interpretability High (rules are transparent) Low (black-box nature)

Domains Logical reasoning, expert systems Vision, speech, language, robotics

4. Transition from Classical AI to Neural Networks


Classical AI dominated the early stages of AI research but struggled with real-world tasks
involving noisy, fuzzy, or incomplete data.

Neural networks gained prominence as they offered robust performance in such environments.

The shift was driven by the success of NNs in tasks like image recognition, speech processing,
and natural language understanding.

Biological Neurons and Artificial Neural Network; Model of Artificial


Neuron

Biological Neuron: The Inspiration

SC-mod3 2
The concept of artificial neural networks is inspired by the structure and function of biological
neurons, which are the fundamental units of the human nervous system.

Structure of a Biological Neuron:


Dendrites: Receive electrical signals (inputs) from other neurons.

Cell Body (Soma): Processes incoming signals and determines if the neuron should fire.

Axon: Transmits the electrical signal away from the cell body.

Axon Terminals (Synaptic Knobs): Pass the signal to the next neuron via synapses using
neurotransmitters.

Working Principle:
When a neuron receives enough signals (i.e., the input exceeds a threshold), it fires and transmits
the signal to the next neuron.

This process is non-linear and adaptive, forming the basis of learning in the brain.

Artificial Neural Network (ANN)


An Artificial Neural Network is a computational system made up of artificial neurons that mimics the
behavior of biological neurons.

Key Characteristics:
Distributed parallel processing system: Each unit (neuron) works simultaneously with others.

Learning from data: Adjusts weights based on input-output examples using a learning algorithm.

Generalization: Capable of making predictions on unseen data after training.

Components of ANN:
Input Layer: Receives raw data.

Hidden Layers: Intermediate layers where computations happen via activation functions.

Output Layer: Produces the final output (e.g., class label, value).

Weights: Each connection has an associated weight that determines the importance of the input.

Bias: Allows shifting of the activation function to better fit the data.

Activation Function: Introduces non-linearity and determines the firing of the neuron.

Model of an Artificial Neuron (Mathematical Formulation)


An artificial neuron is a mathematical function that takes inputs, applies weights, and produces an
output using an activation function.

Structure:
Let the inputs be x1,x2, ..., x_n

SC-mod3 3
Let the corresponding weights be w1,w2, ..., w_n
Let bias be b

Then, the neuron performs the following operations:

1. Weighted Sum:

z = ∑ w i xi + b
​ ​ ​

i=1

2. Activation:

y= f(z)

Where ff is the activation function (e.g., sigmoid, tanh, ReLU).

Differences Between Biological and Artificial Neurons


Feature Biological Neuron Artificial Neuron

Signal Transmission Electrochemical (via synapse) Numerical (weighted sum)

Learning Hebbian learning Gradient-based (e.g., backpropagation)

Complexity Very high (many interconnections) Simplified abstraction

Adaptability Real-time adaptation, neuroplasticity Requires training epochs

Practical Implications:
The artificial neuron is the core building block of neural networks.

ANN behavior heavily depends on weight adjustments, which occur during training.

The choice of activation function and number of hidden layers significantly affects the
performance of the model.

Learning Methods: Hebbian, Competitive, Boltzmann, etc.

SC-mod3 4
Overview
Learning in neural networks refers to the method of updating the connection weights so that the
network can perform a specific task (e.g., classification or prediction) more accurately over time.
Learning is typically based on data and experience.

There are three broad categories of learning methods:

1. Supervised Learning: Target output is known.

2. Unsupervised Learning: No explicit target; patterns must be discovered.

3. Reinforcement Learning: Learns through feedback in the form of rewards/punishments.

The following learning rules fall into these categories.

Hebbian Learning (Unsupervised)


Proposed by: Donald Hebb (1949).

Principle: “Neurons that fire together, wire together.”

The weight between two neurons increases if both neurons are activated simultaneously.

Weight Update Rule:

Δwij = η ⋅ xi ⋅ yj
​ ​ ​

Where:

x_i: input from neuron i

y_j: output from neuron j

η: learning rate

Characteristics:
Unsupervised

Simple biologically inspired rule

Limited by uncontrolled weight growth unless normalized

Competitive Learning (Unsupervised)


Mechanism: Neurons compete to become active.

The “winning” neuron (with the strongest response) updates its weights.

Encourages specialization—each neuron becomes tuned to a specific type of input.

Weight Update Rule (for winning neuron jj):

Δwj = η(x − wj )
​ ​

SC-mod3 5
Applications:
Vector Quantization

Self-Organizing Maps (Kohonen Networks)

Boltzmann Learning (Stochastic / Energy-Based)


Used in Boltzmann Machines – stochastic, generative neural networks.

Involves adjusting weights to minimize a global energy function.

Uses simulated annealing to explore solution space.

Learning Characteristics:
Stochastic, based on probabilities

Weight updates rely on difference between actual and desired joint probabilities

Slow convergence but good for complex distribution modeling

Energy Function (simplified):

E = − ∑ wij si sj − ∑ bi si
​ ​ ​ ​ ​ ​ ​

i<j i

Where sis_i are neuron states, wijw_{ij} are weights, and bib_i are biases.

Other Learning Methods (Brief Mentions)


Error-Correction Learning: Used in supervised learning like perceptron and backpropagation.
Weights are adjusted to reduce the output error.

Reinforcement Learning: Weights are updated based on a reward signal; no precise error signal is
given.

Backpropagation: A supervised method that propagates error from output layer to hidden layers to
adjust weights using gradient descent.

Comparative Table

Learning Method Type Key Idea Use Case

Hebbian Unsupervised Co-activation strengthens weights Pattern recognition, early models

Competitive Unsupervised Winner-take-all updates Clustering, feature mapping

Boltzmann Stochastic Minimizes energy via annealing Complex pattern learning

Backpropagation Supervised Gradient descent on error Deep learning models

Reinforcement Feedback-based Learns via reward signals Game AI, robotics, control systems

SC-mod3 6
Neural Network Models: Perceptron, Adaline, Madaline; Single-
Layer Networks

Introduction to Single-Layer Neural Networks


A single-layer neural network consists of:

An input layer (just passes inputs to the next layer)

A single layer of output neurons (no hidden layers)

These models are mainly suitable for linearly separable problems. While simple, they form the
foundation for more advanced architectures.

1. Perceptron Model

Proposed by:
Frank Rosenblatt (1958)

Structure:
Inputs x1,x2,…, x_n

Weights w1,w2,..., w_n

Bias bb

Activation Function: Step function

Where:

d: desired output

y: actual output

η: learning rate

Properties:
Works only for linearly separable problems

Binary classifier (0/1)

Fast and simple

SC-mod3 7
Cannot solve XOR-type problems

2. Adaline (Adaptive Linear Neuron)

Proposed by:
Bernard Widrow & Ted Hoff (1960)

Difference from Perceptron:


Uses a linear activation function (no thresholding)

Learning is based on continuous output, not binary

Output:

y = ∑ w i xi + b

Learning Rule:
Minimizes Mean Squared Error (MSE) between actual and target output.

Based on Least Mean Squares (LMS) algorithm

wi (t + 1) = wi (t) + η(d − y)xi


​ ​ ​

Properties:
Better for regression and continuous-valued outputs

Converges more smoothly due to differentiability

Still limited to linear separability

3. Madaline (Multiple Adaline)

Proposed by:
Widrow & Hoff (1960)

Architecture:
A multi-layer network built from multiple Adalines

Typically has:

Input layer

Hidden layer (comprising Adaline units)

Output layer (also Adaline units)

Learning Rule:

SC-mod3 8
Uses Madaline Rule I and II (heuristic, not gradient-based)

Madaline Rule II adjusts only the weights that lead to incorrect outputs by flipping signs (based on
minimal disturbance principle)

Advantages:
First functional multi-layer neural network

Capable of solving some non-linearly separable problems

Predecessor of modern multilayer networks and backpropagation

Limitations:
Training is complex due to non-differentiable transfer functions

Not as efficient as backpropagation in deep networks

Comparison Table
Feature Perceptron Adaline Madaline

Activation Step function Linear Step function (multi-unit)

Learning Rule Classification error LMS (gradient descent) Heuristic (MR-II)

Output Type Binary (0 or 1) Continuous real values Binary

Network Type Single-layer Single-layer Multi-layer

Problem Type Classification Classification/Regression Non-linear classification

Limitation Linearly separable Linearly separable Training complexity

Applications:
Perceptron: Pattern recognition, classification

Adaline: Signal processing, adaptive filters

Madaline: Early speech recognition, noise reduction

Backpropagation and Multilayer Networks

Introduction to Multilayer Networks


A Multilayer Neural Network (also known as a Multilayer Perceptron – MLP) is composed of:

Input layer (accepts input features)

One or more hidden layers

Output layer

Each neuron in a layer is connected to every neuron in the next layer — forming a fully connected
feedforward architecture.

SC-mod3 9
Why Multilayer?
Single-layer networks (like Perceptron or Adaline) can only solve linearly separable problems.

Multilayer networks can solve non-linearly separable problems (e.g., XOR).

The power of the network comes from its hidden layers, which can model complex patterns.

The Problem of Learning in Multilayer Networks


Unlike single-layer networks, multilayer networks:

Don’t have direct target outputs for hidden layers.

Require an algorithm that can compute the error and propagate it backward to update the hidden
layers.

This leads to the development of the Backpropagation Algorithm.

Backpropagation Algorithm

Objective:
To minimize the error between predicted and actual output using gradient descent by adjusting the
weights in all layers of the network.

Assumptions:
Uses a differentiable activation function (e.g., sigmoid, tanh, ReLU)

Employs supervised learning

Working of Backpropagation (Two Phases)


1. Forward Pass:

Inputs are fed into the network.

Outputs are computed layer by layer using activation functions.

The final output is compared with the actual label to compute error (using Mean Squared Error
or Cross Entropy).

2. Backward Pass:

The error is propagated backward from output to input layers.

Gradients are calculated using chain rule of calculus.

Weights are updated to minimize the error:

∂E
wij (t + 1) = wij (t) − η
​ ​ ​

∂ wij ​

Where η: is the learning rate, and E is the loss function.

SC-mod3 10
Key Components

Activation Functions:
Sigmoid: Smooth curve, output between 0 and 1

Tanh: Zero-centered, output between –1 and 1

ReLU: Faster convergence, avoids vanishing gradient

Loss Functions:
MSE: Used for regression tasks

Cross-Entropy: Commonly used for classification tasks

Learning Rate:
Controls how large a step is taken while updating weights.

Too high: might overshoot optimal solution.

Too low: convergence becomes very slow.

Advantages of Backpropagation:
Efficient method for training deep neural networks.

Can approximate any continuous function given enough hidden neurons (Universal Approximation
Theorem).

Automates learning across multiple layers.

Limitations:
Can get stuck in local minima (non-convex optimization).

Suffers from the vanishing gradient problem in deep networks (especially with sigmoid/tanh).

Requires large datasets and computational resources.

Applications:
Image and speech recognition

Natural language processing

Financial prediction

Medical diagnostics

Competitive Learning Networks: Kohonen Self-Organizing


Networks, Hebbian Learning

SC-mod3 11
Competitive Learning: Core Idea
In competitive learning, neurons in a network compete to respond to a given input. Only one neuron
wins (or a small subset wins), and only that neuron’s weights are updated. This is in contrast to
backpropagation, where all weights are updated.

Used primarily in unsupervised learning

Neurons learn to recognize clusters or patterns in the input space

Often applied in clustering, dimensionality reduction, and vector quantization

Kohonen Self-Organizing Map (SOM)

Developed by:
Teuvo Kohonen (1982)

Objective:
To produce a topological mapping of input data — i.e., to convert high-dimensional data into a 2D
representation while preserving the relative structure of the data.

Structure:
Consists of:

Input layer: Feature vector

Output layer: Usually a 2D grid of neurons (map)

Each neuron in the output layer is associated with a weight vector of the same dimension as the
input

Learning Process:
1. Initialization: Randomly initialize weight vectors.

2. Input selection: Choose an input vector xx.

3. Winner selection: Find the Best Matching Unit (BMU) or winning neuron whose weight vector ww
is closest to xx:

BMU = arg min ∥x − wj ∥


​ ​

4. Weight update:

wj (t + 1) = wj (t) + η(t) ⋅ hj ,i (t) ⋅ (x − wj (t))


​ ​ ​ ​

η(t): learning rate

hj,i(t): neighborhood function (decays with distance from BMU)

5. Repeat for several epochs, reducing learning rate and neighborhood size over time

SC-mod3 12
Key Properties:
Performs dimensionality reduction

Learns topological relationships

Ideal for visualizing high-dimensional data

Applications:
Pattern recognition

Clustering

Feature compression

Data visualization (e.g., document classification, genomic data)

Hebbian Learning (Revisited)


Although already introduced earlier, in the context of competitive learning networks, Hebbian learning
often forms the foundation.

Core Principle:

"When neuron A repeatedly assists in firing neuron B, the connection between


them strengthens."

Characteristics in Competitive Setting:


Reinforces weight changes in the direction of input patterns

Often combined with normalization to avoid unbounded weight growth

Encourages formation of feature detectors

Comparison: Kohonen SOM vs Hebbian Learning

Feature Kohonen SOM Hebbian Learning

Type Competitive, unsupervised Correlation-based, unsupervised

Weight Update Only BMU and neighbors All neurons involved

Output Topological map Not topological

Use Case Dimensionality reduction, clustering Pattern association, correlation

Biological Plausibility Moderate High

Key Takeaways:
Competitive learning is useful for unsupervised pattern discovery.

Kohonen SOM organizes input into spatially meaningful output maps — particularly useful in high-
dimensional datasets.

SC-mod3 13
Hebbian learning is the biological inspiration that underlies many unsupervised weight adjustment
mechanisms.

Hopfield Networks and Neuro-Fuzzy Modelling

Hopfield Networks

Developed by:
John J. Hopfield (1982)

Overview:
A Hopfield Network is a type of recurrent neural network where:

Each neuron is connected to every other neuron

Connections are symmetric: wij = wji, and wii = 0

The network acts as an associative memory system that stores patterns and retrieves them even
from noisy inputs

Key Characteristics:
Binary threshold units (output: 0 or 1 / –1 or +1 depending on version)

Works in discrete time with asynchronous updates

Designed to converge to a stable state (local minima of an energy function)

Energy Function:
Hopfield defined an energy function that decreases over time:

1
E = − ​
∑ wij si sj + ∑ θi si
​ ​ ​ ​ ​ ​ ​

2
i=j
 i

s_i: state of neuron ii

wij: weight between neurons ii and jj

θi: threshold of neuron ii

The network evolves until it reaches a stable minimum energy state, which corresponds to a stored
pattern.

Learning Rule (Hebbian-based):


p

(k) (k)
wij = ∑ x x (i 
= j)
i j
​ ​ ​ ​

k=1

Where x^(k) is the k-th pattern to be stored.

SC-mod3 14
Applications:
Associative memory

Error correction

Pattern completion (e.g., image restoration)

Limitations:
Limited capacity: can reliably store only about 0.15n patterns for n neurons

May converge to spurious (unstored) states

Only suitable for static pattern recall, not sequential data

Neuro-Fuzzy Modelling

Definition:
Neuro-Fuzzy systems combine:

Neural networks (learning from data)

Fuzzy logic (handling imprecision and uncertainty)

The goal is to build adaptive systems that learn fuzzy rules from data and tune membership functions
automatically.

Popular Model: Adaptive Neuro-Fuzzy Inference System (ANFIS)


ANFIS is a hybrid model that:

Uses a fuzzy inference system (FIS) (typically Sugeno-type)

Learns its parameters using neural network learning methods (e.g., backpropagation and/or least
squares)

ANFIS Architecture (5 Layers)


1. Input Layer:

Passes crisp inputs to the next layer.

2. Fuzzification Layer:

Applies membership functions to inputs (e.g., Gaussian, triangular).

Outputs degree of membership.

3. Rule Layer:

Each node represents a fuzzy rule.

Outputs the firing strength of a rule.

4. Normalization Layer:

SC-mod3 15
Normalizes firing strengths of rules.

5. Output Layer:

Computes the final output as a weighted average of rule outputs.

Example: Rule Structure in ANFIS


Fuzzy rule:

IF Temperature is Hot AND Humidity is Low THEN FanSpeed = 0.8

Premise part is fuzzy (with membership functions)

Consequent is typically a linear function of inputs (in Sugeno FIS)

Learning in ANFIS:
Uses hybrid learning:

Forward pass: Least Squares estimates for output parameters.

Backward pass: Gradient descent updates for membership function parameters.

Applications:
Forecasting (e.g., stock market, weather)

Control systems (e.g., robotics, HVAC)

Pattern classification

Function approximation

Comparison: Hopfield vs Neuro-Fuzzy


Aspect Hopfield Network Neuro-Fuzzy System

Type Recurrent, associative memory Hybrid (NN + fuzzy logic)

Learning Hebbian rule Backpropagation + fuzzy tuning

Output Binary or bipolar Crisp or fuzzy values

Suitable for Pattern recall Modeling imprecise, nonlinear systems

Interpretability Low High (rule-based reasoning)

SC-mod3 16

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy