0% found this document useful (0 votes)
37 views80 pages

Wa0020.

Uploaded by

NIKHIL DHAKNE
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
37 views80 pages

Wa0020.

Uploaded by

NIKHIL DHAKNE
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 80

Introduction

Basics of Soft Computing

What is Soft Computing?

• The idea of soft computing was initiated in 1981 when Lotfi A. Zadeh published his first
paper on soft data analysis “What is Soft Computing”, Soft Computing. Springer-Verlag
Germany/USA 1997.]

• Zadeh, defined Soft Computing into one multidisciplinary system as the fusion of the
fields of Fuzzy Logic, Neuro-Computing, Evolutionary and Genetic Computing, and
Probabilistic Computing.

• Soft Computing is the fusion of methodologies designed to model and enable solutions to
real world problems, which are not modeled or too difficult to model mathematically.

• The aim of Soft Computing is to exploit the tolerance for imprecision, uncertainty,
approximate reasoning, and partial truth in order to achieve close resemblance with
human like decision making.
• The Soft Computing – development history

SC = EC + NN + FL

Soft Evolutionary Neural Fuzzy

Computing Computing Network Logic

Zadeh Rechenberg McCulloch Zadeh

1981 1960 1943 1965

EC = GP + ES + EP + GA

Evolutionary Genetic Evolution Evolutionary Genetic

Computing Programming Strategies Programming Algorithms

Rechenberg Koza Rechenberg Fogel Holland

1960 1992 1965 1962 1970

2
Definitions of Soft Computing (SC)

Lotfi A. Zadeh, 1992 : “Soft Computing is an emerging approach to computing which parallel
the remarkable ability of the human mind to reason and learn in a environment of uncertainty
and imprecision”.

The Soft Computing consists of several computing paradigms mainly :

Fuzzy Systems, Neural Networks, and Genetic Algorithms.

• Fuzzy set : for knowledge representation via fuzzy If – Then rules.

• Neural Networks : for learning and adaptation

• Genetic Algorithms : for evolutionary computation

These methodologies form the core of SC.

Hybridization of these three creates a successful synergic effect; that is, hybridization creates a
situation where different entities cooperate advantageously for a final outcome.

Soft Computing is still growing and developing.

Hence, a clear definite agreement on what comprises Soft Computing has not yet been reached.
More new sciences are still merging into Soft Computing.

Goals of Soft Computing


Soft Computing is a new multidisciplinary field, to construct new generation of Artificial
Intelligence, known as Computational Intelligence.

• The main goal of Soft Computing is to develop intelligent machines to provide solutions
to real world problems, which are not modeled, or too difficult to model mathematically.

• Its aim is to exploit the tolerance for Approximation, Uncertainty, Imprecision, and
Partial Truth in order to achieve close resemblance with human like decision making.

Approximation : here the model features are similar to the real ones, but not the same.

3
Uncertainty : here we are not sure that the features of the model are the same as that of the
entity (belief).

Imprecision : here the model features (quantities) are not the same as that of the real ones,
but close to them.

Importance of Soft Computing

Soft computing differs from hard (conventional) computing. Unlike

hard computing, the soft computing is tolerant of imprecision, uncertainty,


partial truth, and approximation. The guiding principle of soft computing is to

exploit these tolerance to achieve tractability, robustness and low solution cost. In
effect, the role model for soft computing is the human mind.

The four fields that constitute Soft Computing (SC) are : Fuzzy Computing (FC),
Evolutionary Computing (EC), Neural computing (NC), and Probabilistic

Computing (PC), with the latter subsuming belief networks, chaos theory and parts

of learning theory.

Soft computing is not a concoction, mixture, or combination, rather,

Soft computing is a partnership in which each of the partners contributes

a distinct methodology for addressing problems in its domain. In principal

the constituent methodologies in Soft computing are complementary rather


than competitive.

Soft computing may be viewed as a foundation component for the emerging


field of Conceptual Intelligence.

4
Fuzzy Computing

In the real world there exists much fuzzy knowledge, that is, knowledge which
is vague, imprecise, uncertain, ambiguous, inexact, or probabilistic in nature.

Human can use such information because the human thinking and reasoning
frequently involve fuzzy information, possibly originating from inherently
inexact human concepts and matching of similar rather then identical
experiences.

The computing systems, based upon classical set theory and two-valued logic,
can not answer to some questions, as human does, because they do not have
completely true answers.

We want, the computing systems should not only give human like answers but
also describe their reality levels. These levels need to be calculated using
imprecision and the uncertainty of facts and rules that were applied.

Fuzzy Sets

Introduced by Lotfi Zadeh in 1965, the fuzzy set theory is an extension of


classical set theory where elements have degrees of membership.

• Classical Set Theory

− Sets are defined by a simple statement describing whether an

element having a certain property belongs to a particular set.

− When set A is contained in an universal space X,

5
then we can state explicitly whether each element x of space X "is or
is not" an element of A.

− Set A is well described by a function called characteristic function A.


This function, defined on the universal space X, assumes :

value 1 for those elements x that belong to set A, and

value 0 for those elements x that do not belong to set A.

The notations used to express these mathematically are

Α : Χ → [0, 1]

A(x) = 1 , x is a member of A Eq.(1)

A(x) = 0 , x is not a member of A

Alternatively, the set A can be represented for all elements x ∈ X


by its characteristic function A (x) defined as

1 if x∈ X

A (x) = Eq.(2)

0 otherwise

− Thus, in classical set theory A (x) has only the values 0 ('false') and 1
('true''). Such sets are called crisp sets.

6
• Crisp and Non-crisp Set

− As said before, in classical set theory, the characteristic function A(x)


of Eq.(2) has only values 0 ('false') and 1 ('true'').

Such sets are crisp sets.

− For Non-crisp sets the characteristic function A(x)can be defined.

ƒ The characteristic function A(x) of Eq. (2) for the crisp set is
generalized for the Non-crisp sets.

ƒ This generalized characteristic function A(x) of Eq.(2) is called

membership function.

Such Non-crisp sets are called Fuzzy Sets.

− Crisp set theory is not capable of representing descriptions and


classifications in many cases; In fact, Crisp set does not provide
adequate representation for most cases.

− The proposition of Fuzzy Sets are motivated by the need to capture and
represent real world data with uncertainty due to imprecise
measurement.

− The uncertainties are also caused by vagueness in the language.

7
• Example 1 : Heap Paradox

This example represents a situation where vagueness and uncertainty are inevitable.

- If we remove one grain from a heap of grains, we will still have a heap.
- However, if we keep removing one-by-one grain from a heap of grains, there will be a
time when we do not have a heap anymore.
- The question is, at what time does the heap turn into a countable collection of grains
that do not form a heap? There is no one correct answer to this question.

• Example 2 : Classify Students for a basketball team This


example explains the grade of truth value.
- tall students qualify and not tall students do not qualify
- if students 1.8 m tall are to be qualified, then

should we exclude a student who is 1/10" less? or should we


exclude a student who is 1" shorter?


Non-Crisp Representation to represent the notion of a tall person.

A student of height 1.79m would belong to both tall and not tall sets with a particular degree of
membership.As the height increases the membership grade within the tall set would increase whilst
the membership grade within the not-tall set would decrease.

8
• Capturing Uncertainty

Instead of avoiding or ignoring uncertainty, Lotfi Zadeh introduced Fuzzy


Set theory that captures uncertainty.

■ In the case of Crisp Sets the members of a set are :

either out of the set, with membership of degree " 0


", or in the set, with membership of degree " 1 ",

Therefore, Crisp Sets ⊆ Fuzzy Sets In other words, Crisp Sets are
Special cases of Fuzzy Sets.

9
Example 2: Set of SMALL ( as non-crisp set) Example 1: Set of prime
numbers ( a crisp set)

If we consider space X consisting of natural numbers ≤ 12

ie X = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12}

Then, the set of prime numbers could be described as follows.

PRIME = {x contained in X | x is a prime number} = {2, 3, 5, 6, 7, 11}

A Set X that consists of SMALL cannot be described;

for example 1 is a member of SMALL and 12 is not a member of SMALL.

Set A, as SMALL, has un-sharp boundaries, can be characterized by a


function that assigns a real number from the closed interval from 0 to
1 to each element x in the set X.

10
• Definition of Fuzzy Set

A fuzzy set A defined in the universal space X is a function defined in X


which assumes values in the range [0, 1].

A fuzzy set A is written as a set of pairs {x, A(x)} as

A = {{x , A(x)}} , x in the set X

where x is an element of the universal space X, and


A(x) is the value of the function A for this element.

The value A(x) is the membership grade of the element x in a fuzzy set A.

Example : Set SMALL in set X consisting of natural numbers


to 12.

Assume:
SMALL(1) = 1, SMALL(2) = 1, SMALL(3) = 0.9, SMALL(4) = 0.6,

11
SMALL(5) = 0.4, SMALL(6) = 0.3, SMALL(7) = 0.2, SMALL(8) = 0.1,
SMALL(u) = 0 for u >= 9.

Then, following the notations described in the definition above :

Set SMALL = {{1, 1 }, {2, 1 }, {3, 0.9}, {4, 0.6}, {5, 0.4}, {6, 0.3}, {7, 0.2},

{8, 0.1}, {9, 0 }, {10, 0 }, {11, 0}, {12, 0}}

Note that a fuzzy set can be defined precisely by associating with each x ,
its grade of membership in SMALL.

• Definition of Universal Space

Originally the universal space for fuzzy sets in fuzzy logic was defined only
on the integers. Now, the universal space for fuzzy sets and fuzzy
relations is defined with three numbers. The first two numbers specify the
start and end of the universal space, and the third argument specifies the
increment between elements. This gives the user more flexibility in
choosing the universal space.

Example : The fuzzy set of numbers, defined in the universal space

X = { xi } = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12} is presented as

SetOption [FuzzySet, UniversalSpace → {1, 12, 1}]

• Graphic Interpretation of Fuzzy Sets SMALL

The fuzzy set SMALL of small numbers, defined in the universal space

12
X = { xi } = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12} is presented as

SetOption [FuzzySet, UniversalSpace → {1, 12, 1}]

The Set SMALL in set X is :

SMALL = FuzzySet {{1, 1 }, {2, 1 }, {3, 0.9}, {4, 0.6}, {5, 0.4}, {6, 0.3},

{7, 0.2}, {8, 0.1}, {9, 0 }, {10, 0 }, {11, 0}, {12, 0}}

Therefore SetSmall is represented as

SetSmall = FuzzySet [{{1,1},{2,1}, {3,0.9}, {4,0.6}, {5,0.4},{6,0.3}, {7,0.2},


{8, 0.1}, {9, 0}, {10, 0}, {11, 0}, {12, 0}} , UniversalSpace → {1, 12, 1}]

FuzzyPlot [ SMALL, AxesLable → {"X", "SMALL"}]

13
• Graphic Interpretation of Fuzzy Sets PRIME Numbers

The fuzzy set PRIME numbers, defined in the universal space

X = { xi } = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12} is presented as

SetOption [FuzzySet, UniversalSpace → {1, 12, 1}]

The Set PRIME in set X is :

PRIME = FuzzySet {{1, 0}, {2, 1}, {3, 1}, {4, 0}, {5, 1}, {6, 0}, {7, 1}, {8, 0}, {9, 0}, {10, 0}, {11, 1},

{12, 0}}

Therefore SetPrime is represented as

SetPrime = FuzzySet [{{1,0},{2,1}, {3,1}, {4,0}, {5,1},{6,0}, {7,1},

{8, 0}, {9, 0}, {10, 0}, {11, 1}, {12, 0}} , UniversalSpace → {1, 12, 1}]

FuzzyPlot [ PRIME, AxesLable → {"X", "PRIME"}]

14
• Graphic Interpretation of Fuzzy Sets UNIVERSALSPACE

In any application of sets or fuzzy sets theory, all sets are subsets of

a fixed set called universal space or universe of discourse denoted by X.


Universal space X as a fuzzy set is a function equal to 1 for all elements.

The fuzzy set UNIVERSALSPACE numbers, defined in the universal

space X = { xi } = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12} is presented as

SetOption [FuzzySet, UniversalSpace


→ {1, 12, 1}]

The Set UNIVERSALSPACE in set X is :

UNIVERSALSPACE = FuzzySet {{1, 1}, {2, 1}, {3, 1}, {4, 1}, {5, 1}, {6, 1},

{7, 1}, {8, 1}, {9, 1}, {10, 1}, {11, 1}, {12, 1}}

Therefore SetUniversal is represented as

SetUniversal = FuzzySet [{{1,1},{2,1}, {3,1}, {4,1}, {5,1},{6,1}, {7,1},

{10, 1}, {11, 1}, {12, 1}} , UniversalSpace


{8, 1}, {9, 1}, → {1, 12, 1}]

FuzzyPlot [ UNIVERSALSPACE, AxesLable → {"X", " UNIVERSAL SPACE "}]

15
Finite and Infinite Universal Space

Universal sets can be finite or infinite.

Any universal set is finite if it consists of a specific number of different


elements, that is, if in counting the different elements of the set, the
counting can come to an end, else the set is infinite.

Examples:

1. Let N be the universal space of the days of the week.

N = {Mo, Tu, We, Th, Fr, Sa, Su}. N is finite.

2. Let M = {1, 3, 5, 7, 9, ...}. M is infinite.

3. Let L = {u | u is a lake in a city }. L is finite.

(Although it may be difficult to count the number of lakes in a


city, but L is still a finite universal set.)

16
• Graphic Interpretation of Fuzzy Sets EMPTY

An empty set is a set that contains only elements with a grade of


membership equal to 0.

Example: Let EMPTY be a set of people, in Minnesota, older than


120. The Empty set is also called the Null set.

The fuzzy set EMPTY , defined in the universal space

X = { xi } = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12} is presented as

SetOption [FuzzySet, UniversalSpace → {1, 12, 1}]

The Set EMPTY in set X is :

EMPTY = FuzzySet {{1, 0}, {2, 0}, {3, 0}, {4, 0}, {5, 0}, {6, 0}, {7, 0}, {8,

0}, {9, 0}, {10, 0}, {11, 0}, {12, 0}}

Therefore SetEmpty is represented as

SetEmpty = FuzzySet [{{1,0},{2,0}, {3,0}, {4,0}, {5,0},{6,0}, {7,0},

{8, 0}, {9, 0}, {10, 0}, {11, 0}, {12, 0}} , UniversalSpace → {1, 12, 1}]

FuzzyPlot [ EMPTY, AxesLable → {"X", " UNIVERSAL SPACE "}]

17
Fuzzy Operations

A fuzzy set operations are the operations on fuzzy sets. The fuzzy set
operations are generalization of crisp set operations. Zadeh [1965]
formulated the fuzzy set theory in the terms of standard operations:
Complement, Union, Intersection, and Difference.

In this section, the graphical interpretation of the following standard fuzzy


set terms and the Fuzzy Logic operations are illustrated:

Inclusion :
FuzzyInclude [VERYSMALL, SMALL]

Equality :
FuzzyEQUALITY [SMALL, STILLSMALL]

Complement :
FuzzyNOTSMALL = FuzzyCompliment [Small]

Union :
FuzzyUNION = [SMALL MEDIUM]

Intersection :
FUZZYINTERSECTON = [SMALL MEDIUM]

18
• Inclusion

Let A and B be fuzzy sets defined in the same universal space X.

The fuzzy set A is included in the fuzzy set B if and only if for every x in

the set X we have A(x) ≤ B(x)

Example :

The fuzzy set UNIVERSALSPACE numbers, defined in the universal

space X = { xi } = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12} is presented as

SetOption [FuzzySet, UniversalSpace → {1, 12, 1}]

The fuzzy set B SMALL

The Set SMALL in set X is :

SMALL = FuzzySet {{1, 1 }, {2, 1 }, {3, 0.9}, {4, 0.6}, {5, 0.4}, {6, 0.3},

{7, 0.2}, {8, 0.1}, {9, 0 }, {10, 0 }, {11, 0}, {12, 0}}

Therefore SetSmall is represented as

SetSmall = FuzzySet [{{1,1},{2,1}, {3,0.9}, {4,0.6}, {5,0.4},{6,0.3}, {7,0.2},


{8, 0.1}, {9, 0}, {10, 0}, {11, 0}, {12, 0}} , UniversalSpace → {1, 12, 1}]

The fuzzy set A VERYSMALL

The Set VERYSMALL in set X is :

VERYSMALL = FuzzySet {{1, 1


{6, 0.1}, {7, 0 }, }, {2, 0.8 }, {3, 0.7}, {4, 0.4}, {5, 0.2},

{8, 0 }, {9, 0 }, {10, 0 }, {11, 0}, {12, 0}}

Therefore SetVerySmall is represented as

19
SetVerySmall = FuzzySet [{{1,1},{2,0.8}, {3,0.7}, {4,0.4}, {5,0.2},{6,0.1},
{7,0}, {8, 0}, {9, 0}, {10, 0}, {11, 0}, {12, 0}} , UniversalSpace → {1, 12, 1}]

The Fuzzy Operation :


nclusion

Include

[VERYSMALL,
SMALL]

20
• Comparability

Two fuzzy sets A and B are comparable

if the condition A ⊂ B or B ⊂ A holds, ie,

if one of the fuzzy sets is a subset of the other set, they are comparable.

Two fuzzy sets A and B are incomparable

if the condition A ⊄ B or B ⊄ A holds.

Example 1:

Let A = {{a, 1}, {b, 1}, {c, 0}} and

B = {{a, 1}, {b, 1}, {c, 1}}.

Then A is comparable to B, since A is a subset of B.

Example 2 :

Let C = {{a, 1}, {b, 1}, {c, 0.5}} and

D = {{a, 1}, {b, 0.9}, {c, 0.6}}.

Then C and D are not comparable since

C is not a subset of D and

D is not a subset of C.

Property Related to Inclusion :

for all x in the set X, if A(x) ⊂ B(x) ⊂ C(x), then accordingly A ⊂ C.

21
• Equality

Let A and B
be fuzzy sets defined in the same space X.
Then A and B if
and only if
are equal, which is denoted X = Y

for all x in the set X, A(x) = B(x).

Example.

The fuzzy set B SMALL

SMALL = FuzzySet {{1, 1 }, {2, 1 }, {3, 0.9}, {4, 0.6}, {5, 0.4}, {6, 0.3},

{7, 0.2}, {8, 0.1}, {9, 0 }, {10, 0 }, {11, 0}, {12, 0}}

The fuzzy set A STILLSMALL

STILLSMALL = FuzzySet {{1, 1 }, {2, 1 }, {3, 0.9}, {4, 0.6}, {5, 0.4},

{6, 0.3}, {7, 0.2}, {8, 0.1}, {9, 0 }, {10, 0 }, {11, 0}, {12, 0}}

The Fuzzy Operation : Equality

Equality [SMALL, STILLSMALL]

the set X, then we say that A is not equal to B.

22
• Complement

Let A be a fuzzy set defined in the space X.

Then the fuzzy set B is a complement of the fuzzy set A, if and only if,

for all x in the set X, B(x) = 1 - A(x).

The complement of the fuzzy set A is often denoted by A' or Ac or A

Fuzzy Complement : Ac(x) = 1 – A(x)

Example 1.

The fuzzy set A SMALL

SMALL = FuzzySet {{1, 1 }, {2, 1 }, {3, 0.9}, {4, 0.6}, {5, 0.4}, {6, 0.3},

{7, 0.2}, {8, 0.1}, {9, 0 }, {10, 0 }, {11, 0}, {12, 0}}

The fuzzy set Ac NOTSMALL

NOTSMALL = FuzzySet {{1, 0 }, {2, 0 }, {3, 0.1}, {4, 0.4}, {5, 0.6}, {6, 0.7},

{7, 0.8}, {8, 0.9}, {9, 1 }, {10, 1 }, {11, 1}, {12, 1}}

The Fuzzy Operation : Compliment

NOTSMALL = Compliment [SMALL]

23
SC – Fuzzy Computing

Example 2.

The empty set Φ and the universal set X, as fuzzy sets, are

complements of one another.

Φ'= X' =
X , Φ

The fuzzy set B EMPTY

Empty = FuzzySet {{1, 0 }, {2, 0 }, {3, 0}, {4, 0}, {5, 0}, {6, 0},

{7, 0}, {8, 0}, {9, 0 }, {10, 0 }, {11, 0}, {12, 0}}

The fuzzy set A UNIVERSAL

Universal = FuzzySet {{1, 1 }, {2, 1 }, {3, 1}, {4, 1}, {5, 1}, {6, 1},

{7, 1}, {8, 1}, {9, 1 }, {10, 1 }, {11, 1}, {12, 1}}

The fuzzy operation : Compliment

EMPTY = Compliment [UNIVERSALSPACE]

24
• Union

Let A and B be fuzzy sets defined in the space X.

The union is defined as the smallest fuzzy set that contains both A and
B. The union of A and B is denoted by A ∪ B.

The following relation must be satisfied for the union operation


: for all x in the set X, (A ∪ B)(x) = Max (A(x), B(x)).

Fuzzy Union : (A ∪ B)(x) = max [A(x), B(x)] for all x ∈ X

Example 1 : Union of Fuzzy A and B

A(x) = 0.6 and B(x) = 0.4 ∴ (A ∪ B)(x) = max [0.6, 0.4] = 0.6

Example 2 : Union of SMALL and MEDIUM

The fuzzy set A SMALL

SMALL = FuzzySet {{1, 1 }, {2, 1 }, {3, 0.9}, {4, 0.6}, {5, 0.4}, {6, 0.3},

{7, 0.2}, {8, 0.1}, {9, 0 }, {10, 0 }, {11, 0}, {12, 0}}

The fuzzy set B MEDIUM

MEDIUM = FuzzySet {{1, 0 }, {2, 0 }, {3, 0}, {4, 0.2}, {5, 0.5}, {6, 0.8},

{7, 1}, {8, 1}, {9, 0.7 }, {10, 0.4 }, {11, 0.1}, {12, 0}}

The fuzzy operation : Union

FUZZYUNION = [SMALL
∪ MEDIUM]

SetSmallUNIONMedium = FuzzySet [{{1,1},{2,1}, {3,0.9}, {4,0.6}, {5,0.5},


{6,0.8}, {7,1}, {8, 1}, {9, 0.7}, {10, 0.4}, {11, 0.1}, {12, 0}} , UniversalSpace →
{1, 12, 1}]

The notion of the union is closely related to that of the connective "or".

Let A is a class of "Young" men, B is a class of "Bald" men.

25
If "David is Young" or "David is Bald," then David is associated with the
union of A and B. Implies David is a member of A ∪ B.

• Properties Related to Union

The properties related to union are :

Identity, Idempotence, Commutativity and Associativity.


Identity:

A∪
Φ =A

input
= Equality [SMALL ∪ EMPTY , SMALL]

output = True

A
∪ X=X

input
= Equality [SMALL ∪ UnivrsalSpace , UnivrsalSpace]

output = True

■ Idempotence :
A∪A=A

input = Equality [SMALL ∪ SMALL , SMALL]

output = True


Commutativity :

A ∪ B =B ∪ A

26
input = Equality [SMALL ∪ MEDIUM, MEDIUM ∪ SMALL]

output = True


Associativity:

A
∪ (B C) = B) C
∪ (A∪ ∪

input = Equality [SMALL


∪ (MEDIUM BIG) , (SMALL MEDIUM) ∪
BIG]
∪ ∪

output = True

SMALL = FuzzySet {{1, 1 },


{7, 0.2}, {2, 1 }, {3, 0.9}, {4, 0.6}, {5, 0.4}, {6, 0.3}, {8,
0.1}, {9, 0.7 }, {10, 0.4 }, {11, 0}, {12, 0}}

MEDIUM = FuzzySet {{1, 0 }, {2, 0 }, {3, 0}, {4, 0.2}, {5, 0.5}, {6, 0.8},

{7, 1}, {8, 1}, {9, 0 }, {10, 0 }, {11, 0.1}, {12, 0}}

BIG
= FuzzySet [{{1,0}, {2,0}, {3,0}, {4,0}, {5,0}, {6,0.1}, {7,0.2},
{8,0.4}, {9,0.6}, {10,0.8}, {11,1}, {12,1}}]

Medium ∪ BIG = FuzzySet [{1,0},{2,0}, {3,0}, {4,0.2}, {5,0.5}, {6,0.8},


{7,1}, {8, 1}, {9, 0.6}, {10, 0.8}, {11, 1}, {12, 1}]

Small ∪ Medium = FuzzySet [{1,1},{2,1}, {3,0.9}, {4,0.6}, {5,0.5},


{6,0.8}, {7,1}, {8, 1}, {9, 0.7}, {10, 0.4}, {11, 0.1}, {12, 0}]

27

27
SC – Fuzzy Computing

• Intersection

Let A and B be fuzzy sets defined in the space X.

The intersection is defined as the greatest fuzzy set included both A and
B. The intersection of A and B is denoted by A ∩ B.

The following relation must be satisfied for the union operation :

for all x in the set X, (A ∩ B)(x) = Min (A(x), B(x)).

Fuzzy Intersection : (A ∩ B)(x) = min [A(x), B(x)]


for all x X

Example 1 : Intersection of Fuzzy A and B

A(x) = 0.6 and B(x) = 0.4 ∴ (A ∩ B)(x) = min [0.6, 0.4] = 0.4

Example 2 : Union of SMALL and MEDIUM

The fuzzy set A SMALL

SMALL = FuzzySet {{1, 1 }, {2, 1 }, {3, 0.9}, {4, 0.6}, {5, 0.4}, {6, 0.3},

{7, 0.2}, {8, 0.1}, {9, 0 }, {10, 0 }, {11, 0}, {12, 0}}

The fuzzy set B MEDIUM

MEDIUM = FuzzySet {{1, 0 }, {2, 0 }, {3, 0}, {4, 0.2}, {5, 0.5}, {6, 0.8},

{7, 1}, {8, 1}, {9, 0.7 }, {10, 0.4 }, {11, 0.1}, {12, 0}}

The fuzzy operation : Intersection FUZZYINTERSECTION = min


[SMALL ∩ MEDIUM] SetSmallINTERSECTIONMedium = FuzzySet
[{{1,0},{2,0}, {3,0}, {4,0.2},

{5,0.4}, {6,0.3}, {7,0.2}, {8, 0.1}, {9, 0},{10, 0}, {11, 0}, {12, 0}} , UniversalSpace →
{1, 12, 1}]

28
Neural Computing

Neural Computers mimic certain processing capabilities of the human brain.

- Neural Computing is an information processing paradigm, inspired by


biological system, composed of a large number of highly interconnected
processing elements (neurons) working in unison to solve specific problems.

- A neural net is an artificial representation of the human brain that tries to


simulate its learning process. The term "artificial" means that neural nets
are implemented in computer programs that are able to handle the large
number of necessary calculations during the learning process.

- Artificial Neural Networks (ANNs), like people, learn by example.

- An ANN is configured for a specific application, such as pattern recognition


or data classification, through a learning process.

- Learning in biological systems involves adjustments to the synaptic


connections that exist between the neurons. This is true for ANNs as well.

29
Biological Model:

The human brain consists of a large number (more than a billion) of neural
cells that process information. Each cell works like a simple processor. The
massive interaction between all cells and their parallel processing, makes
the brain's abilities possible. The
structure of neuron is shown below.

Dendrites are the branching fibers


extending from the cell body or soma.
Soma or cell body of a neuron contains
the nucleus and other structures,
support chemical processing and
production of neurotransmitters.

Axon is a singular fiber carries information


away from the soma to the synaptic sites
of other neurons (dendrites and somas),
muscles, or glands.
Fig. Structure of Neuron

Axon hillock is the site of summation


for incoming information. At any
moment, the collective influence of all
neurons, that conduct as impulses to a
given neuron, will determine whether or
not an action potential will be initiated at

the axon hillock and propagated along the axon.

Myelin Sheath consists of fat-containing cells that insulate the axon from electrical
activity. This insulation acts to increase the rate of transmission of signals. A gap
exists between each myelin sheath cell along the axon. Since fat inhibits the
propagation of electricity, the signals jump from one gap to the next.

30
Nodes of Ranvier are the gaps (about 1 m) between myelin sheath cells long
axons. Since fat serves as a good insulator, the myelin sheaths speed the rate of
transmission of an electrical impulse along the axon.

Synapse is the point of connection between two neurons or a neuron and a muscle or
a gland. Electrochemical communication between neurons takes place at these
junctions.

Terminal Buttons of a neuron are the small knobs at the end of an axon that
release chemicals called neurotransmitters.

31
• Information flow in a Neural Cell

The input /output and the propagation of information are shown below.

Fig. Structure of a neural cell in the human brain

■ Dendrites receive activation from other neurons.

■ Soma processes the incoming activations and converts them into


output activations.

■ Axons act as transmission lines to send activation to other neurons.

■ Synapses the junctions allow signal transmission between the axons


and dendrites.

■ The process of transmission is by diffusion of chemicals called neuro-


transmitters.

McCulloch-Pitts introduced a simplified model of this real neurons.

32
Artificial Neuron

• The McCulloch-Pitts Neuron

This is a simplified model of real neurons, known as a Threshold Logic Unit.

■ A set of synapses (ie connections) brings in activations from other


neurons.

■ A processing unit sums the inputs, and then applies a non-linear


activation function (i.e. transfer / threshold function).

■ An output line transmits the result to other neurons.

In other words, the input to a neuron arrives in the form of signals.

The signals build up in the cell. Finally the cell fires (discharges)

through the output. The cell can start building up signals again.

33
• Functions :

The function y = f(x) describes a relationship, an input-output mapping,


from x to y.

■ Threshold or Sign function sgn(x) : defined as

■ Threshold or Sign function sigmoid (x) : defined as a smoothed


(differentiable) form of the threshold function

34
• McCulloch-Pitts (M-P) Neuron Equation

Fig below is the same previously shown simplified model of a real

neuron, as a threshold Logic Unit.

Note : The McCulloch-Pitts neuron is an extremely simplified model of real


biological neurons. Some of its missing features include: non-binary input
and output, non-linear summation, smooth thresholding, stochastic (non-
deterministic), and temporal information processing.

34

35
• Basic Elements of an Artificial Neuron

It consists of three basic components - weights, thresholds, and a single


activation function.

Weighting Factors

The values W1 , W2 , . . . Wn are weights to determine the strength of input


row vector X = [x1 , x2 , . . . , xn]T. Each input is multiplied by the
associated weight of the neuron connection XT W. The +ve weight excites
and the -ve weight inhibits the node output.

Threshold

The node’s internal threshold Φ is the magnitude offset. It affects the


activation of the node output y as:

y= Σn Xi Wi - Φ k

i=1

36
Activation Function

An activation function performs a mathematical operation on the signal


output. The most common activation functions are, Linear

Function, Threshold Function, Piecewise Linear Function, Sigmoidal (S


shaped) function, Tangent hyperbolic function and are chose depending
upon the type of problem to be solved by the network.

37
• Example :

With a binary activation function, the outputs of the neuron is:

y (threshold) = 1

38
• Single and Multi - Layer Perceptrons

A perceptron is a name for simulated neuron in the computer program.


The usually way to represent a neuron model is described below.

The neurons are shown as circles in the diagram. It has several inputs and
a single output. The neurons have gone under various names.

- Each individual cell is called either a node or a perceptron.

- A neural network consisting of a layer of nodes or perceptrons between

the input and the output is called a single layer perceptron.

- A network consisting of several layers of single layer perceptron


stacked on top of other, between input and output , is called a

multi-layer perceptron
Output

Output Output

Input InputInput

Fig Single and Multi - Layer Perceptrons

Multi-layer perceptrons are more powerful than single-layer perceptrons.

39
• Perceptron

Any number of McCulloch-Pitts neurons can be connected together in any


way.

Definition : An arrangement of one input layer of McCulloch-Pitts neurons,


that is feeding forward to one output layer of McCulloch-Pitts neurons is
known as a Perceptron.

A Perceptron is a powerful computational device.

40
Genetic Algorithms

Genetic Algorithms (GAs) were invented by John Holland in early 1970's to mimic
some of the processes observed in natural evolution.

Later in 1992 John Koza used GAs to evolve programs to perform certain tasks.
He called his method "Genetic Programming" (GP).

GAs simulate natural evolution, a combination of selection, recombination and


mutation to evolve a solution to a problem.

GAs simulate the survival of the fittest, among individuals over consecutive
generation for solving a problem. Each generation consists of a population of
character strings that are analogous to the chromosome in our DNA
(Deoxyribonucleic acid). DNA contains the genetic instructions used in the
development and functioning of all known living organisms.

What are Genetic Algorithms

■ Genetic Algorithms (GAs) are adaptive heuristic search algorithm


based on the evolutionary ideas of natural selection and genetics.

■ Genetic algorithms (GAs) are a part of evolutionary computing, a


rapidly growing area of artificial intelligence. GAs are inspired by
Darwin's theory about evolution - "survival of the fittest".

■ GAs represent an intelligent exploitation of a random search used to


solve optimization problems.
■ GAs, although randomized, exploit historical information to direct the

search into the region of better performance within the search space.
■ In nature, competition among individuals for scanty resources results
in the fittest individuals dominating over the weaker ones.

41
• Why Genetic Algorithms

"Genetic Algorithms are good at taking large, potentially huge search


spaces and navigating them, looking for optimal combinations of things,
solutions you might not otherwise find in a lifetime.” - Salvatore Mangano
Computer Design, May 1995.

- GA is better than conventional AI, in that it is more robust.

- Unlike older AI systems, GAs do not break easily even if the

inputs changed slightly, or in the presence of reasonable noise.

- In searching a large state-space, multi-modal state-space, or n-


dimensional surface, a GA may offer significant benefits over more
typical search of optimization techniques, like - linear programming,
heuristic, depth-first, breath-first.

• Mechanics of Biological Evolution

Genetic Algorithms are a way of solving problems by mimicking processes


the nature uses - Selection, Crosses over, Mutation and Accepting to
evolve a solution to a problem.

■ Every organism has a set of rules, describing how that organism is


built, and encoded in the genes of an organism.

■ The genes are connected together into long strings called


chromosomes.

42
■ Each gene represents a specific trait (feature) of the organism and has
several different settings, e.g. setting for a hair color gene may be
black or brown.

■ The genes and their settings are referred as an organism's genotype.

■ When two organisms mate they share their genes. The resultant
offspring may end up having half the genes from one parent and half
from the other parent. This process is called crossover

(recombination).

■ The newly created offspring can then be mutated. A gene may be


mutated and expressed in the organism as a completely new trait.

Mutation means, that the elements of DNA are a bit changed. This
change is mainly caused by errors in copying genes from parents.

■ The fitness of an organism is measured by success of the organism in


its life.

43
Artificial Evolution and Search Optimization

The problem of finding solutions to problems is itself a problem with no


general solution. Solving problems usually mean looking for solutions,
which will be the best among others.

■ In engineering and mathematics finding the solution to a problem is


often thought as a process of optimization.

■ Here the process is : first formulate the problems as mathematical

models expressed in terms of functions; then to find a solution,


discover the parameters that optimize the model or the function
components that provide optimal system performance.

The well-established search / optimization techniques are usually classified


in to three broad categories : Enumerative, Calculus-based, and Guided
random search techniques. A taxonomy of Evolution & Search
Optimization classes is illustrated in the next slide.

44
• Taxonomy of Evolution & Search Optimization Classes

■ Enumerative Methods

These are the traditional search and control strategies. They search for
a solution in a problem space within the domain of artificial
intelligence. There are many control structures for search. The depth-
first search and breadth-first search are the two most

common search strategies. Here the search goes through every

point related to the function's domain space (finite or discretized),

one point at a time. They are very simple to implement but

usually require significant computation. These techniques are

not suitable for applications with large domain spaces.

In the field of AI, enumerative methods are subdivide into two

45
categories : uninformed and informed methods.

◊ Uninformed or blind methods : Such as mini-max algorithm

searches all points in the space in a predefined order; this is


used in game playing;

◊ Informed methods : Such as Alpha-Beta and A*, does more


sophisticated search using domain specific knowledge in the form
of a cost function or heuristic in order to reduce the cost of the
search.

46
■ Calculus based techniques

Here a set of necessary and sufficient conditions to be satisfied by the


solutions of an optimization problem. They subdivide into direct and
indirect methods.

◊ Direct or Numerical methods, such as Newton or Fibonacci,

seek extremes by "hopping" around the search space and


assessing the gradient of the new point, which guides

the search. This is simply the notion of "hill climbing", which


finds the best local point by climbing the steepest permissible
gradient. These techniques can be used only on a restricted set
of "well behaved" functions.

◊ Indirect methods search for local extremes by solving the usually


non-linear set of equations resulting from setting the

gradient of the objective function to zero. The search for


possible solutions (function peaks) starts by restricting itself to
points with zero slope in all directions.

47
■ Guided Random Search techniques

These are based on enumerative techniques but they use additional


information to guide the search. Two major subclasses

are simulated annealing and evolutionary algorithms. Both are


evolutionary processes.

◊ Simulated annealing uses a thermodynamic evolution process to


search minimum energy states.

◊ Evolutionary algorithms (EAs) use natural selection principles.


This form of search evolves throughout generations, improving the
features of potential solutions by means of biological inspired
operations. Genetic Algorithms (GAs) are a good example of this
technique.

Our main concern is, how does an Evolutionary algorithm :

- implement and carry out search,

- describes the process of search,

- what are the elements required to carry out search, and

- what are the different search strategies.

48
Evolutionary Algorithms (EAs)

Evolutionary algorithms are search methods. They take inspirations

from natural selection and survival of the fittest in the biological

world, and therefore differ from traditional search optimization

techniques. EAs involve search from a "population" of solutions,

and not from a single point. Each iteration of an EA involves

a competitive selection that weeds out poor solutions. The solutions

with high "fitness" are "recombined" with other solutions by

swapping parts of a solution with another. Solutions are also "mutated" by


making a small change to a single element of the

solution. Recombination and mutation are used to generate

new solutions that are biased towards regions of the space for

which good solutions have already been seen.

Evolutionary search algorithm (issues related to search) :

In the search space, each point represent one feasible solution.

Each feasible solution is marked by its value or fitness for the problem.

The issues related to search are :

- Search for a solution point, means finding which one point (or more)
among many feasible solution points in the search space is the solution.
This requires looking for some extremes, minimum or maximum.

- Search space can be whole known, but usually we know only a few
points and we are generating other points as the process of finding
solution continues.

49
- Search can be very complicated. One does not know where to look

for the solution and where to start.

- What we find is some suitable solution, not necessarily the best solution.
The solution found is often considered as a good solution, because it is not
often possible to prove what is the real optimum solution.
Associative Memory

An associative memory is a content-addressable structure that maps a set of


input patterns to a set of output patterns. The associative memory are of two
types : auto-associative and hetero-associative.

ƒ An auto-associative memory retrieves a previously stored pattern that most


closely resembles the current pattern.

ƒ In a hetero-associative memory, the retrieved pattern is, in general, different


from the input pattern not only in content but possibly also in type and
format.

• Example : Associative Memory

The figure below shows a memory containing names of several people.


If the given memory is content-addressable,

Then using the erroneous string "Crhistpher Columbos" as key is


sufficient to retrieve the correct name "Christopher Colombus."

In this sense, this type of memory is robust and fault-tolerant, because


this type of memory exhibits some form of error-correction capability.

50
Description of Associative Memory

An associative memory is a content-addressable structure that maps


specific input representations to specific output representations.


A content-addressable memory is a type of memory that allows, the
recall of data based on the degree of similarity between the input
pattern and the patterns stored in memory.


It refers to a memory organization in which the memory is accessed by
its content and not or opposed to an explicit address in the traditional
computer memory system.


This type of memory allows the recall of information based on partial
knowledge of its contents.

51
■ It is a system that “associates” two patterns (X, Y) such that

when one is encountered, the other can be recalled.

- Let X and Y be two vectors of length m and n respectively.

- Typically, XÎ {-1, +1}m, Y Î {-1, +1}n

- The components of the vectors can be thought of as pixels

when the two patterns are considered as bitmap images.

■ There are two classes of associative memory:

- auto-associative and

- hetero-associative.

An auto-associative memory is used to retrieve a previously stored


pattern that most closely resembles the current pattern.

In a hetero-associative memory, the retrieved pattern is, in general,


different from the input pattern not only in content but possibly also
different in type and format.

■ Artificial neural networks can be used as associative memories.

The simplest artificial neural associative memory is the

linear associater. The other popular ANN models used as

associative memories are Hopfield model and Bidirectional

Associative Memory (BAM) models.

52
Adaptive Resonance Theory (ART)

ART stands for "Adaptive Resonance Theory", invented by Stephen Grossberg


in 1976. ART encompasses a wide variety of neural networks, based explicitly
on neurophysiology. The word "Resonance" is a concept, just a matter of being
within a certain threshold of a second similarity measure.

The basic ART system is an unsupervised learning model, similar to many


iterative clustering algorithm where each case is processed by finding the
"nearest" cluster seed that resonate with the case and update the cluster seed
to be "closer" to the case. If no seed resonate with the case then a new cluster
is created.

Note : The terms nearest and closer are defined in many ways in clustering
algorithm. In ART, these two terms are defined in slightly different way by
introducing the concept of "resonance".

53
• Definitions of ART and other types of Learning

ART is a neural network topology whose dynamics are based on Adaptive


Resonance Theory (ART). Grossberg developed ART as a theory of human
cognitive information processing. The emphasis of ART neural networks
lies at unsupervised learning and self-organization to mimic biological
behavior. Self-organization means that the system must be able to build
stable recognition categories in real-time.

The unsupervised learning means that the network learns the significant
patterns on the basis of the inputs only. There is no feedback. There is no
external teacher that instructs the network or tells to which category a
certain input belongs. Learning in biological systems always starts as
unsupervised learning; Example : For the newly born, hardly any pre-
existing categories exist.

The other two types of learning are reinforcement learning

and supervised learning. In reinforcement learning the net receives

only limited feedback, like "on this input you performed well" or

"on this input you have made an error". In supervised mode of learning

a net receives for each input the correct response.

Note: A system that can learn in unsupervised mode can always

be adjusted to learn in the other modes, like reinforcement mode

or supervised mode. But a system specifically designed to learn

in supervised mode can never perform in unsupervised mode.

54
• Description of Adaptive Resonance Theory

The basic ART system is an unsupervised learning model.

The model typically consists of :

− a comparison field and a recognition field composed of neurons,

− a vigilance parameter, and

− a reset module.

The functions of each of these constituents are explained below.

■ Comparison field and Recognition field

- The Comparison field takes an input vector (a 1-D array of values)


and transfers it to its best match in the Recognition field; the

best match is, the single neuron whose set of weights (weight
vector) matches most closely the input vector.

- Each Recognition Field neuron outputs a negative signal


(proportional to that neuron’s quality of match to the input vector)
to each of the other Recognition field neurons and inhibits their
output accordingly.

- Recognition field thus exhibits lateral inhibition, allowing each


neuron in it to represent a category to which input vectors are
classified.

■ Vigilance parameter

It has considerable influence on the system memories:

- higher vigilance produces highly detailed memories,

- lower vigilance results in more general memories

55
■ Reset module

After the input vector is classified, the Reset module compares the
strength of the recognition match with the vigilance parameter.

- If the vigilance threshold is met, Then training commences.

- Else, the firing recognition neuron is inhibited until a new input


vector is applied;

• Training ART-based Neural Networks

Training commences only upon completion of a search procedure.


What happens in this search procedure :

- The Recognition neurons are disabled one by one by the reset function
until the vigilance parameter is satisfied by a recognition match.

- If no committed recognition neuron’s match meets the vigilance


threshold, then an uncommitted neuron is committed and adjusted
towards matching the input vector.

Methods of training ART-based Neural Networks:

There are two basic methods, the slow and fast learning.

- Slow learning method : here the degree of training of the recognition


neuron’s weights towards the input vector is calculated using
differential equations and is thus dependent on the length of time the
input vector is presented.

- Fast learning method : here the algebraic equations are used to calculate
degree of weight adjustments to be made, and binary values are used.

56
Note : While fast learning is effective and efficient for a variety of tasks,
the slow learning method is more biologically plausible and can be used
with continuous-time networks (i.e. when the input vector can vary
continuously).

• Types of ART Systems :

The ART Systems have many variations :


ART 1, ART 2, Fuzzy ART, ARTMAP

■ ART 1: The simplest variety of ART networks, accept only binary inputs.

■ ART 2 : It extends network capabilities to support continuous inputs.

■ Fuzzy ART : It Implements fuzzy logic into ART’s pattern recognition, thus
enhances generalizing ability. One very useful feature of fuzzy ART is
complement coding, a means of incorporating the absence of features into
pattern classifications, which goes a long way towards preventing
inefficient and unnecessary category proliferation.

■ ARTMAP : Also known as Predictive ART, combines two slightly

modified ARTs , may be two ART-1 or two ART-2 units into a


supervised learning structure where the first unit takes the input data
and the second unit takes the correct output data, then used to make
the minimum possible adjustment of the vigilance parameter in the
first unit in order to make the correct classification.

57
Applications of Soft Computing

The applications of Soft Computing have proved two main advantages.

- First, in solving nonlinear problems, where mathematical models are not


available, or not possible.

- Second, introducing the human knowledge such as cognition, recognition,


understanding, learning, and others into the fields of computing.

This resulted in the possibility of constructing intelligent systems such as


autonomous self-tuning systems, and automated designed systems.

The relevance of soft computing for pattern recognition and image processing
is already established during the last few years. The subject has recently
gained importance because of its potential applications in problems like :

- Remotely Sensed Data Analysis,

- Data Mining, Web Mining,

- Global Positioning Systems,

- Medical Imaging,

- Forensic Applications,

- Optical Character Recognition,

- Signature Verification,

- Multimedia,

- Target Recognition,

- Face Recognition and

- Man Machine Communication.

58
Fundamentals of Neural Networks

What is Neural Net ?

• A neural net is an artificial representation of the human brain that tries to


simulate its learning process. An artificial neural network (ANN) is often
called a "Neural Network" or simply Neural Net (NN).

• Traditionally, the word neural network is referred to a network of biological


neurons in the nervous system that process and transmit information.

• Artificial neural network is an interconnected group of artificial neurons


that uses a mathematical model or computational model for information
processing based on a connectionist approach to computation.

• The artificial neural networks are made of interconnecting artificial


neurons which may share some properties of biological neural networks.

• Artificial Neural network is a network of simple processing elements


(neurons) which can exhibit complex global behavior, determined by the
connections between the processing elements and element parameters.

59
Introduction

Neural Computers mimic certain processing capabilities of the human brain.

- Neural Computing is an information processing paradigm, inspired by


biological system, composed of a large number of highly interconnected
processing elements (neurons) working in unison to solve specific problems.

- Artificial Neural Networks (ANNs), like people, learn by example.

- An ANN is configured for a specific application, such as pattern recognition


or data classification, through a learning process.

- Learning in biological systems involves adjustments to the synaptic


connections that exist between the neurons. This is true of ANNs as well.

60
Why Neural Network

Neural Networks follow a different paradigm for computing.

■ The conventional computers are good for - fast arithmetic and does
what programmer programs, ask them to do.

■ The conventional computers are not so good for - interacting with


noisy data or data from the environment, massive parallelism, fault
tolerance, and adapting to circumstances.

■ The neural network systems help where we can not formulate an


algorithmic solution or where we can get lots of examples of the
behavior we require.

■ Neural Networks follow different paradigm for computing.

The von Neumann machines are based on the processing/memory


abstraction of human information processing.

The neural networks are based on the parallel architecture of biological


brains.

■ Neural networks are a form of multiprocessor computer system, with

- simple processing elements ,

- a high degree of interconnection,

- simple scalar messages, and

- adaptive interaction between elements.

61
Research History

The history is relevant because for nearly two decades the future of Neural
network remained uncertain.

McCulloch and Pitts (1943) are generally recognized as the designers of the
first neural network. They combined many simple processing units together
that could lead to an overall increase in computational power. They suggested
many ideas like : a neuron has a threshold level and once that level is
reached the neuron fires. It is still the fundamental way in which ANNs
operate. The McCulloch and Pitts's network had a fixed set of weights.

Hebb (1949) developed the first learning rule,


that is if two neurons are
active at the same time then the strength
between them should be
increased.

In the 1950 and 60's, many researchers (Block, Minsky, Papert, and
Rosenblatt worked on perceptron. The neural network model could be proved
to converge to the correct weights, that will solve the problem. The weight
adjustment (learning algorithm) used in the perceptron was found more
powerful than the learning rules used by Hebb. The perceptron caused great
excitement. It was thought to produce programs that could think.

Minsky & Papert (1969) showed that perceptron could not learn those
functions which are not linearly separable.

The neural networks research declined throughout the 1970 and until mid
80's because the perceptron could not learn certain important functions.

62
Neural network regained importance in 1985-86. The researchers, Parker
and LeCun discovered a learning algorithm for multi-layer networks called
back propagation that could solve problems that were not linearly
separable.

63
Biological Neuron Model

The human brain consists of a large number, more than a billion of neural
cells that process information. Each cell works like a simple processor. The
massive interaction between all cells and their parallel processing only
makes the brain's abilities possible.

Dendrites are branching fibers that


extend from the cell body or soma.
Soma or cell body of a neuron contains
the nucleus and other structures,
support chemical processing and
production of neurotransmitters.

Axon is a singular fiber carries information


away from the soma to the synaptic sites
of other neurons (dendrites and somas),
muscles, or glands.

Axon hillock is the site of summation for


incoming information. At any moment, the
collective influence of all neurons that
conduct impulses to a given neuron will
determine whether or not an

Fig. Structure of Neuron

action potential will be initiated at the

axon hillock and propagated along the axon.

Myelin Sheath consists of fat-containing cells that insulate the axon from electrical
activity. This insulation acts to increase the rate of transmission of signals. A gap
exists between each myelin sheath cell along the axon. Since fat inhibits the
propagation of electricity, the signals jump from one gap to the next.

64
Nodes of Ranvier are the gaps (about 1 m) between myelin sheath cells long
axons are Since fat serves as a good insulator, the myelin sheaths speed the rate of
transmission of an electrical impulse along the axon.

Synapse is the point of connection between two neurons or a neuron and a muscle or
a gland. Electrochemical communication between neurons takes place at these
junctions.

Terminal Buttons of a neuron are the small knobs at the end of an axon that
release chemicals called neurotransmitters.

• Information flow in a Neural Cell

The input /output and the propagation of information are shown below.

Fig. Structure of a neural cell in the human brain

■ Dendrites receive activation from other neurons.

■ Soma processes the incoming activations and converts them into


output activations.

65
■ Axons act as transmission lines to send activation to other neurons.

■ Synapses the junctions allow signal transmission between the axons


and dendrites.

■ The process of transmission is by diffusion of chemicals called neuro-


transmitters.

McCulloch-Pitts introduced a simplified model of this real neurons.

66
SC - Neural Network – Introduction

1.4 Artificial Neuron Model

An artificial neuron is a mathematical function conceived as a simple


model of a real (biological) neuron.

• The McCulloch-Pitts Neuron

This is a simplified model of real neurons, known as a Threshold Logic Unit.


A set of input connections brings in activations from other neurons.


A processing unit sums the inputs, and then applies a non-linear
activation function (i.e. squashing / transfer / threshold function).


An output line transmits the result to other neurons.

In other words ,

- The input to a neuron arrives in the form of signals.

- The signals build up in the cell.

- Finally the cell discharges (cell fires) through the output .

- The cell can start building up signals again.

67
Single Layer Feed-forward Network

The Single Layer Feed-forward Network consists of a single layer of


weights , where the inputs are directly connected to the outputs, via a
series of weights. The synaptic links carrying weights connect every input
to every output , but not other way. This way it is considered a network of
feed-forward type. The sum of the products of the weights and the inputs is

calculated in each neuron node, and if the value is above some threshold
(typically 0) the neuron fires and takes the activated value (typically 1);
otherwise it takes the deactivated value (typically -1).

input xi output yj
weights wij

w11
x1 y1
w21

w12

w22
x2 y2

w2m

w1m
wn1

wn2

xn ym
wnm

Single layer

Neurons

Fig. Single Layer Feed-forward Network

68
Multi Layer Feed-forward Network

The name suggests, it consists of multiple layers. The architecture of this


class of network, besides having the input and the output layers, also have
one or more intermediary layers called hidden layers. The computational
units of the hidden layer are known as hidden neurons.

Fig.
Multilayer feed-forward network in (ℓ – m – n) configuration.

■ The hidden layer does intermediate computation before directing the


input to output layer.

■ The input layer neurons are linked to the hidden layer neurons; the
weights on these links are referred to as input-hidden layer weights.

■ The hidden layer neurons and the corresponding weights are referred to
as output-hidden layer weights.

69
■ A multi-layer feed-forward network with ℓ input neurons, m1 neurons in
the first hidden layers, m2 neurons in the second hidden layers, and n
output neurons in the output layers is written as (ℓ - m1 - m2 – n ).

The Fig. above illustrates a multilayer feed-forward network with a


configuration (ℓ - m – n).

Recurrent Networks

The Recurrent Networks differ from feed-forward architecture. A Recurrent


network has at least one feed back loop.

Example :

There could be neurons with self-feedback links; that is the output of a


neuron is fed back into it self as input.

70
Learning Methods in Neural Networks

The learning methods in neural networks are classified into three basic types :

• Supervised Learning,
• Reinforced Learning

These three types are classified based on :

• presence or absence of teacher and

• the information provided for the system to learn.

These are further categorized, based on the rules used, as

• Hebbian,

• Gradient descent,

• Competitive and

• Stochastic learning.

71
• Classification of Learning Algorithms

Fig. below indicate the hierarchical representation of the algorithms


mentioned in the previous slide. These algorithms are explained in
subsequent slides.

Neural Network

Learning algorithms

72
b Supervised Learning

A teacher is present during learning process and presents expected


output.

Every input pattern is used to train the network.

Learning process is based on comparison, between network's computed


output and the correct expected output, generating "error".

The "error" generated is used to change network parameters that result


improved performance.

c Unsupervised Learning

No teacher is present.

The expected or desired output is not presented to the network.

The system learns of it own by discovering and adapting to the


structural features in the input patterns.

d Reinforced learning

A teacher is present but does not present the expected or desired


output but only indicated if the computed output is correct or incorrect.

The information provided helps the network in its learning process.

A reward is given for correct answer computed and a penalty for a


wrong answer.

Note : The Supervised and Unsupervised learning methods are most popular
forms of learning compared to Reinforced learning.

73
• Hebbian Learning

Hebb proposed a rule based on correlative weight adjustment.

In this rule, the input-output pattern pairs (Xi , Yi) are associated by the
weight matrix W, known as correlation matrix computed as

W= Σn Xi Yi
T

i=1

where YiT is the transpose of the associated output vector Yi

There are many variations of this rule proposed by the other researchers
(Kosko, Anderson, Lippman) .

74
• Gradient descent Learning

This is based on the minimization of errors E defined in terms of weights


and the activation function of the network.

- Here, the activation function


of the network is of required to be
differentiable, because the
updates weight is dependent on

the gradient of the error E.

• If ∆ Wij is the weight update of the link connecting the i th and the j th

neuron of the two neighboring layers, then ∆ Wij is defined as

Wij = ( E/ Wij )
η ∂ ∂

where
η is the learning rate parameters and E/ Wij ) is error
(∂ ∂

gradient
with reference to the weight Wij .

Note : The Hoffs Delta rule and Back-propagation


learning rule are

the examples of Gradient descent learning.

75
• Competitive Learning

In this method, those neurons which respond strongly to the input


stimuli have their weights updated.

When an input pattern is presented, all neurons in the layer compete,


and the winning neuron undergoes weight adjustment .

This strategy is called "winner-takes-all".

• Stochastic Learning

In this method the weights are adjusted in a probabilistic fashion.

- Example : Simulated annealing which is a learning mechanism


employed by Boltzmann and Cauchy machines.

76
• Taxonomy Of Neural Network Systems

In the previous sections, the Neural Network Architectures and the Learning
methods have been discussed. Here the popular neural network

systems are listed. The grouping of these systems in terms of architectures and
the learning methods are presented in the next slide.

• Neural Network Systems

– ADALINE (Adaptive Linear Neural Element)

– ART (Adaptive Resonance Theory)

– AM (Associative Memory)

– BAM (Bidirectional Associative Memory)

– Boltzmann machines

– BSB ( Brain-State-in-a-Box)

– Cauchy machines

– Hopfield Network

– LVQ (Learning Vector Quantization)

– Neoconition

77
– Perceptron

– RBF ( Radial Basis Function)

– RNN (Recurrent Neural Network)

– SOFM (Self-organizing Feature Map)

30

78
• Classification of Neural Network

A taxonomy of neural network systems based on Architectural types and


the Learning methods is illustrated below.

Learning Methods

Gradient Hebbian Competitive Stochastic

descent

Single-layer ADALINE, AM, LVQ, -

feed-forward Hopfield, Hopfield, SOFM

Percepton,

Multi-layer CCM, Neocognition

feed- forward MLFF,

RBF

Recurrent RNN BAM, ART Boltzmann and

Networks BSB, Cauchy

Hopfield, machines

Table : Classification of Neural Network Systems with respect to

learning methods and Architecture types

31

79
■ Single-Layer NN Systems

Here, a simple Perceptron Model and an ADALINE Network Model is presented.

6.1 Single layer Perceptron

Definition : An arrangement of one input layer of neurons feed forward to


one output layer of neurons is known as Single Layer Perceptron.

input xi output yj
weights wij

w11
x1 y1
w21
w12

w22
x2 y2

w2m

w1m
wn1

wn2

xn ym
wnm

Single layer

Perceptron

Fig. Simple Perceptron Model

where net j =
n
y j = f (net j) = 1 if net j ≥ 0 Σ xi wij

0 if net j  0 i=1

80
■ Learning Algorithm : Training Perceptron

The training of Perceptron is a supervised learning algorithm where


weights are adjusted to minimize error when ever the output does not
match the desired output.

− If the output is correct then no adjustment of weights is done.

K+1 K
i.e. W = W
ij ij

− If the output is 1 but should have been 0 then the weights are

decreased on the active input link

K+1 K

W − α.
i.e. ij = W i j xi

− If the output is 0 but should have been 1 then the weights are
increased on the active input link

K+1 K
i.e. W ij
+ . xi
=W ij
α

Where

K+1 is the new adjusted weight,


K
Wij is the old weight

Wi j

xi is the input and α is the learning rate parameter.

81

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy