0% found this document useful (0 votes)
105 views97 pages

Final Thesis Pk12

Direct modeling mainly refers to adaptive identification of unknown plants. Most of the practical plants are dynamic, nonlinear and combination of these two characteristics. Conventional derivative based least mean square (LMS) type algorithms work well for identification of static plants. But when the plants are of dynamic type, the existing forwardbackward LMS and the RLS algorithms very often lead to non optimal solution due to premature convergence of weights to local minima. This is a major drawback of the use of existing derivative based techniques

Uploaded by

Suman Pradhan
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
105 views97 pages

Final Thesis Pk12

Direct modeling mainly refers to adaptive identification of unknown plants. Most of the practical plants are dynamic, nonlinear and combination of these two characteristics. Conventional derivative based least mean square (LMS) type algorithms work well for identification of static plants. But when the plants are of dynamic type, the existing forwardbackward LMS and the RLS algorithms very often lead to non optimal solution due to premature convergence of weights to local minima. This is a major drawback of the use of existing derivative based techniques

Uploaded by

Suman Pradhan
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 97

CHAPTER-1

INTRODUCTION:
1.1 BACK GROUND:
Out of many applications of adaptive filtering, direct modeling and inverse modeling are very
important. The direct modeling or system identification finds applications in control system
engineering including robotics [1], intelligent sensor design [2], process control [3], power
system engineering [4], image and speech processing [4], geophysics [5], acoustic noise and
vibration control [6] and biomedical engineering [7]. Similarly inverse modeling technique is
used in digital data reconstruction [8], channel equalization in digital communication [9], digital
magnetic data recording [10], and intelligent sensor [2], deconvolution of seismic data [11]. The
direct modeling mainly refers to adaptive identification of unknown plants. Simple static linear
plants are easily identified through parameter estimation using conventional derivative based
least mean square (LMS) type algorithms [12]. But most of the practical plants are dynamic,
nonlinear and combination of these two characteristics. In many applications Hammerstein and
MIMO plants need identification. In addition the output of the plant is associated with
measurement or additive white Gaussian noise (AWGN). Identification of such complex plants
is a difficult task and poses many challenging problems. Similarly inverse modeling of
telecommunication and magnetic medium channels are also important for reducing the effect of
inter symbol interference (ISI) and achieving faithful reconstruction of original data. Similarly
adaptive inverse modeling of sensors is required to extend their linearity's for direct digital
readout and enhancement of dynamic range. These two important and complex issues are
addressed in the thesis and attempts have been made to provide improved efficient and alternate
promising solutions.
[1]
The conventional LMS and recursive least square (RLS) [13] techniques work well for
identification of static plants but when the plants are of dynamic type, the existing forward-
backward LMS [14] and the RLS algorithms very often lead to non optimal solution due to
premature convergence of weights to local minima [15]. This is a major drawback of the use of
existing derivative based techniques. To alleviate this burning issue this thesis suggests the use
of derivative free optimization techniques in place of conventional techniques.
In recent past population based optimization techniques have been reported which fall under the
category of evolutionary computing [16] or computational intelligence [17]. These are also
called bio-inspired techniques which include genetic algorithm (GA) and its variants [18],
Differential Evolution [19]. These techniques are suitably employed to obtain efficient iterative
learning algorithms for developing adaptive direct and inverse models of complex plants and
channels.
Development of direct and inverse adaptive models essentially consists of two components. The
first component is an adaptive network which may be linear or nonlinear in nature. Use of a
nonlinear network is preferable when nonlinear plants or channels are to be identified or
equalized. The linear networks used in the thesis are adaptive linear combiner or all-zero or FIR
structure [7] under nonlinear category GA and DE are used.
1.2 MOTIVATION
In summary the main motivations of the research work carried in the present thesis are the
following:
i. To formulate the direct and inverse modeling problems as error square optimization
problems
ii. To introduce bio-inspired optimization tools such as GA and DE and their variants to
efficiently minimize the squared error cost function of the models. In other words to
develop alternate identification scheme.
iii. To achieve improved identification (direct modeling) of complex nonlinear and channel
equalization (inverse modeling) of nonlinear noisy digital channels by introducing new
and improved updating algorithms.
[2]
1.3 MAJOR CONTRIBUTION OF THE THESIS
The major contribution of the thesis is outlined below
i. The GA based approach for both linear and nonlinear system identifications are
introduced. The GA based approach is found to be more efficient for nonlinear system
than other standard derivative based learning. In addition the DE based identification
have been proposed and shown to have better performance and involve less
computational complexity.
ii. The GA based approach for linear and nonlinear channel equalizations are introduced.
The GA based approach is found to be more efficient than other standard derivative
based learning. In addition DE based equalizers have been proposed and shown to have
better performance and involve less computational complexity.
1.4 CHAPTER WISE CONTRIBUTION
The research work undertaken is embodied in 7 Chapters.
Chapter-1 gives an introduction to System identification, channel equalization and reviews
of various learning algorithm such as Least-mean-square (LMS) algorithm, Recursive-
least-square (RLS) algorithm, Artificial Neural Network (ANN), Genetic Algorithm (GA),
Differential Evolution (DE) used to identify the system and train to the equalizer. It also
includes the motivation behind undertaking the thesis work.
Chapter-2 Discusses about the general form of adaptive algorithm, Adaptive filtering
problem, derivative based algorithm such as LMS and overview of derivative free based
algorithm such as Genetic Algorithm and Differential Evolution.
Chapter-3 Discusses various system identification technique, Develop the algorithm of GA
for simulation on system identification and taking a comparison study between LMS and
GA on both linear and nonlinear system.
[3]
Chapter-4 Discusses various channel equalization technique, Develop the algorithm of GA
for simulation on channel equalization and taking a comparison between LMS and GA on
both linear and nonlinear channel.
Chapter-5 Develop the algorithm of DE for simulation on system identification and taking
a comparison between LMS, GA and DE on both linear and nonlinear system.
Chapter-6 Develop the algorithm of DE for simulation on channel equalization and taking a
comparison between LMS, GA and DE on both linear and nonlinear channel equalizers.
Chapter-7 deals with the conclusion of the investigation made in the thesis. This chapter
also suggests some future research related to the topic.

[4]
CHAPTER- 2
GENETIC ALGORITHM AND
DIFFERENTIAL EVOLUTION
2.1 INTRODUCTION:
There are many learning algorithms which are employed to train various adaptive models. The
performance of these models depends on rate of convergence, training time, Computational
complexity involved and minimum mean square error achieved after training. The learning
algorithms may be broadly classified into two categories (a) derivative based (b) derivative free.
The derivative based algorithms include least means square (LMS), IIR LMS (ILMS), back
propagation (BP) and FLANN-LMS. Under the derivative free algorithms, genetic algorithm
(GA), differential evolution (DE), particle swarm optimization (PSO), bacterial foraging
optimization (BFO) and artificial immune system (AIS) have been employed. In this section the
details of LMS, GA and DE algorithms are outlined in sequel.
2.2 GRADIENT BASED ADAPTIVE ALGORITHIM:
An adaptive algorithm is a procedure for adjusting the parameters of an adaptive lter to
minimize a cost function chosen for the task at hand. In this section, we describe the general
form of many adaptive FIR ltering algorithms and present a simple derivation of the LMS
adaptive algorithm. In our discussion, we only consider an adaptive FIR lter structure. Such
systems are currently more popular than adaptive IIR lters because
1. The input-output stability of the FIR lter structure is guaranteed for any set of xed
coecients, and
2. The algorithms for adjusting the coecients of FIR lters are simpler in general than
those for adjusting the coecients of IIR lters.
[5]
2.2.1 GENERAL FORM OF ADAPTIVE FIR ALGORITHMS:
The general form of an adaptive FIR ltering algorithm is

W(n+1)=W(n) + (n)G(e(n),X(n),(n)) (2.1)
Where G(-) is a particular vector-valued nonlinear function, (n) is a step size parameter, e(n)
and X(n) are the error signal and input signal vector, respectively, and (n) is a vector of states
that store pertinent information about the characteristics of the input and error signals and/or the
coecients at previous time instants. In the simplest algorithms, (n) is not used, and the only
information needed to adjust the coecients at time n are the error signal, input signal vector,
and step size.
The step size is so called because it determines the magnitude of the change or step that is
taken by the algorithm in iteratively determining a useful coecient vector. Much research
eort has been spent characterizing the role that (n) plays in the performance of adaptive
lters in terms of the statistical or frequency characteristics of the input and desired response
signals. Often, success or failure of an adaptive ltering application depends on how the value
of (n) is chosen or calculated to obtain the best performance from the adaptive lter.
2.2.2 THE MEAN-SQUARED ERROR COST FUNCTION:
The form of G(-) depends on the cost function chosen for the given adaptive ltering task. We
now consider one particular cost function that yields a popular adaptive algorithm. Dene the
mean-squared error (MSE) cost function as

( )
( )
2
1
( ) ( ) ( )
2
mse n
J n e n p e n de n
+

(2.2)
=
( )
2
1
2
Ee n (2.3)
[6]
Where p
n
(e) represents the probability density function of the error at time n. and E- is short
hand for the expectation integral on the right hand side (2.3).
The MSE cost function is useful for adaptive FIR lters because
J
mse
(n) has a well-dened minimum with respect to the parameters in W(n)
The coecient values obtained at this minimum are the ones that minimize the power in
the error signal e(n), indicating that y(n) has approached d(n) and
J
mse
(n) a smooth function of each of the parameters in W(n), such that it is dierentiable
with respect to each of the parameters in W(n).
The third point is important in that it enables us to determine both the optimum coecient
values given knowledge of the statistics of d(n) and x(w) as well as a simple iterative procedure
for adjusting the parameters of an FIR lter.
2.2.3 THE WIENER SOLUTION:
For the FIR lter structure, the coecient values in W(n) that minimize J
M SE
(n) are well-
dened if the statistics of the input and desired response signals are known. The formulation of
this problem for continuous-time signals and the resulting solution was rst derived by Wiener.
Hence, this optimum coecient vector W
M SE
(n) is often called the Wiener solution to the
adaptive ltering problem. The extension of Wieners analysis to the discrete-time case is
attributed to Levinson . To determine W
M SE
(n) we note that the function J
M SE
(n) in is quadratic
in the parameters {w
i
(n)}, and the function is also dierentiable. Thus, we can use a result from
optimization theory that states that the derivatives of a smooth cost function with respect to each
of the parameters is zero at a minimizing point on the cost function error surface. Thus, W
M SE
(n) can be found from the solution to the system of equation

( )
0.0 ( ) ( )
( )
mse
i
J n
i L
w n

(2.4)
Taking derivative of J
mse
(n) in (2.3) and noting that e(n)=d(n) - y(n) and
( ) ( )
1
0
( ) ( ) ( )
T
L
i
i
w n X n y n w n x n i

respectively, we obtain
[7]
( ) ( ( ))
( )
( ) ( )
mse
i i
J n e n
E e n
w n w n
1
1
1
]


(2.5)
( )
( )
( )
i
y n
Ee n
w n
1
1
1
]

(2.6)
[ ( ) ( )] E e n x n i (2.7)

1
0
( ) ( ) ( ) ( ) ( )
L
j
j
E d n x n i E x n i x n j w n

_
1
1
1 ]
]

,

(2.8)
Where we have used the dentitions of e(n) and of y(n) for the FIR lter structure in and
respectively to expand the last result in (2.8).
By dening the matrix R
XX
(n) and vector P
dx
(n) as
( ) ( )
T
XX
R E X n X n
1
1
]

(2.9)
( ) ( ) ( )
dx
P n E d n X n
1
1
]

(2.10)
respectively, we can combine the above equations to obtain the system of equations in vector
form as
( ) ( ) ( ) 0
XX MSE dx
R n W n P n
(2.11)
Where 0 is the zero vector. Thus, so long as the matrix R
XX
(n) is invertible, the optimum
Wiener solution vector for this problem is
[8]
1
( ) ( ) ( )
XX MSE dx
W n R n P n

(2.12)
2.2.4 THE METHOD OF STEEPEST DESCENT:
The method of steepest descent is a celebrated optimization procedure for minimizing the value
of a cost function J(n) with respect to a set of adjustable pa-rameters W(n). This procedure
adjusts each parameter of the system according to

( )
( 1) ( ) ( )
( )
i i
i
J n
w n w n n
w n

(2.13)
In other words, the I
th
parameter of the system is altered according to the derivative of the cost
function with respect to the I
th
parameter. Collecting these equations in vector form, we have

( )
( 1) ( ) ( )
( )
J n
W n W n n
W n

(2.14)
Where
( )
( )
J n
W n

be the vector form of


( )
( )
i
J n
w n

For an FIR adaptive filter that minimizes the MSE cost function, we can use the result in to
explicitly give the form of the steepest descent procedure in this problem. Substituting these
results into yields the update equation for W(n) as
( )
( 1) ( ) ( ) ( ) ( ) ( )
XX dx
W n W n n P n R n W n + +
(2.15)

However, this steepest descent procedure depends on the statistical quantities E{d(n)x(n i)}
and E{x(n i)x(n j)} contained in P
dx
(n) and R
xx
(n) respectively. In practice, we only have
measurements of both d(n) and x(n) to be used within the adaptation procedure. While suitable
estimates of the statistical quantities needed for (2.15) could be determined from the signals x(n)
and d(n) we instead develop an approximate version of the method of steepest descent that
depends on the signal values themselves. This procedure is known as the LMS algorithm.
[9]
2.2.5 THE LMS ALGORITHM:
The cost function J(n) chosen for the steepest descent algorithm of determines the coecient
solution obtained by the adaptive lter. If the MSE cost function in is chosen, the resulting
algorithm depends on the statistics of x(n) and d(n) because of the expectation operation that
denes this cost function. Since we typically only have measurements of d(n) and of x(n)
available to us, we substitute an alternative cost function that depends only on these
measurements.
One such cost function is the least-squares cost function given by

( )
2
0
( ) ( ) ( ) ( ) ( )
n
T
LMS
k
j n k d k W n X k

(2.16)
Where ( ) k a
is a suitable weighting sequence for the terms within the summation. This cost
function, however, is complicated by the fact that it requires numerous computations to
calculate its value as well as its derivatives with respect to each W(n), although ecient
recursive methods for its minimization can be developed. Alternatively, we can propose the
simplied cost function J
LM S
(n ) Given by

2
1
( ) ( )
2
LMS
J n e n (2.17)
function can be thought of as an instantaneous estimate of the MSE cost function, as
J
MSE
(n)=EJ
LMS
(n). Although it might not appear to be useful, the resulting algorithm obtained
when J
LMS
(n) is used for J(n) in (2.13) is extremely useful for practical applications. Taking
derivatives of J
LMS
(n) with respect to the elements of W(n) and substituting the result into
(2.13), we obtain the LMS adaptive algorithm given by

( 1) ( ) ( ) ( ) ( ) W n W n n e n X n + + (2.18)
[10]
Note that this algorithm is of the general form in. It also requires only multiplications and
additions to implement. In fact, the number and type of operations needed for the LMS
algorithm is nearly the same as that of the FIR lter structure with xed coecient values,
which is one of the reasons for the algorithms popularity. The behavior of the LMS algorithm
has been widely studied, and numerous results concerning its adaptation characteristics under
dierent situations have been developed. For now, we indicate its useful behavior by noting that
the solution obtained by the LMS algorithm near its convergent point is related to the Wiener
solution. In fact, analyses of the LMS algorithm under certain statistical assumptions about the
input and desired response signals show that

lim [ ( )]
MSE
n
E W n W


(2.19)
When the Wiener solution W
M SE
(n) is a xed vector. Moreover, the average behavior of the
LMS algorithm is quite similar to that of the steepest descent algorithm in that depends
explicitly on the statistics of the input and desired response signals. In eect, the iterative nature
of the LMS coecient updates is a form of time-averaging that smoothes the errors in the
instantaneous gradient calculations to obtain a more reasonable estimate of the true gradient.
The problem is that gradient descent is a local optimization technique, which is limited because
it is unable to converge to the global optimum on a multimodal error surface if the algorithm is
not initialized in the basin of attraction of the global optimum. Several medications' exist for
gradient based algorithms in attempt to enable them to overcome local optima. One approach is
to simply add noise or a momentum term to the gradient computation of the gradient descent
algorithm to enable it to be more likely to escape from a local minimum. This approach is only
likely to be successful when the error surface is relatively smooth with minor local minima, or
some information can be inferred about the topology of the surface such that the additional
gradient parameters can be assigned accordingly. Other approaches attempt to transform the
error surface to eliminate or diminish the presence of local minima , which would ideally result
in a unimodal error surface. The problem with these approaches is that the resulting minimum
[11]
transformed error used to update the adaptive lter can be biased from the true minimum output
error and the algorithm may not be able to converge to the desired minimum error condition.
These algorithms also tend to be complex, slow to converge, and may not be guaranteed to
emerge from a local minimum. Some work has been done with regard to removing the bias of
equation error LMS and Steiglitz-McBride adaptive IIR lters, which add further complexity
with varying degrees of success. Another approach, attempts to locate the global optimum by
running several LMS algorithms in parallel, initialized with dierent initial coecients. The
notion is that a larger, concurrent sampling of the error surface will increase the likelihood that
one process will be initialized in the global optimum valley. This technique does have potential,
but it is inecient and may still suer the fate of a standard gradient technique in that it will be
unable to locate the global optimum if none of the initial estimates is located in the basin of
attraction of the global optimum. By using a similar congregational scheme, but one in which
information is collectively exchanged between estimates and intelligent randomization is
introduced, structured stochastic algorithms are able to hill-climb out of local minima. This
enables the algorithms to achieve better, more consistent results using a fewer number of total
estimates. These types of algorithms provide the framework for the algorithms discussed in the
following sections.
2.3 DERIVATIVE FREE BASED ALGORITHIM:
Since the beginning of the nineteenth century, a signicant evolution in optimization theory has
been noticed. Classical linear programming and traditional non-linear optimization techniques
such as Lagranges Multiplier, Bellmans principle and Pontyagrins principle were prevalent
until this century. Unfortunately, these derivative based optimization techniques can no longer
be used to determine the optima on rough non-linear surfaces. One solution to this problem has
already been put forward by the evolutionary algorithms research community. Genetic
algorithm (GA), enunciated by Holland, is one such popular algorithm. This chapter provides
recent algorithms for evolutionary optimization known as deferential evolution (DE). The
algorithms are inspired by biological and sociological motivations and can take care of
optimality on rough, discontinuous and multimodal surfaces. The chapter explores several
schemes for controlling the convergence behaviors DE by a judicious selection of their
parameters. Special emphasis is given on the hybridizations DE algorithms with other soft
computing tools.
[12]
2.4 GENETIC ALGORITHM:
Genetic algorithms are a class of evolutionary computing techniques, which is a rapidly
growing area of artificial intelligence. Genetic algorithms are inspired by Darwin's theory of
evolution. Simply said, problems are solved by an evolutionary process resulting in a best
(fittest) solution (survivor) - in other words, the solution is evolved. Evolutionary computing
was introduced in the 1960s by Rechenberg in his work "Evolution strategies" (Evolutions
strategies' in original). His idea was then developed by other researchers. Genetic Algorithms
(GAs) were invented by John Holland and developed by him and his students and colleagues .
This led to Holland's book "Adaption in Natural and Artificial Systems" published in 1975.
The algorithm begins with a set of solutions (represented by chromosomes) called population.
Solutions from one population are taken and used to form a new population.
This is motivated by a hope, that the new population will be better than the old one. Solutions
which are then selected to form new solutions (offspring) are selected according to their fitness -
the more suitable they are, the more chances they have to reproduce. This is repeated until some
condition (for example number of populations or improvement of the best solution) is satisfied.
2.4.1 OUTLINE OF BASIC GA:
1. [Start] Generate random population of n chromosomes (suitable solutions for the
problem)
2. [Fitness] Evaluate the fitness f(x) of each chromosome x in the population
3. [New population] Create a new population by repeating following steps until the new
population is complete
4. a [Selection] Select two parent chromosomes from a population according to their
fitness (the better fitness, the bigger chance to be selected)
5. [Replace] Use new generated population for a further run of the algorithm
6. [Test] If the end condition is satisfied, stop, and return the best solution in current
7. population
8. [Loop] Go to step 2
9. The outline of the Basic GA provided above is very general. There are many parameters
[13]
and settings that can be implemented differently in various problems. Elitism is often
used as a method of selection. Which means, that at least one of a generation's best
solution is copied without changes to a new population, so the best solution can survive
to the succeeding generation
a. [Crossover] With a crossover probability cross over the parents to form new
offspring (children). If no crossover was performed, offspring is the exact copy of
parents.
b. [Mutation] With a mutation probability mutate new offspring at each locus (position
in chromosome).
c. [Accepting] Place new offspring in the new population
2.4.2 OPERATORS OF GA:
OVERVIEW:
The crossover and mutation are the most important parts of the genetic algorithm. The
performance is influenced mainly by these two operators.
ENCODING OF A CHROMOSOME:
A chromosome should in some way contain information about solution that it represents. The
most commonly used way of encoding is a binary string. A chromosome then could look like
this:
Table-2.1 (Encoding of a chromosome)
Each chromosome is represented by a binary string. Each bit in the string can represent some
characteristics of the solution. There are many other ways of encoding. The encoding depends
mainly on the problem to be solved. For example, one can encode directly integer or real
numbers; sometimes it is useful to encode some permutations and so on.
CROSSOVER:
[14]
Chromosome 1 1101100100110110
Chromosome 2 1101111000011110
Crossover operates on selected genes from parent chromosomes and creates new offspring. The
simplest way how to do that is to choose randomly some crossover point and copy everything
before this point from the first parent and then copy everything after the crossover point from
the other parent. Crossover is illustrated in the following (| is the Crossover point)

Table-2.2 (crossover of Chromosome)
Chromosome 1 11011 00100110110
Chromosome 2 11011 11000011110
Chromosome 3 11011 11000011110
Chromosome 4 11011 00100110110
There are other ways how to make crossover, for example we can choose more crossover points.
MUTATION:
Mutation is intended to prevent falling of all solutions in the population into a local optimum of
the solved problem. Mutation operation randomly changes the offspring resulted from
crossover. In case of binary encoding we can switch a few randomly chosen bits from 1 to 0 or
from 0 to 1. Mutation can be then illustrated as follows



Table-2.3(Mutation operation)
[15]
Original offspring 1 1101111000011110
Original offspring 2 1101100100110110
Original offspring 3 1100111000011110
Original offspring 4 1101101100110110
The technique of mutation (as well as crossover) depends mainly on the encoding of
chromosomes. For example when we are encoding by permutations, mutation could be
performed as an exchange of two genes.
2.4.3 PARAMETERS OF GA:
There are two basic parameters of GA - crossover probability and mutation probability.
CROSSOVER PROBABILITY:
It indicates how often crossover will be performed. If there is no crossover, offspring are exact
copies of parents. If there is crossover, offspring are made from parts of both parent's
chromosome. If crossover probability is 100%, then all
offspring are made by crossover. If it is 0%, whole new generation is made from exact copies of
chromosomes from old population (but this does not mean that the new generation is the same!).
Crossover is made in hope that new chromosomes will contain good parts of old chromosomes
and therefore the new chromosomes will be better. However, it is good to leave some part of old
population survives to next generation.
MUTATION PROBABILITY:
This signifies how often parts of chromosome will be mutated. If there is no mutation, offspring
[16]
are generated immediately after crossover (or directly copied) without any change. If mutation
is performed, one or more parts of a chromosome are changed. If mutation probability is 100%,
whole chromosome is changed, if it is 0%, nothing is changed. Mutation generally prevents the
GA from falling into local extremes. Mutation should not occur very often, because then GA
will in fact change to random search.
OTHER PARAMETERS:
There are also some other parameters of GA. One another particularly important parameter is
population size.
POPULATION SIZE:
It signifies how many chromosomes are present in population (in one generation). If there are
too few chromosomes, then GA has few possibilities to perform crossover and only a small part
of search space is explored. On the other hand, if there are too many chromosomes, then GA
slows down.
SELECTION:
The chromosomes are selected from the population to be parents for crossover. The problem is
how to select these chromosomes. According to Darwin's theory of evolution, the best ones
survive to create new offspring. There are many methods in selecting the best chromosomes.
Examples are roulette wheel selection, Boltzmann selection, tournament selection, rank
selection, steady state selection and some others. In this thesis we have used the tournament
selection as it performs better than the others.
TOURNAMENT SELECTION:
A selection strategy in GA is simply a process that favors the selection of better individuals in
the population for the mating pool. There are two important issues in the evolution process of
genetic search, population diversity and selective pressure. Population diversity means that the
genes from the already discovered good individuals are exploited while promising the new areas
of the search space continue to be explored. Selective pressure is the degree to which the better
individuals are favored. The tournament selection strategy provides selective pressure by
[17]
holding a tournament competition among individuals.
2.5 DIFFERENTIAL EVALUATION:
The aim of optimization is to determine the best-suited solution to a problem under a given set
of constraints. Several researchers over the decades have come up with dierent solutions to
linear and non-linear optimization problems. Mathematically an optimization problem involves
a tness function describing the problem, under a set of constraints representing the solution
space for the problem. Unfortunately, most of the traditional optimization techniques are
centered around evaluating the rst derivatives to locate the optima on a given constrained
surface. Because of the diculties in evaluating the rst Derivatives, to locate the optima for
many rough and discontinuous optimization surfaces, in recent times, several derivative free
optimization algorithms have emerged. The optimization problem, now-a-days, is represented as
an intelligent search problem, where one or more agents are employed to determine the optima
on a search landscape, representing the constrained surface for the optimization problem [20].
In the later quarter of the twentieth century, Holland pioneered a new concept on evolutionary
search algorithms, and came up with a solution to the so far open-ended problem to non-linear
optimization problems. Inspired by the natural adaptations of the biological species, Holland
echoed the Darwinian Theory through his most popular and well known algorithm, currently
known as genetic algorithms (GA) [21]. Holland and his coworkers including Goldberg and
Dejong popularized the theory of GA and demonstrated how biological crossovers and
mutations of chromosomes can be realized in the algorithm to improve the quality of the
solutions over successive iterations [22]. In mid 1990s Eberhart and Kennedy enunciated an
alternative solution to the complex non-linear optimization problem by emulating the collective
behavior of bird ocks, particles, the boids method of Craig Reynolds [23] and socio-cognition
and called their brainchild the particle swarm optimization (PSO)[23-27]. Around the same
time, Price and Storn took a serious attempt to replace the classical crossover and mutation
operators in GA by alternative operators, and consequently came up with a suitable deferential
operator to handle the problem. They proposed a new algorithm based on this operator, and
called it deferential evolution (DE) [28].
Both algorithms do not require any gradient information of the function to be optimized uses
only primitive mathematical operators and are conceptually very simple. They can be
[18]
implemented in any computer language very easily and requires minimal parameter tuning.
Algorithm performance does not deteriorate severely with the growth of the search space
dimensions as well. These issues perhaps have a great role in the popularity of the algorithms
within the domain of machine intelligence and cybernetics.
2.5.1 CLASSICAL DE:
Like any other evolutionary algorithm, DE also starts with a population of NP D-dimensional
search variable vectors. We will represent subsequent generations in DE by discrete time steps
like t = 0, 1, 2. . . t, t+1, etc. Since the vectors are likely to be changed over dierent generations
we may adopt the following notation for representing the ith vector of the population at the
current generation (i.e., at time t = t) as
X
i
(t)= [x
i,1
(t), x
i,2
(t), x
i,3
(t) . . . . . x
i,D
(t)] (2.20)
These vectors are referred in literature as genomes or chromosomes. DE is a very simple
evolutionary algorithm. For each search-variable, there may be a certain range within which
value of the parameter should lie for better search results. At the very beginning of a DE run or
at t = 0, problem parameters or independent variables are Initialized somewhere in their feasible
numerical range. Therefore, if the jth parameter of the given problem has its lower and upper
bound as x
Lj
and x
Uj
respectively, then we may initialize the j
th
component of the i
th
population
members as x
i,j
(0) = x
Lj
+ rand (0, 1) (x
Uj
x
Lj
), where rand (0,1) is a uniformly distributed
random number lying between 0 and 1. Now in each generation (or one iteration of the
algorithm) to change each population member X
i
(t) (say), a Donor vector V
i
(t) is created. It is
the method of creating this donor vector, which demarcates between the various DE schemes.
However, here we discuss one such specic mutation strategy known as DE/rand/1. In this
scheme, to create V
i
(t) for each ith member, three other parameter vectors (say the r
1
, r
2
, and r
3
th
vectors) are chosen in a random fashion from the current population. Next, a scalar number F
scales the deference of any two of the three vectors and the scaled deference is added to the
third one whence we obtain the donor vector V
i
(t). We can express the process for the j
th
component of each vector as
[19]
, 1, 2, 3,
. ( 1) ( ) .( ( ) ( ))..............
i j r j r j r j
V t x t F x t x t + +
(2.21)
The process is illustrated in Fig. 2. Closed curves in Fig. 2denote constant cost contours, i.e., for
a given cost function f, a contour corresponds to f (X) = constant. Here the constant cost
contours are drawn for the Ackley Function. Next, to increase the potential diversity of the
population a crossover scheme comes to play. DE can use two kinds of cross over schemes
namely Exponential and Binomial. The donor vector exchanges its body parts, i.e.,
components with the target vector Xi(t) under this scheme. In Exponential crossover, we rst
choose an integer n randomly among the numbers [0, D1]. This integer acts as starting point in
the target vector, from where the crossover or exchange of components with the donor vector
starts. We also choose another integer L from the interval [1, D]. L denotes the number of
components; the donor vector actually contributes to the target. After a choice of n and L the
trial vector
,1 ,2 ,
( ) [ ( ), ( ),....... ( )]
i i i i D
U t u t u t u t
(2.22)
is formed with
, ,
( ) ( )
i j i j
u t v t
for j= < n > D, < n+1 > D,..,< n L+1 >D

( )
ij
x t

(2.23)
Where the angular brackets <>D denote a modulo function with modulus D. The integer L is
drawn from [1, D] according to the following pseudo code.
[20]
Fig. 1.1. Illustrating creation of the donor vector in 2-D parameter space (The
constant cost contours are for two-dimensional Ackley Function)
L=0;
Do
{
L=L+1;
}
While (rand (0, 1) < CR) AND (L<D);
Hence in eect probability (L > m) = (CR)
m1
for any m > 0. CR is called Crossover constant
and it appears as a control parameter of DE just like F. For each donor vector V, a new set of n
and L must be chosen randomly as shown above. However, in Binomial crossover scheme,
the crossover is performed on each of the D variables whenever a randomly picked number
between 0 and 1 is within the CR value. The scheme may be outlined as
u
i,j
(t) = v
i,j
(t) if rand (0, 1) < CR,
[21]
= x
i,j
(t) else. (2.26)
In this way for each trial vector X
i
(t) an ospring vector U
i
(t) is created. To keep the population
size constant over subsequent generations, the next step of the algorithm calls for selection to
determine which one of the target vector and the trial vector will survive in the next generation,
i.e., at time t = t + 1. DE actually involves the Darwinian principle of Survival of the ttest in
its selection process which may be outlined as
X
i
(t + 1) =U
i
(t) if f (U
i
(t)) f (X
i
(t)),

= X
i
(t) if f (U
i
(t)) f (X
i
(t)) (2.27)

Where f () is the function to be minimized. So if the new trial vector yields a better value of the
tness function, it replaces its target in the next generation; otherwise the target vector is
retained in the population. Hence the population either gets better (w.r.t. the tness function) or
remains constant but never deteriorates. The DE/rand/1 algorithm is outlined below
2.5.2 PROCEDURE:
Input: Randomly initialized position and velocity of the particles: xi(0)
Output: Position of the approximate global optima X
Begin
Initialize population;
Evaluate tness;
For i = 0 to max-iteration do
Begin
Create Dierence-Ospring;
Evaluate tness;
If an ospring is better than its parent
Then replace the parent by ospring in the next generation;
End If;
End For;
End.
[22]
2.5.3 THE COMPLETE DE FAMILY:
Actually, it is the process of mutation, which demarcates one DE scheme from another. In the
former section, we have illustrated the basic steps of a simple DE. The mutation scheme in
(2.21) uses a randomly selected vector Xr1 and only one weighted dierence vector F (Xr2
Xr3) is used to perturb it. Hence, in literature the particular mutation scheme is referred to as
DE/rand/1. We can now have an idea of how dierent DE schemes are named. The general
convention used, is DE/x/y. DE stands for DE, x represents a string denoting the type of the
vector to be perturbed (whether it is randomly selected or it is the best vector in the population
with respect to tness value) and y is the
number of dierence vectors considered for perturbation of x. Below we outline the other four
dierent mutation schemes, suggested by Price et al.
SCHEME DE/RAND TO BEST/1
DE/rand to best/1 follows the same procedure as that of the simple DE scheme illustrated
earlier. The only dierence being that, now the donor vector, used to perturb each population
member, is created using any two randomly selected member of the population as well as the
best vector of the current generation (i.e., the vector yielding best suited objective function
value at t = t). This can be expressed for the ith donor vector at time t = t + 1 as

V
i
(t + 1) = X
i
(t) + (X
best
(t) X
i
(t)) + F (X
r2
(t) X
r3
(t)) (2.28)
Where is another control parameter of DE in [0, 2], X
i
(t) is the target vector and X
best
(t) is the
best member of the population regarding tness at current time step t = t. To reduce the number
of control parameters a usual choice is to put = F
SCHEME DE/BEST/1
In this scheme everything is identical to DE/rand/1 except the fact that the
trial vector is formed as
V
i
(t + 1) = X
best
(t) + F (X
r1
(t) X
r2
(t)) (2.29)
[23]
here the vector to be perturbed is the best vector of the current population and the perturbation is
caused by using a single dierence vector.
SCHEME DE/BEST/2
Under this method, the donor vector is formed by using two dierence vectors as shown below:
V
i
(t + 1) = X
best
(t) + F (X
r1
(t) + X
r2
(t) X
r3
(t) X
r4
(t)) (2.30)
Owing to the central limit theorem the random variations in the parameter vector seems to shift
slightly into the Gaussian direction which seems to be benecial for many functions.
SCHEME DE/RAND/2
Here the vector to be perturbed is selected randomly and two weighted difference vectors are
added to the same to produce the donor vector. Thus for each target vector, a totality of ve
other distinct vectors are selected from the rest of the population. The process can be expressed
in the form of an equation as
V
i
(t + 1) = X
r1
(t) + F
1
(X
r2
(t) X
r3
(t)) + F
2
(X
r4
(t) X (t)) (2.31)
Here F1 and F2 are two weighing factors selected in the range from 0 to 1. To reduce the
number of parameters we may choose F
1
= F
2
= F.
SUMMARY OF ALL SCHEMES:
In 2001 Storn and Price [21] suggested total ten dierent working strategies of DE and some
guidelines in applying these strategies to any given problem. These strategies were derived from
the ve dierent DE mutation schemes outlined above. Each mutation strategy was combined
with either the exponential type crossover or the binomial type crossover. This yielded 5
2 = 10 DE strategies, which are listed below.
DE/best/1/exp
[24]
DE/rand/1/exp
DE/rand-to-best/1/exp
DE/best/2/exp
DE/rand/2/exp
DE/best/1/bin
DE/rand/1/bin
DE/rand-to-best/1/bin
DE/best/2/bin
DE/rand/2/
The general convention used above is again DE/x/y/z, where DE stands for DE, x represents a
string denoting the vector to be perturbed, y is the number of dierence vectors considered for
perturbation of x, and z stands for the type of crossover being used (exp: exponential; bin:
binomial)
2.5.4 MORE RECENT VARIANTS OF DE:
DE is a stochastic, population-based, evolutionary search algorithm. The strength of the
algorithm lies in its simplicity, speed (how fast an algorithm can nd the optimal or suboptimal
points of the search space) and robustness (producing nearly same results over repeated runs).
The rate of convergence of DE as well as its accuracy can be improved largely by applying
dierent mutation and selection strategies. A judicious control of the two key parameters
namely the scale factor F and the crossover rate CR can considerably alter the performance of
DE. In what follows we will illustrate some recent medications in DE to make it suitable for
tackling the most dicult optimization problems.
DE WITH TRIGONOMETRIC MUTATION:
Recently, Lampinen and Fan [29] has proposed a trigonometric mutation operator for DE to
speed up its performance. To implement the scheme, for each target vector, three distinct
vectors are randomly selected from the DE population. Suppose for the ith target vector Xi(t),
the selected population members are X
r1
(t), X
r2
(t) and X
r3
(t). The indices r
1
, r
2
and r
3
are mutually
dierent and selected from [1, 2. . . N] Where N denotes the population size. Suppose the
[25]
objective function values of these three vectors are given by, f (X
r1
(t)), f (X
r2
(t)) and f (X
r3
(t)).
Now three weighing coecients are formed according to the following equations:
p = f (X
r1
) + f (X
r2
) + f (X
r3
) (2.32)
p
1
= f (X
r1
) p (2.33)
p
2
= f (X
r2
) p (2.34)
p
3
= f (X
r3
) p (2.35)
Let rand (0, 1) be a uniformly distributed random number in (0, 1) and be the trigonometric
mutation rate in the same interval (0, 1). The trigonometric mutation scheme may now be
expressed as
V
i
(t + 1) = (X
r1
+ X
r2
+ X
r3
)/3 + (p2 p1) (X
r1
X
r2
)
+ (p
3
p
2
) (X
r2
X
r3
) + (p
1
p
3
) (X
r3
X
r1
)
if rand (0, 1) < (2.36)
Vi(t + 1) = Xr1 + F (Xr2 + Xr3) else (2.37)

Thus, we nd that the scheme proposed by Lampinen et al. uses trigonometric mutation with a
probability of and the mutation scheme of DE/rand/1 with a probability of (1 ).
DERANDSF (DE WITH RANDOM SCALE FACTOR)
In the original DE [28] the deference vector (Xr1(t) Xr2(t)) is scaled by a constant factor F .
The usual choice for this control parameter is a number between 0.4 and 1. We propose to vary
this scale factor in a random manner in the range (0.5, 1) by using the relation
F = 0.5 (1 + rand (0, 1)) (2.38)
[26]
where rand (0, 1) is a uniformly distributed random number within the range [0, 1]. We call this
scheme DERANDSF (DE with Random Scale Factor) . The mean value of the scale factor is
0.75. This allows for stochastic variations in the amplication of the dierence vector and thus
helps retain population diversity as the search progresses. Even when the tips of most of the
population vectors point to locations clustered near a local optimum due to the randomly scaled
dierence vector, a new trial vector has fair chances of pointing at an even better location on the
multimodal functional surface. Therefore, the
tness of the best vector in a population is much less likely to get stagnant until a truly global
optimum is reached.
DETVSF (DE WITH TIME VARYING SCALE FACTOR)
In most population-based optimization methods (except perhaps some hybrid global-local
methods) it is generally believed to be a good idea to encourage
Fig. 1.2. Illustrating DETVSF scheme on two-dimensional cost contours of Ackley
Function
the individuals (here, the tips of the trial vectors) to sample diverse zones of the search space
during the early stages of the search. During the later stages it is important to adjust the
movements of trial solutions nely so that they can explore the interior of a relatively small
space in which the suspected global optimum lies. To meet this objective we reduce the value of
the scale factor linearly with time from a (predetermined) maximum to a (predetermined)
[27]
minimum value:
R = (R
max
R
min
)(MAXIT iter)/MAXIT (2.39)
where F
max
and F
min
are the maximum and minimum values of scale factor F, iter is the current
iteration number and MAXIT is the maximum number of allowable iterations. The locus of the
tip of the best vector in the population under this scheme may be illustrated as in Fig. 2. The
resulting algorithm is referred as DETVSF (DE with a time varying scale factor).
DE WITH LOCAL NEIGHBORHOOD:
Only in 2006, a new DE-variant, based on the neighborhood topology of the parameter vectors
was developed [30] to overcome some of the disadvantages of the classical DE versions. The
authors in proposed a neighborhood-based local mutation operator that draws inspiration from
PSO. Suppose we have a DE population P = [X
1
, X
2
. . . X
Np
] where each Xi (i = 1, 2. . . Np) is a
D-dimensional vector. Now for every vector Xi we dene a neighborhood of radius k, consisting
of vectors X
ik
. . . X
i
. . .Xi
+k
. We assume the vectors to be organized in a circular fashion such
that two immediate neighbors of vector X1 are XNp and X2. For each member of the population
a local mutation is created by employing the ttest vector in the neighborhood of the model may
be expressed as:
L
i
(t)=X
i
(t)+ (X
nbest
(t) X
i
(t)) + F (X
p
(t) X
q
(t)) (2.40)
where the subscript nbest indicates the best vector in the neighborhood of X i and p, q (i k,
i + k). Apart from this, we also use a global mutation expressed as:
G
i
(t) = X
i
(t) + (X
best
(t) X
i
(t)) + F (X
r
(t) X
s
(t)) (2.41)
where the subscript best indicates the best vector in the entire population, and r, s (1, NP).
Global mutation encourages exploitation, since all members (vectors) of a population are biased
by the same individual (the population best); local mutation, in contrast, favors exploration,
since in general dierent members of the population are likely to be biased by dierent
individuals. Now we combine these two models using a time-varying scalar weight w (0, 1)
[28]
to form the actual mutation of the new DE as a weighted mean of the local and the global
components:
V
i
(t) = w G
i
(t) + (1 w) L
i
(t). (2.42)
The weight factor varies linearly with time as follows:
w = w
min
+ (w
max
w
min
) iter (2.43)

Where iter is the current iteration number, MAXIT is the maximum number of iterations
allowed and w
max
, w
min
denotes, respectively, the maximum and minimum value of the weight,
with wmax, wmin (0, 1). Thus the algorithm starts at iter = 0 with w = wmin but as iter
increases towards MAXIT, w increases gradually and ultimately when iter = MAXIT w reaches
wmax. Therefore at the beginning, emphasis is laid on the local mutation scheme, but with time,
contribution from the global model increases. In the local model attraction towards a single
point of the search space is reduced, helping DE avoid local optima. This feature is essential at
the beginning of the search process when the candidate vectors are expected to explore the
search space vigorously. Clearly, a judicious choice of wmax and wmin is necessary to strike a
balance between the exploration and exploitation abilities of the algorithm. After some
experimenting, it was found that wmax = 0.8 and wmin = 0.4 seem to improve the performance
of the algorithm over a number of benchmark function


[29]
CHAPTER -3
ADAPTIVE SYSTEM IDENTIFICATION
USING GA
3.1 INTRODUCTION:
Generally the identification of linear system is performed by using LMS algorithm. But most of
the dynamic systems exhibit nonlinearity. The LMS based technique [31] does not perform
satisfactory to identify nonlinear system. To improve the identification performance of
nonlinear systems various techniques such as Artificial Neural Network (ANN) [32], Functional
Link Artificial Neural Network (FLANN) [33], Radial Basis Function (RBF) [34], etc.
In this chapter we propose a novel adaptive model based on GA technique for identification of
nonlinear systems. To apply GAs in systems identification, each individual in the population
must represent a model of the plant and the objective becomes a quality measure of the model,
by evaluating its capacity of predicting the evolution of the measured outputs. The measured
output predictions, inherent to each individual i, is compared with the measurements made on
the real plant. The obtained error is a function of the individuals quality. As less is this error, as
more performing the individual is. There are many ways in which the GAs can be used to solve
system identification tasks.
3.2. BASIC PRINCIPLE OF ADAPTIVE SYSTEM
IDENTIFICATION:
An adaptive filter can be used in modeling that is, imitating the behavior of physical dynamic
systems which may be regarded as unknown black boxes having one or more inputs and
outputs. Modeling a single input, single output dynamic system is shown in fig(3).Noise is taken
into consideration because in many practical cases the system to be modeled is noisy, that is,
has internal random disturbing forces. Internal system noise appears at the system output and is
commonly represented there as an additive noise. This noise is generally uncorrelated with the
plant input. If this is the case and if the adaptive model is an adaptive linear combiner whose
[30]
weights are adjusted to minimize mean square error, it can be shown that the least squares
solution will be unaffected by the presence of plant noise. This is not to say that the
convergence of the adaptive process will be unaffected by system noise, only that the expected
weight vector of the adaptive model after convergence will be unaffected. The least square
solution will be determined primarily by the impulse response of the system to be modeled. It
could also be significantly affected by the statistical or spectral character of the system input
signal.







Fig.3.1 Modeling the single input, single output System..
The problem of determining a mathematical model for an unknown system by observing its
input-output data is known as system identification, Which is performed by suitably
adjusting the parameters within a given model, such that for a particular input, the model output
matches with the corresponding actual system output .After a system is identified, the output
can be predicted for a given input to the system which is the goal of system identification
problem. When the plant behavior is completely unknown it may be characterized using certain
adaptive model and then its identification task is carried out using adaptive algorithms like the
[31]
Adaptive model
Unknown System

Adaptive Algorithm
+
-
x

noise
e
y
LMS. The system identification task is at the heart of numerous adaptive filtering applications.
We list several of these applications here.
Channel Identification
Plant Identification.
Echo Cancellation for long distance transmission.
Acoustic Echo Cancellation
Adaptive Noise Cancellation.
Fig .4 represents a schematic diagram of system identification of time invariant, causal discrete
time dynamic plant The output of the plant is given by y = p(x) where x is the input which is
uniformly bounded function of time .the operator p describes the dynamic plant . The objective
of identification problem is to construct model generating an output which approximate the
plant output y when subjected to the same input x so that the squared error (e2) is minimum .

Fig.3.2 schematic block diagram of a GA based adaptive identification system
In this chapter the modeling is done in an adaptive manner such that after training the model
iteratively y and become almost equal and the squared error becomes almost zero. The
minimization of error in an iterative manner is usually achieved by LMS or RLS methods which
[32]
y
-
x

noise
e
+
Adaptive model
System P(x)

GA Based Adaptive
Algorithm
are basically derivative based. The shortcoming of this method is that for certain type of plant
the squared error cannot be optimally minimized due to error surface falling to local minima. In
this chapter we propose a novel and elegant method which employs Genetic algorithm for
minimizing the squared error in a derivative free manner. In essence, in this chapter the system
identification problem is viewed as a squared error minimization problem.
The adaptive modeling constitutes two step. In the first step the model is trained using GA
based updating technique. After successful training of the model system performance is carried
out by feeding zero mean uniformly distributed random input. Before we proceed to the
identification task using GA let us discuss the basics of GA based optimization.
3.3. DEVELOPMENT OF GA BASED ALGORITHEM
FOR SYSTEM IDENTIFICATION:
Referring to Fig.3.2 let the system p(x) be an FIR system represent by the transfer function
given by
p(z)=a
0
+a
1
z
-1
+a
2
z
-2
+a
3
z
-3
+.+a
n
z
-n
(3.1)
Where a
0
, a
1
, a
2
an represent the impulse response (parameter) of the system . The
measurement noise of the system is given by n(k) which is assumed to be white and Gaussian
distributed . The input x is also uniformly distributed white noise lying between -23 to +23
and have a variance of unity. The GA based model consists of an equal order FIR system with
unknown coefficients. The purpose of the adaptive identification model is to estimate the
unknown coefficients
0
,
1
,
2
,...
n
such that they match with the corresponding
parameters a
0
, a
1
,a
2
,a
n
of the actual system p(z) . if the system is exactly
identified(theoretically) then in case of a linear system (for example the FIR system ) the system
parameters and the model parameters become equal i.e. a
0
=
0
, a
1
=
1
, a
2
=
2
a
n
=
n.
Also the response of actual system(y) coincides with the response of the model system ().
However, in case of nonlinear dynamic system the system parameters do not match but the
responses of the system will match.
The updating of the parameters of the model is carried out using GA rule as outlined in the
following steps
[33]
I. As shown in fig.3.2 an unknown static dynamic system to be identified is connected is
parallel with an adaptive model to be developed using GA.
II. The coefficients () of the system are initially chosen from population of M
chromosomes. Each chromosome constitutes NL number of random binary bits, each
sequential group of L-bits represent one coefficient of the adaptive model, where N is
the number of parameters of the model.
III. Generate k(=500) number of input signal samples each of which is having zero mean
and uniformly distributed between -23 to +23 and having a variance of unity.
IV. Each of the input samples is passed through the plant P(Z) and the contaminated with
the additive noise of known strength .The resultant signal acts like the desired signal . in
this way k number of desired signals are produced by feeding all the k input samples.
V. Each of the input sample is also passed through the model using each chromosome as
model parameters and M sets of K estimated output are obtained.
VI. Each of the desired output is compared with corresponding estimated output and K
errors are produced. The mean square error (MSE) for set of parameters (corresponding
to m
th
chromosome) is determined by using relation.

1
2
( )
k
i
i
MSEn
k
e
=
=

(3.2)
This is repeated for M times
VII. Since the objective is to minimize MSE(m),m=1 to M the GA based optimization is
used.
VIII. The tournament selection, crossover, mutation and selection operator are sequentially
carried out following the steps as given in section-3.3.
[34]
IX. In each generation the minimum MSE, MMSE is obtained and plotted against
generation to show the learning characteristics.
X. The learning process is stopped when MMSE reaches the minimum level.
XI. At this step all the chromosomes attend almost identical genes, which represent the
estimated parameters of the developed model.
3. 4. SIMULATION STUDIES:
To demonstrate the performance of the proposed GA based approach numerous simulation
studied are carried out of several linear and non linear system. The performance of the proposed
structure is compared with corresponding LMS structure.
The block diagram shown in the Fig.3.2is used for simulation study
Case-1 (Linear System)
A unit variance random system uniform signal lying in the range of -2 3 to +23
is applied to known the system having transfer function.
Experiment-1: H (z) =0.2090+ 0.9950Z
-1
+ 0.2090 Z
-2
and
Experiment-2: H (z) =0.2600 + 0.9300 Z
-1
+ 0.2600 Z
-2
The output of the system is contaminated with white Gaussian noise of different strengths of -20
db and -30db. The resultant signal y is used as the desired on the training signal. The same
random input is also applied to the GA based adaptive model having the same linear combiner
structure as that of H (z) but with random initial weights. The coefficients or weights of the
linear combiner are updated using LMS algorithm as well as the proposed GA based algorithm.
The training become complete when MSE in dB become parallel to x- axis. Under this
condition, for a linear system, the parameter a
i
s match with the corresponding estimated
parameter
i
s from the proposed system.
[35]
In Table -3.1 we represent actual and estimated parameter of a 3-tap linear combiner obtained
by the LMS as well as GA models. From this table it is observed that the GA based model
performs better than that of LMS based models under different noise conditions.

Experiment
Actual
Parameter
Estimated parameters
LMS Based GA Based
NSR = -30
dB
NSR = -20
dB
NSR = -30
dB
NSR = -20
dB
01
0.2090 0.2092 0.2064 0.2100 0.2061
0.9950 0.9941 1.0094 0.9943 0.9985
0.2090 0.2071 0.2153 0.2077 0.2077
02
0.2600 0.2631 0.2705 0.2582 0.2566
0.9300 0.9308 0.9289 0.9301 0.9342
0.2600 0.2563 0.2624 0.2598 0.2598
[36]
Table-3.1 comparison of actual and estimated parameters of LMS and GA based models
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000
-30
-25
-20
-15
-10
-5
0
M
S
E

I
N

d
B
NUMBER OF ITERATIONS(SAMPLES)
CH:[0.2090,0.9950,0.2090],NL=0
NSR=-20dB
NSR=-30dB
Fig.3.3 Learning Chacteristics of LMS based Linear System Identification (Experiment-1)
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000
-30
-25
-20
-15
-10
-5
0
M
S
E

I
N

d
B
NUMBER OF ITERATIONS(SAMPLES)
CH:[0.2600,0.9300,0.2600],NL=0
NSR=-30dB
NSR=-20dB
Fig.3.4 Learning Chacteristics of LMS based Linear System Identification (Experiment-2)
[37]
0 10 20 30 40 50 60 70 80 90 100
-35
-30
-25
-20
-15
-10
-5
0
Generation
M
e
a
n


s
q
u
a
r
e

e
r
r
o
r

i
n

d
B
CH:[0.2090,0.9950,0.2090],NL=0
NSR=-30dB
NSR=-20dB
Fig.3.5 Learning Characteristics of GA based Linear System Identification (Experiment-1)
0 10 20 30 40 50 60 70 80 90 100
-35
-30
-25
-20
-15
-10
-5
0
Generation
M
e
a
n


s
q
u
a
r
e

e
r
r
o
r

i
n

d
B
CH:[0.2600,0.9300,0.2600],NL=0
NSR=-20dB
NSR=-30dB
Fig.3.6 Learning Characteristics of GA based Linear System Identification (Experiment-2)
[38]
Case-2 (Non-Linear System)

In this simulation the actual is assume to be non linear in nature .Computer simulation result of
two different nonlinear system are presented in this case the actual system
Experiment -3: y
n
(k) = tanh{y (k)}
Experiment -4: y
n
(k) = y (k) + 0.2y
2
(k) 0.1y
3
(k)
Where y (k) is the output of the linear system and y
n
(k) is the output of nonlinear system .
In case of nonlinear system the parameter of two system do not match ,however the responses of
the actual and adaptive model match .To demonstrate this observation training carried out using
both LMS and GA based algorithm .
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000
-30
-25
-20
-15
-10
-5
0
M
S
E

I
N

d
B
NUMBER OF ITERATIONS(SAMPLES)
CH:[0.2090,0.9950,0.2090],NL:y=tanh(y)
NSR=20dB
NSR=-30dB
Fig.3.7 Learning Chacteristics of LMS based Non Linear System Identification
(Experiment-3)
[39]
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000
-25
-20
-15
-10
-5
0
M
S
E

I
N

d
B
NUMBER OF ITERATIONS(SAMPLES)
CH:[0.2090,0.9950,0.2090],NL:y=y+0.2*(y
2
)-0.1(y
3
)
NSR=-30dB
NSR=-20dB
Fig.3.8 Learning Chacteristics of LMS based Non Linear System Identification
(Experiment-4)

0 100 200 300 400 500 600
-30
-25
-20
-15
-10
-5
0
Generation
M
e
a
n


s
q
u
a
r
e

e
r
r
o
r

i
n

d
B
CH:[0.2090,0.9950,0.2090],NL:y=tanhy
NSR=-30dB
NSR=-20db
Fig.3.9 Learning Chacteristics of GA based Non Linear System Identification (Experiment-3)
[40]
0 100 200 300 400 500 600
-25
-20
-15
-10
-5
0
Generation
M
e
a
n


s
q
u
a
r
e

e
r
r
o
r

i
n

d
B
CH:[0.2090,0.9950,0.2090],NL:y=y+0.2*(y.
2
)=0.1*(y.
3
)
NSR=-30dB
NSR=-20dB
Fig.3.10 Learning Chacteristics of GA based Non Linear System Identification
(Experiment-4)
0 5 10 15 20 25 30 35 40 45 50
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
o
u
t

p
u
t
CH:[0.2090,0.9950,0.2090],NL:y=tanhy
Actual
GA
LMS
Fig.3.11 Comparision of Output response of (Experiment-3) at -30dBNSR.
[41]
0 5 10 15 20 25 30 35 40 45 50
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
o
u
t

p
u
t
CH:[0.2090,0.9950,0.2090],NL:y=y+0.2*(y.
2
)-0.1*(y.
3
)
Actual
GA
LMS

Fig.3.12 Comparison of Output response of (Experiment-4) at -30dBNSR.
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000
-30
-25
-20
-15
-10
-5
0
M
S
E

I
N

d
B
NUMBER OF ITERATIONS(SAMPLES)
CH:[0.2600,0.9300,0.2600],NL:y=tanh(y)
NSR=-20dB
NSR=-30dB
Fig.3.13 Learning Chacteristics of LMS based Non Linear System Identification
(Experiment-3)
[42]

0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000
-25
-20
-15
-10
-5
0
M
S
E

I
N

d
B
NUMBER OF ITERATIONS(SAMPLES)
CH:[0.2600,0.9300,0.2600],NL:y=y+0.2*(y.
2
)-0.1*(y.
3
)
NSR=-30dB
NSR=-20dB

Fig.3.14 Learning Chacteristics of LMS based Non Linear System Identification
(Experiment-4)
0 100 200 300 400 500 600
-30
-25
-20
-15
-10
-5
0
Generation
M
e
a
n


s
q
u
a
r
e

e
r
r
o
r

i
n

d
B
CH:[0.2600,0.9300,0.2600],Nl;y=tanhy
NSR=-30dB
NSR=-20dB
Fig.3.15 Learning Chacteristics of GA based Non Linear System Identification
(Experiment-3)
[43]
0 100 200 300 400 500 600
-25
-20
-15
-10
-5
0
Generation
M
e
a
n


s
q
u
a
r
e

e
r
r
o
r

i
n

d
B
CH:[0.2600,0.9300,0.2600],NL:y=y+0.2*(y.
2
)-0.1*(y.
3
)
NSR=-30dB
NSR=-20dB
Fig.3.16 Learning Chacteristics of GA based Non Linear System Identification
(Experiment-4)
0 5 10 15 20 25 30 35 40 45 50
-0.4
-0.3
-0.2
-0.1
0
0.1
0.2
0.3
0.4
0.5
0.6
o
u
t

p
u
t
CH:[0.2600,0.9300,0.2600],NL:y=tanhy
Actual
GA
LMS
Fig.3.17 Comparison of Output response of (Experiment-3) at -30dBNSR.
[44]
0 5 10 15 20 25 30 35 40 45 50
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
o
u
t

p
u
t
CH:[0.2600,0.9300,0.2600],NL:y=y+0.2*(y.
2
)-0.1*(y.
3
)
Actual
GA
LMS
Fig.3.18 Comparison of Output response of (Experiment-4) at -30dBNSR
The MSE plots of experiment-3 and experiment-4 followed by experiment-1 for two different
noise conditions for LMS based algorithm are obtained by simulation and shown in Fig.3.7
&3.8 respectively .The corresponding plots for the same system for GA based model are shown
in Fig.3.9 &3.10 respectively. The comparison of output responses of the two nonlinear models
using LMS and GA techniques are shown in Fig.3.11 &3.12 respectively. Similarly the MSE
plots of experiment-3 and experiment-4 followed by experiment-2 for two different noise
conditions for LMS based algorithm are obtained by simulation and shown in Fig.3.13 &3.14
respectively .The corresponding plots for the same system for GA based model are shown in
Fig.3.15 &3.16 respectively. The comparison of output responses of the two nonlinear models
using LMS and GA techniques are shown in Fig.3.17 &3.18 respectively. Similar results are
also observed in case of other non linear models and under various noise conditions.
[45]
3.5. RESULTS AND DISCUSSIONS:
Table-1 reveals that for FIR linear system the coefficients of adaptive model using LMS are
matched closely with the coefficients of actual system in comparison with GA.Hence for linear
FIR system LMS works well.
For nonlinear system the learning characteristics of LMS technique is poor (Fig.9) for both
noise cases. But the same is much improved in case of GA (Fig.11).
The output response of nonlinear system (Experiment-3) of GA is better than the LMS counter
part because of GA is closer to the desired response (Fig.13).
[46]
CHAPTER-4
ADAPTIVE CHANNEL EQUALIZATION
USING GENETIC ALGORITHM.
4.1 INTRODUCTION:
The digital communication system suffers from the problem of ISI which essentially deteoriates
the accuracy of reception. The probability of error at the receiver can be minimized and can be
reduced to an acceptable level by introducing an equalizer at the front end of the receiver. An
adaptive digital channel equalizer is essentially an inverse system of the channel model which
primarily compacts the effect of ISI. Conventially the LMS algorithm is employed to design and
develop adaptive equalizers [35]. Such equalizers use gradient based weight update algorithm
and therefore there is a possibility that during training of the equalizers its weight do not attain
to their optimal values due to the MSE being trapped to local minimum. On the other hand the
GA and DE are derivative free technique and hence the local minima problem does not arise
during weight updates. The present chapter has developed a novel GA based adaptive channel
equalizer.
4.2 BASIC PRINCIPLE OF CHANNEL EQUALIZATION:
In an ideal communication channel, the received information is identical to that transmitted.
However, this is not the case for real communication channels, where signal distortions take
place. A channel can interfere with the transmitted data through three types of distorting eects.
Power degradation and fades, multi-path time dispersions and background thermal noise [36].
Equalization is the process of recovering the data sequence from the corrupted channel samples.
A typical baseband transmission system is depicted in Fig.4.1, where an equalizer is
incorporated within the receiver.
[47]
Fig. 4.1. A Baseband Communication System
4.2.1 MULTIPATH PROPAGATION:
Within telecommunication channels multiple paths of propagation commonly occur. In practical
terms this is equivalent to transmitting the same signal through a number of separate channels,
each having a dierent attenuation and delay. Consider an open-air radio transmission channel
that has three propagation paths, as illustrated in Fig4.2. These could be direct, earth bound and
sky bound.
Multipath interference between consecutively transmitted signals will take place if one signal is
received whilst the previous signal is still being detected. In Fig4.1 this would occur if the
symbol transmission rate is greater than 1/ where, represents transmission delay. Because
bandwidth eciency leads to high data rates, multi-path interference commonly occurs.
[48]
Input
Out put Transmitter
Filter
Channel
Medium
+
Receiver
Filter
EQUALISER
noise
Fig.4.2 Impulse Response of a transmitted signal in a channel which has 3
modes of propagation, (a) The signal transmitted paths, (b) The received samples
4.2.2 MINIMUM & NON MINIMUM PHASE CHANNELS:
When all the roots of the H(Z) lie within the unit circle, the channel are termed minimum phase.
The inverse of a minimum phase [37] channel is convergent, illustrated by (4.1)
[49]
Sky Bound
Direct
Earth Bound
Transmitter Receiver
Multiple Transmission Paths
(a)
Signal Strength
at Receiver
Direct
Earth Bound
Sky Bound

(b)

1
1
1.0 0.5
( )
( )
1
1
1.0 0.5
1
( )
2
0
1 2 3
1 0.5 0.25 0.125 ..............
Z
H z
H z
z
i i
Z
i
Z Z Z

-
-

- - -
- + - +
(4.1)
Whereas the inverse of non-minimum phase channels are not convergent, as shown in (4.2).
1
1
1.0 0.5
( )
( )
1.0 0.5
1
.[ ( ) ]
2
0
2 3
.[1 0.5 0.25 0.125 ]
Z
H z
H z
Z
Z
i i
Z Z
i
Z Z Z Z

-
+

-
-

- + -

(4.2)
Since equalizers are designed to invert the channel distortion process they will in eect model
the channel inverse. The minimum phase channel has a linear inverse model therefore a linear
equalization solution exists. However, limiting the inverse model to m-dimensions will
[50]
approximate the solution and it has been shown that non-linear solutions can provide a superior
inverse model in the same dimension.
A linear inverse of a non-minimum phase channel does not exist without incorporating time
delays. A time delay creates a convergent series for a non-minimum phase model, where longer
delays are necessary to provide a reasonable equalizer. (4.3) describes a non-minimum phase
channel with a single delay inverse and a four sample delay inverse. The latter of these is the
more suitable form for a linear lter.

1
1 1
0.5 1.0
( )
( )
1
1 0.5
1
2 3 4
1 0.5 0.25 0.125 .........
( )
3 2 1
0.5 0.25 0.125 ........
.( )
.( )
Z Z
H z
H z
Z
Z Z Z Z
H z
Z Z Z Z
noncausal
truncatedandcausal

- -

-
- + - +

- - -
- + - +

(4.3)
The three-tap non-minimum phase channel H (z) = 0.3410+0.8760z
1
+0.3410z
2
is used
throughout this thesis for simulation purposes. A channel delay, D, is included to assist in the
classication so that the desired output becomes u(n D).
4.2.3 INTERSYMBOL INTERFERENCE:
Inter-symbol interference (ISI) has already been described as the overlapping of the transmitted
data. It is dicult to recover the original data from one channel sample dimension because there
is no statistical information about the multipath propagation. Increasing the dimensionality of
the channel output vector helps characterize the multipath propagation. This has the aect of not
only increasing the number of symbols but also increases the Euclidean distance between the
[51]
output classes.
Fig. 4.3 Interaction between two neighboring symbols
When additive Gaussian noise, is present within the channel, the input sample will form
Gaussian clusters around the symbol centers. These symbol clusters can be characterized by a
probability density function (PDF) with a noise variance 2. where the noise can cause the
symbol clusters to interfere. Once this occurs, equalization ltering will become inadequate to
classify all of the input samples. Error control coding schemes can be employed in such cases
but these often require extra bandwidth.
4.4.4 SYMBOL OVERLAP:
The expected number of errors can be calculated by considering the amount of symbol
interaction, assuming Gaussian noise. Taking any two neighboring symbols, the cumulative
distribution function (CDF) can be used to describe the overlap between the two noise
characteristics. The overlap is directly related to the probability of error between the two
symbols and if these two symbols belong to opposing classes, a class error will occur.
Fig4.3 shows two Gaussian functions that could represent two symbol noise distributions. The
Euclidean distance, L, between symbol canters and the noise variance
2
s
can be used in the
[52]
L

2
Area of overlap =
Probability of error
cumulative distribution function of (4.4) to calculate the area of overlap between the two
symbol noise distributions and therefore the probability of error, as in (4.5).

1
2
2
exp
2
2
( )
x
CDF dx x
s
s





=


-
-
(4.4)

( ) 2
2
L
P c CDF





=
(4.5)
Since each channel symbol is equally likely to occur the probability of unrecoverable errors
occurring in the equalization space can be calculated using the sum of all the CDF overlap
between each opposing class symbol. The probability of error is more commonly described as
the BER. (4.6) describes the BER based upon the Gaussian noise overlap, where NSP is the
number of symbols in the positive class, Nm is the number of number of symbols in the negative
class and
i
D
is the distance between the Ith positive symbol and its closest neighboring symbol
in the negative class.

2
( ) log ( )
2
1
N
sp
i
BER CDF
n
N N
i sp m n
s
s








D
=

+
=
(4.6)
[53]
4.3 CHANNEL EQUALIZATION:
The inverse model of a system having an unknown transfer function is itself a system having a
transfer function which is in some sense a best fit to the reciprocal of the unknown transfer
function. Sometimes the inverse model response contains a delay which is deliberately
incorporated to improve the quality of the fit. In Fig. 4.4, a source signal s(n) is fed into an
unknown system that produces the input signal x(n) for the adaptive filter. The output of the
adaptive filter is subtracted from a desired response signal that is a delayed version of the source
signal, such that
( ) ( ) n n d s = - D
Where is a positive integer value. The goal of the adaptive filter is to adjust its characteristics
such that the output signal is an accurate representation of the delayed source signal.
There are many applications of adaptive inverse model of a system. If the system is a
communication channel then the inverse model is an adaptive equalizer which compensates the
effects of inter symbol interference (ISI) caused due to restriction of channel bandwidth [38].
Similarly if this system is the models of a high density recording medium then its corresponding
inverse model reconstruct the recorded data without distortion [39]. If the system represents a
nonlinear sensor then its inverse model represents a compensator of environmental as well as
inherent nonlinearities [40]. The adaptive inverse model also finds applications in adaptive
control [41] as well as in deconvolution in geophysics application [42]
[54]
Fig. 4.4: Inverse Modeling
Channel equalization is a technique of decoding of transmitted signals across non ideal
Communication channels. The transmitter sends a sequence s(n) that is known to both the
transmitter and receiver. However, in equalization, the received signal is used as the input
Signal x(n) to an adaptive filter, which adjusts its characteristics so that its output closely
matches a delayed version
( ) n S - D
of the known transmitted signal. After a suitable
adaptation period, the coefficients of the system either are fixed and used to decode future
transmitted messages or are adapted using a crude estimate of the desired response signal that is
computed from y(n) . This latter mode of operation is known as decision-directed adaptation.
Channel equalization is one of the first applications of adaptive filters and is described in the
pioneering work of Lucky. Today, it remains as one of the most popular uses of an adaptive
filter. Practically every computer telephone modem transmitting at rates of 9600 bits per second
or greater contains an adaptive equalizer. Adaptive equalization is also useful for wireless
communication systems. Qureshi [43] has written an excellent tutorial on adaptive equalization.
A related problem to equalization is deconvolution, a problem that appears in the context of
geophysical exploration.
[55]
System/Plant/Channel

Adaptive Filter

Delay
Update Algorithm

(n)
+
+
x(n) y(n)
+
e(n)
S(n)
-
In many control tasks, the frequency and phase characteristics of the plant hamper the
convergence behavior and stability of the control system. We can use an adaptive filter shown
in Fig. 4.4 to compensate for the nonideal characteristics of the plant and as a method for
adaptive control. In this case, the signal s(n) is sent at the output of the controller, and the signal
x(n) is the signal measured at the output of the plant. The coefficients of the adaptive filter are
then adjusted so that the cascade of the plant and adaptive filter can be nearly represented by the
pure delay z
-
.
Transmission and storing of high density digital information plays an important role in the
present age of information technology. Digital information obtained from audio, video or text
sources needs high density storage or transmission through communication channels.
Communication channels and recording medium are often modeled as band-limited channel for
which the channel impulse response is that of an ideal low pass filter. When sequences of
symbols are transmitted recorded, the low pass filtering of the channel distorts the transmitted
symbols over successive time intervals causing symbols to spread and overlap with adjacent
symbols. This resulting linear distortion is known as inter symbol interference. In addition
nonlinear distortion is also caused by cross talk in the channel and use of amplifiers. In the data
storage channel, the binary data is stored in the form of tiny magnetized regions called bit cells,
arranged along the recording track. At read back, noise and nonlinear distortions (ISI) corrupt
the signal. An ANN based equalization technique has been proposed to alleviate the ISI present
during read back from the magnetic storage channel. Recently, Sun et al [44] have reported an
improved Vitoria detector to compensate the nonlinearities and media noise. Thus adaptive
channel equalizers play an important role in recovering digital information from digital
communication channels/storage media. Preparta had suggested a simple and attractive scheme
for dispersal recovery of digital information based on the discrete Fourier transform.
Subsequently Gibson et al have reported an efficient nonlinear ANN structure for reconstructing
digital signal which has passed through a dispersive channel and corrupted with additive noise.
In a recent publication the authors have proposed optimal preprocessing strategies for perfect
reconstruction of binary signals from a dispersive communication channels. Tourietal have
developed deterministic worst case framework for perfect reconstruction of discrete data
transmission through a dispersive communication channel. In recent past, new adaptive
equalizers have been suggested using soft computing tools such as artificial neural network
[56]
ADDER
ADDER
(ANN), polynomial perception network (PPN) and the functional link artificial neural network
(FLANN). It is reported that these methods are best suited for nonlinear and complex channels.
Recently, Chebyshev artificial neural network has also been proposed for nonlinear channel
equalization [45]. The drawback of these methods is that the estimated weights may likely fall
to local minima during training. For this reason genetic algorithm (GA) [46] and Differential
evolution [19] has been suggested for training adaptive channel equalizers. The main attraction
of GA lies in the fact that it does not rely on Newton like gradient-descent methods, and hence
there is no need for calculation of derivatives. This makes them less likely to be trapped in local
minima. But only two parameters of GA, the crossover and the mutation, help to avoid local
minima problem.
4.3.1 TRANSVERSAL EQUALIZER:
The transversal equalizer uses a time-delay vector, Y (n) (4.7), of channel output samples to
determine the symbol class. The {m} TE notation used to represent the transversal equalizer
species m inputs. The equalizer lter output will be classied through a threshold activation
device (Fig4.5) so that the equalizer decision will belong to one of the BPSK states u(n) {1,
+1}
Y (n) = [y (n), y (n 1)... y (n (m 1))] (4.7)
Considering the inverse of the channel H (z) = 1.0 + 0.5z
1
that was given in (4.2), this is an
innitely long convergent linear series:
1 1
( )
( ) 2
1
m
i i
Z
H z
i
-
-
=
=
Each coefficient of this inverse
model can be used in a linear equalizer as a FIR tap weight. Each tap-dimension will improve
the accuracy; however, high input dimensions leave the equalizer susceptible to noisy samples.
If a noisy sample is received, this will remain within the filter affecting the output from each
equalizer tap. Rather than designing a linear equalizer, a non-linear filter can be used to provide
the desired performance that has a shorter input dimension; this will reduce the sensitivity to
noise.
[57]
ADDER
ADDER
Fig 4.5: Linear Transverse Equalizer
4.3.2 DECISION FEEDBACK EQUALIZER:
A basic structure of the decision feedback equalizer (DFE) is shown in Fig4.6.The DFE consists
of a transversal feed forward and feedback lter. In the case when the communication channel
causes severe ISI distortion, the LTE could not be provide satisfactory performance. Instead, a
DFE is required. The DFE uses past corrected samples, w(n), from a decision device to the
feedback lter and
[58]
Z
-1
Z
-1
Z
-1
Z
-1
Z
-1


w
0
w
1
w
2
w
3
w
4
w
5
w
6
x(n)
(n)
ADDER
+
+
-
(n)
(n)
Decision
Device
Feed forward Filter
Feedback Filter
c(n)
Z
-1
Z
-1
Z
-1
Z
-1


y(n)
y(n-1)
y(n-2) y(n-3) y(n-4)
c(n-1) c(n-2)
c(n-3)
c(n-4)
ADDER
(n)
(n)
Equalizer
output
Receiving
Figure 4.6: Decision Feedback Equalizer
Combines with the feed forward lter. In eect, the function of the feedback lter is to subtract
the ISI produced by previously detected symbols from the estimates of future samples. Consider
that the DFE is updated with a recursive algorithm the feed forward lter weights and feedback
lter weights can be jointly adapted by the LMS algorithm on a common error signal e(n) as
shown in (4.8).
W(n+1) = W(n) + e
^
(n)V(n) (4.8)
where e(n) = u(n) y(n) and V (n) = [x(n), x(n 1), ..., x(n k
1
1), u(n k
2
l), ...u(n)]
T
.
The feed forward and feedback lter weight vectors are written in a joint vector as W (n) =
[w
0
(n), w
1
(n), ..., w
k1+k21
(n)]
T
. k
1
and k
2
represent the feed forward and feedback lter tap
lengths respectively. Suppose that the decision device causes an error in estimating the symbol
u(n). This error can propagate into subsequent symbols until the future input samples
compensate for the error. This is called the error propagation which will cause a burst of errors .
The detrimental potential of error propagation is the most serious drawback for decision
feedback equalization. Traditionally, the DFE is described as being a non-linear equalizer
because the decision device is non-linear. However, the DFE structure is still a linear combiner
[59]
and the adaptation loop is also linear. It has therefore been described as a linear equalizer
structure
4.4. EQUALIZATION USING GA:
High speed data transmission over communication channels distorts the transmitted signals in
both amplitude and phase due to presence of Inter Symbol Interference (ISI). Other impairments
like thermal noise, impulse noise and cross talk also cause further distortions to the received
symbols. Adaptive equalization of the digital channels at the receiver removes/reduces the
eects of such ISIs and attempts to recover the transmitted symbols. Basically an equalizer is a
lter which is placed in cascade with the transmitter and receiver with the aim to have an
inverse transfer function of that of the channel in order to augment accuracy of reception. The
Least-Mean-Square (LMS), Recursive-Least-Square (RLS) and Multilayer perceptron (MLP)
based equalizers aim to minimize the ISI present in the channels particularly for nonlinear
channels. However they suer from long training time and undesirable local minima during
training. Again the disadvantages or drawbacks of these derivative based algorithms have been
discussed in Chapter-3. In the present chapter we propose a new adaptive channel equalizer
using Genetic Algorithm (GA) optimization technique which is essentially a derivative free
optimization tool. This algorithm has been suitably used to update the weights of the equalizer.
The performance of the proposed equalizer has been evaluated and has been compared with its
LMS based counter part. However being a population based algorithm, the standard Genetic
Algorithm (SGA) suers from slower convergence rate.
4.4.1 FORMULATION OF CHANNEL EQUALIZATION PROCESS AS
AN OPTIMIZATION PROBLEM:
[60]
An adaptive channel equalizer is basically an adaptive tape-delay digital filter with its order
higher than that of channel filter .A typical diagram of a channel equalizer is shown in Fig.4.7.
At any k
th
instant the equalizer output is given by

1
0
( ) ( ). ( )
N
n
Y k x n k hk
-
=
= +

(4.8)
Where N is order of the equalizer filter. The desired signal d(k) at the k
th
instant is formed by
delaying the input sequence x(n+k) by m samples .In actual practice m is usually taken as
2
N
or
1
2
N +
depending upon N is odd or even . That is d(k) =x(n+k-m)
In the beginning of training the initial weights h
n
(0), n=0,1,,N-1 are taken to be
random values within certain bound . Subsequently these weights are updated by GA based
adaptive rules. The error signal e(k), at the k
th
instant is given by
( ) ( ) ( ) ek d k Y k = -
(4.9)
In LSM type learning algorithm e
2
(k) instead of e(k) is taken as the cost function for deriving
the steepest descent algorithm because e
2
(k) always positive and represents the instantaneous
power of difference signal.
In adaptive equalizers the parameters which can vary during training are filter weights. The
objective of an adaptive algorithm is to change the filter weights iteratively so that e
2
(k) is
minimized iteratively and subsequently reduced to zero . In developing the GA based
algorithm a set of chromosomes within a bound are selected, each representing the weight
vector of the equalizer
[61]
Initialize the population
(Random set of filter weights
of the equalizer)
Fig.4.7 A 8-tap Adaptive digital channel equalizer.
[62]
Initialize the population
(Random set of filter weights
of the equalizer)
Z
-1
Z
-1
Z
-1
Z
-1
Z
-1
Z
-1
Z
-1

x(7+k)
h
0
(k)
x(6+k) h
1
(k)
x(5+k)
)
h
2
(k)
x(4+k) h
3
(k)
x(3+k) h
4
(k)
x(2+k)
h
5
(k)
x(1+k)
h
6
(k)
x(k) h
7
(k)

y(k)
+
-
d(k)
e(k)
Adaptive
Algorithm
Z
-m
Random
Binary
input
. Then GA starts from the initial random strings to proceed repeatedly from generation to
generation through three genetic operators. The selection procedure reproduces highly fitted
individuals who provide minimum mean square error (MMSE) at the equalizer output. A flow
chart of a genetic based adaptive algorithm for channel equalization is shown in fig.4.8.
Fig.4.8. A flow chart of genetic based adaptive algorithm for channel equalizer.
[63]
Initialize the population
(Random set of filter weights
of the equalizer)
Create a new generation through
crossover and mutation operators.
Evaluate the fitness for a
whole population.
(MSE of the equalizer)
Apply selection
(sort based on decreasing
MSE)
Terminate
(When
MMSE is
reached)
Stop
No
Yes
4.4.2. STEPWISE REPRESENTATION OF GA BASED CHANNEL
EQUALIZATION ALGORITHM:
The updating of the weights of the GA based equalizer is carried out using GA rule as outlined
in the following steps:
1. As shown in fig 4.8 is a GA based adaptive equalizer connected in series with channel.
2. The structure of equalizer is a FIR system whose coefficients are initially chosen
from a population of M chromosomes. Each chromosome constitutes NL number of
random binary bits, each sequential bits group of L bits represents one coefficient of the
adaptive model, where N is the number of parameter of the model .
3. Generally K (1000) number of input signal which are random binary in nature
4. Each of the input samples is passed through the channel and then contaminated with the
additive noise of known strength . The resultant signal is passed through the equalizer.
In this K number of desired signal are produced by feeding all the K input sample.
5. Each of the input signal is delayed which acts as desired signal.
6. Each of the desired output is compared with corresponding channel output and K error
are produced . The mean square error (MSE) for a given group of parameter
(corresponding to nth chromosome ) is determine by using the relation

2
1
( )
k
i
i
MSE n
k
e

=
=
.This is repeated for N times.
7. Since the objective is to minimize MSE(n), n=1 to N the GA based optimization is
used .
8. The crossover, mutation and selection operator are sequentially carried out.
9. In each generation the minimum MSE, MMSE (expressed in dB ) is stored which show
the learning behavior of the adaptive model from generation to generation .
10. When the MMSE has reached a pre- specific level the optimization is stopped.
11. At this step all the chromosomes attend almost identical genes, which represent the
desired filter coefficients of the equalizer.
4.5. SIMULATIONS:
[64]
In this section we carry out the simulation study of the new channel equalizer. The block
diagram of Fig.4.7 is simulated where the channel coefficients are adapted based on the
LMS and GA. The algorithm proposed in section-4.4 is used in the simulation for GA. The
four different channel (2 linear and 2 non linear) and the additive noise in the channel are
-30 dB and -20dB are used for simulation.
The following channel models are used
a. Linear channel coefficients
(i) CH1: [0.2090, 0.9950, 0.2090]
(ii) CH2: [0.3040, 0.9030, 0.3040]
b. Nonlinear Channels
(i) NCH1: ( ) ( ) ( )
2 3
( ) 0.2 0.1 bk a k a k a k = + -
(ii) NCH2: ( ) ( ) ( ) ( ) ( )
2 3
0.2 0.1 0.5cos ( ) b k a k a k a k a k p = + - +
Where
( ) a k
is the output of linear channel and ( ) bk is the output of Non linear channel.
The desired signal is generated by delaying the input binary sequence by m samples where
2
N
m= or
1
2
N +
depending upon N is even or odd. In the simulation study N=8 has been
taken. The convergence characteristics and bit error (BER) plots are obtained from
simulation for different channel in different noisy conditions using LMS and GA are shown
in the following figures.
[65]
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000
-60
-50
-40
-30
-20
-10
0
No. of iteration/generation
M
S
E

i
n

d
B
CH:[0.2090,0.9950,0.2090],NL=0
Fig.4.9 Plot of convergence characteristic of linear channel CH1 at -30dB using LMS
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000
-40
-35
-30
-25
-20
-15
-10
-5
0
No. of iteration/generation
M
S
E

i
n

d
B
CH:[0.3040,0.9030,0.3040],NL=0
Fig.4.10 Plot of convergence characteristic of linear channel CH2 at -30dB using LMS
[66]
0 100 200 300 400 500 600
-18
-16
-14
-12
-10
-8
-6
-4
-2
0
No. of generation
M
S
E

i
n

d
B
CH:[0.2090,0.9950,0.2090],NL=0


NSR=-20dB
NSR=-30dB
Fig.4.11 Plot of convergence characteristic of linear channel CH1 using GA
0 100 200 300 400 500 600
-15
-10
-5
0
No. of generation
M
S
E

i
n

d
B
CH:[0.3040,0.9030,0.3040],NL=0


NSR=-20dB
NSR=-30dB
Fig.4.12 Plot of convergence characteristic of linear channel CH2 using GA
[67]
2 4 6 8 10 12 14 16 18 20
10
-5
10
-4
10
-3
10
-2
10
-1
SNR in dB
p
r
o
b
a
b
i
l
i
t
y

o
f

e
r
r
o
r
CH:[0.2090,0.9950,0.2090],NL=0
GA
LMS
Fig.4.13 Comparision of BER of linear channel CH1 between LMS and GA based
equalizer at -30dB noise.
5 10 15 20 25
10
-4
10
-3
10
-2
10
-1
SNR in dB
p
r
o
b
a
b
i
l
i
t
y

o
f

e
r
r
o
r
CH:[0.3040,0.9030,0.3040],NL=0
LMS
GA
Fig.4.14 Comparison of BER of linear channel CH2 between LMS and GA based
equalizer at -30dB noise.
[68]
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000
-30
-25
-20
-15
-10
-5
0
No. of iteration/generation
M
S
E

i
n

d
B
CH:[0.2090,0.9950,0.2090],NL:y=y+0.2*(y.
2
)-0.1*(y.
3
)
Fig.4.15 Plot of convergence characteristic of Non Linear channel NCH1 for CH1
using LMS.
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000
-25
-20
-15
-10
-5
0
No. of iteration/generation
M
S
E

i
n

d
B
CH:[0.3040,0.9030,0.3040],NL:y=y+0.2*(y.
2
)-0.1*(y.
3
)
Fig.4.16 Plot of convergence characteristic of Non Linear channel NCH1 for CH2
using LMS.
[69]
0 100 200 300 400 500 600
-16
-14
-12
-10
-8
-6
-4
-2
0
No. of generation
M
S
E

i
n

d
B
CH:[0.2090,0.9950,0.2090],NL:y=y+0.2*(y.
2
)-0.1*(y.
3
)


NSR=-20dB
NSR=-30dB
Fig.4.17 Plot of convergence characteristic of Non Linear channel NCH1 for CH1
using GA.
0 100 200 300 400 500 600
-14
-12
-10
-8
-6
-4
-2
0
No. of generation
M
S
E

i
n

d
B
CH:[0.3040,0.9030,0.3040],NL:y=y+0.2*(y.
2
)-0.1*(y.
3
)


NSR=-20dB
NSR=-30dB
Fig.4.18 Plot of convergence characteristic of Non Linear channel NCH1 for CH2
using GA
[70]
2 4 6 8 10 12 14 16 18 20 22
10
-4
10
-3
10
-2
10
-1
SNR in dB
p
r
o
b
a
b
i
l
i
t
y

o
f

e
r
r
o
r
CH:[0.2090,0.9950,0.2090],NL:y=y+0.2*(y.
2
)-0.1*(y.
3
)
LMS
GA
Fig.4.19 Comparison of BER of linear channel NCH1 for CH1 between LMS and GA
based equalizer at -30dB noise.
0 5 10 15 20 25 30
10
-5
10
-4
10
-3
10
-2
10
-1
10
0
SNR in dB
p
r
o
b
a
b
i
l
i
t
y

o
f

e
r
r
o
r
CH:[0.3040,0.9030,0.3040],NL:y=y+0.2*(y.
2
)-0.1*(y.
3
)
LMS
GA
Fig.4.20 Comparison of BER of linear channel NCH1 for CH2 between LMS and GA
based equalizer at -30dB noise.
[71]
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000
-20
-18
-16
-14
-12
-10
-8
-6
-4
-2
0
No. of iteration/generation
M
S
E

i
n

d
B
CH:[0.2090,0.9950,0.2090],NL:y=y+0.2*(y.
2
)-0.1*(y.
3
)+0.5*cos(pi*y)
Fig.4.21 Plot of convergence characteristic of Non Linear channel NCH2 for CH1
using LMS
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000
-20
-18
-16
-14
-12
-10
-8
-6
-4
-2
0
No. of iteration/generation
M
S
E

i
n

d
B
CH:[0.3040,0.9030,0.3040],NL:y=y+0.2*(y.
2
)-0.1*(y.
3
)+0.5*cos(pi*y)
Fig.4.22 Plot of convergence characteristic of Non Linear channel NCH2 for CH2
using LMS.
[72]
0 100 200 300 400 500 600
-15
-10
-5
0
No. of generation
M
S
E

i
n

d
B
CH:[0.2090,0.9950,0.2090],NL:y=y+0.2*(y.
2
)-0.1*(y.
3
)+0.5*cos(pi*y)


NSR=-30dB
NSR=-20dB
Fig.4.23 Plot of convergence characteristic of Non Linear channel NCH2 for CH1
using GA.
0 100 200 300 400 500 600
-12
-10
-8
-6
-4
-2
0
No. of generation
M
S
E

i
n

d
B
CH:[0.3040,0.9030,0.3040],NL:y=y+0.2*(y.
2
)-0.1*(y.
3
)+0.5*cos(pi*y)


NSR=-30dB
NSR=-20dB
Fig.4.24 Plot of convergence characteristic of Non Linear channel NCH2 for CH2
using GA.
[73]
2 4 6 8 10 12 14 16 18 20 22
10
-5
10
-4
10
-3
10
-2
10
-1
SNR in dB
p
r
o
b
a
b
i
l
i
t
y

o
f

e
r
r
o
r
CH:[0.2090,0.9950,0.2090],NL:y=y+0.2*(y.
2
)-0.1*(y.
3
)+0.5*cos(pi*y)
LMS
GA
Fig.4.25 Comparison of BER of Non linear channel NCH2 for CH1 between LMS and GA
based equalizer at -30dB noise.
2 4 6 8 10 12 14 16
10
-2
10
-1
SNR in dB
p
r
o
b
a
b
i
l
i
t
y

o
f

e
r
r
o
r
CH:[0.3040,0.9030,0.3040],NL:y=y+0.2*(y.
2
)-0.1*(y.
3
)+0.5*cos(pi*y)
GA
LMS
Fig.4.26 Comparison of BER of Non linear channel NCH2 for CH2 between LMS and GA
based equalizer at -30dB noise.
[74]
4.6 RESULTS AND DISCUSSIONS:
The convergence characteristics is obtained from simulation are shown in Fig.4.9 & 4.11 using
LMS and GA respectively for Linear Channel a(i), in Fig.4.10 & 4.12 using LMS and GA
respectively for Linear Channel a(ii).Similarly the bit error rate (BER) plot for channel a(i) is
shown in Fig.4.13 and for channel a(ii) is shown in Fig.4.14.
The convergence characteristics for channel a(i & ii) and b(i ) are shown in Fig.4.14 &4.15
using LMS ,in Fig.4.16 & 4.17 using GA. Similarly the BER plot for channel a(i & ii) and b(i)
are shown in Fig.4.18 & 4.19.
The convergence characteristics for channel a(i & ii) and b(ii ) are shown in Fig.4.20 &4.21
using LMS ,in Fig.4.22 & 4.23 using GA. Similarly the BER plot for channel a(i & ii) and b(i)
are shown in Fig.4.24 & 4.25.
It is observed from the convergence characteristics and BER plot that the GA based equalizers
outperform the corresponding LMS counterparts. This is true for both linear and nonlinear
channels. Under high noise conditions the results of GA based equalizers are distinctly better.
.
[75]
CHAPTER-5
ADAPTIVE SYSTEM IDENTIFICATION
USING DIFFERENTIAL EVOLUTION
5.1. INTRODUCTION:
The identification of linear and nonlinear system are performed in chapter-3 by using LMS
and GA technique .From that chapter we conclude that Linear system is performed well by
using LMS technique and Nonlinear system is performed well by using GA technique. But GA
have more converging time, more computational complexity, required binary bits. To improve
the identification performance of nonlinear systems various techniques such as ANN, FLANN,
RBF, etc are used.
In this chapter we propose a novel model based on DE technique for identification.DE is an
efficient and powerful population based stochastic search technique for solving optimization
problems over continuous space, which has been widely applied in many scientific and
engineering fields. However, the success of DE in solving a specific problem crucially depends
on appropriately choosing trial vector generation strategy and their associated control parameter
values.
5.2. DE BASED OPTIMIZATION
The DE is based on the mechanics of natural selection and the evolutionary behavior of
biological system. It has been successfully applied to diverse fields such as mechanical
Engineering, Communication, pattern recognition. In DE, there exist many trial vector
generation strategies out of which a few may be suitable for solving a particular problem. The
three crucial control parameters involved in DE are population size (NP), scaling factor (F), and
crossover rate (CR) may significantly influence the optimization performance of the DE. Fig.
5.1 shows the basic operation of DE.
[76]
Fig.5.1 Block Diagram of Differential Evolution Algorithm cycle.
5.2.1 OPERATORS OF DE:

(a) Population:
Parameter vectors in a population are randomly initialized and evaluated using the
fitness function. The initial NP D-dimensional parameter vectors, so-called individuals
is
{ }
1
,
, ,
,......,
D
i G
i G i G
X x x =
, i = 1,2,.,NP
Where NP is number of population vectors ,D is dimension and G is generation.

(b) Mutation:

The mutation operation produces a mutant or noisy vector V
i,G
with respect to each
individual X
i,G
,so- called target vector, in the current population. For each target vector Xi,G

at
[77]
Population/
initialisation
Mutation Crossover
Selection
Fitness
Yes
No
nN
the generation G,its associated mutant vector
{ }
1 2
,
, , ,
, ,........,
D
i G
i G i G i G
V v v v =
can be generated
via certain mutation strategy,i,e

1 2 3
,
, , ,
.
i i i
i G
r G r G r G
V X F X X





= + -
(5.1)

Where
1 2 3
, ,
i i i
r r r
are mutually exclusive integers randomly generated within the range [1, NP],
which are also different from the index i.
F is a positive control parameter for scaling the difference vector called as scaling parameter.
(c) Crossover operation:
After mutation crossover operation is applied to each pair of the target vector X
i,G
and its
corresponding mutant vector V
i,G
to generate a trial vector:
1 2
,
, , ,
, ,........,
D
i G
i G i G i G
U u u u




=

(5.2)
In the basic version, DE employs the binomial (uniform) crossover defined as follows:
(5.3)
Where J=1, 2..D,
The crossover rate (CR) is a user specified constant within the range [0, 1),
which controls the fraction of parameter values copied from the mutant vector.
And j
rand
is a randomly chosen integer in the range [1, D).

(d)Selection Operation:
In selection operation the objective function values of each trial vector f (U
i,G
) is compared to
that of its corresponding target vector f (X
i,G
) in the current population. If the trial vector has
less or equal objective function value than the corresponding target vector, the trial vector will
[78]
,
,
( [ ,1) ) ( )
,
,
,
j rand
if rand o CR or j j
j
i G
j
i G
otherwise
v
j
i G
x
u

=
=
replace the target vector and enter the population of the next generation. Otherwise, the target
vector will remain in the population for the next generation. The selection operation can be
expressed as follows:
,
,
,
, ,
, 1
,
( ) ( )
i G
i G
U if
i G i G
i G
X
f U f X
otherwise
X

+

=
(5.4)
The steps (b,c,d) are repeated generation after generation until some specific termination criteria
are satisfied.
5.3. STEPWISE REPRESENTATION OF DE BASED
ADAPTIVE SYSTEM IDENTIFICATION
ALGORITHM:
i. As shown in fig.3.2 an unknown static dynamic system to be identified is connected is
parallel with an adaptive model to be developed using DE.
ii. The coefficients () of the system are initially chosen from population of NL target
vector. Each target vector constitutes D number of random Number, each random
Number represent one coefficient of the adaptive model, where D is the number of
parameters of the model.
iii. Generate k(=500) number of input signal samples each of which is having zero mean
and uniformly distributed between -23 to +23 and having a variance of unity.
iv. Each of the input samples is passed through the plant P(Z) and the contaminated with
the additive noise of known strength .The resultant signal acts like the desired signal . in
this way k number of desired signals are produced by feeding all the k input samples.
v. Each of the input samples is also passed through the model using each target vector as
model parameters and NP sets of K estimated output are obtained.
[79]
vi. Each of the desired output is compared with corresponding estimated output and K
errors are produced. The mean square error (MSE) for set of parameters (corresponding
to NP
th
target vector) is determined by using relation.
1
2
( )
k
i
i
MSEn
k
e
=
=

(5.5)
This is repeated for NP times
vii. Since the objective is to minimize MSE(m),m=1 to NP the GA based optimization is
used.
viii. The mutation operation, crossover operation and selection operation are sequentially
carried out following the steps as given in section-5.2.
ix. In each generation the minimum MSE, MMSE is obtained and plotted against
generation to show the learning characteristics.
x. The learning process is stopped when MMSE reaches the minimum level.
xi. At this step all the individuals attend almost identical parameters, which represent the
estimated parameters of the developed model.
5.4. SIMULATION STUDIES:
To demonstrate the performance of the proposed DE based approach numerous simulation
studied are carried out of several linear and non linear system. The performance of the proposed
structure is compared with corresponding LMS,GA structure.
The block diagram shown in the fig.3.2 is used for simulation study.
Case-1 (Linear System)
[80]
A unit variance random system uniform signal lying in the range of -2 3 to +23
is applied to known the system having transfer function.
Experiment-1: H (z) =0.2090+ 0.9950Z
-1
+ 0.2090 Z
-2
and
Experiment-2: H (z) =0.2600 + 0.9300 Z
-1
+ 0.2600 Z
-2
The output of the system is contaminated with white Gaussian noise of different strengths of -20
db and -30db. The resultant signal y is used as the desired on the training signal. The same
random input is also applied to the DE based adaptive model having the same linear combiner
structure as that of H (z) but with random initial weights. By adjusting scaling factor (f) and
crossover rate (CR), it has been seen that in linear system the actual and estimated parameters
are same.
Case-2 (Non-Linear System)
In this simulation the actual is assume to be non linear in nature .Computer simulation result of
two different nonlinear system are presented in this case the actual system
Experiment -3: y
n
(k) = tanh{y (k)}
Experiment -4: y
n
(k) = y (k) + 0.2y
2
(k) 0.1y
3
(k)
Where y (k) is the output of the linear system and y
n
(k) is the output of nonlinear system.
In case of nonlinear system the parameter of two system do not match ,however the responses of
the actual and adaptive model match .To demonstrate this observation training carried out using
DE based algorithm .

[81]
0 10 20 30 40 50 60 70 80 90 100
-30
-25
-20
-15
-10
-5
0
M
S
E

i
n

d
B
generation
CH:[0.2090,0.9950,0.2090],NL:y=tanhy
NSR=-30dB
NSR=-20dB
Fig.5.2. Learning Characteristics of DE based Non linear system identification at
-20dBNSR and -30dBNSR (experiment-3)
[82]
0 5 10 15 20 25 30 35 40 45 50
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
o
u
t
p
u
t
CH:[0.2090,0.9950,0.2090],NL:y=y+0.2*(y.
2
)-0.1*(y.
3
)
Actual
DE
LMS
Fig.5.3. Comparison of output response of (Experiment-3) at -3odB NSR.
0 10 20 30 40 50 60 70 80 90 100
-30
-25
-20
-15
-10
-5
0
M
S
E

i
n

d
B
generation
CH:[0.2600,0.9300,0.2600],NL:y=tanhy
NSR=-30dB
NSR=-20dB
Fig.5.4. Learning Characteristics of DE based Non linear system identification at
-20dBNSR and -30dBNSR (experiment-3)
[83]
0 5 10 15 20 25 30 35 40 45 50
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
p
l
o
t
CH:[0.2600,0.9300,0.2600],NL:y=tanhy
Actual
DE
LMS
Fig.5.5. Comparison of output response of (Experiment-3) at -3odB NSR.
5.5. RESULTS AND DISCUSSIONS:
Fig.5.2 & 5.4 shows the learning characteristics of Non linear channel Experiment-3 followed
by Experiment -1 and Experiment-2 respectively.
Fig.5.3 & 5.5 shows the output response of Non linear channel Experiment-3 followed by
Experiment -1 and Experiment-2 respectively.
The output response of nonlinear system (experiment-3) of DE based is better than the LMS
based and GA based because of response of DE is closer to the desired response by comparing
the Fig.3.13 with Fig.5.3 and Fig.3.19 with Fig.5.5.
[84]

CHAPTER-6
ADAPTIVE CHANNEL EQUALIZATION
USING DIFFERENTIAL EVOLUTION
6.1. INTRODUCTION:
The equalization of linear and nonlinear system are performed in chapter-3 by using LMS and
GA technique .From that chapter we conclude that Linear system is performed well by using
LMS technique and Nonlinear system is performed well by using GA technique. But GA has
more converging time, more computational complexity, required binary bits. To improve the
equalization performance of nonlinear systems various techniques such as ANN, FLANN, RBF,
etc are used.
[85]
In this chapter we propose a novel model based on DE technique for equalization.DE is an
efficient and powerful population based stochastic search technique for solving optimization
problems over continuous space, which has been widely applied in many scientific and
engineering fields. However, the success of DE in solving a specific problem crucially depends
on appropriately choosing trial vector generation strategy and their associated control parameter
values.
6.2. STEPWISE PRESENTATION OF DE BASED
CHANNEL EQUALIZATION ALGORITHM
The updating of the weights of the DE based equalizer is carried out using GA rule as outlined
in the following steps:
i. As shown in fig 4.8 is a DE based adaptive equalizer connected in series with channel.
ii. The structure of equalizer is a FIR system whose coefficients are initially chosen
from a population of NP target vectors. Each target vector constitutes D number of
random binary number. Each random number represents one coefficient of the adaptive
model, where D is the number of parameter of the model.
iii. Generally K (1000) number of input signal which are random binary in nature
iv. Each of the input samples is passed through the channel and then contaminated with the
additive noise of known strength. The resultant signal is passed through the equalizer. In
this K number of desired signal are produced by feeding the entire K input sample.
v. Each of the input signal is delayed which acts as desired signal.
vi. Each of the desired output is compared with corresponding channel output and K error is
produced. The mean square error (MSE) for a given group of parameter (corresponding
to nth chromosome ) is determine by using the relation

2
1
( )
k
i
i
MSE n
k
e

=
=
.This is repeated for NP times.
[86]
vii. Since the objective is to minimize MSE(n), n=1 to NP the DE based optimization is used
viii. The mutation, crossover and selection operator are sequentially carried out following the
steps as given in section 5.2.
ix. In each generation the minimum MSE, MMSE (expressed in dB) is stored which show
the learning behavior of the adaptive model from generation to generation.
x. When the MMSE has reached a pre- specific level the optimization is stopped.
xi. At this step all the target vectors attend almost identical parameters, which represent the
desired filter coefficients of the equalizer.
6.3. SIMULATIONS:
In this section we carry out the simulation study of the new channel equalizer. The block
diagram of Fig.4.7 is simulated where the channel coefficients are adapted based on the
LMS, GA and DE. The algorithm proposed in section-4.4 is used in the simulation for DE.
The four different channel (2 linear and 2 non linear) and the additive noise in the channel
are -30 dB and -20dB are used for simulation.
The following channel models are used
b. Linear channel coefficients
(i) CH1: [0.2090, 0.9950, 0.2090]
(ii) CH2: [0.3040, 0.9030, 0.3040]
b. Nonlinear Channels
(i) NCH1: ( ) ( ) ( )
2 3
( ) 0.2 0.1 bk a k a k a k = + -
(ii) NCH2: ( ) ( ) ( ) ( ) ( )
2 3
0.2 0.1 0.5cos ( ) b k a k a k a k a k p = + - +
Where
( ) a k
the output of is linear channel and ( ) bk is the output of Non linear channel.
[87]
The desired signal is generated by delaying the input binary sequence by m samples where
2
N
m= or
1
2
N +
depending upon N is even or odd. In the simulation study N=8 has been
taken. The convergence characteristics and bit error (BER) plots are obtained from
simulation for different channel in different noisy conditions using LMS ,GA and DE are
shown in the following figures.
0 50 100 150 200 250 300 350 400 450 500
-12
-10
-8
-6
-4
-2
0
2
M
S
E

i
n

d
B
generation
CH:[0.3040,0.9030,0.3040],NL=0
Fig.6.1. plot for convergence characteristic of linear channel CH2 at -30dB NSR.
[88]
0 50 100 150 200 250 300 350 400 450 500
-14
-12
-10
-8
-6
-4
-2
0
2
M
S
E

i
n

d
B
generation
CH:[0.2090,0.9950,0.2090],NL:y=y+0.2*(y.
2
)-0.1*(y.
3
)
Fig.6.2. plot for convergence characteristic of Nonlinear channel NCH1 using CH1 at
-30dB NSR.
.
0 50 100 150 200 250 300 350 400 450 500
-10
-8
-6
-4
-2
0
2
M
S
E

i
n

d
B
generation
CH:[0.3040,0.9030,0.3040],NL:y=y+0.2*(y.
2
)-0.1*(y.
3
)
NSR=20db
NSR=-30dB
Fig.6.3. plot for convergence characteristic of Nonlinear channel NCH1 using CH2 at
-30dB and -20dB NSR.
[89]
2 4 6 8 10 12 14 16
10
-5
10
-4
10
-3
10
-2
10
-1
SNR in dB
p
r
o
b
a
b
i
l
i
t
y

o
f

e
r
r
o
r
CH:[0.3040,0.9030,0.3040],NL=0
DE
LMS
GA
Fig.6.4. Comparison of BER plot of linear channel CH2 between LMS, GA and DE at
-30dB NSR.
2 4 6 8 10 12 14 16 18 20 22
10
-5
10
-4
10
-3
10
-2
10
-1
SNR in dB
p
r
o
b
a
b
i
l
i
t
y

o
f

e
r
r
o
r
CH:[0.3040,0.9030,0.3040],NL:y=tanhy
LMS
GA
DE
Fig.6.5. Comparison of BER plot of Nonlinear channel NCH2 using CH2
between LMS, GA and DE at -30dB NSR.
[90]
0 5 10 15 20 25 30
10
-5
10
-4
10
-3
10
-2
10
-1
10
0
SNR in dB
p
r
o
b
a
b
i
l
i
t
y

o
f

e
r
r
o
r
CH:[0.3040,0.9030,0.3040],NL:y=y+0.2*(y.
2
)-0.1*(y.
3
)
DE
LMS
GA
.
Fig.6.6. Comparison of BER plot of Nonlinear channel NCH1 using CH2
between LMS, GA and DE at -30dB NSR.
2 4 6 8 10 12 14 16 18 20 22
10
-4
10
-3
10
-2
10
-1
SNR in dB
p
r
o
b
a
b
i
l
i
t
y

o
f

e
r
r
o
r
CH:[0.3040,0.9030,0.3040],NL:y=y+y*(y.
2
)-0.1*(y.
3
)
DE
GA
LMS
Fig.6.7. Comparison of BER plot of Nonlinear channel NCH1 using CH2
between LMS, GA and DE at -20dB NSR.
[91]
6.4 RESULTS AND DISCUSSIONS:
Fig.6.1 shows the convergence characteristic of linear channel CH2 at -30dB NSR.Fig.6.2 &6.3
shows the convergence characteristic of nonlinear channel NCH1 using CH1 at -30dB and using
CH2 at -30dB & -20dB respectively. Fig.6.4 & 6.5 shows the comparison of BER plot between
LMS, GA and DE of linear channel and Nonlinear channel NCH2 using CH2 at -30dB NSR
respectively. Fig.6.6 &6.7 shows the comparison of BER plot between LMS, GA and DE of
Nonlinear channel NCH1 using CH2 at -30dB and -20dB respectively. Using the same channels
and same the same noise conditions the corresponding results are obtained for the LMS, GA
based equalizers. These are used for comparison.
It is observed from the plots of Fig.6.4, 6.5, 6.6 &6.7 that the DE based equalizers outperform
the corresponding LMS and GA counterparts. This is true for both linear and nonlinear
channels.
CHAPTER-7
COCLUSIONS, REFERENCES AND SCOPE
FOR FUTURE WORK:
7.1 COCLUSIONS
The chapter-3 presents a novel GA based adaptive model identification of dynamic non linear
systems. The problem of non linear system identification has been formulated as an MSE
minimization problem. The GA is then successfully used in an iterative manner to optimize the
coefficients of linear and non linear adaptive models. It is demonstrated through simulations
that the proposed approach exbits superior performance compared to its LMS counterpart in
identifying both linear and non linear systems under various additive Gaussian noise conditions.
Thus GA is an useful alternative to the LMS algorithm for non linear system identification.
The chapter-4 proposes a novel adaptive digital channel equalizer using GA based optimization.
Through computer simulation it is shown that the GA based equalizer yield superior
[92]
performance compared to its LMS counterpart. This observation is true for both linear and
nonlinear channels.
The chapter-5 presents a novel DE based adaptive model identification of linear and dynamic
non linear systems. The problem of non linear system identification has been formulated as an
MSE minimization problem. The DE is then successfully used in an iterative manner to
optimize the coefficients of linear and non linear adaptive models. It is demonstrated through
simulations that the proposed approach exbits superior performance compared to its LMS and
GA counterpart in identifying both linear and non linear systems under various additive
Gaussian noise conditions. Thus DE is an useful alternative to the LMS or GA algorithm for
non linear system identification.
The chapter-6 proposes a novel adaptive digital channel equalizer using DE based optimization.
Through simulation it is shown that the DE based equalizers yield superior performance
compared to LMS and GA counterpart. This observation is true for both linear and nonlinear
channels.
7.2 REFERENCES:
[1] K. S. Narendra and K. Parthasarathy, Identification and control of dynamical systems using
neural networks, IEEE Trans. on Neural Networks, vol. 1, pp. 4-26, January 1990.
[2] J. C. Patra, A. C. Kot and G. Panda, An intelligent pressure sensor using neural networks,
IEEE Trans. on Instrumentation and Measurement, vol. 49, issue 4, pp. 829-834, Aug. 2000.
[3] M. Pachter and O. R. Reynolds, Identification of a discrete time dynamical system, IEEE
Trans. Aerospace Electronic System, vol. 36, issue 1, pp. 212-225, 2000.
[4] G. B. Giannakis and E. Serpedin, A bibliography on nonlinear system identification,
Signal Processing, vol. 83, no. 3, pp. 533-580, 2001.
[5] E. A. Robinson and T. Durrani, Geophysical Signal Processing, Prentice-Hall, Englewood
Cliffs, NJ, 1986.
[6] D. P. Das and G. Panda, Active mitigation of nonlinear noise processes using a novel
filtered-s LMS algorithm, IEEE Trans. on Speech and Audio Processing, vol. 12, issue 3,
pp. 313-322, May 2004.
[7] B. Widrow and S.D. Sterns, Adaptive Signal Processing Prentice-Hall, Inc. Engle-wood
Cliffs, New Jersey, 1985.
[8] G. J. Gibson, S. Siu and C. F. N. Cowan, The application of nonlinear structures to the
reconstruction of binary signals, IEEE Trans. signal processing, vol. 39, no. 8, pp. 1877-1884,
Aug. 1991.
[93]
[9] R. W. Lucky, Techniques for adaptive equalization of digital communication systems, Bell
Sys.Tech. J., 45, 255-286, Feb. 1966.
[10] H. Sun, G. Mathew and B. Farhang-Boroujeny, Detection techniques for high density
magnetic recording, IEEE Trans. on Magnetics, vol. 41, no. 3, pp. 1193-1199, March 2005.
[11] L. J. Griffiths, F. R. Smolka and L. D. Trenbly, Adaptive deconvolution : a new
technique for processing time varying seismic data, Geohysics, June 1977.
[12] B. Widrow, J. M. McCool, M. G. Larimore and C. R. Johnson, Jr., :Stationary and
nonstationary learning characteristics of the LMS adaptive filter, Proc. IEEE, vol. 64, no.8, pp.
1151-1162, Aug., 1976.
[13] B. Friedlander and M. Morf, Least-squares algorithms for adaptive linear phase filtering,
IEEE Trans., vol. ASSP-30, no. 3, pp. 381-390, June 1982.
[14] S. A. White, An adaptive recursive digital filter, Proc. 9th Asilomar Conf. Circuits Syst.
Comput., p. 21, Nov. 1975.
[15] John J. Shynk, Adaptive IIR filtering, IEEE ASSP Magazine, April 1989, pp. 4-21.
[16] A. E. Eiben and J. E. Smith, Introduction to Evolutionary Computing, Springer, 2003,
ISBN 3-540-40184-9.
[17] Andries Engelbrecht, Computational Intelligence : An introduction, Wiley & Sons,
ISBN 0-470-84870-7.
[18] D.E.Goldberg, Genetic algorithms in search, optimization and machine learning,
Addition-Wesley,1989.
[19]A.K.Qin, V.L.Huang, and P.N.Suganthan,"Differential Evolution Algorithm with Strategy
Adaptation for Global Numerical Optimization" IEEE Trans. On Evolution
computation,VOL.13,No.2,April.2009.
[20] Konar A (2005), Computational Intelligence: Principles, Techniques and Applications,
Springer, Berlin Heidelberg New York.
[21]. Holland JH (1975), Adaptation in Natural and Artificial Systems, University of Michigan
Press, Ann Arbor.
[22]. Goldberg DE (1975), Genetic Algorithms in Search, Optimization and Machine Learning,
Addison-Wesley, Reading, MA.
[23]. Kennedy J, Eberhart R and Shi Y (2001), Swarm Intelligence, Morgan Kaufmann, Los
Altos, CA.
[94]
[24]. Kennedy J and Eberhart R (1995), Particle Swarm Optimization, In Proceedings of IEEE
International Conference on Neural Networks, pp. 19421948.
[25]. Storn R and Price K (1997), Differential Evolution A Simple and Efficient
Heuristic for Global Optimization Over Continuous Spaces, Journal of Global Optimization,
11(4), 341359.
[26]. Venter G and Sobieszczanski-Sobieski J (2003), Particle Swarm Optimization, AIAA
Journal, 41(8), 15831589.
[27]. Yao X, Liu Y, and Lin G (1999), Evolutionary Programming Made Faster, IEEE
Transactions on Evolutionary Computation, 3(2), 82102.
[28]. Shi Y and Eberhart RC (1998), Parameter Selection in Particle Swarm Optimization,
Evolutionary Programming VII, Springer, Lecture Notes in Computer Science 1447, 591600.
[29]. Das S, Konar A, Chakraborty UK (2005), Particle Swarm Optimization with
aDifferentially Perturbed Velocity, ACM-SIGEVO Proceedings of GECCO 05,Washington
D.C., pp. 991998.
[30]. van den Bergh F (1999), Particle Swarm Weight Initialization in Multi-Layer Perceptron
Artificial Neural Networks, Development and Practice of Artificial Intelligence Techniques,
Durban, South Africa, pp. 4145.
[31] B.Widrow and S.D.Stearns, Adaptive Signal Processing, Chapter-6, pp.99-166,Second
Edition, Pearson.
[32] S. Chan, S.A. Billings and P.M. Grant, "Nonlinear System identification using neural
networks", Int. J.Contr.,Vol. 51,No.6, pp. 1191-1214, June 1990.
[33] J.C. Patra, R.N. Pal, B. N. Chatterji and G. Panda, "Idntification of nonlinear dynamic
systems using functional link artificial neural network", IEEE Trans. On Systems, Man and
Cybernetics-part B: Cybernetics, vol.29,no.2, pp. 254-262, April 1999.
[34] S.V.T Elanayar and Y.C. Shin, "Radial basis function naural network for approximation
and estimation of nonlinear stochastic dynamic systems", IEEE.Trans. Neural Networks, vol.5,
pp. 594-603, July 1994.
[35] C.A. Belfoior & J.H. Park, Jr. "Decission Feedback Equalization", Proc, vol. 67, pp. 1143-
1156, Aug 1979
[36] S.Siu, \Non-linear adaptive equalization based on multi-layer perceptron architecture,"
Ph.D. dissertation, University of Edinburgh, 1990.
[95]
[37] O.Macchi, Adaptive processing, the least mean squares approach with applications in
transmission. West Sussex. England: John Wiley and Sons,1995.
[38] R. W. Lucky, Techniques for adaptive equalization of digital communication systems,Bell
Sys.Tech. J., 45, 255-286, Feb. 1966.
[39] S. K. Nair and Jaekyun Moon, A theoretical study of linear and nonlinear equalization in
nonlinear magnetic storage channels, IEEE Trans. on neural networks, vol. 8, no. 5, pp. 1106-
1118, Sept. 1997.
[40] J. C. Patra, A. C. Kot and G. Panda, An intelligent pressure sensor using neural
networks, IEEE Trans. on Instrumentation and Measurement, vol. 49, issue 4, pp. 829-834,
Aug. 2000.
[41] B. Widrow and E. Walach, Adaptive Inverse Control, Prentice-Hall, Upper Saddle River, NJ,
1996.
[42] E. A. Robinson and T. Durrani, Geophysical Signal Processing, Prentice-Hall, Englewood
Cliffs, NJ, 1986.
[43] S. U. H Qureshi, Adaptive equalization, Proc. IEEE, 73(9), 1349-1387, Sept. 1985.
[44] H. Sun, G. Mathew and B. Farhang-Boroujeny, Detection techniques for high density
magnetic recording, IEEE Trans. on Magnetics, vol. 41, no. 3, pp. 1193-1199, March 2005.
[45] J. C. Patra, Wei Beng Poh, N. S. Chaudhari and Amitabha Das, Nonlinear channel
equalization with QAM signal using Chebyshev artificial neural network, Proc. Of
International joint conference on neural networks, Montreal, Canada, pp. 3214-3219, August
2005.
[46] G. Panda, B. Majhi, D. Mohanty, A. Choubey and S. Mishra, Development of Novel
Digital Channel Equalizers using Genetic Algorithms, Proc. of National Conference on
Communication (NCC-2006), IIT Delhi, pp.117-121, 27-29, January, 2006
7.3 FUTURE WORK:
Comparison of Adaptive system identification and equalization using IIR system between
GA and DE
[96]
[97]

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy