Franco Informatics in Control, Automation and Robotics
Franco Informatics in Control, Automation and Robotics
Volume 24
Joaquim Filipe · Jean-Louis Ferrier ·
Juan Andrade-Cetto (Eds.)
Informatics in Control,
Automation and Robotics
Selected Papers from the International
Conference on Informatics in Control,
Automation and Robotics 2007
123
Joaquim Filipe Juan Andrade Cetto
INSTICC Univ. Politecnica Catalunya
Av. D. Manuel I, 27A 2◦ Esq. Institut Robotica i
2910-595 Setubal Informatica Industrial
Portugal Llorens i Artigas, 4-6
jfilipe@insticc.org Edifici U
08028 Barcelona
Spain
cetto@cvc.uab.es
Jean-Louis Ferrier
Institut des Sciences et Techniques de
l’Ingénieur d’Angers (ISTIA)
Labo. d’Ingénierie des Systémes
Automatisés (LISA)
62 avenue Notre Dame du Lac
49000 Angers
France
ferrier@istia.univ-angers.fr
c Springer-Verlag Berlin Heidelberg 2009
This work is subject to copyright. All rights are reserved, whether the whole or part of the material is
concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting,
reproduction on microfilm or in any other way, and storage in data banks. Duplication of this publication
or parts thereof is permitted only under the provisions of the German Copyright Law of September 9,
1965, in its current version, and permission for use must always be obtained from Springer. Violations are
liable to prosecution under the German Copyright Law.
The use of general descriptive names, registered names, trademarks, etc. in this publication does not imply,
even in the absence of a specific statement, that such names are exempt from the relevant protective laws
and regulations and therefore free for general use.
9 8 7 6 5 4 3 2 1
springer.com
Preface
The present book includes a set of selected papers from the fourth “International
Conference on Informatics in Control Automation and Robotics” (ICINCO 2007),
held at the University of Angers, France, from 9 to 12 May 2007. The conference was
organized in three simultaneous tracks: “Intelligent Control Systems and
Optimization”, “Robotics and Automation” and “Systems Modeling, Signal
Processing and Control”. The book is based on the same structure.
ICINCO 2007 received 435 paper submissions, from more than 50 different countries
in all continents. From these, after a blind review process, only 52 where accepted as
full papers, of which 22 were selected for inclusion in this book, based on the
classifications provided by the Program Committee. The selected papers reflect the
interdisciplinary nature of the conference. The diversity of topics is an important
feature of this conference, enabling an overall perception of several important
scientific and technological trends. These high quality standards will be maintained
and reinforced at ICINCO 2008, to be held in Funchal, Madeira - Portugal, and in
future editions of this conference.
Conference Co-chairs
Jean-Louis Ferrier, University of Angers, France
Joaquim Filipe, Polytechnic Institute of Setúbal / INSTICC, Portugal
Program Co-chairs
Juan Andrade Cetto, Institut de Robòtica i Informàtica Industrial, CSIC-UPC, Spain
Janan Zaytoon, CReSTIC, URCA, France
Organising Committee
Paulo Brito, INSTICC, Portugal
Marina Carvalho, INSTICC, Portugal
Helder Coelhas, INSTICC, Portugal
Andreia Costa, INSTICC, Portugal
Bruno Encarnação, INSTICC, Portugal
Vítor Pedrosa, INSTICC, Portugal
Programme Committee
Eugenio Aguirre, Spain Ruth Bars, Hungary
Arturo Hernandez Aguirre, Mexico Karsten Berns, Germany
Frank Allgower, Germany Robert Bicker, UK
Fouad AL-Sunni, Saudi Arabia Stjepan Bogdan, Croatia
Bala Amavasai, UK Patrick Boucher, France
Francesco Amigoni, Italy Alan Bowling, USA
Yacine Amirat, France Edmund Burke, UK
Nicolas Andreff, France Kevin Burn, UK
Stefan Andrei, Singapore Clifford Burrows, UK
Plamen Angelov, UK Luis M. Camarinha-Matos, Portugal
Luis Antunes, Portugal Marco Campi, Italy
Peter Arato, Hungary Marc Carreras, Spain
Helder Araújo, Portugal Jorge Martins de Carvalho, Portugal
Gustavo Arroyo-Figueroa, Mexico Alicia Casals, Spain
Marco Antonio Arteaga, Mexico Alessandro Casavola, Italy
Vijanth Sagayan Asirvadam, Malaysia Christos Cassandras, USA
Nikos Aspragathos, Greece Riccardo Cassinis, Italy
Robert Babuska, The Netherlands Raja Chatila, France
VIII Conference Committee
Auxiliary Reviewers
Rudwan Abdullah, UK Xevi Cufi, Spain
Luca Baglivo, Italy Sérgio Reis Cunha, Portugal
Prasanna Balaprakash, Belgium Paul Dawson, USA
João Balsa, Portugal Mahmood Elfandi, Libya
Alejandra Barrera, Mexico Michele Folgheraiter, Italy
Frederik Beutler, Germany Diamantino Freitas, Portugal
Alecio Binotto, Brazil Reinhard Gahleitner, Austria
Nizar Bouguila, Canada Nils Hagge, Germany
Dietrich Brunn, Germany Onur Hamsici, USA
Maria Paola Cabasino, Italy Renato Ventura Bayan Henriques,
Joao Paulo Caldeira, Portugal Brazil
Aneesh Chauhan, Portugal Matthias Hentschel, Germany
Paulo Gomes da Costa, Portugal Marco Huber, Germany
Conference Committee XI
Invited Speakers
Dimitar Filev, The Ford Motor Company, USA
Mark W. Spong, University of Illinois at Urbana-Champaign, USA
Patrick Millot, Université de Valenciennes, France
Contents
Invited Papers
Patrick Millot
1 Introduction
In this field of research, the term, “machine”, refers not only to computers, but also to
diverse control devices in complex dynamic situations, such as industrial processes or
transportation networks. Human activities are mainly oriented toward decision-
making, including monitoring and fault detection, fault anticipation, diagnosis and
prognosis, and fault prevention and recovery. The objectives of this decision-making
are related to human-machine system performance (production quantity and quality)
as well as to overall system safety.
In this context human operators may have a double role: a negative role in that
operators may perform unsafe or erroneous actions affecting the process, and a
positive role in that they are able to detect, prevent or recover an unsafe process
4 P. Millot
The influence of the human role and the degree of human involvement on overall
human-machine system performance (production, safety) has been studied since the
early 1980s. Sheridan [1] defined the well-known degrees of automation and their
consequences: at one extreme, in fully manual controlled systems, safety depends
entirely on the human controller’s reliability; at the other extreme, fully automated
systems eliminate the human operator from the supervision and control loop, which
can lead to a lack of vigilance and a loss of skill, preventing operators from assuming
responsibility for the system and, consequently, making system safety almost totally
dependent on technical reliability. Between the two extremes, there is an intermediate
solution consisting of establishing supervisory control procedures that will allow task-
sharing between the human operators and the automated control systems. In addition,
dedicated assistance tools (e.g., DSS, or Decision Support Systems) can be introduced
into the supervision and control loop in order to enhance the human ability to apply
the right decision and/or to manage the wrong decisions.
Technical failures and human errors generally increase with the size and/or the
complexity of the system (i.e., the number of interconnections between the controlled
variables and their degree of interconnection). For instance, large systems, such as
Toward Human-Machine Cooperation 5
Goal part
Level
Function
Level
} Cognitives
Sciences
Behaviour
whole
{
Level
Engineering
Sciences Component
Level
Means
INTERPRET
consequences for
system goals
(knowledge-based behavior)
Target
goal
INDENTIFY
(ANTICIPATE)
system state DEFINE
(evolution) task
Hypothesis
elaboration
and test
Rule-based behavior
SEARCH FORMULATED
for information explicity procedure
expectations
DETECT
EXECUTE
abnormal conditions Skill-based behavior
Reason divides human errors into two categories: non-intentional and intentional.
These categories are further sub-divided into slips and lapses for non-intentional
actions, and mistakes and violations for intentional decisions/actions, thus a total of 4
kinds of human errors. Violations differ from mistakes in that the decision-maker is
conscious of violating the procedure, with either negative intent (e.g., sabotage) or
positive intent (e.g., preventing an accident). Amalberti [8] tries to explain the
production of certain violations through the need for human operators to reach a
compromise solution for three joint, sometimes contradictory, objectives:
performance standards, imposed either by the organization or by the individual
operator; system and/or operator safety; and the cognitive and physiological costs of
attaining the first two objectives (e.g., workload, stress). For Rasmussen [9], these 3
dimensions are limited and they limit the field of human action. An action that crosses
this limit can lead to a loss of control, and subsequently, an incident or an accident.
Technical, organizational or procedural defenses can sometimes remedy faulty
actions or decisions. Thus, several risk analysis methods have been proposed for
detecting risky situations and providing such remedies [10], [11], [12], [13], [14].
Usually, risk management involves three complementary steps, which must be
foreseen when designing the system:
- Prevention: the first step is to prevent risky behaviors. Unexpected behaviors
should be forseen when designing the system, and technical, human and
organizational defenses should be implemented to avoid these behaviors (e.g., norms,
procedures, maintenance policies, supervisory control).
- Correction: if prevention fails, the second step allows these unexpected behaviors
to be detected (e.g., alarm detection system in a power plant) and corrected (e.g., fast
train brakes).
- Recovery: if the corrective action fails, an accident may occur. The third step
attempts to deal with the consequences of a failed corrective action, by intervening to
minimize the negative consequences of this accident (e.g., emergency care on the
road).
These three steps provide prevention, correction or recovery tasks that can be
performed, some by the human operators and some by the machine. The question is
then: “How should the tasks be shared between the human and the machine?”
Symbolic
Ends
Long-term
Human supervisor
}
Control High level Human-machine
Information Interaction
system (in the control room
Assistance or the work station)
requests Assistance
Supervision Computer
Nature of information
Abstraction level
Horizon
}
Local Control Local Control Local Control
Remote
Computer 1 Computer 2 Computer N interaction
with
subsystems
(continuous
process,
vehicle,
Subsystem 1 Subsystem 2 Subsystem 3 robot…)
Short term/
Real time
Means
Numerical
Fig. 3. Supervisory Control (adapted from Sheridan [15] and completed by Millot [16]).
specified in terms of their objectives, acceptable means (e.g., sensors, actuators) and
functions. (Section 3.2 examines this task-sharing process in more detail.)
- At this point, it is necessary to implement the automated processors for managing
future automated tasks and the human-machine interfaces that will facilitate future
human tasks.
- Finally, the entire system must be evaluated in terms of technical and ergonomic
criteria.
TA
TH
TH a Technical feasibility
criteria
Legend: TA: automatable tasks; TH: tasks that cannot be automated and must be performed by humans.
TAh: tasks that can be performed by both humans and machines. THa: tasks that cannot be performed by
machines or humans working alone.
Decision Support Systems (DSS) provide assistance that makes Human Operator
tasks easier and help prevent faulty actions. Both the DSS and the Human Operator
are called agents. Agents (either human or machine) can be modelled according to 3
classes of capabilities — Know-How, Know-How-to Cooperate, and Need-to-
Cooperate.
1) Know-How (KH) is applied to solve problems and perform tasks autonomously,
while acquiring problem solving capabilities (e.g., sources of knowledge,
processing abilities) and communicating with the environment and other agents
through sensors and control devices.
2) Know-How-to Cooperate (KHC) is a class of specific capabilities that is needed
for Managing Interferences between goals (MI) and for facilitating other agents’
goals (FG) with respect to the definition of cooperation given in the next section
[20].
3) Need-to-Cooperate (NC) is a new class combining [21]:
- the Adequacy of the agents’ personal KH (i.e., knowledge and processing
abilities) in terms of the task constraints.
- the Ability to perform the task (the human agents’ workload (WL) produced by
the task, perceptual abilities, and control abilities)
- the Motivation-to-Cooperate of the agents (motivation to achieve the task,
self-confidence, trust [22], confidence in the cooperation [23].
In a multi-disciplinary approach, drawing on research in cognitive psychology and
human engineering, we try to exploit these basic concepts and highlight the links
between them, in order to propose a method for designing cooperative human-
machine systems.
In the field of cognitive psychology, Hoc [6] and Millot & Hoc [20] have proposed
the following definition: “two agents are cooperating if 1) each one strives towards
goals and can interfere with the other, and 2) each agent tries to detect and process
such interference to make the other’s activities easier”.
From this definition, two classes of cooperative activities can be derived and
combined, they constitute know-how-to-cooperate (KHC) as defined by Millot [24],
[25]:
- The first activity, Managing Interference (MI), requires the ability to detect and
manage interferences between goals. Such interferences can be seen as positive (e.g.,
common goal or sub-goal) or negative (e.g., conflicts between goals or sub-goals or
about common shared resources).
- The second activity, Facilitating Goals (FG), requires the ability to make it easier
for other agents’ to achieve their goals.
Toward Human-Machine Cooperation 11
In an organization, agents play roles and thus perform tasks combining the different
activities needed to acquire and process information and make decisions. The
decisions may, or may not, result in actions. Defining the organization has often been
seen as a way to prevent or resolve decisional conflicts between agents, especially in
human engineering in which agents may be human or artificial DSS. This aspect is
also studied under the name of Distributed Artificial Intelligence. In terms of purely
structural organization, two generic structures exist: vertical (hierarchical) and
horizontal (heterarchical) [26], [27].
In the vertical structure (Fig.5), agent AG1 is at the upper level of the hierarchy
and is responsible of all the decisions. If necessary, it can call upon agent AG2, which
can give advice.
Production
Human Operator
Automated
Decisions
Process
KNOW-HOW
Objectives
. Orders
+_ . Assistance Know-How to
requests Cooperate
Advices
DSS
KNOW-HOW
In the horizontal structure (Fig. 6), both agents are on the same hierarchical level
and can behave independently if their respective tasks are independent. Otherwise,
they must manage the interferences between their goals using their MI and FG
abilities.
Task allocatorControl
KNOW-HOW-TO COOPERATE
Production
Objectives Human
+_ Operator Automated
Process
KNOW-HOW
Machine :
KNOW-HOW
Let us consider an agent AGx that acquires know-how Khx and know-how-to-
cooperate KHCx, and is within a structure. The objective is to specify KHCx using
MIx and FGx in the different cooperative situations that can be encountered (or built).
This can be done by adapting and using the generic typology of cooperative forms
proposed by Schmidt [29]: augmentative, debative, integrative.
4.4.1 Augmentative
4.4.2 Debative
Cooperation is debative when agents have similar know-how and are faced with a
single task T that is not divided into STi. Each agent solves the task and then debates
the results (or the partial results) with the other agents. Conflicts can arise and the
KHC must allow these conflicts to be solved through explanations based on previous
partial results along the problem-solving pathway and on a common frame of
reference, for instance [20].
Toward Human-Machine Cooperation 13
Before task execution, each agent’s KH is related to its ability to acquire the task
context and build a plan (i.e., establish a goal, sub-goals & means). Each agent’s
KHC consists of acquiring the other agents’ KH either by inferring the other agents’
KH models or by asking the other agents for their KH. These inferences and/or
requests are part of MI capabilities. The other agents’ responses to these requests
constitute FG capabilities.
After task execution (complete or partial), each agent transmits its own results to
the others, receives results from the other agents and compares them to its own
results. In addition to MI (asking for results from others) and FG (transmitting its own
results) capabilities, this process requires that agents have specific competencies for
understanding the others’ results, comparing them to its own results, and deciding
whether or not to agree with the others’ results. These competences are all included in
MI.
In case of conflict, each agent must be able to ask for explanations (e.g., the other
agent’s view of the task context, its partial results, its goal and/or sub-goals) in order
to compare these explanations with its own view-point and to decide whether or not
the conflict should continue. In addition, each agent must be able to acknowledge its
own errors and learn the lesson needed to avoid such errors in the future. This last
ability can have important consequences on agent KH.
4.4.3 Integrative
cooperation can be observed in banks, when the line in front of a window is too long,
a second window is opened, thus cutting the line in half and reducing the first teller’s
workload. An example of the debative form is found in the mutual control established
between the flying pilot and the co-pilot in the plane cockpit. An example of the
integrative form can be seen in the coordination of the different tasks required to
build a house. The innovation lies in implementing these forms in human-machine
systems.
This section presents an analysis of the kind of structure that should be chosen to
support the different cooperative forms; the recommended forms are illustrated with
examples.
In this example, both agents have similar KH, and each performs a subtask STi
resulting from the division of task T into similar subtasks. In order to prevent
conflicts between the agents, the coordinator must decompose T into independent
subtasks.
In Air Traffic Control (ATC), the objectives consist of monitoring and controlling
the traffic in such a way that the aircraft cross the air space with a maximum level of
safety. The air space is divided into geographical sectors, each of them controlled by
two controllers. The first one is a tactical controller, called the “radar controller” (RC)
who supervises the traffic using a radar screen and dialogues with the aircraft pilots.
The supervision task entails detecting possible traffic conflicts between planes that
may violate separation norms, resulting in a collision, and then solving them. Conflict
resolution usually involves asking to one pilot to modify his/her flight level, heading,
or speed.
The second controller is a strategic controller, called the “planning controller”
(PC). PC coordinates the traffic in his/her own sector with the traffic in other sectors
in order to avoid irreconcilable conflicts on the sector’s borders. They are also
supposed to anticipate traffic density and regulate the workload of the RC. In
addition, in traffic overload conditions, PC assists RC by taking in charge some
tactical tasks. To support the RC, a dedicated DSS called SAINTEX has been
developed. In this system, each agent (i.e., the RC and SAINTEX) was allowed to
perform actions affecting the traffic, and the tasks were dynamically distributed
between these two agents based on performance and workload criteria.
To accomplish this dynamic task allocation, a task allocator control system was
introduced at the strategic level of the organization [30], which can be:
- a dedicated artificial decisional system with the ability to assess human workload
and performance, in which case the dynamic task allocation is called implicit, or
Toward Human-Machine Cooperation 15
- the human operator, who plays a second role dealing with strategic and
organizational tasks, in which case the dynamic task allocation is called explicit.
These two task allocation modes were implemented on a realistic Air Traffic
Control (ATC) simulator and evaluated by professional Air-Traffic Controllers.
A series of experiments implemented both implicit and explicit dynamic task
allocation between the radar controller and SAINTEX. The task allocation depended
on the know-how (KH) of the two decision-makers. The SAINTEX KH was limited
to simple aircraft conflicts (i.e., between only two planes). The RC’s know-how was
only limited by the workload. The experiments showed better performance in terms of
overall safety and fuel consumption of the traffic, and a better human regulation of
the workload in the implicit allocation mode than in the explicit one. However, the
responses to the questionnaires showed that the professional Air Traffic Controllers
would not easily accept implicit allocation in real situations because (a) the
different tasks were not completely independent, and (b) they had no control over the
tasks assigned to SAINTEX, but retained total responsibility for all tasks.
Thus, it seems that if AGx and AGy are both provided with all the KH and KHC
capabilities of a coordinator, a purely horizontal structure like the ones used in
Distributed Artificial Intelligence must be envisaged. However, if only one agent, for
instance AGx, is assigned the capabilities needed to be a coordinator, the result is a de
facto hierarchy in which AGx manages the cooperation. AGy will then have FG
capabilities and become an assistant in the cooperation. This situation is quite realistic
in Human-Machine Cooperation, and the dynamic task allocation aiming for this form
of cooperation can be analyzed from this perspective. In the experiment involving
only RC and SAINTEX, there was an asymetry between the KHC of both agents,
creating a de facto hierarchy in which the RC held the higher position. In the explicit
mode, this hierarchy was respected, but in implicit mode, it was reversed, which
could explain the RC’s refusal of this type of organization. In addition, the sub-tasks
were not really independent since solving some traffic conflicts increased the risk of
creating new ones. Thus, the cooperative form was not purely augmentative; a purely
augmentative form would have required SAINTEX to have other KHC related to the
other cooperative forms.
In this example, both agents have similar KH and are faced with a single task T that is
not (or cannot be) divided into sub-tasks. After each agent had performed the task,
they compare the results (or the partial results), and when there is a conflict, they
debate.
If both agents are given all the KH and KHC abilities, a purely horizontal
structure can be imagined. The ability to recognize and acknowledge errors may then
depend on trust and self-confidence [22]. On the other hand, giving only one agent
full KHC results in a de facto hierarchy; if such a hierarchical structure is chosen, the
conflict resolution process can be aided (or perturbed) by the hierarchy. This
situation is realistic in human-machine cooperation, because the machine capabilities
can be reduced to FG capabilities. In this case, the designer of the machine must have
simulated the human user’s conflict resolution pathway so as to allow the machine to
help the human to cooperate with it.
16 P. Millot
Process
Supervision
Alarm or
detection Human-DSS Alarm
of Warning detection of
Interface events
abnormal state
Observation Set of
of Justification Observation
observa- of the system
information tion of
information
Set of relevant
variables
Identifying
the system System Diagnosis
Rule based Behaviour
Predictions
Evaluation General
of strategy Looking for
alternatives consensus-
points
Definition
of task
Selection Task
of modifications
yes
Definition Selection of
of a Procedure Conflits ? the appropriate
procedure procedure
Execution no
of the
procedure
In this example, both agents have different and complementary KH and each
performs a subtask Sti resulting from the division of T into complementary subtasks.
The task can be decomposed and managed by the coordinator, which can be a third
agent or one of the two original agents, all with all KHC capabilities.
As for the other cooperative forms, a horizontal structure, in which each agent has
all KHC capabilities, can be imagined. This is generally the case in Human-Human
Cooperation, for instance between the pilot and the co-pilot in the plane cockpit.
Toward Human-Machine Cooperation 17
When the KHC capabilities of one agent are only partial, as is usually the case in
Human-Machine Cooperation, the structure is a de facto hierarchy, either for reasons
of competency or legal responsibilities, or both as is the case in ATC. Thus, the
designer must respect this hierarchical organization when creating the structure.
Let us consider the form of cooperation found in the diagnosis task, in which two
main tasks are essential for quickly focusing on the failures affecting the system:
- The first task is to interpret the data collected on the system and to generate a set
of failure hypotheses. The hypotheses are then crossed to determine a minimal failure
set that explains the effects observed.
- The second task is to check the consistency of the hypotheses at each step in the
reasoning, according to the system model.
Human-Machine Supervisory Control
+ Process
Detection Diagnosis Correction
-
Diagnose
Generate Maintain
Hypothesis Consistency Large capacity
Flexibility of storage
Human Capability Human Machine Machine capability
The first task requires a flexible global view of the system in order to quickly
generate consistent failure hypotheses. The necessary Know-How resembles human
abilities and thus is allocated to the human operator. The second task requires
calculating power in order to check the consistency of the hypotheses and to consider
multiple alternatives rapidly. The KH needed is best suited to machine abilities and
thus is allocated to the machine (Fig. 8). After the tasks have been allocated, the main
problem remaining is to define the means for coordinating both decision-makers’
activities during the diagnosis process because, in fact, the partial results must be
aggregated. As the tasks are shared, the decision-makers must exchange data and
interpret them. Furthermore, both decision-makers must share knowledge about the
process (e.g., external data); a shared workspace is also needed for coordinating the
reasoning processes of the human operator and the machine [31]. The shared
knowledge can be represented as a causal network of links between symptoms and the
causes of failure.
An example of a diagnosis task was studied in domestic phone network.
Customers having difficulties with their phone call a “hotline” service, and an
operator must make a diagnosis. The problem can come from the hardware, or a
customer mistake, or from a combination of the hardware and the line itself. A DSS
was built to assist the operators of such hotlines and was evaluated in well-defined
experimental conditions: in the experimental protocol, the network could have 49
possible phone system failures; these failures were linked to 150 symptoms. The
result was a causal network with 500 possible links. In less than 3 minutes, hotline
operators must find one diagnosis among the possible 49, using knowledge of the
18 P. Millot
actual symptoms among 150 possible ones. Operators gather information about the
symptoms through direct dialogue with the customer and through test devices. The
experiments showed that using integrative cooperation with the DSS, the average
number of good diagnoses increased from 64% to 82% [32].
Generally, pure cooperative forms do not exist in the real world; most often, a
combination of the three forms is encountered. This is the case in Air Traffic Control
(ATC). The AMANDA (Automation and MAN-machine Delegation of Action)
project has studied a new version of cooperation between Human Controllers and a
new tool called STAR in the ATC context. The objective of the project was to build a
common frame of reference, called Common Work Space (CWS), using the support
system STAR [31]. STAR is able to take controller strategies into account in order to
calculate precise solutions and then transmits the corresponding command to the
plane pilot. The common frame of reference of air traffic controllers was first
identified experimentally by coding the cognitive activities of air traffic controllers
[33]. The common workspace (CWS) resulting from this common frame of reference
was implemented on the graphic interface of the AMANDA platform [21]. The CWS
plays a role similar to a black-board, displaying the problems to be solved
cooperatively. As each agent brings pieces of the solution, the CWS displays the
evolution of the solution in real time.
Human Controller CWS STAR
a
Information Information
Percept/processing information Percept/processing
Schematic b
decision making strategies
Precise Precise
decision making solutions decision making
The cooperation between STAR and the human controller can take the 3 forms
(Fig.9):
a) debative, b) integrative, c) augmentative. The experimental evaluation shows that
this cooperative organization allows the controllers to better anticipate air traffic
conflicts, thus increasing the safety level. In addition, the common workspace seems
to provide a good representation of air traffic conflicts and thus is a good tool for
conflict resolution. Furthermore, this organization provides for better task sharing
between the two types of controllers (RC and PC), which results in a better regulated
workload [21].
Toward Human-Machine Cooperation 19
6 Conclusions
This paper reviews the objectives and the methods used in human engineering to
enhance the safety of automated systems, focusing on the parameters related to
human-machine interaction—degree of automation, system complexity, the richness
and complexity of the human component— among the different classes of parameters
that influence safety. One solution approach is to implement cooperation between
human and DSS. This paper proposes a framework for integrating human-machine
cooperation. Clearly, in order to implement human-machine cooperation, it is
necessary to cope not only with the KH of the different agents (human or machine),
but also with their respective KHC. Three cooperation forms have been introduced for
describing the activities composing the KHC of each agent. These activities can be
gathered in too groups: MI corresponding to a coordination activity and FG
corresponding to a benevolent behavior for facilitating the other agent’s goals. In
addition, the appropriate cooperative structure must be chosen. Several examples
were presented, regarding each form of cooperation and related to different
application fields: Air Traffic Control, production process supervision and
Telecommunication networks. In the case of human-machine cooperation the ability
of a machine to achieve coordination tasks is discussed in each of these examples.
References
1. Sheridan T.B., 84, “Supervisory Control of Remote Manipulators, Vehicles and Dynamic
Processes: Experiments in Command and Display Aiding”, Advances in Man-machines
Systems Researches, vol. 1(1984)
2. Lemoigne J.L., 84 (reedited 94), “La théorie du système général, théorie de la
modélisation” PUF, Paris (1984)
3. Lind M., 90, “Representing Goals and Functions of Complex Systems: an Introduction to
Multilevel Flow Modelling”, Technical Report 90-D-381 TU Denmark (1990).
4. Lind M. 03, “Making sense of the abstraction hierarchy in the power plant domain”, in
Cognition Technology and Work, vol 5, n°2, (2003) 67-81
5. Rasmussen J. 83, Skills, Rules and Knowledge: signals, signs and symbols and other
distinctions in human performance models: IEEE SMC n°3 (1983)
6. Hoc J.M.: “Supervision et contrôle de processus, la cognition en situation dynamique”.
Presses Universitaires de Grenoble (1996)
7. Reason J.: “Human error” Cambridge University Press (1990) (Version française traduite
par J.M. Hoc, L’erreur humaine PUF (1993))
8. Amalberti R. :, “La conduite des systèmes à risques” PUF (1996)
9. Rasmussen J.: “Risk management in a dynamic society: a modelling problem”, Safety
Sciences, 27, 2/3, (1997) 183-213
10. Fadier E., Actigny B. et al.: “Etat de l’art dans le domaine de la fiabilité humaine”, ouvrage
collectif sous la direction de E. Fadier, Octarès, Paris (1994)
11. Hollnagel E.: “Cognitive Reliability and Errors Analysis Method”, CREAM, Elsevier,
Amsterdam (1999)
12. Vanderhaegen F.: “APRECIH: a human unreliability analysis method-application to
railway system”,Control Engineering Practice, 7 (1999) 1395-1403
13. Polet P., Vanderhaegen F., Wieringa P.A.: “Theory of Safety-related violations of a System
Barriers”, Cognition Technology and Work, 4 (2002) 171-179
20 P. Millot
14. Van der Vlugt M., Wieringa P.A: “Searching for ways to recover from fixation: proposal
for a different view-point”, Cognitive Science Approach for Process Control CSAPC’03,
Amsterdam, September (2003)
15. Sheridan T.: «Forty-Five Years of Man-Machine Systems: History and Trends», 2nd IFAC
Conference Analysis, Design and Evaluation of Man-Machine Systems, Varese, september (1985)
16. Millot P.: «Systèmes Homme-Machine et Automatique», Journées Doctorales
d’Automatique JDA’99, Conférence Plénière, Nancy, septembre (1999)
17. Fadier E.: «Fiabilité humaine : Méthodes d’analyse et domaines d’application», In J. Leplat
et G. de Terssac éditeurs; Les Facteurs humains de la fiabilité dans les systèmes complexes,
Edition Octarés, Marseille (1990)
18. Villemeur A.: «Sûreté de fonctionnement des systèmes industriels : fiabilité, facteur
humain, informatisation», Eyrolles, Paris (1988)
19. Reason J.: «Intentions, errors and machines: a cognitive science perspective», Aspects of
consciousness and awareness, Bielefeld, W. Germany, december (1986)
20. Millot P., Hoc J.M.: “Human-Machine Cooperation: Metaphor or possible reality?”
European Conference on Cognitive Sciences, ECCS’97, Manchester UK, April (1997)
21. Millot P., Debernard S.: “An Attempt for conceptual framework for Human-Machine
Cooperation”, IFAC/IFIP/IFORS/IEA Conference Analysis Design and Evaluation of
Human-machine Systems Seoul Korea, September (2007)
22. Moray N., Lee, Muir,: “Trust and Human Intervention in automated Systems”, in Hoc,
Cacciabue, Hollnagel editors : Expertise and Technology cognition and Human Computer
Interaction. Lawrence Erlbaum Publ. (1995)
23. Rajaonah B., Tricot N., Anceaux F., Millot P.: Role of intervening variables in driver-ACC
cooperation, International Journal of Human Computer Studies (2006)
24. Millot P.: “Concepts and limits for Human-Machine Cooperation”, IEEE SMC CESA’98
Conference, Hammamet, Tunisia, April (1998)
25. Millot P.,Lemoine M.P.: “An attempt for generic concepts Toward Human-Machine
Cooperation”, IEEE SMC, San Diego, USA, October (1998)
26. Millot P., Taborin V., Kamoun A.: «Two approaches for man-computer Cooperation in
supervisory Tasks», 4th IFAC Congress on “Analysis Design and Evaluation of man-
machine Systems”, XiAn China, September (1989)
27. Grislin-Le Strugeon E., Millot P.: «Specifying artificial cooperative agents through a
synthesis of several models of cooperation», 7th European Conference on Cognitive Science
Approach to Process Control CSAPC’99, p. 73-78, Villeneuve d’Ascq, september (1999)
28. Rasmussen J.: “Modelling distributed decision making”, in Rasmussen J., Brehmer B., and
Leplat J. (Eds), Distributed decision-making: cognitive models for cooperative work
pp111-142, John Willey and Sons, Chichester UK(1991)
29. Schmidt K.:« Cooperative Work: a conceptual framework », In J. Rasmussen, B. Brehmer,
and J. Leplat (Eds), Distributed decision making: Cognitive models for cooperative work
(1991) 75-110
30. Vanderhaegen F., Crévits I., Debernard S., Millot P.: « Human-Machine cooperation:
Toward an Activity Regulation Assistance for Different Air Traffic Control Levels »,
International Journal of Human Computer Interactive, 6(1) (1994) 65-104
31. Pacaux-Lemoine M.P., Debernard S.: “Common work space for Human-Machine
Cooperation in Air Traffic Control”, Control Engineering and Practice, 10 (2002) 571-576
32. Jouglet D., Millot P.: “Performance improvement of Technical diagnosis provided by
human-machine cooperation”, IFAC Human-Machine Systems: Analysis Design and
Evaluation of Human-Machine Systems, Kassel, Germany, September (2001)
33. Guiost B, Debernard S., Millot P.:”Definition of a Common Work Space”. In 10th
International Conference of Human-Computer Interaction, Crete, Greece, January (2003) 442-446
PART I
1.1 Context
The Escota Company, founded in 1956, is the leading operator of toll motorways in
France. Due to its integration into the Provence-Alpes-Côte d’Azur region, Escota is
committed, as every motorway operator, to a sustainable development approach,
including the social, economic and environmental aspects of its activities. Every year,
specific initiatives are undertaken, or repeated, to include the motorway network in a
sustainable development approach. Within this scope, the Escota Company aims at
the formalization and improvement of the decisional process for preventive
maintenance and property management with the desire to show transparency on
decisions relative to property management, personal accountability and justification
of decision-making logic in a multi actors and multi criteria (MC) environment [6],
[7]. These decisions concern upkeep, improvement and upgrading operations,
24 C. Sanchez et al.
Periodic inspections are performed to detect and measure, as early as possible, any
malfunction symptoms affecting an element of the infrastructure (EI). The expert in
charge of an operating domain then analyses the technical diagnosis relative to the EI.
He evaluates the situation seriousness in terms of technical risk analyses. This
evaluation relies on a specific set of n criteria relative to his domain. An aggregation
with a weighted arithmetic mean (WAM) is then performed to assess a global degree
of emergency to the corresponding maintenance operation. This evaluation is then
submitted to the official in charge of the operating network. This latter coordinates the
experts’ needs and demands for operation planning purposes.
This paper deals more particularly with the MC evaluation process by the expert
of an operating domain, i.e. the affectation of an emergency degree to an operation.
There exist several methods to identify and perform aggregation process with a
WAM. The Analytic Hierarchical Process, AHP, is probably the most famous one in
industry [1]. However, because it explicitly guarantees the consistency between the
commensurable scales it aggregates and the WAM operator it identifies, the
Measuring Attractiveness by a Categorical Based Evaluation TecHnique method,
MACBETH, has got recent successes [2], [3]. In our application, MACBETH is first
used to build the valuation scale associated to each emergency criterion of a domain.
It is then applied to determine the WAM parameters.
Furthermore, the way experts give their assessment in natural language raises
another problem [4]. These labels are commonly converted into numerical values to
perform the aggregation process. No particular attention is generally paid to this
“translation”. However the consequences over the aggregation results are damageable.
In civil engineering, the culture of numbers is strongly developed. People commonly
manipulate symbolic labels but may convert them into more or less arbitrary
numerical values when necessary without further care. This cultural viewpoint
explains why an aggregation operator is generally preferred to a rule base whereas
appraisals are expressed in terms of symbolic labels [4]. A completely symbolic
evaluation over finite scales could be envisaged [5].
Let us illustrate the scales problem with the following example. Let us suppose
that the semantic universe of an expert w.r.t. the seriousness of a symptom is:
{insignificant, serious, alarming}. We can imagine that a corresponding possible set
of discrete numerical values (in [0; 1]) could be: {0; 0.5; 1}. There are several
assumptions behind this translation concerning the nature of the scale. This point will
be discussed later. Let us just note here that the numerical values are commonly
chosen equidistant. Now let us consider another semantic universe: {insignificant,
minor, alarming}. This time, the associated set of numerical values {0; 0.5; 1}
intuitively appears more questionable. The expert should prefer {0; 0.25; 1}. When
seriousness degrees of several symptoms are to be aggregated, the result of the WAM
Planning of Maintenance Operations for a Motorway Operator 25
The purpose of this section is to explain how we have worked with Escota experts of
the different operating domains in order to properly identify their emergency scales.
There are one emergency scale for each criterion of the domain and one scale for the
aggregated emergency value. In the following we will consider the case of the
operating domain “carriageway”. Eight criteria (n=8) are related to it: security,
durability, regulation, comfort, public image, environment protection, sanitary and
social aspects.
It has been checked a priori that Escota emergency scales are of cardinal nature:
the emergency scale relative to any of the criteria is an interval scale.
Let us consider a finite set X. When the elements of X can be ranked w.r.t. to their
attractiveness, this is ordinal information. It means that a number n(x) can be
associated to any element x of X such that:
∀x, y ∈ X :[ xΡy ⇔ n( x) f n( y)] (1)
where relation P « is more attractive than » is asymmetric and non transitive and
relation I « is as attractive as » is an equivalence relation. n(x) defines an ordinal
scale.
Based upon this first level of information, an interval scale can then be built. The
next step consists in evaluating the difference of intensity of preference between
elements of X. It implies the following constraints:
26 C. Sanchez et al.
n( x) − n( y ) = kα , k ∈ Ν (3)
where k characterizes the intensity of preference and α enables to respect the
limits of the domain (for example [0,1]). The resolution of a system of equations of
type (1), (2) and (3) provides an interval scale. That’s the principle used in the
MACBETH method [2].
Fig. 1. MACBETH – Pair to pair comparison of operations and cardinal scale for security
criterion.
Finally, this procedure is then applied to identify the weights of the WAM operator.
The pair to pair comparison is carried out over the eight criteria of the carriageway
domain (Fig. 2). The resulting interval scale of weights is given in Fig. 2. Let us note the
Planning of Maintenance Operations for a Motorway Operator 27
weights pi , i = 1..n (n=8 for the carriageway domain). At this stage of the modelling,
the carriageway expert has identified his 8 emergency scales and his WAM parameters.
He is supposed to be able to compute the global degree of emergency of any operation
when partial quotations ui are available, w.r.t. each criterion:
n
WAM (OP ) = ∑ pi .ui
i =1
Fig. 2. MACBETH – Pair to pair comparison of carriageway criteria and weights identification.
Partial scores aggregation does not cause any problem when quotations referred to
continuous cardinal scales. As explained in section 1, it is more questionable when
partial scores are expressed on a discrete or finite scale. Indeed, Escota experts
express their assessment w.r.t. each criterion on a finite set of 3 labels {U1 , U 2 , U 3 } .
The different Ui define a discrete cardinal scale. However, computing the WAM value
necessitates assigning numerical values to each Ui. In the following, we describe the
way this assignment can be achieved in a consistent manner with previous
MACBETH identification phases.
A continuous cardinal scale has been identified with MACBETH method for the
emergency scale of each criterion. The problem is now to assign a set of numerical
values {u , u , u } to {U ,U ,U } for criterion i. Let us suppose the continuous
i
1
i
2
i
3 1 2 3
cardinal scale for criterion i has been identified with a training set of q operations.
These operations are grouped into 3 clusters corresponding to U1 , U 2 , U 3 . The
computation of the clusters and their associated centres is achieved by minimizing the
3 qk
quadratic difference ∑∑ (u i
k − u i (OPj )) 2 where qk is the number of operations in
k =1 j =1
28 C. Sanchez et al.
3
class U k ( ∑ qk = q ) and u i (OPj ) , j=1..q, the emergency degree of an operation
k =1
In the example of Fig. 1, the computation of clusters gives: u1sec urity = 0.91 ,
u2sec urity = 0.52 and u3sec urity = 0.11 .
This assignment is repeated for each criterion relative to the carriageway domain.
Then, the WAM can be numerically computed:
• For each criterion i , i = 1..n ( n = 8 ), a value U k is affected to an operation OP. Let
us note this emergency degree U k (i ) ;
• OP is thus described by its vector of emergency degrees [U k (1) ,..,U k ( n ) ] ;
• The corresponding vector of numerical values is: {u1k (1) , uk2(2) ,.., ukn( n ) } ;
n
WAM (OP) = ∑ pi .uki (i ) (4)
i =1
The last constraint to be satisfied is that the WAM values must be converted in
return into the semantic universe {U1 , U 2 , U 3 } . The output of the WAM operator
must be discretized in {U1 , U 2 , U 3 } . The problem is thus to determine the centres of
the U k clusters of the aggregated emergency scale (WAM values).
Let us note that the WAM operator is idempotent. Therefore, we must have:
∀U k , k ∈ {1, 2,3},WAM (U k ,...,U k ) = U k (5)
A sufficient condition for (5) is that the centres of theU k clusters of the
aggregated emergency scale are the images of the corresponding U k centres of the
partial emergency scales by the WAM function, i.e.:
n
WAM (u1k ,.., ukn ) = ∑ pi .uki = ukAg (6)
i =1
Ag
where uk is the centre of class U k in the aggregated emergency scale.
Consequently, when an operation is defined by its partial emergency
vector [U k (1) ,.., U k ( n ) ] , equation (4) provides the numerical value
n
WAM (OP ) = ∑ pi .ui (7)
i =1
The value of k in {1, 2,3} that minimizes the expression in (8) provides the class
U k of operation OP.
Fig. 3 summarizes the whole evaluation process of an operation OP. The validation
of this process has been carried out with a test base of 23 operations in the
carriageway domain. The carriageway expert has analysed each of these operations.
For each of them, he has attributed emergency degrees in the Escota normalized
semantic universe {U1 , U 2 , U 3 } w.r.t. every of his 8 criteria.
Then, the aggregated emergency degree in this semantic universe can be computed
using the 3-step process described in this paper (white arrows in Fig. 3). Besides these
computations, the expert has been asked to directly attribute an overall emergency
degree to each of the 23 operations (grey arrow in Fig. 3).
Fig. 4 reports these data. The last line corresponds to the direct expert evaluation
(grey arrow). The last but one line provides the corresponding computed values with
the 3-step method (white arrows). No error has been observed. However, the poor
30 C. Sanchez et al.
In this paper, the study was focused on the MC evaluation by the expert of an
operating domain. However, as evocated in section 1, planning of operations, by
Escota, is more complex. The emergency assessment by operating domain experts
described here is only part of a hierarchical MC evaluation process. From symptoms
detection on elements of infrastructure to operation planning, a similar MC evaluation
is carried out at different functional levels in the Escota organization.
The complete information processing used for Escota preventive maintenance can
be formalized as the following sequence of risk analysis. Periodic inspections are
performed to detect and measure any malfunction symptoms as early as possible. The
expert in charge of a domain then analyses these technical diagnoses and evaluates the
situation seriousness. The official in charge of the operating network coordinates and
ponders the experts’ needs and demands. Each actor of this information processing
system participates to a tripartite MC decision-making logic: measurement, evaluation
and decision. To each step of this process corresponds a specific set of criteria and an
aggregation operator: seriousness of a malfunction results from a prescribed
aggregation of the symptoms quotation; the expert’s interpretation of the diagnosis
associates an emergency degree to the corresponding maintenance operation w.r.t. the
criteria relating to his operating domain (technical risks assessment); finally, the
manager attributes a priority degree to the operation on the basis of a set of more
strategic criteria (strategic risks analysis).
This hierarchical MC evaluation process enables to breakdown the decision-making
into elementary steps. Each step collaborates to the enrichment of information from
measures to priority degrees and thus contributes to the final step, i.e. operation planning.
We have developed a dynamic Information Processing System (IPS) to support
this hierarchical MC evaluation of the infrastructure condition and facilitate the way
decision-makers use their reasoning capabilities through adequate information
processing procedure. Fig. 5 illustrates the man machine-interface the expert has at
his disposal to fulfil an emergency form relative to an operation.
Let us now consider a last step in the evaluation process: assessment of the risk of
erroneous estimation w.r.t. the emergency of an operation, i.e., the risk of
underestimation or overestimation of the aggregated emergency score of an operation.
It relies on a robustness analysis of the evaluation procedure based upon the WAM.
Two aims are assigned to this step, it must answer the following questions: 1) when
an erroneous partial estimation is done w.r.t. criterion i, what is the risk the
aggregated emergency degree to be affected? 2) when an operation appears to be
underestimated (resp. overestimated), which criteria could most likely explain this
faulty result? The first question corresponds to an a priori risk estimation of
erroneous evaluation; the second question is related to a diagnosis analysis.
Let us first define the notion of neighbourhood of a vector of emergency degrees
[U k (1) ,.., U k ( n ) ] associated to an operation OP. The vectors of the neighbourhood of
[U k (1) ,.., U k ( n ) ] are all the vectors [U k' (1) ,.., U k' ( n ) ] such that: ∀i ∈ {1..n}, U k' ( i ) = U k ( i ) or
U k' ( i ) is the value just above (resp. below) U k (i ) (when defined; indeed, there is no
value below zero and no value above U1 ). The neighbourhood is a set of vectors
denoted Ν([U k (1) ,..,U k ( n ) ]) . In the example in dimension 2 in Fig. 7, U k (1) = U 2 and
U k (2) = U 2 . The values of component i (i = 1or 2) of a neighbour vector may be U 2 ,
U1 or U 3 . There are 8 neighbours. In the general case, the maximal number of
neighbours is 3n − 1.
32 C. Sanchez et al.
Criterion 1
0 U3 U2 U1
0
Over _1(U 2 ,U 2 )
Criterion 2
U3 U3 U2 U2 Under _1(U 2 ,U 2 )
U2 U2 U2 U1
U1 U2 U1 U1
underestimated itself (resp. overestimated) when the overall emergency degree of the
concerned operation is underestimated (resp. overestimated).
17 21 19 30 12 11 29 15 14 27 28 13 6 9 8 7 25 24 23 22 5 2 18
env 11.0% 44.0% 0% 0% 15.0% 2.0% 0% 0% 0% 26.0% 13.0% 41.0% 0% 0% 0% 0% 18.0% 21.0% 0% 0% 0% 0% 21.0%
sanitary 11.0% 42.0% 0% 0% 14.0% 1.0% 0% 0% 0% 26.0% 13.0% 39.0% 0% 0% 0% 0% 18.0% 21.0% 0% 0% 0% 0% 21.0%
comfort 25.0% 58.0% 0% 0% 25.0% 4.0% 0% 0% 0% 40.0% 27.0% 55.0% 0% 0% 0% 0% 28.0% 32.0% 0% 0% 0% 0% 37.0%
regulation 11.0% 45.0% 0% 0% 16.0% 2.0% 0% 0% 0% 34.0% 20.0% 43.0% 0% 0% 0% 0% 20.0% 25.0% 0% 0% 0% 0% 23.0%
security 32.0% 67.0% 0% 0% 37.0% 4.0% 0% 0% 0% 62.0% 35.0% 76.0% 0% 0% 0% 0% 41.0% 46.0% 0% 0% 0% 0% 55.0%
durability 15.0% 73.0% 0% 0% 28.0% 4.0% 0% 0% 0% 32.0% 17.0% 62.0% 0% 0% 0% 0% 39.0% 44.0% 0% 0% 0% 0% 27.0%
social 11.0% 42.0% 0% 0% 14.0% 1.0% 0% 0% 0% 26.0% 13.0% 39.0% 0% 0% 0% 0% 18.0% 21.0% 0% 0% 0% 0% 21.0%
public image 18.0% 56.0% 0% 0% 26.0% 4.0% 0% 0% 0% 44.0% 27.0% 58.0% 0% 0% 0% 0% 29.0% 34.0% 0% 0% 0% 0% 39.0%
17 21 19 30 12 11 29 15 14 27 28 13 6 9 8 7 25 24 23 22 5 2 18
env 2.0% 3.0% 0.0% 53.0% 7.0% 30.0% 12.0% 17.0% 17.0% 1.0% 6.0% 0.0% 41.0% 34.0% 34.0% 28.0% 13.0% 11.0% 6.0% 6.0% 0.0% 0.0% 1.0%
sanitary 2.0% 3.0% 0.0% 53.0% 6.0% 29.0% 12.0% 16.0% 16.0% 1.0% 6.0% 0.0% 34.0% 34.0% 34.0% 22.0% 13.0% 11.0% 6.0% 6.0% 0.0% 0.0% 1.0%
comfort 8.0% 7.0% 1.0% 69.0% 13.0% 46.0% 19.0% 24.0% 24.0% 3.0% 14.0% 2.0% 51.0% 51.0% 51.0% 39.0% 20.0% 18.0% 12.0% 12.0% 2.0% 1.0% 3.0%
regulation 3.0% 3.0% 1.0% 58.0% 9.0% 32.0% 19.0% 23.0% 23.0% 1.0% 6.0% 1.0% 40.0% 35.0% 35.0% 28.0% 14.0% 14.0% 7.0% 7.0% 1.0% 0.0% 1.0%
security 8.0% 9.0% 1.0% 72.0% 18.0% 63.0% 35.0% 47.0% 47.0% 3.0% 18.0% 2.0% 74.0% 80.0% 80.0% 56.0% 30.0% 27.0% 19.0% 19.0% 2.0% 1.0% 3.0%
durability 8.0% 9.0% 1.0% 96.0% 17.0% 49.0% 24.0% 30.0% 30.0% 3.0% 11.0% 2.0% 48.0% 48.0% 48.0% 36.0% 39.0% 34.0% 18.0% 18.0% 2.0% 1.0% 3.0%
social 2.0% 3.0% 0.0% 53.0% 6.0% 29.0% 12.0% 16.0% 16.0% 1.0% 6.0% 0.0% 34.0% 34.0% 34.0% 22.0% 13.0% 11.0% 6.0% 6.0% 0.0% 0.0% 1.0%
public image 3.0% 5.0% 1.0% 60.0% 15.0% 44.0% 24.0% 29.0% 29.0% 3.0% 14.0% 2.0% 53.0% 48.0% 48.0% 33.0% 20.0% 18.0% 12.0% 12.0% 2.0% 1.0% 3.0%
Fig. 9. Risk of overall overestimation of the operations induced by partial overestimations w.r.t.
criteria.
17 21 19 30 12 11 29 15 14 27 28 13 6 9 8 7 25 24 23 22 5 2 18
durability 45% 57% 0% 0% 67% 100% 0% 0% 0% 40% 42% 52% 0% 0% 0% 0% 71% 67% 0% 0% 0% 0% 41%
security 97% 53% 0% 0% 86% 100% 0% 0% 0% 79% 87% 64% 0% 0% 0% 0% 75% 71% 0% 0% 0% 0% 84%
comfort 77% 45% 0% 0% 59% 90% 0% 0% 0% 51% 66% 46% 0% 0% 0% 0% 51% 49% 0% 0% 0% 0% 56%
public image 55% 44% 0% 0% 62% 90% 0% 0% 0% 56% 66% 48% 0% 0% 0% 0% 53% 52% 0% 0% 0% 0% 60%
env 35% 35% 0% 0% 36% 45% 0% 0% 0% 33% 33% 35% 0% 0% 0% 0% 33% 33% 0% 0% 0% 0% 33%
regulation 35% 35% 0% 0% 37% 45% 0% 0% 0% 43% 51% 36% 0% 0% 0% 0% 37% 39% 0% 0% 0% 0% 35%
social 33% 33% 0% 0% 33% 33% 0% 0% 0% 33% 33% 33% 0% 0% 0% 0% 33% 33% 0% 0% 0% 0% 33%
sanitary 33% 33% 0% 0% 33% 33% 0% 0% 0% 33% 33% 33% 0% 0% 0% 0% 33% 33% 0% 0% 0% 0% 33%
17 21 19 30 12 11 29 15 14 27 28 13 6 9 8 7 25 24 23 22 5 2 18
durability 100% 100% 100% 60% 82% 57% 66% 61% 61% 100% 60% 100% 46% 46% 46% 53% 94% 96% 93% 93% 100% 100% 100%
security 100% 100% 100% 46% 90% 72% 96% 95% 95% 100% 100% 100% 71% 78% 78% 84% 73% 75% 100% 100% 100% 100% 100%
comfort 100% 78% 100% 44% 66% 53% 53% 48% 48% 100% 80% 100% 50% 50% 50% 57% 50% 51% 62% 62% 100% 100% 100%
public image 45% 60% 100% 38% 74% 50% 66% 58% 58% 100% 80% 100% 51% 46% 46% 50% 50% 51% 62% 62% 100% 100% 100%
env 35% 34% 33% 33% 35% 34% 33% 35% 35% 33% 33% 40% 40% 33% 33% 42% 33% 33% 33% 33% 33% 50% 33%
regulation 40% 39% 100% 35% 43% 36% 53% 46% 46% 33% 33% 60% 39% 34% 34% 42% 35% 41% 37% 37% 50% 50% 33%
social 33% 33% 33% 33% 33% 33% 33% 33% 33% 33% 33% 33% 33% 33% 33% 33% 33% 33% 33% 33% 33% 33% 33%
sanitary 33% 33% 33% 33% 33% 33% 33% 33% 33% 33% 33% 33% 33% 33% 33% 33% 33% 33% 33% 33% 33% 33% 33%
6 Conclusions
values in their own discrete semantic universe, 2) to convert the labels in adequate
numerical values using the MACBETH method and clustering techniques, 3) to
compute the WAM based aggregated value and convert it in return into the experts’
semantic universe 4) to carry out a robustness analysis of the evaluation process to
assess the risk of misclassification of the operations and to diagnose these
misclassifications. This method is implemented in an IPS—SINERGIE—that
supports decisions concerning maintenance operations planning by the motorway
operator Escota.
References
1. Saaty, T.L.: The Analytic Hierarchy Process. McGraw-Hill , New York (1980)
2. Bana e Costa, C.A., Vansnick, J.C.: MACBETH - an interactive path towards the
construction of cardinal value functions. International transactions in Operational Research,
vol. 1, pp. 489–500 (1994)
3. Clivillé, V. : Approche Systémique et méthode multicritère pour la définition d’un système
d’indicateurs de performance. Thèse de l’Université de Savoie, Annecy (2004)
4. Jullien, S., Mauris, G., Valet, L., Bolon, Ph.: Decision aiding tools for Animated film
selection from a mean aggregation of criteria preferences over a finite scale. 11th
Int. Conference on Information processing and Management of uncertainty in Knowledge-
Based Systems, IPMU, Paris, France (2006)
5. Grabisch, M.: Representation of preferences over a finite scale by a mean operator.
Mathematical Social Sciences, vol. 52, pp. 131–151 (2006)
6. Akharraz A., Montmain J., Mauris G.: A project decision support system based on an
elucidative fusion system, Fusion 2002, 5th International Conference on Information
Fusion, Annapolis, Maryland, USA (2002)
7. Akharraz A., Montmain J., Denguir A., Mauris G., Information System and Decisional Risk
Control for a Cybernetic Modeling of Project Management. 5th international conference on
computer science (MCO 04), Metz, France, pp. 407–414 (2004).
A Multiple Sensor Fault Detection Method based on
Fuzzy Parametric Approach
Abstract. This paper presents a new approach for the model-based diagnosis.
The model is based on an adaptation with a variable forgetting factor. The vari-
ation of this factor is managed thanks to fuzzy logic. Thus, we propose a design
method of a diagnosis system for the sensor defaults. In this study, the adaptive
model is developed theoretically for the Multiple-Input Multiple-Output (MIMO)
systems. We present the design stages of the fuzzy adaptive model and we give
details of the Fault Detection and Isolation (FDI) principle. This approach is val-
idated with a benchmark: a hydraulic process with three tanks. Different defaults
(sensors) are simulated with the fuzzy adaptive model and the fuzzy approach for
the diagnosis is compared with the residues method. The method is efficient to
detect and isolate one or more defaults. The results obtained are promising and
seem applicable to a set of MIMO systems.
Keywords. Adaptive model, fuzzy system models, diagnosis, Fault Detection and
Isolation (FDI).
1 Introduction
The automatic control of technical systems requires a fault detection to improve relia-
bility, safety and economy. The diagnosis is the detection, the isolation and the identi-
fication of the type as well as the probable cause of a failure using a logical reasoning
based on a set of information coming from an inspection, a control or a test (AFNOR,
CEI) [1], [2]. The model-based diagnosis is largely studied in the literature [3], [4], [5].
These methods are based on parameter estimation, parity equations or state observers
[3], [4], [6]. The goal is to generate the indicators of defaults through the generation of
residues [7] (Fig. 1).
This paper deals with the problem of the model-based diagnosis by using a paramet-
ric estimation method. We particularly focus our study on an approach with an adaptive
model. Many methods exist which enable the design of these adaptive models [3].
Many works tackle the model-based diagnosis from a fuzzy model of the processes
[8], [9], [10], [11], [12].
Sala et al. [13] notices that Higher decision levels in process control also use rule
bases for decision support. Supervision, diagnosis and condition monitoring are exam-
ples of successful application domains for fuzzy reasoning strategies.
38 F. Lafont et al.
In our work, unlike these approaches, fuzzy logic is used to design the parametric
model.
In all cases, for the model-based approaches, the quality of the fault detection and
isolation depends on the quality of the model.
It is possible to improve the model identification by implementing an original method
based on a parameters adjustment by using a Fuzzy Forgetting Factor (FFF) [14]. The
idea, in this study, is to use the variations of the fuzzy forgetting factors for the fault
detection and isolation. Thus, we propose an original method based on a fuzzy adapta-
tion of the parameter adjustments by introducing a fuzzy forgetting factor. From these
factors (one by output), we can generate residues for the fault detection and isolation.
This paper is an extension of the approach presented at the fourth International con-
ference on Informatics in Control, Automation and Robotics [15] to detect and isolate
several defaults. It is now developed to detect several simultaneous defaults. The paper
is organised as follow: first, we present the principle of the fuzzy forgetting factor. Then,
we summarize the different stages of generation of residues and decision-making. In
section 4, we present the application of the method to diagnosis of a hydraulic process.
A numerical example, with several types of sensor defaults (the bias and the calibration
default), is presented to show the performances of this method.
A Multiple Sensor Fault Detection Method based on Fuzzy Parametric Approach 39
In this section, after having presented the classical approach for the on-line identifi-
cation, we present a new method of adaptation based on the fuzzy forgetting factor
variation [15].
We consider a modeling of non-linear and non-stationary systems. Consequently,
an on-line adaptation is necessary to obtain a valid model capable of describing the
process and allowing to realize an adaptive command [16]. A common technique for
estimating the unknown parameters is the Recursive Least Squares (RLS) algorithm
with forgetting factor [17], [18], [19].
At each moment k, we obtain a model, such as:
1 P (k)ϕ(k)ϕT (k)P (k)
P (k + 1) = P (k) − (6)
λ(k) λ(k) + ϕT (k)P (k)ϕ(k)
with θ̂(k) the estimated parameters vector (initialized with the least-squares algo-
rithm), ϕ(k) the regression vector, (k + 1) the a-posterior error, P(k) the gain matrix
of regular adaptation and λ(k) the forgetting factor.
If the process is slightly excited, the gain matrix P(k) increases like an exponential
[20]. To avoid this problem, and the drift of parameters, a measure m(k) is introduced as:
u(k + 1) − u(k)
m(k + 1) = 1 if > Su
(7)
umax
y(k + 1) − θ̂T (k)ϕ(k)
or if > Sy (8)
yn
u(k + 1) − u(k)
m(k + 1) = 0 if < Su
(9)
umax
y(k + 1) − θ̂T (k)ϕ(k)
and if < Sy (10)
yn
P (0) = GI (11)
With G << 1 or T race < 1 and I: identity matrix.
We choose as initial values:
⎡ ⎤
0.1 0 0 0
⎢ 0 0.1 0 0 ⎥
P (0) = ⎢
⎣ 0 0 0.1 0 ⎦
⎥ (12)
0 0 0 0.1
The considered class of the system imposes to use a method with a variable forgetting
factor in order to take into account the non-stationarity of the process.
Generally, the adaptation of a model is obtained by using a RLS algorithm with
forgetting factor. The forgetting factor can be constant or variable.
There are different classical methods of the forgetting factor variation as, for exam-
ple, the exponential forgetting factor. The variation of λ is defined as:
Δ(k) represents the variation of the mean error on the N last samples:
k
1
Δ(k) = ((j) − (j − 1)) (17)
N
j=k−N +1
Δ(k) is defined with three membership functions: one for the negative error, one
for the null error and one for the positive error (Fig. 3). A study of observed process
allows to determine the values: {−ηmax ; −ηmin ; ηmin ; ηmax } .
⎧ ⎫
⎨ μnegative (υ) = 1 if υ ≤ −ηmin ⎬
−1
υ → μnegative (υ) = ηmin ∗ υ if − ηmin < υ < 0 (19)
⎩ ⎭
μnegative (υ) = 0 if υ ≥ 0
⎧ ⎫
⎪ μnull (υ) = 0 1if υ ≤ −ηmin
⎪
⎪
⎪
⎪
⎪
⎨ μnull (υ) = η ∗ υ + 1 if − ηmin < υ < 0 ⎬
υ→
min
(21)
⎪
⎪ −1
μnull (υ) = ηmin ∗ υ + 1 if 0 ≤ υ < ηmin ⎪
⎪
⎪
⎩ ⎪
⎭
μnull (υ) = 0 if υ ≥ ηmin
42 F. Lafont et al.
⎧ ⎫
⎨ μpositive (υ) = 0 if υ ≤ 0 ⎬
1
υ → μpositive (υ) = ηmin ∗ υ if 0 < υ < ηmin (23)
⎩ ⎭
μpositive (υ) = 1 if υ ≥ ηmin
The membership functions of the input λ(k) and the output λ(k + 1) are identical
(Fig. 4).
⎧ ⎫
⎨ μsmall (ν) = 1 if ν ≤ 0.85 ⎬
ν → μsmall (ν) = −20 ∗ ν + 18 if 0.85 < ν < 0.9 (25)
⎩ ⎭
μsmall (ν) = 0 if ν ≥ 0.9
⎧ ⎫
⎪
⎪ μmean (ν) = 0 if ν ≤ 0.85 ⎪
⎪
⎨ ⎬
μmean (ν) = 20 ∗ ν − 17 if 0.85 < ν < 0.9
ν→ (27)
⎪
⎪ μmean (ν) = −20 ∗ ν + 19 if 0.9 ≤ ν < 0.95 ⎪
⎪
⎩ ⎭
μmean (ν) = 0 if ν ≥ 0.95
⎧ ⎫
⎨ μgreat (ν) = 0 if ν ≤ 0.9 ⎬
υ → μgreat (ν) = 20 ∗ ν − 18 if 0.9 < ν < 0.95 (29)
⎩ ⎭
μgreat (ν) = 1 if ν ≥ 0.95
According to the application, the bounds [0.8 ; 1] can be reduced.
A Multiple Sensor Fault Detection Method based on Fuzzy Parametric Approach 43
The inference rules are based on the variation method of the exponential forgetting
factor. In this case, the forgetting factor must be maximum when the modeling of the
system is correct (small error variation). Also, we have been inspired by Andersson’s
work. When there is an important non-stationarity, the forgetting factor must
decrease.
If λ(k) is Fn1λ and Δ(k) is Fn2 then λ(k + 1) is Fn3 , where Fn1λ ∈ F11 , F21 , F31
λ
is the set of membership functions of the input variable λ(k), Fn2 ∈ F12 , F22 , F32 is
the set of membership functions of the input variable Δ(k) and Fn3 ∈ F13 , F23 , F33
λ
is the set of membership functions of λ(k + 1) .
The rules for the output λ(k + 1) are defined in table 1.
The inference method is based on the max-min and the defuzzification is the centre
of gravity.
With nλ =1 to 3, n = 1 to 3 and nλ = 1 to 3.
μ(z)zdz
λ(k + 1) = (31)
μ(z)dz
The number of forgetting factors is equal to the number of model outputs.
The residuals are analytical redundancy generated measurements representing the dif-
ference between the observed and the expected system behaviour. When a fault occurs,
the residual signal allows to evaluate the difference with the normal operating condi-
tions (Fig. 5).
The residuals are processed and examined under certain decision rules to determine
the change of the system status. Thus, the fault is detected, isolated (to distinguish the
abnormal behaviours and determine the faulty component) and identified (to character-
ize the duration of the default and the amplitude in order to deduce its severity) (Fig. 6).
A threshold between the outputs of the system and the estimated outputs is chosen
in order to proceed to the decision-making.
44 F. Lafont et al.
Fig. 6. Decision-making.
The residues rj = |(yj − ŷj |) are calculated to estimate the case where there is no
failure and the case of sensor default. A threshold t is taken: if rj ≤ t then rj = 0.
At each instant k, the different rj are checked in order to establish a diagnosis.
Our method uses the fuzzy lambda to detect and isolate a default on a sensor. For the
MIMO system, the algorithm generates one lambda for each output.
Let λj , with j = 1 to n, n: number of outputs.
The residues rj = 1−λj are calculated to estimate the case where there is no failure
and the case of sensor default. A threshold t is taken: if rj ≤ t then rj = 0.
At each instant k, the different rj are checked in order to establish a diagnosis as
shown in table 2.
A Multiple Sensor Fault Detection Method based on Fuzzy Parametric Approach 45
4 Application
4.1 Benchmark Example: A Hydraulic Process [23]
The approach proposed previously has been validated on a benchmark: a hydraulic pro-
cess. This system is a hydraulic process composed of three tanks (Fig. 7). The objective
of the regulation is to be able to have a constant volume of the fluid. The three tanks
have the same section: S .
The physical model of this system is obtained with the difference between the en-
tering and outgoing flows which make evolve the level of each tank.
The state model is described by:
Ẋ = AX + BU
(32)
Y = CX + DU
T T
X = [h1 h2 h3 ] , U = [q1 q2 ] and Y = X (33)
The vector of outputs is the same as the state vector and, thus, the observation matrix
C is an identity matrix with a size 3x3. This system, considered as linear around a
running point, has been identified in using an ARX structure. The discrete model is
obtained by using a sample period equal to 0.68 seconds.
The model describes the dynamical behaviour of the system in terms of inputs/outputs
variations around the running point (U0 Y0 ) .
T T
U0 = (0.8 1) Y0 = (400 300 200)
The sensors noise no(k) considered is a normal distribution with mean zero and
variance one.
This system is completely observable and controllable.
A quadratic linear control, associated to an integrator, enables to calculate the feed-
back gain matrix K from the minimization of the following cost function:
1
N
J= xT (k)Qx(k) + uT (k)Ru(k) (35)
2
k=0
4.2 Results
Two types of defaults have been tested: the bias and the calibration default. The simu-
lation of the bias default has been carried out by substracting a constant value β from
the real value: for example h1real = h1sensor − β.
The simulation of the calibration default is obtained by multiplying the real value
by a coefficient γ: for example h1real = h1sensor ∗ γ.
For each type of default, we present two cases of simulated default: - a fault simu-
lated in a sensor, - a same fault simulated in two or three sensors at same time.
The environment of the supervision enables to see the good detection of defaults.
As soon as a failure is detected, the algorithm stops and indicates which sensor has a
default (Fig. 8).
The physical model is represented by the dotted line curve and the parametric model
by the solid line curve. For this example, the default is simulated, at sample 10, on the
sensor h1 . The algorithm has detected the default at sample 12.
Case 1. For these first experiments, we simulated a sensor default by type of default.
We have simulated the classical method and our approach with the bias default and the
calibration default for the three sensors (λ1 ,λ2 ,λ3 ). To compare these two methods, we
vary the values β and γ.
In table 3 and table 4, we show the performances of the two methods. For this, we
define a rate which is the percentage of detection on 100 tests.
We can note that the fuzzy method gives better results. Indeed, when the default is
weak (β < 7 or γ > 0.97), the rate of detection is more important.
On the other hand, the results are similar. To improve the detection with the classical
method, the threshold t could be decreased but that implies an important rate of false
alarm. Indeed, if the threshold is weaker than the importance of the noise, the algorithm
stops in an inopportune way.
A Multiple Sensor Fault Detection Method based on Fuzzy Parametric Approach 47
Fig. 8. Supervision.
Case 2. In this case, for each type of default, several faults are simulated in several
sensors at same time. Table 5 shows the performances of the proposed method when h1
and h2 , or h1 and h3 , or h2 and h3 , or h1 , h2 and h3 sensors present a default.
The rate of detection obtained is weaker than in the case 1, nevertheless, the FFF
method allows to detect several defaults simulated at same time (principally when β ≥ 8
or γ ≤ 0.96).
48 F. Lafont et al.
Table 5. Rate of detection for several defaults for the fuzzy method.
The measure noise has a great significance on the fault detection. The presented values
are the minimal values which the method can detect.
In the case where the measure noise is more important, these results can be upgraded
by modifying the values ηmin and ηmax defined in section 2.2. If the measure noise is
very large, it is necessary to increase these initial values. By doing that, a tolerance
compared with the noise is admitted. A compromise should be found between the noise
level and the variation of ηmin and ηmax . Indeed, the algorithm can detect a false alarm.
5 Conclusions
This paper presents an original method of model-based diagnosis with a fuzzy paramet-
ric approach. This method is applicable to all non-linear MIMO systems for which the
knowledge of the physical model is not required. We define a Fuzzy Forgetting Factor
which allows to improve the estimation of model parameters, and to detect and isolate
several types of faults. Thus, the fuzzy adaptation of the forgetting factors is used to
detect and isolate the sensor faults. The results are illustrated by a benchmark system (a
hydraulic process) and comparisons between the classical method and the FFF method
is depicted in table 3 and table 4. Moreover, the method has been evaluated for several
sensor defaults and results presented in table 5.
In conclusion, the method is efficient to detect and isolate one or more defaults
simultaneously. The proposed approach is able to detect faults which correspond to a
bias and a calibration default for each sensor.
A possible extension would be to determine the values ηmin and ηmax , described in
section 2.2, in an automatic way according to the sensor noise. Moreover, it would be
interesting to develop the FFF method for the actuator defaults.
References
1. Noura, H.: Methodes d’accommodation aux defauts: Theorie et application. Memoire
d’Habilitation a Diriger des Recherches. University Henri Poincare, Nancy 1, (2002)
2. Szederkenyi, G.: Model-Based Fault Detection of Heat Exchangers. Department of Applied
Computer Science. University of Veszprem, (1998)
3. Ripoll, P.: Conception d’un systeme de diagnostic flou appliqu au moteur automobile. Thesis.
University of Savoie, (1999)
A Multiple Sensor Fault Detection Method based on Fuzzy Parametric Approach 49
1 Introduction
network is initialized at random, and during the second stage the neural network
relaxes into one of the possible stable states, i.e. it optimizes the energy value. Since
the sought result is unknown and the search is done at random, the neural network is
to be initialized many times in order to find as deep an energy minimum as possible.
But the question about the reasonable number of such random starts and whether the
result of the search can be regarded as successful always remains open.
In this paper we have obtained expressions that have demonstrated the relationship
between the depth of a local minimum of energy and the size of the basin of attraction
[14]. Based on this expressions, we presented the probability of finding a local
minimum as a function of the depth of the minimum. Such a relation can be used in
optimization applications: it allows one, based on a series of already found minima, to
estimate the probability of finding a deeper minimum, and to decide in favor of or
against further running of the program. Our expressions are obtained from the
analysis of generalized Hopfield model, namely, of a neural network with Hebbian
matrix. They are however valid for any matrices, because any kind of matrix can be
represented as a Hebbian one, constructed on arbitrary number of patterns. A good
agreement between our theory and experiment is obtained.
and its (asynchronous) dynamics consist in the following. Let S be an initial state of
the network. Then the local field hi = −∂E / ∂si , which acts on a randomly chosen i-th
spin, can be calculated, and the energy of the spin in this field ε i = − si hi can be
determined. If the direction of the spin coincides with the direction of the local field
( ε i < 0 ), then its state is stable, and in the subsequent moment ( t + 1 ) its state will
undergo no changes. In the opposite case ( ε i > 0 ) the state of the spin is unstable and
it flips along the direction of the local field, so that si (t +1) = − si (t ) with the energy
ε i (t + 1) < 0 . Such a procedure is to be sequentially applied to all the spins of the
neural network. Each spin flip is accompanied by a lowering of the neural network
energy. It means that after a finite number of steps the network will relax to a stable
state, which corresponds to a local energy minimum.
3 Basin of Attraction
Let us examine at which conditions the pattern S m embedded in the matrix (1) will be
a stable point, at which the energy E of the system reaches its (local) minimum E m .
In order to obtain correct estimates we consider the asymptotic limit N → ∞ . We
determine the basin of attraction of a pattern S m as a set of the points of N-
dimensional space, from which the neural network relaxes into the configuration S m .
Let us try to estimate the size of this basin. Let the initial state of the network S be
located in a vicinity of the pattern S m . Then the probability of the network
convergation into the point S m is given by the expression:
N
⎛1 + erf γ ⎞
Pr = ⎜ ⎟ (3)
⎝ 2 ⎠
where erf γ is the error function of the variable γ :
rm N ⎛ 2n ⎞
γ = ⎜1 − ⎟ (4)
2(1 − r ) ⎝
2
m
N⎠
N ⎛ r0 1 − rm2 ⎞
nm = ⎜1 − ⎟ (5)
2 ⎜ r 1− r2 ⎟
⎝ m 0 ⎠
where
r0 = 2 ln N / N (6)
54 B. Kryzhanovsky et al.
0.6
nnm // NN
0.5
m
0.4
0.3
0.2
0.1
0.0
0.0 0. 2 r0 0.4 0. 6 0.8 1. 0
rm
Fig. 1. A typical dependence of the width of the basin of attraction nm on the statistical weight
of the pattern rm . A local minimum exists only for those patterns, whose statistical weight is
greater than r0 . For rm→r0 the size of the basin of attraction tends to zero, i.e. the patters whose
statistical weight rm ≤ r0 do not form local minima.
From analysis of Eq. (2) it follows that the energy of a local minimum Em can be
represented in the form:
E m = − rm N 2 (7)
with the accuracy up to an insignificant fluctuation of the order of
σ m = N 1 − rm2 (8)
Then, taking into account Eqs. (5) and (7), one can easily obtain the following
expression:
1
Em = Emin
(1 − 2nm / N )2 + Emin
2 2
/ Emax
(9)
where
1/ 2
Emin = − N 2 N ln N , Emax = −⎛⎜⎜ ∑ Em2 ⎞⎟⎟
M
(10)
m =1 ⎝ ⎠
The Shape of a Local Minimum and the Probability of its Detection in Random Search 55
which yield a relationship between the depth of the local minimum and the size of its
basin of attraction. One can see that the wider the basin of attraction, the deeper the
local minimum and vice versa: the deeper the minimum, the wider its basin of
attraction (see Fig.2).
0.0
-0.1
b
2
-0.2
Em / N
-0.3
a
-0.4
-0.5
-0.6
-0.7
-0.8
-0.9
-1.0
0.0 0.1 0.2 0.3 0.4 0.5
nm / N
Fig. 2. The dependence of the energy of a local minimum on the size of the basin of attraction:
a) N=50; b) N=5000.
We have introduced here also a constant Emax , which we make use of in what
follows. It denotes the maximal possible depth of a local minimum. In the adopted
normalization, there is no special need to introduce this new notation, since it follows
from (7)-(9) that E max = − N 2 . However for other normalizations some other
dependencies of Emax on N are possible, which can lead to a misunderstanding.
The quantity Emin introduced in (10) characterizes simultaneously two parameters
of the neural network. First, it determines the half-width of the Lorentzian distribution
(9). Second, it follows from (9) that:
Emax ≤ E m ≤ Emin (11)
i.e. Emin is the upper boundary of the local minimum spectrum and characterizes the
minimal possible depth of the local minimum. These results are in a good agreement
with the results of computer experiments aimed to check whether there is a local
minimum at the point Sm or not. The results of one of these experiments (N=500,
M=25) are shown in Fig.3. One can see a good linear dependence of the energy of the
local minimum on the value of the statistical weight of the pattern. Note that the
overwhelming number of the experimental points corresponding to the local minima
are situated in the right lower quadrant, where rm > r0 and Em < Emin . One can also
see from Fig.3 that, in accordance with (8), the dispersion of the energies of the
minima decreases with the increase of the statistical weight.
56 B. Kryzhanovsky et al.
0.0
2
Em / N
E m in
-0.5
-1.0
0.0 0.2 r0 0.4 0.6 0.8
rm
Fig. 3. The dependence of the energy Em of a local minimum on the statistical weight rm of
the pattern.
Now let we describe the shape of local minimum landscape. Let S m(n) is any point
in n-vicinity of local minima S m (n is Hemming distance between S m(n) and S m ).
(n) (n)
It follows from (2) that the energy E m in the state S m can be described as
where
M n N
D = 4∑ ∑ ∑ rμ s mi s μi s μj s mj (13)
μ ≠ m i =1 j = n +1
As we see all the minimas have the same shape described by Eqs. (15) which is
independent of matrix type. It can be proved in experiment by random matrix
generating and randomly chosen minima shape investigation (see Fig.4). As we see
the experiment is in a good agreement with theory.
The Shape of a Local Minimum and the Probability of its Detection in Random Search 57
0.2
E m(n) / IE m I
0.0
-0.2
-0.6
-0.8
-1.0
-1.2
0.0 0.1 0.2 0.3 0.4 n /N 0.5
Fig. 4. The shape of randomly detected minima: curve – theory (15), marks – experiment.
Equations (5) and (16) define implicitly a connection between the depth of the
local minimum and the probability of its finding. Applying asymptotical Stirling
expansion to the binomial coefficients and passing from summation to integration one
can represent (16) as
W = W0 e − Nh (17)
where h is generalized Shannon function
nm nm ⎛ nm ⎞ ⎛ n m ⎞
h= ln + ⎜1 − ⎟ ln⎜1 − ⎟ + ln 2 (18)
N N ⎝ N⎠ ⎝ N⎠
visibly non-zero only for deep enough minima E m >> Emin , whose basin of
attraction sizes are comparable with N / 2 . Taking into account (9), the expression
(18) can be transformed in this case to a dependence W = W ( E m ) given by
⎡ 2 ⎛ 1 1 ⎞⎤
W = W0 exp⎢− NEmin ⎜⎜ 2 − 2 ⎟⎟⎥ (19)
⎣ ⎝ E m Emax ⎠⎦
It follows from (18) that the probability to find a minimum increases with the
increase of its depth. This dependence “the deeper minimum → the larger the basin of
attraction → the larger the probability to get to this minimum” is confirmed by the
results of numerous experiments. In Fig.5 the solid line is computed from Eq. (17),
and the points correspond to the experiment (Hebbian matrix with a small loading
parameter M / N ≤ 0.1 ). One can see that a good agreement is achieved first of all
for the deepest minima, which correspond to the patterns S m (the energy interval
2
E m ≤ − 0.49 N in Fig.5). The experimentally found minima of small depth (the
points in the region E m > − 0.44 N 2 ) are the so-called “chimeras”. In standard
Hopfield model ( rm ≡ 1 / M ) they appear at relatively large loading parameter
M / N > 0.05 . In the more general case, which we consider here, they can appear also
earlier. The reasons leading to their appearance are well examined with the help of the
methods of statistical physics in [17], where it was shown that the chimeras appear as
a consequence of interference of the minima of S m . At a small loading parameter the
chimeras are separated from the minima of S m by an energy gap clearly seen in
Fig.5.
0.12
0.10
W
Probability
0.08
0.06
0.04
0.02
0.00
-0.52 -0.50 -0.48 -0.46 -0.44 -0.42 -0.40
2
Minima depth Em / N
Fig. 5. The dependence of the probability W to find a local minimum on its depth Em: theory -
solid line, experiment – points.
The Shape of a Local Minimum and the Probability of its Detection in Random Search 59
6 Discussion
Our analysis shows that the properties of the generalized model are described by three
parameters r0 , Emin and Emax . The first determines the minimal value of the statistical
weight at which the pattern forms a local minimum. The second and third parameters
are accordingly the minimal and the maximal depth of the local minima. It is
important that these parameters are independent from the number of embeded patterns
M.
0.12
B
W
в минимум
0.10
B
Probability
0.08
A
Вероятность попадания
0.06
A A
0.04
0.02
B
0.00
-0.66 -0.64 -0.62 -0.60 -0.58
Энергия минимума Em / N 2
Fig. 6. The comparison of the predicted probabilities (solid line) and the experimentally found
values (points connected with the dashed line).
Now we are able to formulate a heuristic approach of finding the global minimum
of the functional (2) for any given matrix (not necessarily Hebbian one). The idea is
to use the expression (19) with unknown parameters W0 , Emin and Emax . To do this
one starts the procedure of the random search and finds some minima. Using the
obtained data, one determines typical values of Emin and Emax and the fitting
parameter W0 for the given matrix. Substituting these values into (19) one can
estimate the probability of finding an unknown deeper minimum E m (if it exists) and
decide in favor or against (if the estimate is a pessimistic one) the further running of
the program.
This approach was tested with Hebbian matrices at relatively large values of the
loading parameter ( M / N ≥ 0.2 ÷ 10 ). The result of one of the experiments is shown
in Fig.6. In this experiment with the aid of the found minima (the points A ) the
parameters W0 , Emin and Emax were calculated, and the dependence W = W ( E m )
(solid line) was found. After repeating the procedure of the random search over and
over again ( ~ 10 5 random starts) other minima (points B ) and the precise
probabilities of getting into them were found. One can see that although some
dispersion is present, the predicted values in the order of magnitude are in a good
agreement with the precise probabilities.
60 B. Kryzhanovsky et al.
3.0% a
2.5%
2.0%
1.5%
1.0%
0.5%
0.0%
-0.65 -0.60 -0.55 -0.50 -0.45 -0.40
3.0%
b
2.5%
2.0%
1.5%
1.0%
0.5%
0.0%
-0.65 -0.60 -0.55 -0.50 -0.45 -0.40
Fig. 7. The case of matrix with a quasi-continuous type of spectrum. a) The upper part of the
figure shows the spectrum of minima distribution – each vertical line corresponds to a
particular minimum. The solid line denotes the spectral density of minima (the number of
minima at length ΔE ). The Y-axis presents spectral density and the X-axis is the normalized
values of energy minima E / N 2 . b) Probability of finding a minimum with energy E . The Y-
axis is the probability of finding a particular minimum (%) and the X-axis is the normalized
values of energy minima.
In conclusion we stress once again that any given symmetric matrix can be
performed in the form of Hebbian matrix (1) constructed on an arbitrary number of
patterns (for instance, M → ∞ ) with arbitrary statistical weights. It means that the
dependence “the deeper minimum ↔ the larger the basin of attraction ↔ the larger
the probability to get to this minimum” as well as all other results obtained in this
paper are valid for all kinds of matrices. To prove this dependence, we have generated
random matrices, with uniformly distributed elements on [-1,1] segment. The results
of a local minima search on one of such matrices are shown in Fig. 7. The value of
normalized energy is shown on the X-scale and on the Y-scale the spectral density is
noted. As we can see, there are a lot of local minima, and most of them concentrated
in central part of spectrum (Fig 7.a). Despite of such a complex view of the spectrum
of minima, the deepest minimum is found with maximum probability (Fig 7.b). The
same perfect accordance of the theory and the experimental results are also obtained
in the case of random matrices, the elements of which are subjected to the Gaussian
distribution with a zero mean.
The Shape of a Local Minimum and the Probability of its Detection in Random Search 61
References
1. Hopfield, J.J.: Neural Networks and physical systems with emergent collective
computational abilities. Proc. Nat. Acad. Sci.USA. v.79, pp.2554-2558 (1982)
2. Hopfield, J.J., Tank, D.W.: Neural computation of decisions in optimization problems.
Biological Cybernetics, v.52, pp.141-152 (1985)
3. Fu, Y., Anderson, P.W.: Application of statistical mechanics to NP-complete problems in
combinatorial optimization. Journal of Physics A., v.19, pp.1605-1620 (1986)
4. Poggio, T., Girosi, F.: Regularization algorithms for learning that are equivalent to
multilayer networks. Science 247, pp.978-982 (1990)
5. Smith, K.A.: Neural Networks for Combinatorial Optimization: A Review of More Than a
Decade of Research. INFORMS Journal on Computing v.11 (1), pp.15-34 (1999)
6. Hartmann, A.K., Rieger, H.: New Optimization Algorithms in Physics, Wiley-VCH, Berlin
(2004)
7. Huajin Tang; Tan, K.C.; Zhang Yi: A Columnar Competitive Model for Solving
Combinatorial optimization problems. IEEE Trans. Neural Networks v.15, pp.1568–1574 (2004)
8. Kwok, T., Smith, K.A.: A noisy self-organizing neural network with bifurcation dynamics
for combinatorial optimization. IEEE Trans. Neural Networks v.15, pp.84 – 98 (2004)
9. Salcedo-Sanz, S.; Santiago-Mozos, R.; Bousono-Calzon, C.: A hybrid Hopfield network-
simulated annealing approach for frequency assignment in satellite communications
systems. IEEE Trans. Systems, Man and Cybernetics, v. 34, 1108 – 1116 (2004)
10. Wang, L.P., Li, S., Tian F.Y, Fu, X.J.: A noisy chaotic neural network for solving
combinatorial optimization problems: Stochastic chaotic simulated annealing. IEEE Trans.
System, Man, Cybern, Part B - Cybernetics v.34, pp. 2119-2125 (2004)
11. Wang, L.P., Shi, H.: A gradual noisy chaotic neural network for solving the broad-
cast scheduling problem in packet radio networks. IEEE Trans. Neural Networks, vol.17,
pp.989 – 1000 (2006)
12. Joya, G., Atencia, M., Sandoval, F.: Hopfield Neural Networks for Optimization: Study of
the Different Dynamics. Neurocomputing, v.43, pp. 219-237 (2002)
13. Kryzhanovsky, B., Magomedov, B.: Application of domain neural network to optimization
tasks. Proc. of ICANN’2005. Warsaw. LNCS 3697, Part II, pp.397-403 (2005)
14. Kryzhanovsky, B., Magomedov, B., Fonarev, A.: On the Probability of Finding Local
Minima in Optimization Problems. Proc. of International Joint Conf. on Neural Networks
IJCNN-2006 Vancouver, pp.5882-5887 (2006)
15. Kryzhanovsky, B.V.: Expansion of a matrix in terms of external products of configuration
vectors. Optical Memory & Neural Networks, v. 17, No.1, pp. 17-26 (2008)
16. Perez-Vincente, C.J.: Finite capacity of sparce-coding model. Europhys. Lett, v.10,
pp. 627-631 (1989)
17. Amit, D.J., Gutfreund, H., Sompolinsky, H.: Spin-glass models of neural networks.
Physical Review A, v.32, pp.1007-1018 (1985)
Rapidly-exploring Sorted Random Tree: A Self
Adaptive Random Motion Planning Algorithm
Nicolas Jouandeau
Abstract. We present a novel single shot random algorithm, named RSRT, for
Rapidly-exploring Sorted Random Tree and based on inherent relations analysis
between RRT components. Experimental results are realized with a wide set of
path planning problems involving a free flying object in a static environment. The
results show that our RSRT algorithm is faster than existing ones. These results
can also stand as a starting point of a massive motion planning benchmark.
The problem of motion planning turns out to be solved only by high computationnal
systems due to its inherent complexity [1]. As the main goal of the discipline is to
develop pratical and efficient solvers that produce automatically motions, random sam-
pling searches successfully reduce the determinist-polynomial complexity of the reso-
lution [2]. In compensation, the resolution that consists in exploring the space, produce
non-determinist solution [3]. Principal alternatives of this search are realized in config-
uration space C [4], in state space X [5] and in state-time space ST [6]. C is intended
to motion planning in static environments. X adds differential constraints. ST adds the
possibility of a dynamic environment. The concept of high-dimensional configuration
spaces is initiated by J. Barraquand et al. [7] to use a manipulator with 31 degrees of
freedom. P. Cheng [8] uses these methods with a 12 dimensional state space involving
rotating rigid objects in 3D space. S. M. LaValle [9] presents such a space with a hun-
dred dimensions for a robot manipulator or a couple of mobiles. S.M. LaValle [10] is
based on the construction of a tree T in the considered space S. Starting from the initial
position qinit , the construction of the tree is carried out by integrating control com-
mands iteratively. Each iteration aims at bringing closer the mobile M to an element e
randomly selected in S. To avoid cycles, two elements e of T cannot be identical. In
practice, RRT is used to solve various problems such as negotiating narrow passages
made of obstacles [11], finding motions that satisfy obstacle-avoidance and dynamic
balances constraints [12], making Mars exploration vehicles strategies [13], searching
hidden objects [14], rallying a set of points or playing hide-and-seek with another mo-
bile [15] and many others mentioned in [9]. Thus the RRT method can be considered as
the most general one by their efficiency to solve a large set of problems.
64 N. Jouandeau
In its initial formulation, RRT algorithms are defined without goal. The exploration
tree covers the surrounding space and progress blindly towards free space. A geomet-
rical path planning problem aims generally at joining a final configuration qobj . To
solve the path planning problem, the RRT method searches a solution by building a
tree (Alg. 1) rooted at the initial configuration qinit . Each node of the tree results from
the mobile constraints integration. Its edges are commands that are applied to move the
mobile from a configuration to another.
The RRT method is a random incremental search which could be casting in the same
framework of Las Vegas Algorithms (LVA). It repeats successively a loop made of three
phases: generating a random configuration qrand , selecting the nearest configuration
qprox , generating a new configuration qnew obtained by numerical integration over a
fixed time step Δt. The mobile M and its constraints are not explicitly specified. There-
fore, modifications for additional constraints (such as non-holonomic) are considered
minor in the algorithm formulation.
In this first version, C is presented without obstacle in an arbitrary space dimension.
At each iteration, a local planner is used to connect each couples (qnew , qprox ) in C.
The distance between two configurations in T is defined by the time-step Δt. The local
planner is composed by temporal and geometrical integration constraints. The resulting
solution accuracy is mainly due to the chosen local planner. k defines the maximum
depth of the search. If no solution is found after k iterations, the search can be restarted
with the previous T without re-executing the init function (Alg. 1 line 1).
The RRT method, inspired by traditional Artificial Intelligent techniques for finding
sequences between an initial and a final element (i.e. qinit and qobj ) in a well-known
environment, can become a bidirectional search (shortened Bi-RRT [16]). Its principle
is based on the simultaneous construction of two trees (called Tinit and Tobj ) in which
the first grows from qinit and the second from qobj . The two trees are developped to-
wards each other while no connection is established between them. This bidirectional
search is justified because the meeting configuration of the two trees is nearly the half-
course of the configuration space separating qinit and qobj . Therefore, the resolution
time complexity is reduced [17].
RRT-Connect [18] is a variation of Bi-RRT that consequently increase the Bi-RRT
convergence towards a solution thanks to the enhancement of the two trees convergence.
This has been settled to:
Rapidly-exploring Sorted Random Tree 65
– ensure a fast resolution for “simple” problems (in a space without obstacle, the RRT
growth should be faster than in a space with many obstacles);
– maintain the probabilistic convergence property. Using heuristics modify the prob-
ability convergence towards the goal and also should modify its evolving distri-
bution. Modifying the random sampling can create local minima that could slow
down the algorithm convergence.
In RRT-Connect, the two graphs previously called Tinit and Tobj are called now Ta
and Tb (Alg. 3). Ta (respectively Tb ) replaces Tinit and Tobj alternatively (respectively
Tobj and Tinit ). The main contribution of RRT-Connect is the ConnectT function which
move towards the same configuration as long as possible (i.e. without collision). As the
incremental nature algorithm is reduced, this variation is designed for non-differential
constraints. This is iteratively realized by the expansion function (Alg. 2). A connection
is defined as a succession of successful extensions. An expansion towards a configura-
tion q becomes either an extension or a connection. After connecting successfully qnew
to Ta , the algorithm tries as many extensions as possible towards qnew to Tb . The con-
figuration qnew becomes the convergence configuration qco ( Alg. 3 lines 8 and 10).
Inherent relations inside the adequate construction of T in Cf ree shown in previous
works are:
– the deviation of random sampling in the variations Bi-RRT and RRT-Connect. Vari-
ations include in RRT-Connect are called RRT-ExtCon, RRT-ConCon and RRT-
ExtExt; they modify the construction strategy of one of the two trees of the method
RRT-Connect by changing priorities of the extension and connection phases [19].
– the well-adapted qprox element selected according to its collision probability in the
variation CVP and the integration of collision detection since qprox generation [20].
– the adaptation of C to the vicinity accessibility of qprox in the variation RC-RRT [21].
– the parallel execution of growing operations for n distinct graphs in the variation
OR parallel Bi-RRT and the growing of a shared graph with a parallel qnew sam-
pling in the variation embarrassingly parallel Bi-RRT [22].
– the sampling adaptation to the RRT growth [23–27].
66 N. Jouandeau
By adding the collision detection in the given space S during the expansion phase,
the selection of nearest neighbor qprox is realized in S ∩ Cf ree (Alg. 4). Although the
Rapidly-exploring Sorted Random Tree 67
2 RSRT Algorithm
Variations of RRT method presented in the previous section is based on the following
sequence :
– generating qrand ;
– selecting qprox in T ;
– generating each successor of qprox defined in U .
– realizing a colliding test for each successor previously defined;
– selecting a configuration called qnew that is the closest to qrand among successors
previously defined; This selected configuration has to be collision free.
– the insertion of qnew in T (i.e. without obstacle along the path between qprox and
qnew );
– the rejection of each qprox successors (i.e. due to the presence of at least one obsta-
cle along each successors path rooted at qprox ).
The rejection of qnew induces an expansion probability related to its vicinity (and
then also to qprox vicinity); the more the configuration qprox is close to obstacles, the
more its expansion probability is weak. It reminds one of fundamentals RRT paradigm:
free spaces are made of configurations that admit various number of available succes-
sors; good configurations admit many successors and bad configurations admit only
few ones. Therefore, the more good configurations are inserted in T , the better the
RRT expansion will be. The problem is that we do not previously know which good
and bad configurations are needed during RRT construction, because the solution of the
considered problem is not yet known. This problem is also underlined by the parallel
variation OR Bi-RRT [22] (i.e. to define the depth of a search in a specific vicinity). For
a path planning problem p with a solution s available after n integrations starting from
qinit , the question is to maximize the probability of finding a solution; According to
the concept of “rational action”, the response of P 3 class to adapt a on-line search can
be solved by the definition of a formula that defines the cost of the search in terms of
68 N. Jouandeau
“local effects” and “propagations” [28]. These problems find a way in the tuning of the
behavior algorithm like CVP did [20].
In the case of a space made of a single narrow passage, the use of bad configura-
tions (which successors generally collide) is necessary to resolve such problem. The
weak probability of such configurations extension is one of the weakness of the RRT
method.
To bypass this weakness, we propose to reduce research from the closest element
(Alg. 4) to the first free element of Cf ree . This is realized by reversing the relation
between collision detection and distance metric; the solution of each iteration is vali-
dated by subordinating collision tests to the distance metric; the first success call to the
collision detector validates a solution. This inversion induces:
3 Experiments
This section presents experiments performed on a Redhat Linux Cluster that consists of
8 Dual Core processor 2.8 GHz Pentium 4 (5583 bogomips) with 512 MB DDR Ram.
Fig. 1. 20 obstacles problem and its solution (upper couple). 100 obstacles problem and its solu-
tion (lower couple).
To perform the run-time behavior analysis for our algorithm, we have generated
series of problems that gradually contains more 3D-obstacles. For each problem, we
have randomly generated ten different instances. The number of obstacles is defined
by the sequence 20, 40, 60, . . . , 200, 220. In each instance, all obstacles are cubes
and their sizes are randomly varying between (5, 5, 5) and (20, 20, 20). The mobile is
70 N. Jouandeau
a cube with a fixed size (10, 10, 10). Obstacles and mobile coordinates are varying be-
tween (−100, −100, −100) and (100, 100, 100). For each instance, a set of 120 qinit
and 120 qobj are generated in Cf ree . By combinating each qinit and each qobj , 14400
configuration-tuples are available for each instance of each problem. For all that, our
benchmark is made of more than 1.5 million problems. An instance with 20 obstacles is
shown in Fig. 1 on the lower part and another instance with 100 obstacles in Fig. 1 on
the left part. On these two examples, qinit and qobj are also visible. We used the Prox-
imity Query Package (PQP) library presented in [29] to perform the collision detection.
The mobile is a free-flying object controlled by a discretized command that contains 25
different inputs uniformly dispatched over translations and rotations. The performance
was compared between RRT-Connect (using the RRT-ExtCon strategy) and our RSRT
algorithm ( Alg. 5).
The choice of the distance metric implies important consequences on configura-
tions’ connexity in Cf ree . It defines the next convergence node qco for the local plan-
ner. The metric distance must be selected according to the behavior of the local planner
to limit its failures. The local planner chosen is the straight line in C. To validate the
toughness of our algorithm regarding to RRT-Connect, we had use three different dis-
tance metrics. Used distance metrics are:
– the Euclidean distance (mentioned Eucl in Fig. 2 to 4)
12
i
j
d(q, q ) = (ck − ck )2 + nf 2
(αk − αk )2
k=0 k=0
15
10
0
0 50 100 150 200
15
10
0
0 50 100 150 200
Figure 2 shows that average resolving time of our algorithm oscillates between 10
and 4 times faster than the original RRT-Connect algorithm. As the space obstruction
grows linearly, the resolving time of RRT-Connect grows exponentially while RSRT al-
gorithm grows linearly. Figure 3 shows that the standard deviation follows the same
profile. It shows that RSRT algorithm is more robust than RRT-Connect. Figure 4 shows
that midpoints’ distributions follow the average resolving time behavior. This is a re-
inforcement of the success of the RSRT algorithm. This assumes that half part of time
distribution are 10 to 4 times faster than RRT-Connect.
72 N. Jouandeau
14
Rrt with Eucl
12 Rrt with Eucl2
Rrt with Manh
new Rrt with Eucl
10 new Rrt with Eucl2
new Rrt with Manh
8
0
0 50 100 150 200
4 Conclusions
We have described a new RRT algorithm, called RSRT algorithm, to solve motion plan-
ning problems in static environments. RSRT algorithm accelerates consequently the re-
sulting resolving time. The experiments show the practical performances of the RSRT
algorithm, and the results reflect its classical behavior. The results given above (have
been evaluated on a cluster which provide a massive experiment analysis. The chal-
lenging goal is now to extend the benchmark that is proposed to every motion planning
method. The proposed benchmark will be enhanced to specific situations that allow
RRT to deal with motion planning strategies based on statistical analysis.
References
1. Canny, J.: The complexity of robot motion planning. PhD thesis, Massachusetts Institute of
Technology. Artificial Intelligence Laboratory. (1987)
2. Schwartz, J., Sharir, M.: On the piano movers problem:I, II, III, IV, V. Technical report,
New York University, Courant Institute, Department of Computer Sciences (1983)
3. Latombe, J.: Robot Motion Planning (4th edition). Kluwer Academic (1991)
4. Lozano-Prez, T.: Spatial Planning: A Configuration Space Approach. In: Trans. on Comput-
ers. (1983)
5. Donald, B., Xavier, P., Canny, J., Reif, J.: Kinodynamic Motion Planning. Journal of the
ACM (1993)
6. Fraichard, T.: Dynamic trajectory planning with dynamic constraints: a ”state-time space”
approach. In: Int. Conf. Robotics and Automation (ICRA’93). (1993)
7. Barraquand, J., Latombe, J.: A Monte-Carlo Algorithm for Path Planning with many degrees
of Freedom. In: Int. Conf. on Robotics and Automation (ICRA’90). (1990)
Rapidly-exploring Sorted Random Tree 73
8. Cheng, P.: Reducing rrt metric sensitivity for motion planning with differential constraints.
Master’s thesis, Iowa State University (2001)
9. LaValle, S.: Planning Algorithms. [on-line book] (2004)
http://msl.cs.uiuc.edu/planning/.
10. LaValle, S.: Rapidly-exploring random trees: A new tool for path planning. Technical Report
98-11, Dept. of Computer Science, Iowa State University (1998)
11. Ferr, E., Laumond, J.: An iterative diffusion algorithm for part disassembly. In: Int. Conf.
Robotics and Automation (ICRA’04). (2004)
12. Kuffner, J., Nishiwaki, K., Kagami, S., Inaba, M., Inoue, H.: Motion planning for humanoid
robots. In: Int’l Symp. Robotics Research (ISRR’03). (2003)
13. Williams, B.C., B.C., Kim, P., Hofbaur, M., How, J., Kennell, J., Loy, J., Ragno, R., Stedl,
J., Walcott, A.: Model-based reactive programming of cooperative vehicles for mars explo-
ration. In: Int. Symp. on Artificial Intelligence, Robotics and Automation in Space. 2001
14. Tovar, B., LaValle, S., Murrieta, R.: Optimal navigation and object finding without geometric
maps or localization. In: Int. Conf. on Robotics and Automation (ICRA’03). (2003)
15. Simov, B., LaValle, S., Slutzki, G.: A complete pursuit-evasion algorithm for two pursuers
using beam detection. In: Int. Conf. on Robotics and Automation (ICRA’02). (2002)
16. LaValle, S., Kuffner, J.: Randomized kinodynamic planning. In: Int. Conf. on Robotics and
Automation (ICRA’99). (1999)
17. Russell, S., Norvig, P.: Artificial Intelligence, A Modern Approach (2me dition). Prentice
Hall (2003)
18. Kuffner, J., LaValle, S.: RRT-Connect: An efficient approach to single-query path planning.
In: Int. Conf. on Robotics and Automation (ICRA’00). (2000)
19. LaValle, S., Kuffner, J.: Rapidly-exploring random trees: Progress and prospects. In: Work-
shop on the Algorithmic Foundations of Robotics (WAFR’00). (2000)
20. Cheng, P., LaValle, S.: Reducing Metric Sensitivity in Randomized Trajectory Design. In:
Int. Conf. on Intelligent Robots and Systems (IROS’01). (2001)
21. Cheng, P., LaValle, S.: Resolution Complete Rapidly-Exploring Random Trees. In: Int.
Conf. on Robotics and Automation (ICRA’02). (2002)
22. Carpin, S., Pagello, E.: On Parallel RRTs for Multi-robot Systems. In: 8th Conf. of the
Italian Association for Artificial Intelligence (AI*IA’02). (2002)
23. Jouandeau, N., Chrif, A.A.: Fast Approximation to gaussian random sampling for random-
ized motion planning. In: Int. Symp. on Intelligent Autonomous Vehicules (IAV’04). (2004)
24. Corts, J., Simon, T.: Sampling-based motion planning under kinematic loop-closure con-
straints. In: Workshop on the Algorithmic Foundations of Robotics (WAFR’04). (2004)
25. Lindemann, S.R., LaValle, S.M.: Current issues in sampling-based motion planning. In: Int.
Symp. on Robotics Research (ISRR’03). (2003)
26. Lindemann, S., LaValle, S.: Incrementally reducing dispersion by increasing Voronoi bias in
RRTs. In: Int. Conf. on Robotics and Automation (ICRA’04). (2004)
27. Yershova, A., Jaillet, L., Simeon, T., LaValle, S.M.: Dynamic-domain rrts: Efficient ex-
ploration by controlling the sampling domain. In: Int. Conf. on Robotics and Automation
(ICRA’05). (2005)
28. Russell, S.: Rationality and Intelligence. In Press, O.U., ed.: Common sense, reasoning, and
rationality. (2002)
29. Gottschalk, S., Lin, M., Manocha, D.: Obb-tree: A hierarchical structure for rapid interfer-
ence detection. In: Proc. of ACM Siggraph’96. (1996)
Applying an Intensification Strategy
on Vehicle Routing Problem
1 Introduction
The Vehicle Routing Problem (VRP) that is a NP-Hard problem [1] is usually dealt
within the logistic context [2],[3]. It can be described as a set of customers that have
to be served by a fleet of vehicles, satisfying some constraints [4],[3]. The transport is
one of the most costly activities in logistic, typically varying in one or two thirds of
the total costs [5]. Therefore, the necessity of improving the efficiency of this activity
has great importance. A small percentage saved with this activity could result in a
substantial saving total [6]. There many variants and constraints that can be
considered, i.e. it that can be considered the fleet may be heterogeneous, the vehicles
must execute collections and deliveries, there may exist more than one depot, etc. In
this paper we are dealing with the classic version of this problem, were just the
vehicle capacity constraint are considered.
A classical definition is presented in Barbarasoglu and Ozgur [7]. The VRP is defined
in a complete, undirected graph G=(V,A) where a fleet of Nv vehicle of homogeneous
capacity is located. All remaining vertices are customers to be served. A non-negative
matrix C=(cij) is defined on A representing the distance between the vertices. The
costs are the same in both directions. A non-negative demand di, is associated with
76 E.P.L. Simas and A.T. Gómez
each vertex representing the customer demand at vi. The routes must start and finish at
the depot. The clients must be visited just once, by only one vehicle and the total
demand of the route can’t exceed the capacity Qv of the vehicle. In some cases, there
is a limitation on the total route duration. In this case, tij represents the travel time for
each (vi,vj), ti represents the service time at vertex vi and is required that the total time
duration of any route should not exceed Tv. A typical formulation based on [7] is used
in this paper:
∑ ∑ ∑c X
v
Minimize ij .
ij (1)
i j v
∑ ∑ X
v
ij
= 1 for all j. (2)
i v
∑∑ X
v
ij
= 1 for all i. (3)
j v
∑ X −∑
v v
i
ip
j
X pj
= 0 for all p,v. (4)
⎛ ⎞
∑ d ⎜⎜ ∑ X ⎟ <= Q for all v.
v
i ij ⎟ (5)
v
i ⎝ j ⎠
n
∑ X
v
0j
<= 1 for all v. (6)
j =1
∑ X
v
i0
<= 1 for all v. (7)
i =1
∑ X
v
i0
<= 1 for all i,j e v. (8)
i =1
Where Xij are binary variables indicating if arc(vi,vj) is transversed by vehicle v. The
objective function of distance/cost/time is expressed by eq. (1). Constraints in eqs (2)
and (3) together state that each demand vertex is served by exactly one vehicle. The
eq. (4) guarantees that a vehicle leaves the demand vertex as soon as it has served the
vertex. Vehicle capacity is expressed by (5) where Qv is the capacity. Constraints (6)
and (7) express that vehicle availability can’t be exceeded. The subtour elimination
constraints are given in eq.(8) where Z can be defined by:
Z= {(X ): ∑
v
∑ X
v
<= B − 1
for B ⊆ V /{0}; B >= 2}
ij ij
(9)
i∈B j∈B
3 Resolutions Methods
Since VRP is Np-Hard to obtain good solutions in an acceptable time, heuristics are
used and this is the reason why the majority of researchers and scientists direct their
Applying an Intensification Strategy on Vehicle Routing Problem 77
efforts in heuristics development [8],[9],[3]. Osman and Laporte [10] define heuristic
as a technique, which seeks good solutions at a reasonable computational cost without
being able to guarantee the optimality. Laporte et al [11] define two main groups of
heuristics: classical heuristics, developed mostly between 1960 and 1990, and
metaheuristics. The classical heuristics are divided in three groups: constructor
methods, two-phase methods and improvement methods. Since 1990, the
metaheuristics have been applied to the VRP problem. To Osman and Laporte [10] a
metaheuristic is formally defined as an iterative generation process which guides a
subordinate heuristic by combining intelligently different concepts for exploring and
exploiting the search space in order to find efficiently near-optimal solutions. Several
metaheuristics have been proposed to solve the VRP problem. Among these ones,
Tabu Search are considered the best metaheuristic for VRP. To review some works
with Tabu Search and others metaheuristics some readings are suggested [12],[13].
It was proposed by Glover [14] and had its concepts detailed in Glover and Laguna
[15]. It’s a technique to solve optimization combinatorial problems [14] that consists
in an iterative routine to construct neighborhoods emphasizing the prohibition of
stopping in an optimum local. The process that Tabu Search searches for the best
solution is through an aggressive exploration [15], choosing the best movement for
each iteration, not depending on if this movement improves or not the value of the
actual solution. In Tabu Search development, intensification and diversification
strategies are alternated through the tabu attributes analysis. Diversification strategies
direct the search to new regions, aiming to reach whole search space while the
intensification strategies reinforce the search in the neighborhood of a solution
historically good [15]. The stop criterion makes it possible to stop the search. It can be
defined as the interaction where the best results were found or as the maximum
number of iteration without an improvement in the value of the objective function.
The tabu list is a structure that keeps some solution’s attributes that are considered
tabu. The objective of this list is to forbid the use of some solutions during some
defined time.
4 VRP Application
solution list is used to keep the best results that were found during the search. It was
proposed an intensification strategy to be used every time when the search executes
15 iterations without an improvement in the objective function value. In this strategy
we visit every solution that is in elite list generating a big neighborhood for each one.
There were defined two movements to neighborhood generation. V1 that makes the
exchange of vertices and V2 that makes the relocation of vertices. In V1, one route r1
is selected and than one vertex of this route is chosen. We try to exchange this vertex
with every vertex of all the other routes. The exchange is done if the addition of the
two new demands doesn’t exceed the vehicle’s capacity of both routes. This
procedure is done for every vertex of the route r1. To every exchange that is made,
one neighbor is generated. In V2, we select one route and choose one vertex and then
we try to reallocate it into all others routes, if it doesn’t exceed the vehicle capacity of
the route. When a vertex can be inserting into a route, we try to insert it into all
possible positions inside this route. To every position that a vertex is inserted, one
neighbor is generated.
When these movements are used in the search with intensification, they are called
V1’ and V2’ because with intensification, not only one route is selected like in V1
and V2, but also all routes of the solution are chosen. Aiming increase the
neighborhood size and the diversification between the solutions, we proposed to use
the movements alone and together.
5 Computational Experience
There were generated 54 experiments for each group combining all values
proposed to Nbmax with all Tabu List size. So, for each problem there were generated
162 experiments using intensification and 162 experiments not using it. Two types of
analyses were done. In one type it was evaluated the best result obtained for a fixed
value of Nbmax used with all Tabu List size and in other type it was evaluated the
Applying an Intensification Strategy on Vehicle Routing Problem 79
best results obtained for a fixed size of Tabu List used with all values proposed to
Nbmax. Analyses had also been made comparing the best result found for each group
of experiment, in this case comparing the quality of the different movements.
5.1 Analyzing the Nbmax Variation for each Tabu List Size
By analyzing the results in this perspective it will be evaluate the variation of the
Nbmax for each Tabu List size. The objective is verified if big values of Nbmax can
improved the quality of Tabu Search process. We create a “lower average” for the
average from results obtained with Nbmax = 100 and Nbmax = 250 and an “upper
average” for the average from results obtained with Nbmax = 1750 and 2000. For all
problems, analyzing each one of the 6 groups of experiments done, the “upper
average” was always lesser than the “lower average”, indicating that big values of
Nbmax can improve the search quality. The experiments shown that an increase in
Nbmax value can improve the search quality, by decreasing the results costs of the
solutions.
Table 2 shows the number of best results that were found in each Nbmax value:
Table 2. Quantity and localization of the best results found for problems 1,2,3,4 and 5.
This table shown that most best results were found when the search used big values
of Nbmax.
5.2 Analyzing the Tabu List Size Variation for each Nbmax Value
By analyzing the results in this perspective it will be evaluate the variation of the tabu
list size for each Nbmax value. The objective is verified if big tabu list size can
improved the quality of Tabu Search process. For problem 1, not using intensification,
66,66% of the results were found with Tabu list size ≥ 75. Using intensification it was
48,14%. For problem 2 the percentage was 55,55% and 59,25%, not using and using
intensification. For problem 3 the percentage was 77,77% and 96,29% not using and
80 E.P.L. Simas and A.T. Gómez
using intensification. For problem 4 these percentage were 74,05% and 77,77% and
for problem 5 they were 63,88% and 92,59%. So by analyzing these results we can
see that big tabu list size can improve the quality of the search process.
5.3 Comparing the Search Process Using and not Using Intensification
This analysis intends to compare the search process using and not using the
intensification strategy to see if it can improve the results generated.
Figure 1 shows an example of the graphics done with the results obtained in both
search process to compare the quality of the different search process.
1500
1450
1400
Cost
1350
1300
1250
00
50
00
50
0
00
25
50
75
00
/1
/2
/5
/7
/1
/1
/1
/1
/2
75
75
75
75
75
75
75
75
75
Tabu List/Nbmax
Fig. 1. Costs obtained from both search process for Tabu List size = 75 and using V2 for
problem 5.
This figure shows that an intensification strategy increase all results of the search
process using V2 for problem 5. A comparison with the results generated by the
search process using and not using intensification was done. Figures 2 to 6 shows the
percentage of results that were improved with the intensification strategy. Figure 3
shows that for problem 1, from 162 results that were generated 97 were improved
with intensification.
100
80
60 Best results quantity
40
20
0
not using using
intensification intensification
Search type
For problem 2, from 162 results, 121 were improved using intensification strategy.
140
Resuls quantyti
120
100
80
Best results quantyti
60
40
20
0
not using using
intensification intensification
Search type
For problem 3, from 162 results 102 were improved using intensification strategy.
100
80
60 Best results quantyti
40
20
0
not using using intensification
intensification
Search type
For problem 4, from 162 results 117 were improved using intensification strategy.
100
80
Best results quantyti
60
40
20
0
not using using
intensification intensification
Search type
For problem 5, from 162 results 135 were improved using intensification strategy.
Results quantyti
120
100
80 Best results quantyti
60
40
20
0
not using using
intensification intensification
Search Process
The figures shows that an increase in solution quality of, at least, 50% happens
when intensification strategy is used.
The average results and the standard deviation are shown for problem 1, 2, 3, 4 and
5 in tables 3 to 7.
Problem 1
Average Standard Deviation
Not using Using Not using Using
Intensif. Intensif. Intensif. Intensif.
V1 657,55 650,95 24,79 28,30
V2 582,08 570,30 21,16 22,21
V1,V2 537,36 542,01 8,56 13,62
Problem 2
Average Standard Deviation
Not using Using Not using Using
Intensif. Intensif. Intensif. Intensif.
V1 951,07 943,02 21,68 25,84
V2 895,75 883,03 20,67 21,71
V1,V2 867,96 863,06 13,00 14,70
Problem 3
Average Standard Deviation
Not using Using Not using Using
Intensif. Intensif. Intensif. Intensif.
V1 954,01 948,12 19,60 16,95
V2 903,63 901,17 14,50 17,84
V1,V2 879,90 870,10 15,05 20,73
Applying an Intensification Strategy on Vehicle Routing Problem 83
Problem 4
Average Standard Deviation
Not using Using Not using Using
Intensif. Intensif. Intensif. Intensif.
V1 1215,35 1210,79 12,79 10,70
V2 1124,93 1118,83 15,66 12,52
V1,V2 1087,72 1079,54 18,15 16,70
Problem 5
Average Standard Deviation
Not using Using Not using Using
Intensif. Intensif. Intensif. Intensif.
V1 1569,11 1561,59 10,12 16,22
V2 1421,41 1393,56 34,40 12,75
V1,V2 1387,93 1377,82 26,64 16,37
From the results presented in tables 3 to 7 we can see that the results generated by
the movements grouped are better than the results obtained using the movements
alone. The reason for this is that when movements are used together the size of the
neighborhood generated is bigger than the neighborhood generated by V1 or V2
alone. The movements together also cause an increase of the diversification of the
solutions. And when the search generates more results, it is doing a deeper search in
the space. Of course, as was shown, the intensification strategy helps the search to
produce more qualified results.
When comparing the V1 and V2 movements, we can see that V2 produce results
more qualified. If we analyze the policy behind the movement, we can say that V2 is
more flexible than V1. V1 needs that two constraints are satisfied to generate one
neighbor. While in V2, just one demand capacity must be verified (the capacity of the
vehicle that serve the route where the vertex are being allocated) in V1, both routes
must be verified to see if the vehicles capacities aren’t exceeded.
Next table shows the best results obtained for each problem. All the best results
were obtained during the search using movements V1 and V2 together and with the
intensification strategy.
84 E.P.L. Simas and A.T. Gómez
5.4 Comparisons
Aiming to evaluate the quality of the application developed, some papers were
selected from the literature to compare the results. There were selected some classical
heuristic and some papers that also used Tabu Search to solve the VRP. The paper
selected were: {WL}Willard [19], {PF}Pureza and França [20], {OM1}Osman [21] ,
{OM2} Osman [22] , {RG} Rego [23], {GHL} Gendreau, Hertz and Laporte [24],
{BO} Barbarasoglu and Ozgur [7], {XK} Xu and Kelly [3], {TV} Toth and Vigo
[25], {CW} Clarke and Wright [26], {GM} Gillet and Miller [27] , {MJ} Mole and
Jamenson [28] , {CMT} Christofides, Mingozzi and Toth[18].
Table 9 shows the comparison done with the results from the papers. The results
were obtained in Barbarasolgu and Ozgur [7] and in Gendreau, Hertz and
Laporte[24]. In the first columns the paper used is shown. Columns 2 and 4 present
the best results from the paper and columns 3 and 5 shows the difference in
percentage from the results obtained in this paper to the paper compared. This
difference was called “gap”. The (+) indicate that our result is that percentage more
than the result from the paper. The (-) indicate that our results is that percentage
minor than the result from the paper.
Problem 1 Problem 2
Best %Gap Best %Gap
WL 588 11,91(-) 893 5,33(-)
RG 557,86 6,17(-) 847 0,10(+)
PF 536 2,10(-) 842 0,69(+)
OM1 524,61 0,15(+) 844 0,45(+)
OM2 524,61 0,15(+) 844 0,45(+)
GHL 524,61 0,15(+) 835,77 1,42(+)
BO 524,61 0,15(+) 836,71 1,31(+)
XK 524,61 0,15(+) 835,26 1,48(+)
TV 524,61 0,15(+) 838,60 1,09(+)
CW 578,56 10,11(-) 888,04 4,74(-)
GM 546 3,92(-) 865 2,03(-)
MJ 575 9,44(-) 910 7,33(-)
CMT 534 1,63(-) 871 2,73(-)
By analyzing these tables we can see that our application produce more qualified
results than all the classical heuristics used in comparison because our result was
better than all of the heuristic results. Comparing with other tabu search algorithm, we
can say that our algorithm is very competitive. It dominates at least 2 results from the
9 used for each problem. Moreover, the results generated were less than 5% of the
other results for all cases. And in 25 cases out of 45 this percentage is minor than 2%.
6 Final Considerations
In this paper it was proposed an application using Tabu Search to solve the vehicle
routing problem. This application was divided into 3 modules: a net generation
module, an initial solution module and tabu search module. We used two movements
based in relocation of vertices and exchange of vertices to create the neighborhood.
We use the movements alone and together, intending to diversify the solutions. We
used an elite list solution to keep the best results found during the search. We propose
an intensification strategy to use every time the search executes 15 iterations without
improvement in objective value. We proposed some experiments to test if the solution
quality increases or not with the increase in Nbmax value and in Tabu List size. We
also compare the search process using and not using intensifications intending to see
if this solution’s quality is improved with the Intensification strategy. The
experiments showed that big values to Nbmax and Tabu list size could improve the
results. From the experiments we also can see that an intensification strategy can
improve the quality of the search.
References
1. Lenstra,J.K, Rinnoy K., G.: Complexity of Vehicle Routing and Scheduling Problems.
Networks 11, 221-227(1981)
2. Ho, S.C., Haugland, D.: A tabu search heuristic for the vehicle routing problem with time
windows and split deliveries. Computers & Operations Research 31, 1947-1964 (2004)
3. Xu, J., Kelly, James P.: A Network Flow-Based Tabu Search Heuristic for the Vehicle
Routing Problem. Transportation Science 30, 379-393 (1996)
4. Laporte, G. The Vehicle Routing Problem: An overview of exact and approximate
algorithms. European Journal of Operational Research 59, 345-458 (1992)
5. Ballou, R.H. 2001 Gerenciamento da cadeia de Suprimentos – Planejamento, Organização
Logística Empresarial, 4Ed, Porto Alegre: Bookman (2001)
6. Bodin, L.D, Golden, B.L., Assad, A.A., Ball, M.O.: Routing and Scheduling of vehicles
and crews: The State of the Art. Computers and Operations Research 10, 69-211 (1983)
7. Barbarasoglu,G., Ozgur, D.: A tabu search algorithm for the vehicle routing problem.
Computers & Operations Research 26, 255-270 (1999)
8. Thangiah, S.R., Petrovik, P. 1997 Introduction to Genetic Heuristics and vehicle Routing
Problems with Complex Constraints. In: Woodruff, David, L. Advances in Computational
and Stochastic Optimization, Logic programming, and Heuristic search: Interfaces in
Computer Science and Operations research. Kluwer Academic Publishers. (1997)
9. Nelson, Marvin D; Nygard, Kendall E.; Griffin, John H.;Shreve, Warren E.:Implementing
Techniques for the vehicle routing problem. Computers & Operations Research 12, 273-283
(1985)
86 E.P.L. Simas and A.T. Gómez
Abstract. When new data are obtained or simply when time goes by, the pre-
diction accuracy of models in use may decrease. However, the question is when
prediction accuracy has dropped to a level where the model can be considered
out of date and in need of updating. This article describes a method that was de-
veloped for detecting the need for a model update. The method was applied in
the steel industry, and the models whose need of updating is under study are two
regression models, a property model and a deviation model, developed to facili-
tate planning of optimal process settings by predicting the yield strength of steel
plates beforehand. To decide on the need for updating, information from similar
past cases was utilized by introducing a limit called an exception limit for both
models. The limits were used to indicate when a new observation was from an
area of the model input space where the results of the models are exceptional.
Moreover, an additional limit was formed to indicate when too many exceedings
of the exception limit have occurred within a certain time scale. These two limits
were then used to decide when to update the model.
1 Introduction
At Ruukki’s steel works in Raahe, Finland, liquid steel is cast into steel slabs that are
then rolled into steel plates. Many different variables and mechanisms affect the me-
chanical properties of the final steel plates. The desired specifications of the mechani-
cal properties of the plates vary, and to fulfil the specifications, different treatments are
required. Some of these treatments are complicated and expensive, so it is possible to
optimize the process by predicting the mechanical properties beforehand on the basis
of planned production settings [1].
Regression models have been developed for Ruukki to help development engineers
control mechanical properties such as yield strength, tensile strength, and elongation
of the metal plates [2]. Two different regression models are used for every mechani-
cal property: a property model and a deviation model. The first one tells the predicted
quality value, while the second one tells the actual working limits around this value.
However, acquirement of new data and the passing of time decrease the reliability of
88 H. Koskimäki et al.
these models, which can result in economical losses to the plant. For example, when
mechanical properties required by the customer are not satisfied in qualification tests,
the testing lot in question need to be reproduced. If retesting also gives an unsatisfac-
tory result, the whole order has to be produced again. Because of the volumes produced
in a steel mill, this can cause huge losses. Thus, updating of the models emerges as
an important step in improving modelling in the long run. This study is a follow-up to
article [3], in which the need to update the regression model developed to model the
yield strength of steel plates was studied from the point of view of the property model.
However, in this article the deviation model is added to the study, making the solution
more complete.
In practice, because the model is used in advance to plan process settings and since
the employees know well how to produce common steel plate products, modelling of
rare and new events becomes the most important aspect. However, to make new or
rarely manufactured products, a reliable model is needed. Thus, when comparing the
improvement in the model’s performance, rare events are emphasized.
In this study model adaptation was approached by searching for the exact time when
the performance of the model has decreased too much. In practice, model adaptation
means retraining the model at optimally selected intervals. However, because the sys-
tem has to adapt quickly to a new situation in order to avoid losses to the plant, periodic
retraining, used in many methods ([4], [5]), is not considered the best approach. More-
over, there are also disadvantages if retraining is done unnecessarily. For example, extra
work is needed to take a new model into use in the actual application environment. In
the worst case, this can result in coding errors that affect the actual accuracy of the
model.
Some other studies, for example [6], have considered model adaptation as the model’s
ability to learn behaviour in areas from which information has not been acquired. In this
study, adaptation of the model is considered to be the ability to react to time-dependent
changes in the modelled causality. In spite of extensive literature searches, studies com-
parable with the approach used in this article were not found. Thus, it can be assumed
that the approach is new, at least in an actual industrial application.
The data for this study were collected from Ruukki’s steel works production database
between July 2001 and April 2006. The whole data set consisted of approximately
250,000 observations. Information was gathered from element concentrations of actual
ladle analyses, normalization indicators, rolling variables, steel plate thicknesses, and
other process-related variables [7]. The observations were gathered during actual prod-
uct manufacturing. The volumes of the products varied, but if there were more than
500 observations from one product, the product was considered a common product.
Products with less than 50 observations were categorized as rare products.
In the studied prediction model, the response variable used in the regression mod-
elling was the Box-Cox-transformed yield strength of the steel plates. The Box-Cox
transformation was selected to produce a Gaussian-distributed error term. The deviation
in yield strength also depended strongly on input variables. Thus, the studied prediction
Detection of Correct Moment to Model Update 89
model included separate link-linear models for both mean and variance
yi ∼ N (μi , σi2 )
μi = f (xi β)
σi = g(zi τ ). (1)
The length of the parameter vector of mean model β was 130 and the length of the
parameter vector of variance model τ was 30. The input vectors xi and zi included 100
carefully chosen non-linear transformations of the 30 original input variables; for ex-
ample, many of these transformations were products of two or three original inputs. The
link functions f and g that were used were power transformations selected to maximize
the fit with data. The results are presented in the original (nontransformed) scale of the
response variable [2].
where
n = number of observations in a neighbourhood,
ε(i) = the prediction error of the ith observation of the neighbourhood,
di = the Euclidian distance from the new observation to the ith obser-
vation of the neighbourhood,
max(d) = the maximum allowed Euclidian distance between the new obser-
vation and the previous observations in the neighbourhood (= 3.5).
90 H. Koskimäki et al.
n εi 2
i=1 [(1 − max(d) ) · σ 2 ]
di
ASSRN = n , (3)
i=1 (1 − max(d) )
di
where
4 Study
The method used to observe the need for a model update was based on a combination
of the AP EN and ASSRN terms. To be accurate, in both cases a limit, called the
exception limit, was used to decide when the value of a term differed too much from
the expected and it should be called an exception. The exception limits were developed
differently for the property and deviation models. However, because the property and
deviation models were both used to solve the final working limits of product manufac-
turing, it was considered that they should be updated at the same time moments. Thus,
the exceptions were combined to find the actual update moment.
In the property model case, it was considered an exception if the model’s average
prediction error in the neighbourhood of a new observation (AP EN ) differed from
zero too much compared with the amount of similar past cases. When there are plenty
of accurately predicted similar past cases, the AP EN is always near zero. When the
amount of similar past cases decreases, the sensitivity of the AP EN (in relation to
measurement variation) increases, also in situations when the actual model would be
accurate. In other words, the relationship between the sensitivity of the AP EN and the
number of neighbours is negatively correlated. The exception limit for the AP EN is
shown in Figure 1. It defines how high the absolute value of the average prediction error
of a neighbourhood (= |AP EN |) has to be in relation to the size of the neighbourhood
before it can be considered an exception. This design was introduced to avoid possible
sensitivity issues of the AP EN . In practice, if the size of the neighbourhood was 500
(the area is well known), prediction errors higher than 8 were defined as exceptions,
while with a neighbourhood whose size was 5, the error had to be over 50. The values
of the prediction errors used were decided by relating them to the average predicted
deviation, σ̂¯i (≈ 14.4).
In the deviation model case the exceptions were related to the average of the squared
standardized residuals in the neighbourhood. Two different boundary values were for-
med so that if the ASSRN was smaller than the first value or greater than the second
value, it was considered an exception. The actual exceptions were thus achieved using
the equation:
⎧
⎨ 1, when ASSRN < 13
ev(i) = 1, when ASSRN > 3 (4)
⎩
0, else
Detection of Correct Moment to Model Update 91
55
50
45
35
30
25
20
15
10
5
0 50 100 150 200 250 300 350 400 450 500
Size of neighborhood
A second limit, the update limit, was formed to merge the information on excep-
tions. The limit was defined as being exceeded if the sum of the exceptions within a
certain time interval was more than 10 percent of the amount of observation in the
interval. The chosen interval was 1000 observations, which represents measurements
from approximately one week of production. Thus, the model was retrained every time
if the sum of the exception values exceeded a value of 100 during the preceding 1000
observations.
The study was started by training the parameters of the regression model using the
first 50,000 observations (approximately one year). After that the trained model was
used to give the AP EN s and ASSRN s of new observations. The point where the
update limit was exceeded the first time was located and the parameters of the model
were updated using all the data acquired by then. The study was carried on by studying
the accuracy of the model after each update and repeating the steps iteratively until the
whole data set was passed.
5 Results
The effect of the update was studied by comparing the results of the actual prediction
error and the deviation achieved using our approach to the case without an update. The
accuracy of the models was measured in the property prediction case using the weighted
mean absolute prediction error
1
N
MAE = N w(i)|yi − μ
i |. (5)
i=1 w(i) i=1
In the variation model case, a robustified negative log-likelihood was employed to take
into account the variance
1 (yi − μi )2
robL = N w(i) log( σi2 ) + ρ . (6)
i=1 w(i) i
i
σ 2
92 H. Koskimäki et al.
t, when t ≤ 25
ρ(t) = (7)
b2 , when t > 25
which truncates the squared standardized residuals if the standardized residual is below
-5 or above +5.
Two different methods were used to define the weights, w(i). They were chosen to
reflect the usability value of the models. In the first goodness criterion the weights w(i)
were defined productwise, meaning the weight of the observations of a product could be
at most as much as the weight of T observations. Let Ti be the number of observations
that belong to the same product as the ith observation. Then the weight of observation
i is
1, when Ti ≤ T
w(i) = (8)
T /Ti , when Ti > T.
Here the value of T = 50, meaning that if there is more than 50 observation of a
product, the weight is scaled down. The second goodness criterion was formed to take
only rare observations into account. Thus, they were cases for which there were only
less than 30 previous observation within a distance 0.9 or a distance 1.8, but the weight
of the latter was dual (equation 9).
⎧
⎨ 1, when {#xj | ||xi − xj || < 0.9 & j < i} < 30
w(i) = 2, when {#xj | ||xi − xj || < 1.8 & j < i} < 30 (9)
⎩
0, else
In the property model case, the results of the mean absolute prediction errors (MAE)
using the goodness criteria are shown in Table 1. The step size indicates the length
of the iteration step. The results are averages of the absolute prediction errors of the
observations between each iteration step. Thus, the size of the data set used to calculate
the average is the same as the step size. In addition to this, to compare the results, the
prediction error averages are presented in three different cases: predicted with a newly
updated model, with a model updated in the previous iteration step, and with a model
that was not updated at all. The results in Table 1 show that, although the differences
between the new model and the model from the previous iteration step are not in every
case very big, the update improves the prediction in each of the steps. In addition, in
some steps it can be seen that even small addition of training data improves the accuracy
remarkably. For example, when adding the data of step size 743, the accuracy of the
newly updated model is remarkably better compared with the model of the previous
iteration step. Naturally, the benefit of the model update is obvious when the results of
the updated model and the model without an update are compared.
Table 2 shows the results for the deviation model using the two different goodness
criteria. Also in this case, the positive effect of the model update on the robustified
negative log-likelihood (robL) can be seen, although the difference between new and
previous models is not so high. However, also in this case the update improves the
model when compared with the model without an update.
Detection of Correct Moment to Model Update 93
Table 1. Means of absolute property prediction errors with the goodness criteria.
Step size With new With previous Without With new With previous Without
model model update model model update
goodness 1 goodness 1 goodness 1 goodness 2 goodness 2 goodness 2
12228 - - - - - -
36703 11.59 11.74 11.74 14.40 14.62 14.62
18932 10.48 11.01 10.70 13.20 14.15 14.03
5626 10.93 11.07 11.25 12.94 13.11 13.10
17636 12.48 12.57 12.97 16.48 16.60 18.23
6104 11.47 12.39 13.35 12.56 13.87 14.67
743 19.94 20.01 25.72 57.18 57.48 71.18
3772 12.77 14.75 17.77 21.10 36.00 49.63
13533 12.47 12.62 13.68 16.34 17.03 22.37
43338 11.78 11.97 12.51 13.66 13.98 15.61
14831 12.78 13.26 13.77 21.05 20.98 25.55
21397 12.45 12.70 14.26 22.43 23.84 36.60
622 16.20 18.28 25.39 73.18 99.29 156.71
mean 11.99 12.26 12.79 15.43 16.08 18.10
Step size With new With previous Without With new With previous Without
model model update model model update
goodness 1 goodness 1 goodness 1 goodness 2 goodness 2 goodness 2
12228 - - - - - -
36703 6.29 6.30 6.30 6.87 6.92 6.92
18932 6.13 6.27 6.19 6.57 6.76 6.72
5626 6.22 6.23 6.29 6.52 6.52 6.56
17636 6.38 6.39 6.46 7.15 7.13 7.36
6104 6.32 6.42 6.53 6.62 6.90 6.84
743 8.02 8.06 10.66 15.84 16.11 29.56
3772 6.36 6.94 8.58 7.16 11.58 23.04
13533 6.43 6.45 6.84 7.15 7.20 10.13
43338 6.35 6.38 6.47 6.67 6.71 7.33
14831 6.55 6.65 6.94 7.47 7.52 11.13
21397 6.40 6.45 7.13 7.17 7.54 15.39
622 6.58 6.76 11.47 9.65 11.66 70.85
mean 6.36 6.41 6.61 6.91 7.05 8.28
With this data set, determination of the need for a model update and the actual
update process proved their efficiency. The number of iteration steps seems to be quite
large, but the iteration steps get longer when more data are used to train the model (the
smallness of the last iteration step is due to the end of the data). Thus, the amount of
updates decreases as time goes on. However, the developed approach can also adapt to
changes rapidly, when needed, as when new products are introduced or the production
method of an existing product is changed. Finally, the benefits of this more intelligent
updating procedure are obvious in comparison with a dummy periodic update procedure
(when the model is updated at one-year intervals, for example, the prediction error
means of the property model for the whole data sets are 12.24 using criterion 1 and
16.34 using criterion 2, notably worse than the results achieved with our method, 11.99
and 15.43). The periodic procedure could not react to changes quickly or accurately
enough, and in some cases it would react unnecessarily.
94 H. Koskimäki et al.
6 Conclusions
This paper described the development of a method for detecting the need for model up-
dates in the steel industry. The prediction accuracy of regression models may decrease
in the long run, and a model update at periodic intervals may not react to changes
rapidly and accurately enough. Thus, there was a need for a reliable method for de-
termining suitable times for updating the model. Two limits were used to detect these
update times, and the results appear promising. In addition, it is possible to rework the
actual values of the limits to optimize the updating steps and improve the results before
implementing the method in an actual application environment. Although the procedure
was tested using a single data set, the extent of the data set clearly proves the usability
of the procedure. Nevertheless, the procedure will be validated when it is adapted also
to regression models developed to model the tensile strength and elongation of metal
plates.
In this study the model update was performed by using all the previously gathered
data to define the regression model parameters. However, in the future the amount of
data will increase, making it hard to use all the gathered data in an update. Thus, new
methods for intelligent data selection are needed to form suitable training data.
References
1. Khattree, R., Rao, C., eds.: Statistics in industry - Handbook of statistcs 22. Elsevier (2003)
2. Juutilainen, I., Röning, J.: Planning of strength margins using joint modelling of mean and
dispersion. Materials and Manufacturing Processes 21 (2006) 367–373
3. Koskimäki, H., Juutilainen, I., Laurinen, P., Röning, J.: Detection of the need for a model
update in steel manufacturing. Proceedings of International Conference on Informatics in
Control, Automation and Robotics (2007) 55–59
4. Haykin, S.: Neural Networks, A Comprehensive Foundation. Prentice Hall, Upper Saddle
River, New Jersey (1999)
5. Yang, M., Zhang, H., Fu, J., Yan, F.: A framework for adaptive anomaly detection based on
support vector data description. Lecture Notes in Computer Science, Network and Parallel
Computing (2004) 443–450
6. Gabrys, B., Leiviskä, K., Strackeljan, J.: Do Smart Adaptive Systems Exist, A Best-Practice
for Selection and Combination of Intelligent Methods. Springer-Verlag, Berlin, Heidelberg
(2005)
7. Juutilainen, I., Röning, J., Myllykoski, L.: Modelling the strength of steel plates using regres-
sion analysis and neural networks. Proceedings of International Conference on Computational
Intelligence for Modelling, Control and Automation (2003) 681–691
8. Juutilainen, I., Röning, J.: A method for measuring distance from a training data set. Com-
munications in Statistics- Theory and Methods 36 (2007) 2625–2639
Robust Optimizers for Nonlinear Programming in
Approximate Dynamic Programming
Some of the most traditional fields of stochastic dynamic programming, e.g. energy
stock-management, which have a strong economic impact, have not been studied thor-
oughly in the reinforcement learning or approximate dynamic programming (ADP)
community. This is damageable to reinforcement learning as it has been pointed out
that there are not yet many industrial realizations of reinforcement learning. Energy
stock-management leads to continuous problems that are usually handled by traditional
linear approaches in which (i) convex value-functions are approximated by linear cuts
(leading to piecewise linear approximations (PWLA)) (ii) decisions are solutions of a
linear-problem. However, this approach does not work in large dimension, due to the
curse of dimensionality which strongly affects PWLA. These problems should be han-
dled by other learning tools. However, in this case, the action-selection, minimizing the
expected cost-to-go, can’t be anymore done using linear programming, as the Bellman
function is no more a convex PWLA.
The action selection is therefore a nonlinear programming problem. There are not
a lot of works dealing with continuous actions, and they often do not study the non-
linear optimization step involved in action selection. In this paper, we focus on this
96 O. Teytaud and S. Gelly
part: we compare many non-linear optimization-tools, and we also compare these tools
to discretization techniques to quantify the importance of the action-selection step.
We here roughly introduce stochastic dynamic programming. The interested reader
is referred to [1] for more details.
Consider a dynamical system that stochastically evolves in time depending upon
your decisions. Assume that time is discrete and has finitely many time steps. Assume
that the total cost of your decisions is the sum of instantaneous costs. Precisely:
cost = c1 + c2 + · · · + cT
ci = c(xi , di ), xi = f (xi−1 , di−1 , ωi )
di−1 = strategy(xi−1 , ωi )
where xi is the state at time step i, the ωi are a random process, cost is to be minimized,
and strategy is the decision function that has to be optimized. We are interested in a
control problem: the element to be optimized is a function.
Stochastic dynamic programming, a tool to solve this control problem, is based on
Bellman’s optimality principle that can be informally stated as follows:
”Take the decision at time step t such that the sum ”cost at time step t due to your
decision” plus ”expected cost from time step t + 1 to ∞” is minimal.”
Bellman’s optimality principle states that this strategy is optimal. Unfortunately, it
can only be applied if the expected cost from time step t + 1 to ∞ can be guessed,
depending on the current state of the system and the decision. Bellman’s optimality
principle reduces the control problem to the computation of this function. If xt can be
computed from xt−1 and dt−1 (i.e., if f is known) then the control problem is reduced
to the computation of a function
Note that this function depends on the strategy (we omit for short dependencies on
the random process). We consider this expectation for any optimal strategy (even if
many strategies are optimal, V is uniquely determined as it is the same for any optimal
strategy).
Stochastic dynamic programming is the computation of V backwards in time, thanks
to the following equation:
For each t, V (t, xt ) is computed for many values of xt , and then a learning algorithm
(here by support vector machines) is applied for building x → V (t, x) from these ex-
amples. Thanks to Bellman’s optimality principle, the computation of V is sufficient to
define an optimal strategy. This is a well known, robust solution, applied in many areas
including power supply management. A general introduction, including learning, is [2,
1]. Combined with learning, it can lead to positive results in spite of large dimensions.
Robust Optimizers for Nonlinear Programming 97
Many developments, including RTDP and the field of reinforcement learning, can be
found in [3].
Equation 1 is used many many times during a run of dynamic programming. For
T time steps, if N points are required for efficiently approximating each Vt , then there
are T × N optimizations. Furthermore, the derivative of the function to optimize is not
always available, due to the fact that complex simulators are sometimes involved in
the transition f . Convexity sometimes holds, but sometimes not. Binary variables are
sometimes involved, e.g. in power plants management. This suggests that evolutionary
algorithms are a possible tool.
Robustness is one of the main issue in non-linear optimization and has various mean-
ings.
1. A first meaning is the following: robust optimization is the search of x such that
in the neighborhood of x the fitness is good, and not only at x. In particular, [4] has
introduced the idea that evolutionary algorithms are not function-optimizers, but rather
tools for finding wide areas of good fitness.
2. A second meaning is that robust optimization is the avoidance of local minima.
It is known that iterative deterministic methods are often more subject to local minima
than evolutionary methods; however, various forms of restarts (relaunch the optimiza-
tion from a different initial point) can also be efficient for avoiding local minima.
3. A third possible meaning is the robustness with respect to fitness noise. Various
models of noise and conclusions can be found in [5–9].
4. A fourth possible meaning is the robustness with respect to unsmooth fitness
functions, even in cases in which there’s no local minima. Evolutionary algorithms are
usually rank-based (the next iterate point depends only on the ranks of previously visited
points), therefore do not depend on increasing transformations of the fitness-function. It
is known that
they have optimality properties w.r.t this kind of transformations [10]. For
example, ||x|| (or some C ∞ functions close to this one) lead to a very bad behavior of
standard Newton-based methods like BFGS [11–14] whereas a rank-based evolutionary
algorithm behaves the same for ||x||2 and ||x||.
5. The fifth possible meaning is the robustness with respect to the non-deterministic
choices made by the algorithm. Even algorithms that are considered as deterministic
often have a random part1 : the choice of the initial point. Population-based methods are
more robust in this sense, even if they use more randomness for the initial step (full
random initial population compared to only one initial point): a bad initialization which
would lead to a disaster is much more unlikely.
The first sense of robustness given above, i.e. avoiding too narrow areas of good
fitness, fully applies here. Consider for example a robot navigating in an environment
in order to find a target. The robot has to avoid obstacles. The strict optimization of the
cost-to-go leads to choices just tangent to obstacles. As at each step the learning is far
1
Or, if not random, a deterministic but arbitrary part, such as the initial point or the initial
step-size.
98 O. Teytaud and S. Gelly
from perfect, then being tangent to obstacles leads to hit the obstacles in 50 % of cases.
We see that some local averaging of the fitness is suitable.
The second sense, robustness in front of non-convexity, of course also holds here.
Convex and non-convex problems both exist. The law of increasing marginal costs im-
plies the convexity of many stock management problems, but approximations of V are
usually not convex, even if V is theoretically convex. Almost all problems of robotics
are not convex.
The third sense, fitness (or gradient) noise, also applies. The fitness functions are
based on learning from finitely many examples. Furthermore, the gradient, when it can
be computed, can be pointless even if the learning is somewhat successful; even if fˆ
approximates f in the sense that ||f − fˆ||p is small, ∇fˆ can be very far from ∇f .
The fourth sense is also important. Strongly discontinuous fitnesses can exist: ob-
stacle avoidance is a binary reward, as well as target reaching. Also, a production-unit
can be switched on or not, depending on the difference between demand and stock-
management, and that leads to large binary-costs.
The fifth sense is perhaps the most important. SDP can lead to thousands of opti-
mizations, similar to each other. Being able of solving very precisely 95 % of families
of optimization problems is not the goal; here it’s better to solve 95 % of any fam-
ily of optimization problems, possibly in a suboptimal manner. We do think that this
requirement is a main explanation of results below.
Many papers have been devoted to ADP, but comparisons are usually far from being
extensive. Many papers present an application of one algorithm to one problem, but do
not compare two techniques. Problems are often adapted to the algorithm, and therefore
comparing results is difficult. Also, the optimization part is often neglected; sometimes
not discussed, and sometimes simplified to a discretization.
In this paper, we compare experimentally many non-linear optimization-tools. The
list of methods used in the comparison is given in 2. Experiments are presented in
section 3. Section 4 concludes.
We include in the comparison standard tools from mathematical programming, but also
evolutionary algorithms and some discretization techniques. Evolutionary algorithms
can work in continuous domains [15–17]; moreover, they are compatible with mixed-
integer programming (e.g. [18]). However, as there are not so many algorithms that
could naturally work on mixed-integer problems and in order to have a clear com-
parison with existing methods, we restrict our attention to the continuous framework.
We can then easily compare the method with tools from derivative free optimization
[19], and limited-BFGS with finite differences [20, 21]. We also considered some very
naive algorithms that are possibly interesting thanks to the particular requirement of ro-
bustness within a moderate number of iterates: random search and some quasi-random
improvements. The discretization techniques are techniques that test a predefined set
of actions, and choose the best one. As detailed below, we will use dispersion-based
samplings or discrepancy-based samplings.
Robust Optimizers for Nonlinear Programming 99
Cma-ES from Beagle [22, 23] is similar to Cma-ES from EO[24] and therefore it
has been removed. We now provide details about the methods integrated in the exper-
iments. For the sake of neutrality and objectivity, none of these source codes has been
implemented for this work: they are all existing codes that have been integrated to our
platform, except the baseline algorithms.
– random search: randomly draw N points in the domain of the decisions ; compute
their fitness ; consider the minimum fitness.
– quasi-random search: idem, with low discrepancy sequences instead of random se-
quences [25]. Low discrepancy sequences are a wide area of research [25, 26], with
clear improvements on Monte-Carlo methods, in particular for integration but also
for learning [27], optimization [25, 28], path planning [29]. Many recent works
are concentrated on high dimension [30, 31], with in particular successes when the
”true” dimensionality of the underlying distribution or domain is smaller than the
apparent one [32], or with scrambling-techniques [33].
– Low-dispersion optimization is similar, but uses low-dispersion sequences [25, 34,
35] instead of random i.i.d sequences ; low-dispersion is related to low-discrepancy,
but easier to optimize. A dispersion-criterion is
This sequence has the advantage of being much faster to compute than the non-
greedy one, and that one does not need a priori knowledge of the number of points.
Of course, it is not optimal for Equation. 3 or Equation. 2.
– Equation 3 pushes points on the frontier, what is not the case in equation 2 ; there-
fore, we also considered low-dispersion sequences ”far-from-frontier”, where equa-
tion 3 is replaced by:
tion and d the dimension of space. The crossover between two individuals x and
y gives birth to two individuals 13 x + 23 y and 23 x + 13 y. Let λ1 , λ2 , λ3 , λ4 be
such that λ1 + λ2 + λ3 + λ4 = 1 ; we define S1 the set of the λ1 .n best indi-
viduals, S2 the λ2 .n best individuals among the others. At each generation, the
new offspring is (i) a copy of S1 (ii) nλ2 cross-overs between individuals from
S1 and individuals from S2 (iii) nλ3 mutated copies of individuals from S1 (iv)
nλ4 individuals randomly drawn uniformly in the domain. The parameters are
σ = 0.08, λ1 = 1/10, λ2 = 2/10, λ3 = 3/10, λ4 = 4/10; the population size
is the square-root of the number of fitness-evaluations allowed. These parameters
are standard ones from the library. We also use a ”no memory” (GANM) version,
that provides as solution the best point in the final offspring, instead of the best
visited point. This is made in order to avoid choosing a point from a narrow area of
good fitness.
– limited-BFGS with finite differences, thanks to the LBFGSB library [20, 21]. Rough-
ly speaking, LBFGS uses an approximated Hessian in order to approximate Newton-
steps without the huge computational and space cost associated to the use of a full
Hessian.
In our experiments with restart, any optimization that stops due to machine precision is
restarted from a new random (independent, uniform) point.
For algorithms based on an initial population, the initial population is chosen ran-
domly (uniformly, independently) in the domain. For algorithms based on an initial
point, the initial point is the middle of the domain. For algorithms requiring step sizes,
the step size is the distance from the middle to the frontier of the domain (for each di-
rection). Other parameters were chosen by the authors with equal work for each method
on a separate benchmark, and then plugged in our dynamic programming tool. The de-
tailed parametrization is available in http://opendp.sourceforge.net, with
the command-line generating tables of results.
Some other algorithms have been tested and rejected due to their huge compu-
tational cost: the DFO-algorithm from Coin [19],http://www.coin-or.org/;
Cma-ES from Beagle [22, 23] is similar to Cma-ES from EO[24] and has also been
removed.
3 Experiments
The characteristics of the problems are summarized in table 1; problems are scalable
and experiments are performed with dimension (i) the baseline dimension in table 1
Robust Optimizers for Nonlinear Programming 101
(ii) twice this dimensionality (iii) three times (iv) four times. Both the state space di-
mension and the action space are multiplied. Results are presented in tables below. The
detailed experimental setup is as follows: the learning of the function value is performed
by SVM with Laplacian-kernel (SVMTorch,[39]), with hyper-parameters heuristically
chosen; each optimizer is allowed to use a given number of points (specified in tables
of results); 300 points for learning are sampled in a quasi-random manner for each time
step, non-linear optimizers are limited to 100 function-evaluations. Each result is av-
eraged among 66 runs. We can summarize results below as follows. Experiments are
performed with:
3.2 Results
Results varies from one benchmark to another. We have a wide variety of benchmarks,
and no clear superiority of one algorithm onto others arises. E.g., CMA is the best
algorithm in some cases, and the worst one in some others. One can consider that it
would be better to have a clear superiority of one and only one algorithm, and therefore
a clear conclusion. Yet, it is better to have plenty of benchmarks, and as a by-product of
our experiments, we claim that conclusions extracted from one or two benchmarks, as
done in some papers, are unstable, in particular when the benchmark has been adapted
to the question under study. The significance of each comparison (for one particular
benchmark) can be quantified and in most cases we have sufficiently many experiments
to make results significant. But, this significance is for each benchmark independently;
in spite of the fact that we have chosen a large set of benchmarks, coming from robotics
or industry, we can not conclude that the results could be extended to other benchmarks.
However, some (relatively) stable conclusions are:
Table 1. Summary of the characteristics of the benchmarks. The stock management problems
theoretically lead to convex Bellman-functions, but their learnt counterparts are not convex. The
”arm” and ”away” problem deal with robot-hand-control; these two problems can be handled
approximately (but not exactly) by bang-bang solutions. Walls and Multi-Agent problems are
motion-control problems with hard penalties when hitting boundaries; the loss functions are very
unsmooth.
2
We include CMA in order-2 techniques in the sense that it uses a covariance matrix which is
strongly related to the Hessian.
Robust Optimizers for Nonlinear Programming 103
Table 2. Experimental results. For the ”best algorithm” column, bold indicates 5% significance
for the comparison with all other algorithms and italic indicates 5% significance for the com-
parison with all but one other algorithms. y holds for 10%-significance. Detailed results show
that many comparisons are significant for larger families of algorithms, e.g. if we group GA and
GANM, or if we compare algorithms pairwise. Problems with a star are problems for which bang-
bang solutions are intuitively appealing; LD, which over-samples the frontiers, is a natural candi-
date for such problems. Problems with two stars are problems for which strongly discontinuous
penalties can occur; the first meaning of robustness discussed in section 1.1 is fully relevant for
these problems. Conclusions: 1. GA outperforms random and often QR. 2. For *-problems with
nearly bang-bang solutions, LD is significantly better than random and QR in all but one case,
and it is the best in 7 on 8 problems. It’s also in some cases the worst of all the tested techniques,
and it outperforms random less often than QR or GA. LD therefore appears as a natural efficient
tool for generating nearly bang-bang solutions. 3. In **-problems, GA and GANM are often the
two best tools, with strong statistical significance; their robustness for various meanings cited
in section 1.1 make them robust solutions for solving non-convex and very unsmooth problems
with ADP. 4. Stock management problems (the two first problems) are very efficiently solved by
CMA-ES, which is a good compromise between robustness and high-dimensional-efficiency, as
soon as dimensionality increases.
4 Conclusions
We presented an experimental comparison of non linear optimization algorithms in
the context of ADP. The comparison involves evolutionary algorithms, (quasi-)random
104 O. Teytaud and S. Gelly
References
1. Bertsekas, D., Tsitsiklis, J.: Neuro-dynamic Programming. Athena Scientific (1996)
2. Bertsekas, D.: Dynamic Programming and Optimal Control, vols I and II. Athena Scientific
(1995)
3. Sutton, R., Barto, A.: Reinforcement learning: An introduction. MIT Press., Cambridge,
MA (1998)
4. DeJong, K.A.: Are genetic algorithms function optimizers ? In Manner, R., Manderick, B.,
eds.: Proceedings of the 2nd Conference on Parallel Problems Solving from Nature, North
Holland (1992) 3–13
5. Jin, Y., Branke, J.: Evolutionary optimization in uncertain environments, a survey. IEEE
Transactions on Evolutionary Computation 9 (2005) 303–317
6. Sendhoff, B., Beyer, H.G., Olhofer, M.: The influence of stochastic quality functions on
evolutionary search. Recent Advances in Simulated Evolution and Learning (2004) 152–172
7. Tsutsui, S.: A comparative study on the effects of adding perturbations to phenotypic pa-
rameters in genetic algorithms with a robust solution searching scheme. In: Proceedings of
the 1999 IEEE System, Man, and Cybernetics Conference SMC 99. Volume 3., IEEE (1999)
585–591
8. Fitzpatrick, J., Grefenstette, J.: Genetic algorithms in noisy environments. Machine Learn-
ing: Special Issue on Genetic Algorithms 3 (1988) 101–120
9. Beyer, H.G., Olhofer, M., Sendhoff, B.: On the impact of systematic noise on the evo-
lutionary optimization performance - a sphere model analysis. Genetic Programming and
Evolvable Machines 5(2004) 327 360
10. Gelly, S., Ruette, S., Teytaud, O.: Comparison-based algorithms: worst-case optimality, op-
timality w.r.t a bayesian prior, the intraclass-variance minimization in eda, and implementa-
tions with billiards. In: PPSN-BTP workshop. (2006)
11. Broyden., C.G.: The convergence of a class of double-rank minimization algorithms 2. The
New Algorithm. J. of the Inst. for Math. and Applications 6 (1970) 222–231
12. Fletcher, R.: A new approach to variable-metric algorithms. Computer Journal 13 (1970)
317–322
13. Goldfarb, D.: A family of variable-metric algorithms derived by variational means. Mathe-
matics of Computation 24 (1970) 23–26
14. Shanno, D.F.: Conditioning of quasi-newton methods for function minimization. Mathemat-
ics of Computation 24 (1970) 647–656
15. Bäck, T., Hoffmeister, F., Schwefel, H.P.: A survey of evolution strategies. In Belew, R.K.,
Booker, L.B., eds.: Proceedings of the 4th International Conference on Genetic Algorithms,
Morgan Kaufmann (1991) 2–9
16. Bäck, T., Rudolph, G., Schwefel, H.P.: Evolutionary programming and evolution strategies:
Similarities and differences. In Fogel, D.B., Atmar, W., eds.: Proceedings of the 2nd Annual
Conference on Evolutionary Programming, Evolutionary Programming Society (1993) 11–
22
17. Beyer, H.G.: The Theory of Evolutions Strategies. Springer, Heidelberg (2001)
18. Bäck, T., Schütz, M.: Evolution strategies for mixed-integer optimization of optical multi-
layer systems. In McDonnell, J.R., Reynolds, R.G., Fogel, D.B., eds.: Proceedings of the 4th
Annual Conference on Evolutionary Programming, MIT Press (1995)
19. Conn, A., Scheinberg, K., Toint, L.: Recent progress in unconstrained nonlinear optimization
without derivatives (1997)
20. Zhu, C., Byrd, R., P.Lu, Nocedal, J.: L-BFGS-B: a limited memory FORTRAN code for
solving bound constrained optimization problems. Technical Report, EECS Department,
Northwestern University (1994)
106 O. Teytaud and S. Gelly
21. Byrd, R., Lu, P., Nocedal, J., C.Zhu: A limited memory algorithm for bound constrained
optimization. SIAM J. Scientific Computing, vol.16, no.5 (1995)
22. Hansen, N., Ostermeier, A.: Adapting arbitrary normal mutation distributions in evolution
strategies: The covariance matrix adaption. In: Proc. of the IEEE Conference on Evolutionary
Computation (CEC 1996), IEEE Press (1996) 312–317
23. Gagné, C.: Openbeagle 3.1.0-alpha. Technical report (2005)
24. Keijzer, M., Merelo, J.J., Romero, G., Schoenauer, M.: Evolving objects: A general purpose
evolutionary computation library. In: Artificial Evolution. (2001) 231–244
25. Niederreiter, H.: Random Number Generation and Quasi-Monte Carlo Methods. SIAM
(1992)
26. Owen, A.B.: Quasi-Monte Carlo sampling. In Jensen, H.W., ed.: Monte Carlo Ray Tracing:
Siggraph 2003 Course 44, SIGGRAPH (2003) 69–88
27. Cervellera, C., Muselli, M.: A deterministic learning approach based on discrepancy. In:
Proceedings of WIRN’03, pp53-60. (2003)
28. Auger, A., Jebalia, M., Teytaud, O.: Xse: quasi-random mutations for evolution strategies.
In: Proceedings of EA’2005. (2005) 12–21
29. Tuffin, B.: On the use of low discrepancy sequences in monte carlo methods. In: Technical
Report 1060, I.R.I.S.A. (1996)
30. Sloan, I., Woźniakowski, H.: When are quasi-Monte Carlo algorithms efficient for high
dimensional integrals? Journal of Complexity 14 (1998) 1–33
31. Wasilkowski, G., Wozniakowski, H.: The exponent of discrepancy is at most 1.4778. Math.
Comp 66 (1997) 1125–1132
32. Hickernell, F.J.: A generalized discrepancy and quadrature error bound. Mathematics of
Computation 67 (1998) 299–322
33. L’Ecuyer, P., Lemieux, C.: Recent advances in randomized quasi-monte carlo methods. In
Dror, M., L’Ecuyer, P., Szidarovszkin, F., eds.: Modeling Uncertainty: An Examination of its
Theory, Methods, and Applications, Kluwer Academic (2002) 419–474
34. Lindemann, S.R., LaValle, S.M.: Incremental low-discrepancy lattice methods for motion
planning. In: Proceedings IEEE International Conference on Robotics and Automation.
(2003) 2920–2927
35. LaValle, S.M., Branicky, M.S., Lindemann, S.R.: On the relationship between classical grid
search and probabilistic roadmaps. I. J. Robotic Res. 23 (2004) 673–692
36. Hooke, R., Jeeves, T.A.: Direct search solution of numerical and statistical problems. Journal
of the ACM, Vol. 8, pp. 212-229 (1961)
37. Kaupe, A.F.: Algorithm 178: direct search. Commun. ACM 6 (1963) 313–314
38. Wright, M.: Direct search methods: Once scorned, now respectable. Numerical Analysis (D.
F. Griffiths and G. A. Watson, eds.), Pitman Research Notes in Mathematics (1995) 191–208
http://citeseer.ist.psu.edu/wright95direct.html.
39. Collobert, R., Bengio, S.: Svmtorch: Support vector machines for large-scale regression
problems. Journal of Machine Learning Research 1 (2001) 143–160
PART II
1 Introduction
cycle around the reference position. This is a particular problem near zero velocities
where friction is highly nonlinear and the servomechanism is most likely to stick-slip.
Stick-slip can be reduced or eliminated by using impulsive control near or at zero
velocities. The impulsive controller is used to overcome static friction by impacting
the mechanism and moving it by microscopic amounts. By combining the impulsive
controller and conventional controller together, the PID part can be used to provide
large scale movement and stability when moving towards the reference position,
while the impulse controller is used to improve accuracy for the final positioning
where the error signal is small.
By applying a short impulse of sufficient force, plastic deformation occurs
between the asperities of mating surfaces resulting in permanent controlled
movement. If the initial pulse causes insufficient movement, the impulsive controller
produces additional pulses until the position error is reduced to a minimum.
A number of investigators have devised impulsive controllers which achieve
precise motion in the presence of friction by controlling the height or width of a pulse.
Yang and Tomizuka [17] applied a standard rectangular shaped pulse whereby the
height of the pulse is a force about 3 to 4 times greater than the static friction to
guarantee movement. The width of the pulse is adaptively adjusted proportional to the
error and is used to control the amount of energy required to move the mechanism
towards the reference positioning. Alternatively, Popovic [12] described a fuzzy logic
pulse controller that determines both the optimum pulse amplitude and pulse width
simultaneously using a set of membership functions. Hojjat and Higuchi [6] limited
the pulse width to a fixed duration of 1ms and vary the amplitude by applying a force
about 10 times the static friction. Rathbun et al [14] identify that a flexible-body plant
can result in a position error limit cycle and that this limit cycle can be eliminated by
reducing the gain using a piecewise-linear-gain pulse width control law.
In a survey of friction controllers by Armstrong-Hélouvry [2], it is commented
that underlying the functioning of these impulsive controllers is the requirement for
the mechanism to be in the stuck or stationary position before subsequent impulses
are applied. Thus, previous impulse controllers required each small impacting pulse
to be followed by an open loop slide ending in a complete stop.
In this paper, a hybrid PID + impulsive controller is used to improve the precision
of a servomechanism under the presence of static and Coulomb friction. The design
and functioning of the controller does not require the mechanism to come to rest
between subsequent pulses, making it suitable for both point to point positioning and
speed regulation. The experimental results of this paper show that the shape of the
impulse can be optimised to increase the overall precision of the controller. It is
shown that the smallest available movement of the servomechanism can be
significantly reduced without modification to the mechanical plant.
On a broad scale, the properties of friction are both well understood and documented.
Armstrong-Hélouvry [2] have surveyed some of the collective understandings of how
friction can be modelled to include the complexities of mating surfaces at a
Improved Positional Accuracy of Robots 111
microscopic level. Canudas de Wit [3] add to this contribution by presenting a new
model that more accurately captures the dynamic phenomena of rising static friction
[13], frictional lag [13], varying break away force [7], [15] dwell time [9], pre-sliding
displacement [4], [5], [8] and Stribeck effect [11].
a) b)
Fig. 1. Bristle model; Figure a) shows the deflection of a single bristle. Figure b) shows the
resulting static friction model for a single instance in time.
dz v
=v− z (1)
dt g (v)
g (v) =
1
σ0 (F
C + ( Fs − FC ) e
− ( v vs ) 2
) (2)
dz
F = σ z + σ ( v ) + Fv v (3)
0 1 dt
σ1 ( v ) = σ1e−( v v d )2
(4)
where v is the relative velocity between the two surfaces and z is the average
deflection of the bristles. σ0 is the bristle stiffness and σ1 is the bristle damping. The
term vs is used to introduce the velocity at which the Stribeck effect begins while the
parameter vd determines the velocity interval around zero for which the velocity
damping is active. Fig. 1(b) shows the friction force as a function of velocity. Fs is the
average static friction while FC is the average Coulomb friction. For very low
velocities, the viscous friction Fv is negligible but is included for model completeness.
Fs, FC, and Fv are all estimated experimentally by subjecting a real mechanical system
to a series of steady state torque responses. The parameters σ0, σ1, vs and vd are also
112 S. van Duin et al.
determined by measuring the steady state friction force when the velocity is held
constant [3].
For these experiments, only the A and B axis of the Hirata robot are controlled.
Both the A and B axes have a harmonic gearbox between the motor and robot arm.
Their gear ratios are respectively 100:1 and 80:1. All of the servomotors on the Hirata
robot are permanent magnet DC type and the A and B axis motors are driven with
Baldor® TSD series DC servo drives. Each axis has characteristics of high nonlinear
friction whose parameters are obtained by direct measurement. For both axes, the
static friction is approximately 1.4 times the Coulomb friction.
MATLAB’s xPC target oriented server was used to provide control to each of the
servomotor drives. For these experiments, each digital drive was used in current
control mode which in effect means the output voltage from the 12-bit D/A converter
gives a torque command to the actuator’s power electronics. The system controller
was compiled and run using Matlab’s real time xPC Simulink® block code. A 12-bit
A/D converter was used to read the actuator’s shaft position signal.
Fig. 3 shows the block diagram of a PID linear controller + impulsive controller. This
hybrid controller has been suggested by Li [10] whereby the PID driving torque and
impulsive controller driving torque are summed together. It is unnecessary to stop at
the end of each sampling period and so the controller can be used for both position
and speed control.
Improved Positional Accuracy of Robots 113
The controller can be divided into two parts; the upper part is the continuous
driving force for large scale movement and control of external force disturbances. The
lower part is an additional proportional controller kpwm with a pulse width modulated
sampled-data hold (PWMH), and is the basis of the impulsive controller for the
control of stick-slip.
The system controller is sampled at 2 kHz. The impulse itself is sampled and
applied at one twentieth of the overall sampling period (i.e. 100 Hz) to match the
mechanical system dynamics. Fig. 4 shows a typical output of the hybrid controller
for one impulse sampling period τ s. The pulse with height fp is added to the PID
output. Because the PID controller is constantly active, the system has the ability to
counteract random disturbances applied to the servomechanism. The continuous part
of the controller is tuned to react to large errors and high velocity, while the impulse
part is optimized for final positioning where stick-slip is most prevalent.
Δ
Force
fp
PID Output
τs
Fig. 4. Friction controller output.
For large errors, the impulse width approaches the full sample period τs, and for
very large errors, it transforms into a continuous driving torque. When this occurs, the
combined control action of the PID controller and the impulsive controller will be
continuous. Conversely, for small errors, the PID output is too small to have any
substantial effect on the servomechanism dynamics.
The high impulse sampling rate, combined with a small error, ensures that the
integral (I) part of the PID controller output has insufficient time to rise and produce
limit cycling. To counteract this loss of driving torque, when the error is below a
threshold, the impulsive controller begins to segment into individual pulses of varying
114 S. van Duin et al.
width and becomes the primary driving force. One way of achieving this is to make
the pulse width determined by:
k pwm ⋅ e(k ) τs
Δ= if k pwm ⋅ | e( k ) |≤| f p |
fp
Δ = τs otherwise (6)
In (6)
f p = f p ⋅ sign ( e( k ) ) (7)
where e(k) is the error input to the controller, |fp| is a fixed pulse height greater than
the highest static friction and τs is the overall sampling period. For the experimental
results of this paper, the impulsive sampling period τ s was 10ms and the pulse width
could be incrementally varied by 1ms intervals. The pulse width gain kpwm, is
experimentally determined by matching the mechanism’s observed displacement d to
the calculated pulse width tp using the equation of motion:
fp( fp − fC )
d = 2
tp , fp > 0 (8)
2 mf C
The gain is iteratively adjusted until the net displacement for each incremental pulse
width is as small as practical.
The point to point steady state precision of the system is governed by the smallest
incremental movement which will be produced from the smallest usable width pulse.
Because the shape of the pulse is affected by the system’s electrical circuit response, a
practical limit is placed on the amplitude of the pulse over very short durations and
this restricts the amount of energy that can be contained within a very thin pulse.
Consequently, there exists a minimum pulse width that is necessary to overcome the
static friction and guarantee plastic movement.
For the Hirata robot, the minimum pulse width guaranteeing plastic displacement
was determined to be 2ms and therefore the pulse width is adjusted between 2 and
10ms. Any pulse smaller than 2ms results in elastic movement of the mating surfaces
in the form of pre-sliding displacement. In this regime, short impulses can produce
unpredictable displacement or even no displacement at all. In some cases, the
mechanism will spring back greater than the forward displacement resulting in a
larger error. Fig. 5 shows the displacement of the experimental system of five
consecutive positive impulses followed by five negative impulses. The experiment
compares impulses of width 2ms and 1.5ms. For impulses of 2ms, the displacement is
represented by the consistent staircase movement. For a lesser width of 1.5ms, the
Improved Positional Accuracy of Robots 115
Fig. 5. Experimentally measured displacement for both positive and negative impulses using
successive pulse widths 1.5ms and 2ms.
3.1 Motivation
Fig. 6 shows the simulated displacements of varying pulse widths which have been
labelled d1, d2, d3…dn respectively, where d1 is the minimum pulse width which will
generate non-elastic movement and defines the system’s resolution.
Using the variable pulse width PID + impulse controller for a position pointing
task, the torque will incrementally move the mechanism towards the reference set
point in an attempt to reach steady state. Around the set point, the system will
inevitably begin to limit cycle when the error e(k) is approximately the same
magnitude as the system resolution (the displacement for the minimum pulse width
d1).
For the limit cycle to be extinguished, the controller must be disabled. As an
example, the limit cycle in Fig. 7 is extinguished by disabling the impulse controller
at t=0.18s, and in this case, the resulting error is approximately half the displacement
of the minimum pulse width d1.
Model Parameter Fs FC 0 1 Fv vs vd
5
Value 2 1 4.5*10 12,000 0.4 0.001 0.0004
Fig. 7. Simulation of the impulse controller limit cycling around a position reference set-point
where the final torque output is a pulse with a minimum width and the mean peak to peak
oscillation is d1. The friction parameters used for the simulation are also given in the
accompanying table.
Limit cycling will occur for all general servomechanisms using a torque pulse
because every practical system inherently has a minimum pulse width that defines the
system’s resolution. Fig. 7 simulates a typical limit cycle with a peak to peak
oscillation equal to the displacement of the minimum pulse width d1.
One way to automatically extinguish the limit cycle is to include a dead-zone that
disables the controller output when the error is between an upper and lower bound of
Improved Positional Accuracy of Robots 117
the reference point (see Fig. 7). The final error is then dependent on the amount of
offset the limit cycle has in relation to the reference point. Fig. 7 shows a unique case.
d2 - d1
reference
position new
Position error
Time
Fig. 8. Conceptual example of reducing the steady state error using ‘Limit Cycle Offset’ with
the limit cycle shifted up by d2-d1 and the new error that is guaranteed to fall within the dead-
zone.
where the ± amplitude of the limit cycle is almost evenly distributed either side of the
reference set point; i.e. the centre line of the oscillation lies along the reference set
point. In this instance, disabling the controller would create an error e(k) equal to
approximately d1 . This however, would vary in practice and the centreline is likely
2
to be offset by some arbitrary amount. The maximum precision of the system will
therefore be between d1 and zero.
By controlling the offset of the limit cycle centreline, it is possible to guarantee that
the final error lies within the dead-zone, and therefore to increase the precision of the
system. As a conceptual example, Fig. 8 shows a system limit cycling either side of
the reference point by the minimum displacement d1. By applying the next smallest
pulse d2, then followed by the smallest pulse d1, the limit cycle can be shifted by d2 –
d1. The effect is that the peak to peak centreline of the oscillation has now been
shifted away from the reference point.
However, at least one of the peaks of the oscillation has been shifted closer to the
set point. If the controller is disabled when the mechanism is closest to the reference
set point, a new reduced error is created. For this to be realised, the incremental
difference in displacement between successively increasing pulses must be less than
the displacement from the minimum pulse width; i.e. d2 – d1 < d1.
For the limit cycle to be offset at the correct time, the impulse controller must have a
set of additional control conditions which identify that a limit cycle has been initiated
with the minimum width pulse. The controller then readjusts itself accordingly using a
‘switching bound’ and finally disables itself when within a new specified error
118 S. van Duin et al.
‘dead-zone’. One way to achieve this is to adjust the pulse width so that it is increased
by one Pulse Width Increment (PWI) when satisfying the following conditions:
if switching bound > |e(k)| ≥ dead-zone
k pwm ⋅ e(k )τ s
then Δ= + PWI
fp
k pwm ⋅ e(k )τ s
otherwise Δ= (9)
fp
Fig. 9. Simulation of the limit cycle offset function used with the PID + impulse controller.
To demonstrate the limit cycle offset function, the modified controller is simulated
using a simple unit mass with the LeGre friction model [11] using Eqs. 1 to 4.
Improved Positional Accuracy of Robots 119
4 Experimental
For continuous coordinated motion, a 100mm diameter circle was drawn using the A
and B axes of the robot to compare a conventional PID controller to the PID +
impulse control. Fig. 10 shows the relative motion of each axis from the control
reference inputs. Velocity reversals occur at t=40s and t=120s for the A axis and for
the B axis occur at t=0 and 100s. The robot’s tool tip angular velocity ω =31.4 mrad/s.
Fig. 10. Reference control signals for the A and B axes (ω=31.4 mrad/s).
The experimental results are shown in Fig. 11 (a) and Fig. 11 (b) respectively
whereby the classic staircase stick-slip motion is extinguished when using the PID +
impulse controller. The deviation of the desired 100mm diameter circle is shown in
Fig. 12. This is a polar plot where each of the reference position errors for each
controller is compared. The maximum deviation from the circle using the PID only
controller is ±3.5mm. The maximum deviation using the PID + impulse controller is
significantly less with an error of ±0.1mm.
120 S. van Duin et al.
(a) (b)
Fig. 11. Circle using PID only (a) and PID + impulse control (b).
This section evaluates the limit cycle offset function using the experimental Hirata
robot having position dependent variables. Fig. 13 shows a steady state limit cycle for
a position pointing step response of 0.001 radians using a PID + impulse hybrid
controller. The mean peak to peak displacement of the smallest non-elastic part of the
limit cycle is μd.
The experiment was repeated using the limit cycle offset function with the same
position step reference of 0.001 radians. Fig. 14 shows a sample experiment and in
this example, the limit cycle offset function is activated at t=0.9s. At this time, the
amplitude of the non-elastic part of the limit cycle is identified as lying between the
switching bounds. The switching bounds and dead-zone are set according to the
methodology given earlier. Once the offset function is activated, the controller adjusts
itself by forcing the proceeding pulse to be one increment wider before returning to
the smallest pulse width. This results in the limit cycle being shifted down into the
Improved Positional Accuracy of Robots 121
Fig. 13. Steady state limit cycle for the PID + impulse hybrid controller when applying a unit
step input to the Hirata robot. The mean peak to peak displacement μd is the non-elastic part of
limit cycle.
Fig. 14. Using the ‘Limit Cycle Offset’ function to reduce the final steady state error of the
Hirata robot.
This set of results demonstrates the Limit Cycle Offset function can be successfully
applied to a commercial robot manipulator having characteristics of high nonlinear
friction. The results show that the unmodified controller will cause the robot to limit
122 S. van Duin et al.
cycle near steady state position and that the peak to peak displacement is equal to the
displacement of the smallest usable width pulse.
By using the Limit Cycle Offset function, the limit cycle can be detected and the
pulse width adjusted so that at least one of the peaks of the limit cycle is moved
towards the reference set point. Finally, the results show that the controller recognises
the limit cycle as being shifted into a defined error dead-zone whereby the controller
is disabled. The steady state error is therefore guaranteed to fall within a defined
region so that the steady state error is reduced. For the SCARA robot, the
improvement in accuracy demonstrated was 1.1e-4 radians in comparison to 4.5e-4
radians achieved without the limit cycle offset.
5 Conclusions
Advances in digital control have allowed the power electronics of servo amplifiers to
be manipulated in a way that will improve a servomechanism precision without
modification to the mechanical plant. This is particularly useful for systems having
highly nonlinear friction where conventional control schemes alone under perform. A
previously developed hybrid PID + impulse controller which does not require the
mechanism to come to a complete stop between pulses has been modified to further
improve accuracy. This modification shifts the limit cycling into a different position
to provide substantial additional improvement in the mechanism’s position accuracy.
This improvement has been demonstrated both in simulations and in experimental
results on a SCARA robot arm. The mechanism does not have to come to a complete
stop between pulses, and no mechanical modification has to be made to the robot.
References
Abstract. This study focuses on the estimation of car dynamic variables for
the improvement of vehicle safety, handling characteristics and comfort. More
specifically, a new estimation process is proposed to estimate longitudinal/lateral
tire-road forces, velocity, sideslip angle and wheel cornering stiffness. This
method uses measurements from currently available standard sensors (yaw rate,
longitudinal/lateral accelerations, steering angle and angular wheel velocities).
The estimation process is separated into two blocks: the first block contains an
observer whose principal role is to calculate tire-road forces without a descriptive
force model, while in the second block an observer estimates sideslip angle and
cornering stiffness with an adaptive tire-force model. The different observers are
based on an Extended Kalman Filter (EKF). The estimation process is applied
and compared to real experimental data, notably sideslip angle and wheel force
measurements. Experimental results show the accuracy and potential of the esti-
mation process.
1 Introduction
The last few years have seen the emergence in cars of active security systems to re-
duce dangerous situations for drivers. Among these active security systems, Anti-lock
Braking Systems (ABS) and Electronic Stability Programs (ESP) significantly reduce
the number of road accidents. However, these systems may improved if the dynamic
potential of a car is well known. For example, information on tire-road friction means a
better definition of potential trajectories, and therefore a better management of vehicle
controls. Nowadays, certain fundamental data relating to vehicle-dynamics are not mea-
surable in a standard car for both technical and economic reasons. As a consequence,
dynamic variables such as tire forces and sideslip angle must be observed or estimated.
Vehicle-dynamic estimation has been widely discussed in the literature, e.g. ([6],
[16], [8], [15], [2]). The vehicle-road system is usually modeled by combining a vehi-
cle model with a tire-force model in one block. One particularity of this study is that it
126 G. Baffet et al.
separates the estimation modeling into two blocks (shown in figure 1), where the first
block concerns the car body dynamic while the second is devoted to the tire-road inter-
face dynamic. The first block contains an Extended Kalman Filter (denoted as O1,4w )
constructed with a four-wheel vehicle model and a random walk force model. The first
observer O1,4w estimates longitudinal/lateral tire forces, velocity and yaw rate, which
are inputs to the observer in the second block (denoted as O2,LAM ). This second ob-
server is developed from a sideslip angle model and a linear adaptive force model.
Some studies have described observers which take road friction variations into ac-
count ([7], [12], [13]). In the works of [7] road friction is considered as a disturbance.
Alternatively, as in [12], the tire-force parameters are identified with an observer, while
in [13] tire forces are modeled with an integrated random walk model. In this study a
linear adaptive tire force model is proposed (in block 2) with an eye to studying road
friction variations.
The rest of the paper is organized as follows. The second section describes the
vehicle model and the observer O1,4w (block 1). The third section presents the sideslip
angle and cornering stiffness observer (O2,LAM in block 2). In the fourth section an
observability analysis is performed. The fifth section provides experimental results: the
two observers are evaluated with respect to sideslip angle and tire-force measurements.
Finally, concluding remarks are given in section 6.
This section describes the first observer O1,4w constructed from a four-wheel vehicle
model (figure 2),
An Estimation Process for Tire-Road Forces and Sideslip Angle 127
δ1
Fx11
Fy11
E
y Fy11
x
Vg δ2
Fy21 β Fx12
Fx21
ψ Fy12
Fy22 L1
Fx22
L2
where ψ̇ is the yaw rate, β the center of gravity sideslip angle, Vg the center of gravity
velocity, and L1 and L2 the distance from the vehicle center of gravity to the front and
rear axles respectively. Fx,y,i,j are the longitudinal and lateral tire-road forces, δ1,2 are
the front left and right steering angles respectively, and E is the vehicle track (lateral
distance from wheel to wheel).
In order to develop an observable system (notably in the case of null steering angles),
rear longitudinal forces are neglected relative to the front longitudinal forces. The sim-
plified equation for yaw acceleration (four-wheel vehicle model) can be formulated as
the following dynamic relationship (O1,4w model):
⎡ ⎤
L1 [Fy11 cos(δ1 ) + Fy12 cos(δ2 )
⎢ +Fx11 sin(δ1 ) + Fx12 sin(δ2 )] ⎥
⎢ ⎥
ψ̈ = 1 ⎢ −L2 [Fy21 + Fy22 ] ⎥, (1)
Iz ⎢ E ⎥
⎣ + [Fy11 sin(δ1 ) − Fy12 sin(δ2 ) ⎦
2
+Fx12 cos(δ2 ) − Fx11 cos(δ1 )]
where m the vehicle mass and Iz the yaw moment of inertia. The different force evolu-
tions are modeled with a random walk model:
˙ , Fyij
[Fxij ˙ ] = [0, 0], i = 1, 2 j = 1, 2. (2)
128 G. Baffet et al.
where Fx1 is the sum of front longitudinal forces (Fx1 = Fx11 + Fx12 ). Tire forces and
force sums are associated according to the dispersion of vertical forces:
Fz11 Fx1 Fz12 Fx1
Fx11 = , Fx12 = , (5)
Fz12 + Fz11 Fz12 + Fz11
Fz11 Fy1 Fz12 Fy1
Fy11 = , Fy12 = , (6)
Fz12 + Fz11 Fz12 + Fz11
Fz21 Fy2 Fz22 Fy2
Fy21 = , Fy22 = , (7)
Fz22 + Fz21 Fz22 + Fz21
where Fzij are the vertical forces. These are calculated, neglecting roll and suspension
movements, with the following load transfer model:
L2 mg − hcog mγx L2 hcog mγy
Fz11 = − , (8)
2(L1 + L2 ) (L1 + L2 )E
L2 mg − hcog mγx L2 hcog mγy
Fz12 = + , (9)
2(L1 + L2 ) (L1 + L2 )E
L1 mg + hcog mγx L2 hcog mγy
Fz21 = − , (10)
2(L1 + L2 ) (L1 + L2 )E
L1 mg + hcog mγx L2 hcog mγy
Fz22 = + , (11)
2(L1 + L2 ) (L1 + L2 )E
hcog being the center of gravity height and g the gravitational constant. The superposi-
tion principle means that the load transfer model assumes the assumption of indepen-
dent longitudinal and lateral acceleration contributions [8]. The input vectors U of the
An Estimation Process for Tire-Road Forces and Sideslip Angle 129
5
C1
Front lateral tire force (kN)
4.5
4
3.5
3
2.5
2 C1+ΔCa1 Linear
Linear adaptive
1.5
Burckhardt, asphalt dry
1 Burckhardt, asphalt wet
0.5 Burckhardt, cobblestone wet
Burckhardt, ice
0
0 2 4 6 8 10 12 14 16
Front lateral slip, front wheel sideslip angle (°)
Fig. 3. Lateral tire force models: linear, linear adaptive, Burckhardt for various road surfaces.
where Ci is the wheel cornering stiffness, a parameter closely related to tire-road fric-
tion.
When road friction changes or when the nonlinear tire domain is reached, ”real” wheel
cornering stiffness varies. In order a take the wheel cornering stiffness variations into
account, we propose an adaptive tire-force model (known as the linear adaptive tire-
force model, illustrated in figure 3). This model is based on the linear model at which a
readjustment variable ΔCai is added to correct wheel cornering stiffness errors:
The variable ΔCai is included in the state vector of the O2,LAM observer and it evo-
lution equation is formulated according to a random walk model (ΔC˙ai = 0). State
X ∈R3 , input U ∈R4 and measurement Y ∈R3 are chosen as:
where
β1 = u1 − x1 − L1 u2 /u3 ,
(19)
β2 = −x1 + L2 u2 /u3 .
= [x1 , x2 , x3 ], the state evolution model of
Given the state estimation denoted as X
O2,LAM is:
x˙1 = mu1
3
[u4 sin(u1 − x1 ) + Fyw1,aux cos(u1 − x1 )
+ Fyw2,aux cos(x1 )] − u2 ,
(20)
x˙2 = 0,
x˙3 = 0,
where the auxiliary variables Fyw1,aux and Fyw2,aux are calculated as:
Fyw1,aux = (C1 + x2 )(u1 − x1 − L1 u2 /u3 ),
(21)
Fyw2,aux = (C2 + x3 )(−x1 + L2 u2 /u3 ).
4 Estimation Method
The different observers (O1,4w , O2,LAM ) were developed according to an extended
Kalman filter method. In 1960 R. E. Kalman published a paper describing a recur-
sive solution to the discrete-data linear filtering problem [5]. Since this publication,
An Estimation Process for Tire-Road Forces and Sideslip Angle 131
Kalman’s method, usually known as the ”Extended Kalman Filter”, has been the object
of extensive search and numerous applications. For example, in [9], Mohinder and An-
gus present a broad overview of Kalman filtering.
This paragraph describes an EKF algorithm. bs,k , be,k and bm,k represent measurement
noise at time tk for the input and models respectively. This noise is assumed to be Gaus-
sian, white and centered. Qs , Qe and Qm are the noise variance-covariance matrices for
bs,k , be,k and bm,k , respectively. The discrete form of models is:
t
F (Xk , Uk∗ ) = Xk + tkk+1 f (Xk , Uk∗ )dt,
Xk+1 = F (Xk , Uk∗ ) + bm,k , (22)
Yk = h(Xk , Uk∗ ) + bs,k ,
Uk∗ = Uk + be,k .
X − and X + are state prediction and estimation vectors, respectively, at time tk . f and
k k
h are the evolution and measurement functions. The first step of the EKF is to linearize
the evolution equation around the estimated state and input:
∂F + ∗
Ak = ∂X (Xk , Uk ),
∂F + ∗
(23)
Bk = ∂U (Xk , Uk ).
The second step is the prediction of the next state, from the previous state and measured
input:
− = F (X
X + , Uk∗ ) (24)
k+1 k
The covariance matrix of state estimation uncertainty is then:
−
Pk+1 = Ak Pk+ A
k + Bk Qe Bk + Qm (25)
The third step is to calculate the Kalman gain matrix from the linearization of the mea-
surement matrix:
∂h −
Ck = ∂X (Xk+1 , Uk∗ ),
∂F −
Bk = ∂U (Xk+1 , Uk∗ ), (26)
∂h −
Dk = ∂U (Xk+1 , Uk∗ ).
The following intermediate variables are used:
−
Rk = Ck Pk+1 Ck + Dk Qe Dk ,
Sk = Bk Qe Dk , (27)
−
Tk = Pk+1 Ck + Sk ,
and the Kalman gain matrix is:
Kk = Tk (Rk + Qs + Ck Sk + Sk Ck )−1 (28)
The estimation step is to correct the state vector in line with measurement errors:
+ = X
X − + Kk (Yk+1 − h(X
− , Uk+1
∗
)) (29)
k+1 k+1 k+1
5 Observability
From the two vehicle-road systems (O1,4w , O2,LAM ), two observability functions were
calculated. The two systems are nonlinear, so the observability definition is local and
uses the Lie derivative [10].
with
= ∂hi (X) f (X,
L1f hi (X) U) (32)
∂X
where p is the dimension of the Y vector. Fig. 4 illustrates observability analysis of the
two systems for an experimental test, presented in section 6. Ranks of the two observ-
ability functions were 4 (for O1,4w ) and 3 (for O2,LAM ) (state dimensions) throughout
the test, and consequently the state of the two systems were locally observable.
6 Experimental Results
The experimental vehicle (see figure 5) is a Peugeot 307 equipped with a number of sen-
sors including GPS, accelerometer, odometer, gyrometer, steering angle, correvit and
dynamometric hubs. Among these sensors, the correvit (a non-contact optical sensor)
gives measurements of rear sideslip angle and vehicle velocity, while the dynamometric
hubs are wheel-force transducers.
This study uses an experimental test representative of both longitudinal and lateral
dynamic behaviors. The vehicle trajectory and the acceleration diagram are shown in
figure 6. During the test, the vehicle first accelerated up to γx ≈ 0.3g, then negotiated
An Estimation Process for Tire-Road Forces and Sideslip Angle 133
4.5
O 1,4w
O 2,LAM
Observability rank
4
3.5
2.5
0 5 10 15 20 25
Times (s)
Fig. 4. Ranks of the two observability functions for systems O1,4w and O2,LAM , during an ex-
perimental test (slalom).
Wheel-force transducers
5
acceleration (m/s2)
1
0
0
Lateral
-1 -5
0 50 100 150 -6 -4 -2 0 2
Longitudinal position (m) Longitudinal acceleration (m/s2)
a slalom at an approximate velocity of 12m/s (−0.6g < γy < 0.6g), before finally
decelerating to γx ≈ −0.7g. The results are presented in two forms: figures of esti-
mations/measurements and tables of normalized errors. The normalized error εz for an
estimation z is defined in [15] as
During the test, the sideslip angle input of O1,4w is estimated from the O2,LAM ob-
server. Figure 7 and table 1 present O1,4w observer results.
0
Measurement
-5 O1,4w
0 5 10 15 20 25
-2 Measurement
O1,4w
0 5 10 15 20 25
-5 Measurement
O1,4w
-10
0 5 10 15 20 25
Times (s)
The state estimations were initialized using the maximum values for the measure-
ments during the test (for instance, the estimation of the front lateral force Fy1 was set
to 5155 N ). In spite of these false initializations the estimations converge quickly to the
measured values, showing the good convergence properties of the observer. Moreover,
the O1,4w observer produces satisfactory estimations close to measurements (normal-
ized mean and standard deviations errors are less than 7 %). These good experimental
results confirm that the observer approach may be appropriate for the estimation of
tire-forces.
An Estimation Process for Tire-Road Forces and Sideslip Angle 135
During the test, (Fx1 ,Fy1 ,Fy2 ,Vg ) inputs of O2,LAM were originally those from the
O1,4w observer. In order to demonstrate the improvement provided by the observer
using the linear adaptive force model (O2,LAM , equation 16), another observer con-
structed with a linear fixed force model is used in comparison (denoted Orl , equation
15, described in [1]). The robustness of the two observers is tested with respect to tire-
road friction variations by performing the tests with different cornering stiffness param-
eters ([C1 , C2 ] ∗ 0.5, 1, 1.5). The observers were evaluated for the same test presented
in section 6.
Figure 8 shows the estimation results of observer Orl for rear sideslip angle. Ob-
server Orl gives good results when cornering stiffnesses are approximately known
([C1 , C2 ] ∗ 1). However, this observer is not robust when cornering stiffnesses change
([C1 , C2 ] ∗ 0.5, 2).
(C 1 , C 2 )
(C , C )*0.5
-5 1 2
0 5 10 15 20 25
Fig. 8. Observer Orl using a fixed linear force model, rear sideslip angle estimations with different
cornering stiffness settings.
Figure 9 and table 2 show estimation results for the adaptive observer O2,LAM . The
performance robustness of O2,LAM is very good, since sideslip angle is well estimated
irrespective of cornering stiffness settings. This result is confirmed by the normalized
mean errors (Table 2) which are approximately constant (about 7 %). The front and rear
cornering stiffness estimations (Ci + ΔCi ) converge quickly to the same values after
the beginning of the slalom at 12 s.
Table 1. Maximum absolute values, O1,4w normalized mean errors and normalized standard
deviation (Std).
(C , C )
-2 1 2
(C , C )*0.5
1 2
0 5 10 15 20 25
4 -1
x 10 O2,LAM, Front cornering stiffness (N.rad )
9 (C 1 , C 2 )*1.5
8 (C , C )
1 2
7
6
5
4 (C , C )*0.5
1 2
0 5 10 15 20 25
4 -1
x 10 O2,LAM, Rear cornering stiffness (N.rad )
(C 1 , C 2 )*1.5
8
(C , C )
1 2
4
(C 1 , C 2 )*0.5
0 5 10 15 20 25
Fig. 9. O2,LAM adaptive observer, Sideslip angle estimation results, Front and rear cornering
stiffness estimations Ci + ΔCi , with different cornering stiffness settings.
This study deals with two vehicle-dynamic observers constructed for use in a two-block
estimation process. Block 1 mainly estimates tire-forces (without an explicit tire-force
model), while block 2 calculates sideslip angle and corrects cornering stiffnesses (with
an adaptive tire-force model).
The first observer O1,4w (block 1), an extended Kalman Filter, is constructed with a ran-
dom walk force model. The experimental evaluations of O1,4w are satisfactory, showing
An Estimation Process for Tire-Road Forces and Sideslip Angle 137
Table 2. Observer OLAM , rear sideslip angle estimation results, maximum absolute value, nor-
malized mean errors.
References
1. Baffet, G., Stephant, J., Charara, A.: Vehicle Sideslip Angle and Lateral Tire-Force Estima-
tions in Standard and Critical Driving Situations: Simulations and Experiments. Proceedings
of the 8th International Symposium on Advanced Vehicle Control AVEC2006, Taipei Tai-
wan, (2006)
2. Baffet, G., Stephant, J., Charara, A.: Sideslip angle lateral tire force and road friction es-
timation in simulations and experiments. Proceedings of the IEEE conference on control
application CCA, Munich, Germany, (2006)
3. Bolzern, P., Cheli, F., Falciola, G., Resta, F.: Estimation of the nonlinear suspension tyre
cornering forces from experimental road test data. Vehicle system dynamics. Vol. 31 (1999)
23–34
4. Canudas-De-Wit, C., Tsiotras, P., Velenis, E., Basset, M., Gissinger, G.: Dynamic friction
models for road/tire longitudinal interaction. Vehicle System Dynamics, Vol. 39 (2003) 189–
226
5. Kalman, R.E.: A New Approach to Linear Filtering and Prediction Problems. Transactions
of the ASME - PUBLISHER of Basic Engineering, Vol. 82 (1960) 35–45
6. Kiencke U., Nielsen, L.: Automotive control system. Springer, (2000)
7. Lakehal-ayat, M., Tseng, H.E., Mao, Y., Karidas, J.: Disturbance Observer for Lateral Ve-
locity Estimation. Proceedings of the 8th International Symposium on Advanced Vehicle
Control AVEC2006, Taipei Taiwan (2006)
138 G. Baffet et al.
8. Lechner, D.: Analyse du comportement dynamique des vehicules routiers legers: developpe-
ment d’une methodologie appliquee a la securite primaire. Ph.D. dissertation Ecole Centrale
de Lyon, France (2002)
9. Mohinder, S.G., Angus, P.A.: Kalman filtering theory and practice. Prentice hall, (1993)
10. Nijmeijer, H., Van der Schaft, A.J.: Nonlinear Dynamical Control Systems. Springer-Verlag,
(1990)
11. Pacejka, H.B., Bakker, E.: The magic formula tyre model. Int. colloq. on tyre models for
vehicle dynamics analysis, (1991) 1–18
12. Rabhi, A., M’Sirdi, N.K., Zbiri, N., Delanne, Y: Vehicle-road interaction modelling for esti-
mation of contact forces. Vehicle System Dynamics, Vol. 43 (2005) 403–411
13. Ray, L.: Nonlinear Tire Force Estimation and Road Friction Identification : Simulation and
Experiments. Automatica, Vol. 33 (1997) 1819–1833
14. Segel, M.L.: Theorical prediction and experimental substantiation of the response of the
automobile to steering control. automobile division of the institut of mechanical engineers,
Vol. 7 (1956) 310–330
15. Stephant, J., Charara, A., Meizel, D.: Evaluation of a sliding mode observer for vehicle
sideslip angle. Control Engineering Practice, Available online 5 June 2006
16. Ungoren, A.Y., Peng, H., Tseng, H.E.: A study on lateral speed estimation methods. Int. J.
Vehicle Autonomous Systems, Vol. 2 (2004) 126–144
S MART MOBILE and its Applications to Guaranteed
Modeling and Simulation of Mechanical Systems
Keywords. Validated method, interval, Taylor model, initial value problem, guar-
anteed multibody modeling and simulation.
1 Introduction
Modeling and simulation of kinematics and dynamics of mechanical systems is em-
ployed in many branches of modern industry and applied science. This fact contributed
to the appearance of various tools for automatic generation and simulation of models of
multibody systems, for example, MOBILE [1]. Such tools produce a model (mostly a
system of differential or algebraic equations or both) from a formalized description of
the goal mechanical system. The system is then solved using a corresponding numerical
algorithm. However, the usual implementations are based on finite precision arithmetic,
which might lead to unexpected errors due to round off and similar effects. For example,
unreliable numerics might ruin an election (German Green Party Convention in 2002)
or even cost people lives (Patriot Missile failure during the Golf War), see [2], [3].
Aside from finite precision errors, possible measurement uncertainties in model
parameters and errors induced by model idealization encourage the employment of a
technique called interval arithmetic and its extensions in multibody modeling and sim-
ulation tools. Essential ideas of interval arithmetic were developed simultaneously and
independently by several people whereas the most influential theory was formulated by
R. E. Moore [4]. Instead of providing a point on the real number axis as an (inexact)
answer, intervals supply the lower and upper bounds that are guaranteed to contain the
140 E. Auer and W. Luther
true result. These two numbers can be chosen so as to be exactly representable in a given
finite precision arithmetic, which cannot be always ensured in the usual finite precision
case. The ability to provide a guaranteed result supplied a name for such techniques –
“validated arithmetics”. Their major drawback is that the output might be too uncertain
(e.g. [−∞; +∞]) to provide a meaningful answer. Usually, this is an indication that
the problem might be ill conditioned or inappropriately formulated, and so the finite
precision result wrong.
To minimize the possible influence of overestimation on the interval result, this tech-
nique was extended with the help of such notions as affine [5] or Taylor forms/models
[6]. Besides, strategies and algorithms much less vulnerable to overestimation were de-
veloped. They include rearranging expression evaluation, coordinate transformations,
or zonotopes [7].
S MART MOBILE1 enhances the usual, floating point based MOBILE with vali-
dated arithmetics and initial value problem (IVP) solvers [8]. In this way, it can model
and perform validated simulation of the behavior of various classes of mechanical sys-
tems including non-autonomous and closed-loop ones as well as provide more realistic
models by taking into account the uncertainty in parameters.
In this paper, we give an overview of the structure and the abilities of S MART MO-
BILE. The main validated techniques and software are referenced briefly in Section 2.
In Section 3, we describe in short the main features of MOBILE and focus on the im-
plementation of S MART MOBILE. Finally, a number of applications of this tool are
provided in Section 4. We summarize the paper in Section 5. On the whole, this paper
contains an overview of the potential of validated methods in mechanical modeling,
and, in particular, the potential of S MART MOBILE.
Note that the result of an interval operation is also an interval. Every possible combi-
nation of x ◦ y, where x ∈ [x; x] and y ∈ [y; y], lies inside this interval. (For division, it
is assumed that 0 ∈/ [y; y].)
142 E. Auer and W. Luther
To be able to work with this definition on a computer using a finite precision arith-
metic, a concept of a machine interval is necessary. The machine interval has machine
numbers as the lower and upper bounds. To obtain the corresponding machine inter-
val for the real interval [x; x], the lower bound is rounded down to the largest machine
number equal or less than x, and the upper bound is rounded up to the smallest machine
number equal or greater than x.
Consider an algorithm for solving the IVP
ẋ(t) = f (x(t)),
(1)
x(t0 ) ∈ [x0 ],
where t ∈ [t0 , tn ] ⊂ R for some tn > t0 , f ∈ C p−1 (D) for some p > 1, D ⊆ Rm is
open, f : D → Rm , and [x0 ] ⊂ D. The problem is discretized on a grid t0 < t1 < · · · <
tn with hk−1 = tk − tk−1 . Denote the solution with the initial condition x(tk−1 ) =
xk−1 by x(t; tk−1 , xk−1 ) and the set of solutions {x(t; tk−1 , xk−1 ) | xk−1 ∈ [xk−1 ]}
by x(t; tk−1 , [xk−1 ]). The goal is to find interval vectors [xk ] for which the relation
x(tk ; t0 , [x0 ]) ⊆ [xk ], k = 1, . . . , n holds.
The (simplified) kth time step of the algorithm consists of two stages [21] :
1. Proof of existence and uniqueness. Compute a step size hk−1 and an a priori enclo-
sure [x̃k−1 ] of the solution such that
(i) x(t; tk−1 , xk−1 ) is guaranteed to exist for all t ∈ [tk−1 ; tk ] and all xk−1 ∈ [xk−1 ],
(ii) the set of solutions x(t; tk−1 , [xk−1 ]) is a subset of [x̃k−1 ] for all t ∈ [tk−1 ; tk ].
Here, Banach’s fixed-point theorem is applied to the Picard iteration.
2. Computation of the solution. Compute a tight enclosure [xk ] ⊆ [x̃k−1 ] of the solution
of the IVP such that x(tk ; t0 , [x0 ]) ⊆ [xk ]. The prevailing algorithm is as follows.
2.1. Choose a one-step method
x(t; tk , xk ) = x(t; tk−1 , xk−1 ) + hk−1 ϕ(x(t; tk−1 , xk−1 )) + zk ,
where ϕ (·) is an appropriate method function, and zk is the local error which takes into
account discretization effects. The usual choice for ϕ (·) is a Taylor series expansion.
2.2. Find an enclosure for the local error zk . For the Taylor series expansion of order
p − 1, this enclosure is obtained as [zk ] = hpk−1 f [p] ([x̃k−1 ]), where f [p] ([x̃k−1 ]) is an
enclosure of the pth Taylor coefficient of the solution over the state enclosure [x̃k−1 ]
determined by the Picard iteration in Stage One.
2.3. Compute a tight enclosure of the solution. If mean-value evaluation for computing
the enclosures of the ranges of f [i] ([xk ]), i = 1, ..., p−1, instead of the direct evaluation
of f [i] ([xk ]) is used, tighter enclosures can be obtained.
Note that Taylor coefficients and their Jacobians (used in the mean-value evaluation)
are necessary to be able to use this algorithm.
3 S MART MOBILE
In this Section, we first describe the main features of MOBILE in short to provide a
better understanding of the underlying structure of S MART MOBILE. The implementa-
tion and features of this latter tool, which produces guaranteed results in the constraints
S MART MOBILE and its Applications 143
of the given model, are summarized afterwards. Note that not only simulation, but mod-
eling itself can be enhanced in S MART MOBILE by taking into account the uncertainty
in parameters, which might result, for example, from measurements.
3.1 MOBILE
MOBILE is an object oriented C++ environment for modeling and simulation of kine-
matics and dynamics of mechanical systems based on the multibody modeling method.
Its central concept is a transmission element which maps motion and force between
system states. For example, an elementary joint modeling revolute and prismatic joints
is such a transmission element. Mechanical systems are considered to be concatena-
tions of these entities. In this way, serial chains, tree type or closed loop systems can
be modeled. With the help of the global kinematics, the transmission function of the
complete system chain can be obtained from transmission functions of its parts. The
inverse kinematics and the kinetostatic method [22] help to build dynamic equations of
motion, which are solved with common IVP solvers. MOBILE belongs to the numer-
ical type of modeling software, that is, it does not produce a symbolic description of
the resulting model. Only the values of output parameters for the user-defined values of
input parameters and the source code of the program itself are available. In this case,
it is necessary to integrate verified techniques into the core of the software itself, as
opposed to the tools of the symbolical type, where the task is basically reduced to the
application of the validated methods to the obtained system of equations.
All transmission elements in MOBILE are derived from the abstract class MoMap,
which supplies their main functionality including the methods doMotion() and
doForce() for transmission of motion and force. For example, elementary joints
are modeled by the class MoElementaryJoint. Besides, there exist elements for
modeling mass properties and applied forces. Transmission elements are assembled
to chains implemented by the class MoMapChain. The methods doMotion() and
doForce() can be used for a chain representing the system to determine the cor-
responding composite transmission function. The class MoEqmBuilder is respon-
sible for generation of equations of motion, which are subsequently transferred into
their state-space form by MoMechanicalSystem. Finally, the corresponding IVP is
solved by an appropriate integrator algorithm, for example, Runge–Kutta’s using the
class MoRungeKuttaIntegrator derived from the basic class MoIntegrator.
consume a lot of CPU time in case of such a large program as MOBILE. An alternative
is to make use of the system’s mechanics for this purpose. This option is not provided by
MOBILE developers yet and seems to be rather difficult to algorithmize for (arbitrary)
higher orders of derivatives. That is why it was decided to employ the first possibility
in S MART MOBILE.
To obtain the derivatives, S MART MOBILE uses the overloading technique. In ac-
cordance with Subsection 2.1, all relevant occurrences of MoReal (an alias of double
in MOBILE) have to be replaced with an appropriate new data type. Almost each val-
idated solver needs a different basic validated data type (cf. Table 1). Therefore, the
strategy in S MART MOBILE is to use pairs type/solver. To provide interval validation
with the help of VNODE-based solver TMoAWAIntegrator, the basic data type
TMoInterval including data types necessary for algorithmic differentiation should
be used. The data type TMoFInterval enables the use of TMoValenciaIntegrator,
an adjustment of the basic version of VAL E NC IA-IVP. The newly developed
TMoRiotIntegrator is based on the IVP solver from the library R I OT, an indepen-
dent C++ version of COSY and COSY VI, and requires the class TMoTaylorModel,
a S MART MOBILE-compatible wrapper of the library’s own data type TaylorModel.
Analogously, to be able to use an adjustment of COSY VI, the wrapper RDAInterval
is necessary. Modification of the latter solver for S MART MOBILE is currently work in
progress.
In general, kinematics can be simulated with the help of all of the above mentioned
basic data types. However, other basic data types might become necessary for more
specific tasks such as finding of equilibrium states of a system since they require spe-
cific solvers. S MART MOBILE provides an option of modeling equilibrium states in
a validated way with the help of the interval-based data type MoFInterval and the
class MoIGradientStaticEquilibriumFinder, a version of the zero-finding
algorithm from the C-XSC T OOLBOX.
Table 1. Basic validated data types and the corresponding solvers in S MART MOBILE.
The availability of several basic data types in S MART MOBILE points out its sec-
ond feature: the general data type independency through its template structure. That is,
MoReal is actually replaced with a placeholder and not with a concrete data type. For
example, the transmission element MoRigidLink from MOBILE is replaced with
S MART MOBILE and its Applications 145
its template equivalent TMoRigidLink, the content of the placeholder for which (e.g.
TMoInterval or MoReal, cf. Fig. 1) can be defined at the final stage of the system
assembly. This allows us to use a suitable pair consisting of the data type and solver
depending on the application at hand. If only a reference about the form of the solution
is necessary, MoReal itself and a common numerical solver (e.g. Runge-Kutta’s) can
be used. If a relatively fast validation of dynamics without much uncertainty in param-
eters is of interest, TMoInterval and TMoAWAIntegrator might be the choice.
For validation of highly nonlinear systems with a considerable uncertainty, the slower
combination of TMoTaylorModel and TMoRiOTIntegrator can be used.
MOBILE SmartMOBILE
TMoRigidLink<TMoInterval> R;
MoRigidLink R;
TMoRigidLink<MoReal> R;
A MOBILE user can easily switch to S MART MOBILE because the executable
programs for the models in both environments are similar (cf. Fig. 4). In the validated
environment, the template syntax should be used. The names of transmission elements
are the same aside from the preceding letter T. The methods of the classes have the
same names, too. Only the solvers are, of course, different, although they follow the
same naming conventions.
4 Applications
The modeling of the five arm manipulator, the system defined in detail in [26], can be
enhanced in S MART MOBILE by using so called sloppy joints [27] instead of usual
revolute ones. In the transmission element responsible for the modeling of the sloppy
joint, it is no longer assumed that the joint connects the rotational axes of two bodies
exactly concentrically. Instead, the relative distance between the axes is supposed to be
within a specific (small) range. Two additional parameters are necessary to describe the
sloppy joint (cf. Fig. 2): radius li ∈ [0; lmax ] and the the relative orientation angle αi ∈
146 E. Auer and W. Luther
[0; 2π) (the parameter ϕi that describes the relative orientation between two connected
bodies is the same both for the sloppy and the usual joint).
Ki+1
Ki body i + 1 Fi+1 = Fi
Mi+1 = Mi + li × Fi
ϕi
body i li
αi
x y CPU (s)
TMoInterval 1.047 1.041 0.02
TMoTaylorModel 0.163 0.290 0.14
CPU time, which is not the case with S MART MOBILE. Additionally, the results are
proven to be correct there through the use of validated methods.
Table 3. Performance of validated integrators for the double pendulum over [0; 0.4].
y(m)
double l_max; 7.6
for(int i=0;i<5;i++){
7.4
TMoSlacknessJoint<type> joint(K[2*i],K[2*i+1],
phi[i],zAxis,l_max); 7.2
TMoRigidLink<type> link(K[2*i],K[2*i+1],l[i]);
R[i]=joint; L[i]=link;
Manipulator<<R[i]<<L[i]; 16.4 16.6 16.8 17 17.2 17.4 17.6 17.8
} x(m)
Manipulator.doMotion();
cout<<"Position="<<K[10].R*K[10].r;
(a) S MART MOBILE model. (b) Enclosures of the tip position.
#define TMoInterval t;
MoFrame K0, K1, K2, K3, K4; TMoFrame<t> K0, K1, K2, K3, K4;
MoAngularVariable psi1, psi2; TMoAngularVariable<t> psi1, psi2;
// transmission elements // transmission elements
MoVector l1(0,0,-1), l2(0,0,-1) ; TMoVector<t> l1(0,0,-1), l2(0,0,-1) ;
MoElementaryJoint R1(K0,K1,psi1,xAxis) ; TMoElementaryJoint<t> R1(K0,K1,psi1,xAxis) ;
MoElementaryJoint R2(K2,K3,psi2,xAxis) ; TMoElementaryJoint<t> R2(K2,K3,psi2,xAxis) ;
MoRigidLink rod1(K1,K2,l1),rod2(K3,K4,l2) ; TMoRigidLink<t> rod1(K1,K2,l1),rod2(K3,K4,l2) ;
MoReal m1(1),m2(1) ; t m1(1),m2(1) ;
MoMassElement Tip1(K2,m1),Tip2(K4,m2) ; TMoMassElement<t> Tip1(K2,m1),Tip2(K4,m2) ;
// the complete system // the complete system
MoMapChain Pend; TMoMapChain<t> Pend;
Pend << R1<<rod1<<Tip1<<R2<<rod2<<Tip2 ; Pend << R1<<rod1<<Tip1<<R2<<rod2<<Tip2 ;
// dynamics // dynamics
MoVariableList q; q << psi1<<psi2 ; TMoVariableList<t> q; q << psi1<<psi2 ;
MoMechanicalSystem S(q,Pend,K0,zAxis) ; TMoMechanicalSystem<t> S(q,Pend,K0,zAxis) ;
MoAdamsIntegrator I(S) ; TMoAWAIntegrator I(S,0.0001,ITS QR,15) ;
for(int i=0;i<100;i++) I.doMotion(); I.doMotion();
Fig. 4. The double pendulum in MOBILE (left) and S MART MOBILE (right).
2.4 -1.2
TMoAWAIntegrator
2.3 TMoRiOTIntegrator
-1.32
TMoValenciaIntegrator
second angle (rad)
2.2
first angle (rad)
-1.44
2.1
TMoAWAIntegrator (h=0.0001) -1.56
2
TMoRiotIntegrator (0.0002<h<0.02)
TMoValenciaIntegrator (h=0.0001)
1.9 -1.68
1.8 -1.8
0 0.082 0.164 0.246 0.328 0.41 0 0.082 0.164 0.246 0.328 0.41
time (s) time (s)
(a) Enclosure of the first joint angle. (b) Enclosure of the second joint angle.
Fig. 5. Interval enclosures for the first and second state variable of the double pendulum.
S MART MOBILE and its Applications 149
Since we do not have any uncertainties in the model, the intervals obtained are very
close to point intervals, that is, β i ≈ β i . The difference is noticeable only after the 12-th
digit after the decimal point. Note that if the same problem is modeled using the non-
verified model in MOBILE, only one (unstable) equilibrium state [β1 , β2 ] =[3.142;-3.142]
is obtained (using the identical initial guess).
5 Conclusions
In this paper, we presented the tool S MART MOBILE for guaranteed modeling and sim-
ulation of kinematics and dynamic of mechanical systems. With its help, the behavior of
150 E. Auer and W. Luther
0.2 2
0.12 1.6
Model data (without uncertainty)
Gait lab data
knee angle (rad)
Fig. 6. Interval enclosures of the knee angle with (left) and without(right) ±0.1% uncertainty in
the thigh length.
different classes of systems can be obtained with the guarantee of correctness, the option
which is not given by tools based on floating point arithmetics. Besides, the uncertainty
in parameters can be taken into account in a natural way. Moreover, S MART MOBILE
is flexible and allows the user to choose the kind of underlying arithmetics according to
the task at hand. The tool was applied to four mechanical problems.
The main directions of the future development will include enhancement of vali-
dated options for modeling and simulation of closed-loop systems in S MART MOBILE
as well as integration of further verified solvers into its core.
References
1. Kecskeméthy, A.: Objektorientierte Modellierung der Dynamik von Mehrkörpersystemen
mit Hilfe von Übertragungselementen. PhD thesis, Gerhard Mercator Universität Duisburg
(1993)
2. Huckle, T.: Software Bugs, www5.in.tum.de/h̃uckle/bugse.html (2005)
3. Arnold, D.N.: Some Disasters Attributable to Bad Numerical Computing,
www.ima.umn.edu/ arnold/disasters/ (1998)
4. Moore, R.E.: Interval Analysis. Prentice-Hall, New York (1966)
5. de Figueiredo, L.H., Stolfi, J.: Affine Arithmetic: Concepts and Applications. Numerical
Algorithms 37 (2004) 147–158
6. Neumaier, A.: Taylor Forms — Use and Limits. Reliable Computing 9 (2002) 43–79
7. Lohner, R.: On the Ubiquity of the Wrapping Effect in the Computation of the Error Bounds.
In Kulisch, U., Lohner, R., Facius, A., eds.: Perspectives on Enclosure Methods, Springer
Wien New York (2001) 201–217
8. Auer, E., Rauh, A., Hofer, E.P., Luther, W.: Validated Modeling of Mechanical Systems with
S MART MOBILE: Improvement of Performance by VAL E NC IA-IVP. In: Proc. of Dagstuhl
Seminar 06021: Reliable Implementation of Real Number Algorithms: Theory and Practice.
Lecture Notes in Computer Science (2006) To appear.
S MART MOBILE and its Applications 151
1 Introduction
Path planning is the generation of a space path between an initial location and the
desired destination that has an optimal or near-optimal performance under specific
constraints [1]. The main concerns during the comparison of various candidate
solutions are feasibility and optimality [2]. Searching for optimality is not a trivial
task and in most cases results in non-affordable computation time, even in simple
problems. Therefore, in most cases we search for suboptimal or just feasible solutions.
In this work the path planning for cooperating unmanned vehicles moving over a
3-D terrain is considered; the vehicles can be either Unmanned Aerial Vehicles
154 I. K. Nikolos and N.C. Tsourveloudis
(UAVs) or Autonomous Underwater Vehicles (AUVs). UAVs and AUVs share the
common feature of performing inside a 3-D environment and having six degrees of
freedom, although their kinematic characteristics are not the same. The upper ceiling
for AUVs is the sea surface, while a similar upper ceiling exists for UAVs due to
stealth considerations or flight envelop restrictions.
Path planning for UAVs and AUVs imply special characteristics that have to be
considered [3], [4], [5], such as: (a) physical feasibility, (b) performance related to
mission, (c) real-time implementation, (d) cooperation between the vehicles, (e)
stealth (low observability due to the selected path). Besides their common features,
differences also exist between the two categories, as far as coordination and path
planning is concerned, which are mainly related with the different sensors and
electronic equipment that are needed in order to cooperate and perform their mission.
Cooperation between robotic vehicles has gained recently an increased interest as
systems of multiple vehicles engaged in cooperative behavior show specific benefits
compared to a single one [6] [7].
Path planning problems are computationally demanding multi-objective multi-
constraint optimization problems [8]. The problem complexity increases when
multiple vehicles should be used. Various approaches have been reported for UAVs
coordinated route planning, such as Voronoi diagrams [9], mixed integer linear
programming [10], [11] and dynamic programming [12] formulations.
In Beard et al. [9] the motion-planning problem was decomposed into a waypoint
path planner and a dynamic trajectory generator. The path-planning problem was
solved via a Voronoi diagram and Eppstein’s k-best paths algorithm, while the
trajectory generator problem was solved via a real-time nonlinear filter.
In [13] the motion-planning problem for a limited resource of Mobile Sensor
Agents (MSAs) is investigated, in an environment with a number of targets larger
than the available MSAs. The problem is formulated as an optimization one, whose
objective is to minimize the average time duration between two consecutive
observations of each target.
Computational intelligence methods, such as Neural Networks [14], Fuzzy Logic
[15] and Evolutionary Algorithms (EAs) [5], [16] have been successfully used to
produce trajectories for guiding mobile robots in known, unknown or partially known
environments. Besides their computational cost, EAs are considered as a viable
candidate to solve path planning problems effectively; the reasons are their high
robustness, their ease of implementation, and their high adaptability to different
optimization problems, with or without constraints [16].
EAs have been successfully used in the past for the solution of the path-finding
problem in ground based or sea surface navigation [17], [18], [19], or for solving the
path-finding problem in a 3-D environment for underwater vehicles [20], [21].
Changwen Zheng et al. [5] proposed a route planner for UAVs, based on
evolutionary computation. The generated routes enable the vehicles to arrive at their
destination simultaneously by taking into account the exposure of UAVs to potential
threats. The flight route consists of straight-line segments, connecting the way points
from the starting to the goal points. The cost function penalizes the route length the
high altitude flights or routes that come dangerously close to known ground threats.
In [22] a multi-task assignment problem for cooperating UAVs is formulated as a
combinatorial optimization problem; a Genetic Algorithm is utilized for assigning the
multiple agents to perform various tasks on multiple targets.
Path Planning for Cooperating Unmanned Vehicles over 3-D Terrain 155
Fig. 1. A representation of the proposed concept: three vehicles are moving along curved path
lines over a 3-D terrain; an upper ceiling is enforced (either sea surface or the maximum
allowed flying height); on-board sensors are scanning the environment within a certain range in
front of each vehicle.
Initially the off-line planner will be presented; it generates collision free paths in
environments with known characteristics and flight restrictions. The on-line planner,
being an extension of the off-line one, was developed to generate collision free paths
in unknown environments. As each vehicle moves towards its destination, its on-
board sensors are scanning the environment within a certain range and certain angles;
this information is exchanged between the members of the team, resulting in a gradual
mapping of the environment (Fig. 1). The on-line planner uses the acquired
knowledge of the environment to generate a near optimum path for each vehicle that
will guide it safely to an intermediate position within the known territory. The process
is repeated until the corresponding final position is reached by one or more members
of the team. Then, each one of the remaining members of the team either uses the off-
line planner to compute a path that connects its current position and the final
destination, or continues in the on-line mode until it reaches the common destination.
Both path planning problems are formulated as minimization problems, where
156 I. K. Nikolos and N.C. Tsourveloudis
specially constructed functions take into account mission and cooperation objectives
and constraints, with a Differential Evolution algorithm to serve as the optimizer.
The rest of the paper is organized as follows: in section 2 the off-line path planner
for a single vehicle will be briefly discussed. Section 3 deals with the concept of on-
line path planning for cooperating vehicles. The problem formulation is described,
including assumptions, objectives, constraints, cost function definition and path
modeling. Simulations results are presented in section 4, followed by discussion in
section 5.
The off-line planner generates collision free paths in environments with known
characteristics and flight restrictions, where the solid boundaries are interpreted as 3-
D surfaces. The derived path line for each vehicle is a single continuous 3-D B-Spline
curve with fixed starting and ending control points. A third point, placed in a pre-
specified distance from the starting one, is also fixed, determining the initial flight
direction for the corresponding vehicle. Between the fixed control points, free-to-
move control points determine the shape of the curve. For each path, the number of
the free-to-move control points is user-defined.
Straight line segments that connect a number of way points have been used in the
past to model UAV paths in 2D or 3D space [23], [5]. However, these simplified
paths cannot be used for an accurate simulation of UAV’s flight, unless a large
number of way points is used. In [9], paths from the initial vehicle location to the
target location are derived from a graph search of a Voronoi diagram that is
constructed from the known threat locations. The resulting paths, consisting of line
segments, are subsequently smoothed around each way point. Dubins [24] car
formulation has been proposed as an alternative approach to the modeling of UAV
dynamics [25]. This approach seems inefficient to model scenarios including 3D
terrain avoidance and following of stealthy routes. However, this approach seems to
be sufficient enough for task assignment purposes to cooperating UAVs flying at safe
altitudes [13], [22], [25].
B-Spline curves have been used in the past for trajectory representation in 2-D [26]
or in 3-D environments [16], [27]. They are well fitted in an optimization procedure
as they need a few variables (the coordinates of their control points) to define
complicated curved paths [28], [29]. The use of B-Spline curves for the determination
of a path-line provides the advantage of describing complicated non-monotonic 3-
dimensional curves with controlled smoothness with a small number of design
parameters, i.e. the coordinates of the control points. Another valuable characteristic
of the adopted B-Spline curves is that the curve is tangential to the control polygon at
the starting and ending points. This characteristic can be used in order to define the
starting or ending direction of the curve, by inserting an extra fixed point after the
starting one, or before the ending control point.
Path Planning for Cooperating Unmanned Vehicles over 3-D Terrain 157
Fig. 2. Schematic representation of the B-Spline control polygon (top) and its projection on the
horizontal plane (bottom).
In this work each path is constructed using a 3-D B-Spline curve; each B-Spline
control point is defined by its three Cartesian coordinates xk,j, yk,j, zk,j (k=0,…,n,
j=1,…,N, N being the number of vehicles, while n+1 is the number of control points
in each B-Spline curve, the same for all curves). The first (k=0) and last (k=n) control
points of the control polygon are the initial and target points of the jth UAV, which are
predefined by the user. The second (k=1) control point is positioned in a pre-specified
distance from the first one, in a given altitude, and in a given direction, in order to
define the initial direction of the corresponding path.
The control polygon of each B-Spline curve is defined by successive straight line
segments (Fig. 2). Each segment of the control polygon is defined using its projection
on the horizontal plane (Fig. 2); the length seg_lengthk,j, and the direction seg_anglek,j
of this projection are used as design variables (k=2,…,n-1, j=1,…,N). Design
variables seg_anglek,j are defined as the difference between the direction (in deg) of
the current segment’s projection and the projection of the previous one. For the first
segment (k=1) of each control polygon seg_angle1,j is measured with respect to the x-
axis (Fig. 2). Additionally, the control points’ altitudes zk,j are used as design
variables, except for the three fixed points (k=0, k=1, and k=n), which are predefined.
For the first segment (k=1), seg_length1,j, and seg_angle1,j are pre-specified in order to
define the initial direction of the path, and they are not included in the design
variables of the optimization procedure.
The horizontal coordinates of each B-Spline control point xk,j and yk,j can be easily
calculated by using seg_lengthk,j and seg_anglek,j along with the coordinates of the
previous control point xk-1,j and yk-1,j. The use of seg_lengthk,j and seg_anglek,j as design
158 I. K. Nikolos and N.C. Tsourveloudis
variables instead of xk,j and yk,j was adopted for three reasons. The first reason is the
fact that abrupt turns of each flight path can be easily avoided by explicitly imposing
short lower and upper bounds for the seg_anglek,j design variables. The second reason
is that by using the proposed design variables a better convergence rate was achieved
compared to the case with the B-Spline control points’ coordinates (xk,j, yk,j, zk,j) as
design variables. The latter observation is a consequence of the shortening of the
search space, using the proposed formulation. The third reason is that by using
seg_lengthk,j as design variables, an easier determination of the upper bound for each
curve’s length is achieved, along with a smoother variation of the lengths of each
curve’s segments. The lower and upper boundaries of each independent design
variable are predefined by the user.
For the case of a single vehicle the optimization problem to be solved minimizes a
set of five terms, connected to various objectives and constraints; they are associated
with the feasibility of the curve, its length and a safety distance from the ground. The
cost function to be minimized is defined as:
5
f = ∑w f
i =1
i i (1)
Term f1 penalizes the non-feasible curves that pass through the solid boundary. In
order to compute this term, discrete points along each curve are computed, using B-
Spline theory [28] [29] and a pre-specified step for B-Spline parameter u. The value
of f1 is proportional to the number of discrete curve points located inside the solid
boundary. Term f2 is the length of the curve (non-dimensional with the distance
between the starting and destination points) and is used to provide shorter paths. Term
f3 is designed to provide flight paths with a safety distance from solid boundaries. For
each discrete point i (i=1,…,nline, where nline is the number of discrete curve points)
of the B-Spline curve its distance from the ground is calculated (the ground is
described by a mesh of nground discrete points). Then the minimum distance of the
curve and the ground dmin is computed. Term f3 is then defined as:
( )
2
f3 = d safe d min , (2)
to ensure that curves inside the pre-specified space have a smaller cost function than
those having control points outside of it. This can be formally written as
if f 4 > 0 ⇒ f 4 = f 4 + C2 , (4)
where C2 is a constant.
Term f5 was designed to provide path lines within the already scanned terrain. Each
control point of the B-Spline curve is checked for whether it is placed over a known
territory. The ground is modeled as a mesh of discrete points and the algorithm
computes the mesh shell (on the x-y plane) that includes each B-Spline control point.
If the corresponding mesh shell is characterized as unknown then a constant penalty is
added to f5. A mesh shell is characterized as unknown if all its 4 nodes are unknown
(have not been detected by a sensor).
Weights wi are experimentally determined, using as criterion the almost uniform
effect of the last four terms in the objective function. Term w1 f1 has a dominant role in
Eq. 1 providing feasible curves in few generations, since path feasibility is the main
concern. The minimization of Eq. 1 results in a set of B-Spline control points, which
actually represent the desired path.
For the solution of the minimization problem a Differential Evolution (DE) [30]
algorithm is used. The classic DE algorithm evolves a fixed size population, which is
randomly initialized. After initializing the population, an iterative process is started
and at each generation G, a new population is produced until a stopping condition is
satisfied. At each generation, each element of the population can be replaced with a
new generated one. The new element is a linear combination between a randomly
selected element and the difference between two other randomly selected elements. A
detailed description of the DE algorithm used in this work can be found in [31].
The on-line path planner was designed for navigation and collision avoidance of a
small team of autonomous vehicles moving over a completely unknown static 3-D
terrain. The general constraint of the problem is the collision avoidance between the
vehicles and the ground. The route constraints are: (a) predefined initial and target
coordinates for all vehicles, (b) predefined initial directions for all vehicles, (c)
predefined minimum and maximum limits of allowed-to-move space. The first two
route constraints are explicitly taken into account by the optimization algorithm. The
third route constraint is implicitly handled by the algorithm, through the cost function.
The cooperation objective is that all members of the team should reach the same
target point.
The on-line planner is based on the ideas developed in [16] for a single UAV. The
on-line planner rapidly generates a near optimum path, modeled as a 3-D B-Spline
curve that will guide each vehicle safely to an intermediate position within the already
scanned area. The information about the already scanned area by each vehicle is
passed to the rest cooperating vehicles, in order to maximize the knowledge of the
environment. The process is repeated until the final position is reached by one or
more members of the team (it is possible some members of the team to reach
simultaneously the target – in the same number of on-line steps). Then the rest
members of the team turn into the off-line mode and a single B-Spline path for each
160 I. K. Nikolos and N.C. Tsourveloudis
vehicle is computed to guide it from its current position, through the already scanned
territory to the common final destination. An alternative approach, which was also
tested, is to keep the remaining vehicles in the on-line mode, and not to turn into the
off-line mode after a vehicle has reached the target.
In the on-line problem only four control points define each B-Spline curve, the first
two of which are fixed and determine the direction of the path of the current vehicle.
The remaining two control points are allowed to take any position within the already
scanned space, taking into consideration given constraints. The second control is used
to make sure that at least first derivative continuity of the two connected curves is
provided at their common point. Hence, the second control point of the next curve
should lie on the line defined by the last two control points of the previous curve (Fig.
3). The design variables that define each B-Spline segment are the same as in the off-
line case, i.e. seg_lengthk,j , seg_anglek,j, and zk,j (k=2, 3, and j=1,…,N).
The path-planning algorithm considers the scanned surface as a group of quadratic
mesh nodes. All ground nodes are initially considered unknown. An algorithm is used
to distinguish between nodes visible by the on-board sensors and nodes not visible.
The algorithm uses a predefined range RS for each sensor as well as two angles, one
for the horizontal aH and one for the vertical scanning aV (Fig. 4). The range and the
two angles are predefined by the user and depend on the type of the sensors used. A
node is not visible by a sensor if it is not within the sensor’s range and angles of sight,
or if it is within the sensor’s range and angles of sight but is hidden by a ground
section that lies between it and the vehicle. The corresponding algorithm, simulates
the sensor and checks whether the ground nodes within the sensor’s range are
“visible” or not and consequently “known” or not. If a newly scanned node is
characterized as “visible”, it is added to the set of scanned ground nodes, which is
common for all cooperating vehicles.
The information from its sensors is used to produce the first path line segment for
the corresponding vehicle. As the vehicle is moving along its first segment and until it
has traveled about 3/4 of its length, its sensor scans the surrounding area, returning a
new set of visible nodes, which are subsequently added to the common set of scanned
nodes. This (simulated) scanning is performed for 11 intermediate positions along
each path segment. The on-line planner, then, produces a new segment for each
vehicle, whose first point is the last point of the previous segment and whose last
point lies somewhere in the already scanned area, its position being determined by the
on-line procedure. The on-line process is repeated until the ending point of the current
path line segment of one vehicle lies close to the final destination. Then the rest
members of the team either can turn into the off-line process, in order to reach the
target using B-Spline curves that pass through the scanned terrain, or may remain in
the on-line mode.
Path Planning for Cooperating Unmanned Vehicles over 3-D Terrain 161
Fig. 3. Schematic representation of the formation of the complete path by successive B-Spline
segments (projected on the horizontal plane).
Fig. 4. Schematic representation of the scanned area in front of each vehicle; aH and aV are the
solid angles in the horizontal and vertical directions that define the scanned sector.
The position at which the algorithm starts to generate the next path line segment
for each vehicle (here taken as the 3/4 of the segment length) depends on the range of
the sensors, vehicle’s velocity and the computational demands of the algorithm. The
computation of intermediate path segments for each vehicle is formulated as a
minimization problem. The cost function to be minimized is formulated as the
weighted sum of seven different terms
7
f = ∑w f
i =1
i i , (5)
where wi are the weights and fi are the corresponding terms described below.
Terms f1, f2, and f3 are similar to terms f1, f3, and f4 respectively of the off-line
procedure. Term f1 penalizes the non-feasible curves that pass through the solid
boundary. Term f2 is designed to provide flight paths with a safety distance from solid
boundaries. Only already scanned ground points are considered for this calculation.
Additionally, the points that are lower than a pre-specified (small) vertical distance
from the current level of flight are not considered for this calculation. Term f3 is
162 I. K. Nikolos and N.C. Tsourveloudis
designed to provide B-Spline curves with control points inside the pre-specified
working space.
Term f4 is designed to provide flight segments with their last control point having a
safety distance from solid boundaries. This term was introduced to ensure that the
next path segment will not start very close to a solid boundary (which may lead to
infeasible paths or paths with abrupt turns). The minimum distance Dmin from the
ground is calculated for the last control point of the current path segment. Only
already scanned ground points are considered for this calculation. As in term f2 the
points that are lower than a pre-specified (small) vertical distance from the current
level of flight are not considered for this calculation. Term f4 is then defined as
( )
2
f 4 = d safe Dmin , (6)
where Npo is the number of the discrete curve points produced so far by all vehicles
and rk is their distance from the last point of the current curve segment.
Term f7 represents another potential field, which is developed around the final
target and has the form
f 7 = r22 , (9)
where r2 is the distance between the last point of the current curve and the final
destination. Thus, when the vehicle is near its target, the value of this term is quite
small and prevents the vehicle from moving away.
Weights wi in Eq. 5 are experimentally determined, using as criterion the almost
uniform effect of all the terms, except the first one. Term w1 f1 has a dominant role, in
order to provide feasible curve segments in a few generations, since path feasibility is
the main concern.
Path Planning for Cooperating Unmanned Vehicles over 3-D Terrain 163
Fig. 5. Test Case 1: On-line path planning for a single UAV. The maximum allowed height for
the vehicle is shown using a cutting plane.
4 Simulation Results
The same artificial environment was used for all the test cases considered, with
different starting and target points. The artificial environment is constructed within a
rectangle of 20x20 (non-dimensional distances). The (non-dimensional) range of the
sensors (Rs) that scan the environment was set equal to 4 for all vehicles. The safety
distance from the ground was set equal to dsafe=0.25. The (experimentally optimized)
settings of the Differential Evolution algorithm during the on-line procedure were as
follows: population size = 20, F = 0.6, Cr = 0.45, number of generations = 70. For the
on-line procedure we have two free-to-move control points, resulting in 6 design
variables. The corresponding settings during the off-line procedure were as follows:
population size = 30, F = 0.6, Cr = 0.45, number of generations = 70. For the off-line
procedure eight control points were used to construct each B-Spline curve (including
the initial (k=0) and the final one (k=7). These correspond to five free-to-move control
points, resulting in 15 design variables. All B-Spline curves have a degree equal to 3.
All experiments have been designed in order to search for path lines between
“mountains”. For this reason, an upper ceiling has been enforced in the optimization
procedure, by explicitly providing an upper boundary for the z coordinates of all B-
Spline control points. Test Case 1 corresponds to the on-line path planning for a
single vehicle over an unknown environment (Fig. 5). The horizontal and vertical
angles aH and aV, used for the sensor’s simulation, were set equal to 45 degrees. The
complete path consists of 6 B-Spline segments; the final curve is smooth enough to be
followed by a vehicle. The first turn in the path line is due to the presence of an
obstacle (solid ground) in front of the vehicle (Fig. 5); the second turn forces the
vehicle towards its final destination.
164 I. K. Nikolos and N.C. Tsourveloudis
Fig. 6. Test Case 2 corresponds to the on-line path planning for 3 vehicles. The picture shows
the status of the path lines when the first vehicle (near the upper corner) reaches the target.
Fig. 7. The final status of the path lines of Test Case 2. The off-line path planner was used by
the remaining vehicles to drive them, from their current position to the final destination,
through already scanned area.
Test Case 2 corresponds to the on-line path planning for 3 unmanned vehicles (Fig.
1, 6, and 7). The horizontal and vertical angles aH and aV, used for the sensor’s
simulation were set equal to 45 and 30 degrees respectively. Figure 1 shows the status
of the three path lines when the first line segment has been computed for all three
vehicles. Figure 6 shows the status of the three path lines when the first vehicle
reaches the target, after two steps in the on-line procedure. The final status is
demonstrated in Fig. 7; the remaining two vehicles turn into off-line mode to reach
the target. A curved path is computed for each one of the remaining vehicles, which
drives the vehicle from its current position to the target, through the already scanned
area.
Path Planning for Cooperating Unmanned Vehicles over 3-D Terrain 165
Fig. 8. Test Case 3: Successive snapshots of the path-line for three vehicles, computed using
only the on-line planner. Two of the vehicles are reaching the target using 3 segments.
An alternative strategy was considered in Test Case 3. Instead of turning into off-
line mode when a vehicle (or more) is reaching the target, the on-line path planner is
always used to guide all vehicles to the target. In this case three vehicles are
considered. The horizontal and vertical angles aH and aV, used for the sensor’s
simulation were set equal to 45 degrees for both angles. Figure 8 contains successive
snapshots of the path lines produced using the on-line path planner. As it can be
166 I. K. Nikolos and N.C. Tsourveloudis
observed the two vehicles arrive to the target after the same number of steps (Fig. 8).
Two more steps of the procedure are needed for the third vehicle to reach the target
(Fig. 9).
Fig. 9. Test Case 3: Two more steps are needed for the third vehicle to reach the target using
the on-line path planner.
5 Discussion
References
22. Shima, T., Rasmussen, S.J., Sparks, A.G.: UAV Cooperative Multiple Task Assignments
using Genetic Algorithms. In: 2005 American Control Conference, June 8-10, Portland,
OR, USA (2005)
23. Moitra, A., Mattheyses, R.M., Hoebel, L.J., Szczerba, R.J., Yamrom, B.: Multivehicle
Reconnaissance Route and Sensor Planning. IEEE Trans. on Aerospace and Electronic
Syst. 37, 799–812 (2003)
24. Dubins, L.: On Curves of Minimal Length with a Constraint on Average Curvature, and
with Prescribed Initial and Terminal Position. Amer. J. of Math. 79, 497–516 (1957)
25. Shima, T., Schumacher, C.: Assignment of Cooperating UAVs to Simultaneous Tasks
Using Genetic Algorithms. In: AIAA Guidance, Navigation, and Control Conference and
Exhibit, San Francisco (2005)
26. Martinez-Alfaro H., and Gomez-Garcia, S.: Mobile Robot Path Planning and Tracking
using Simulated Annealing and Fuzzy Logic Control. Expert Systems with Applications,
15, 421–429 (1988)
27. Nikolos, I.K., Tsourveloudis, N., and Valavanis, K.P.: Evolutionary Algorithm Based 3-D
Path Planner for UAV Navigation. In: 9th Mediterranean Conference on Control and
Automation, Dubrovnik, Croatia (2001)
28. Piegl, L., Tiller, W.: The NURBS Book. Springer (1997)
29. Farin, G.: Curves and Surfaces for Computer Aided Geometric Design, a Practical Guide.
Academic Press (1988)
30. Price, K.V., Storn, R.M., Lampinen, J.A.: Differential Evolution, a Practical Approach to
Global Optimization. Springer-Verlag, Berlin Heidelberg (2005)
31. Nikolos, I.K., Tsourveloudis, N., Valavanis, K.: Evolutionary Algorithm Based Path
Planning for Multiple UAV Cooperation. In: Valavanis, K. (ed.), Advances in Unmanned
Aerial Vehicles, State of the Art and the Road to Autonomy, pp. 309–340. Springer (2007)
Tracking of Manoeuvring Visual Targets
1 Introduction
During the last few years, the use of visual servoing and visual tracking has been more
and more common due to the increasing power of algorithms and computers.
Visual servoing and visual tracking are techniques that can be used to control a
mechanism according to visual information. This visual information is available with a
time delay, therefore, the use of predictive algorithms are widely extended (notice that
prediction of the object’s motion can be used for smooth movements without disconti-
nuities).
The Kalman filter [1] has become a standard method to provide predictions and
solve the delay problems (considered the predominant problem of visual servoing) in
visual based control systems [2], [3] and [4].
The time delay is one of the bigger problems in this type of systems. For practically
all processing architectures, the vision system requires a minimum delay of two cycles,
but for on-the-fly processing, only one cycle of the control loop is needed [5].
Authors of [6] demonstrate that steady-state Kalman filters (αβ and αβγ filters)
performs better than the KF in the presence of abrupt changes in the trajectory, but not
as good as the KF for smooth movements. Some research works about the motion esti-
mation are presented in [7] and [8]. Further, some motion understanding and trajectory
planning based on the Frenet-Serret formula are described in [9], [10] and [11]. Using
the knowledge of the motion and the structure, identification of the target dynamics may
be accomplished.
To solve delay problems, taking into account these considerations, we propose a new
prediction algorithm. This new filter can be called Fuzzy predictor. This filter minimizes
170 C. Pérez et al.
the tracking error and works better than the classic KF because it decides what of the
used filters (αβ slow /αβ f ast [5], αβγ, Kv, Ka and Kj) must be employed. The transition
between them is smooth avoiding discontinuities.
These five filters should be used in a combination because: The Kalman filter is con-
sidered one of the reference algorithms for position prediction (but we must consider the
right model depending on the object’s dynamics: velocity−acceleration−jerk). When
the object is outside the image plane, the best prediction is given by steady-state filters
(αβ/αβγ depending on the object’s dynamics: velocity−acceleration). Obviously, con-
sidering more filters and more behaviour cases, the Fuzzy predictor can be improved
but computational cost of additional considerations can be a problem in real-time ex-
ecution. These five filters are considered by authors as the best consideration (solution
taking into account the prediction quality and the computational cost). This is the reason
to combine these five filters to obtain the Fuzzy predictor.
This work is focused on the new Fuzzy prediction filter and is structured as follows:
in section 2 we present the considered dynamics, the considered dynamics is a Jerk
model with adaptable parameters obtained by KFs [12], [13] and [14]. In section 3,
we present the block diagram for the visual servoing task. This block diagram is widely
used in several works like [2] or [5]. Section 4 presents the basic idea applied in our case
(see [15] and [15]), but the main work done is focused in one of the blocks described in
section 3, the Fuzzy predictor is described in section 5.
In section 6, we can see the results with simulated data. These results show that the
Fuzzy predictor can be used to improve the high speed visual servoing tasks. This sec-
tion is organized in two parts: in the first one (Subsection 6.1), the analysis of the Fuzzy
predictor behaviour is focussed and in the second one (Subsection 6.2) their results are
compared those with achieved by Chroust and Vince [5] and with CPA [16] algorithm
(algorithm used for aeronautic/aerospace applications). Conclusions and future work
are presented in section 7.
The object’s movement is not known (a priori) in a general visual servoing scheme.
Therefore, it is treated as an stochastic disturbance justifying the use of a KF as a
stochastic observer. The KF algorithm presented by Kalman [1] starts with the system
description given by 1 and 2.
xk+1 = F · xk + G · ξk (1)
yk = C · xk + N · ηk (2)
where xk ∈ nx1 is the state vector and yk ∈ mx1 is the output vector. The matrix
F ∈ nxm is the so-called system matrix witch describes the propagation of the state
from k to k+1 and C ∈ mxn describes the way in which the measurement is generated
out of the state xk . In our case of visual servoing m is 1 (because only the position is
measured) and n = 4. The matrix G ∈ nx1 distributes the system noise ξk to the states
and ηk is the measurement noise. In the KF the noise sequences ηk and ξk are assumed
Tracking of Manoeuvring Visual Targets 171
to be gaussian, white and uncorrelated. The covariance matrices of ξk and ηk are Q and
R respectively (these expressions consider 1D movement). A basic explanation for the
assumed gaussian white noise sequences is given in [17].
In the general case of tracking, the usual model considered is a constant acceleration
model [5], but in our case, we consider a constant jerk model described by matrices F
and C are:
⎡ ⎤
1 T T 2 /2 T 3 /6
⎢ 0 1 T T 2 /2 ⎥
F =⎢
⎣0 0 1
⎥; C= 1000
T ⎦
00 0 1
where T is the sampling time. This model is called a constant jerk model because it
assumes that the jerk (dx3 (t)/dt3 ) is constant between two sampling instants.
F and C matrices are obtained from expression 3 to 7:
a − ai Δa
= = J0 (3)
t − ti Δt
1 1
x(t) = xi + vi (t − ti ) + ai (t − ti )2 + Ji (t − ti )3 (4)
2 6
1
v(t) = vi + ai (t − ti ) + J0 (t − ti )2 (5)
2
a(t) = ai + J0 (t − ti ) (6)
J(t) = J0 (7)
where, x is the position, v is the velocity, a is the acceleration and J is the jerk. So the
relation between them is:
...
x(t) = f (t); ẋ(t) = v(t); ẍ(t) = a(t); x(t) = J(t)
The main objective of the visual servoing is to bring the target to a position of the image
plane and to keep it there for any object’s movement. In Fig. 1 we can see the visual
control loop presented by Corke in [2]. The block diagram can be used for a moving
camera and for a fixed camera controlling the motion of a robot. Corke use a KF to
incorporate a feed-forward structure. We incorporate the Fuzzy prediction algorithm in
the same structure (see Fig. 2) but reordering the blocks for an easier comprehension.
V(z) in Fig. 2 represents the camera behaviour, which is modeled as a simple delay:
V (z) = kv · z −2 (see [2], [18], [19], [20] and [21]). C(z) is the controller (A simple
proportional controller is implemented in experiments presented in this work). R(z) is
the robot (for this work: R(z) = z/z −1) and the Prediction filter generates the feedfor-
ward signal by prediction the position of the target. The variable for been minimized is
Δx (generated by the vision system) that represents the deviation of the target respect to
the desired position (error). The controller calculates a velocity signal ẋd which moves
172 C. Pérez et al.
Fig. 1. Operation diagram presented by Corke using KF for the E(z) block.
the robot in the right direction to decrease the error. Using this approach, no path plan-
ning is needed (the elimination of this path planning is important because it decreases
the computational load [2]).
The transfer function of the robot describes the behaviour from the velocity input to
the position reached by the camera, which includes a transformation in the image plane.
Therefore, the transfer function considered is [5]:
z
R(z) =
z−1
The Fuzzy predictor block is explained in the next sections (sections 4 and 5).
main difference between Mamdani and Sugeno is that the Sugeno output membership
functions are either linear or constant (for more information see [23]).
For Sugeno regulators, we have a linear dynamic system as the output function so that
the ith rule has the form:
If z˜1 is Ãj1 and z˜2 is Ãk2 and, ..., and z˜p is Ãlp Then ẋi (t) = Ui x(t) + Vi u(t)
where x(t) = [x1 (t), x2 (t), ..., xn (t)]T is the state vector,
u(t) = [u1 (t), u2 (t), ..., um (t)]T , Ui and Vi are the state and input matrices and z(t) =
[z1 (t), z2 (t), ..., zp (t)]T is the input to the fuzzy system, so:
R
(Ui x(t) + Vi u(t))μ(z(t))
i=1
ẋ(t) = R
(μ(z(t))
i=1
or
R R
ẋ(t) = (Ui ξi (z(t)) x(t) + (Vi ξi (z(t)) u(t)
i=1 i=1
where
1
ξ T = [ξ1 , ..., ξR ] = R
[μ1 , ..., μR ]
μi
i=1
Our work is based on this idea and these expressions (see [23] for more details). We
have mixed the Mamdani’s and the Sugenos’s idea because we have implemented an al-
gorithm similar to Sugeno but not for linear systems. We obtain a normalized weighting
of several non linear recursive expressions. The system works like we can see in Fig. 3
(see section 5).
We have developed a new filter that mixes different types of Kalman filters depending
on the conditions of the object’s movement. The main advantage of this new algorithm
is the non-abrupt change of the filter’s output.
Consider the nonlinear dynamic system
ẋ = f1 (x, u); y = g1 (x, u)
as each one of the filters used. The application of the fuzzy regulator in our case pro-
duces the next space-state expression:
N
fi (x, u) · ω(x, u)
i=1
174 C. Pérez et al.
where
μi (x, u)
ω(x, u) = N
μj (x, u)
i=1
The final system obtained has the same structure than filters used:
Figure 3 shows the Fuzzy prediction block diagram (In this work is presented a
Fuzzy predictor using different types of Kalman filters, therefore in this case it can be
named as Fuzzy predictor using Kalman Filters or FKF, although the Fuzzy predic-
tor wants to be a general idea). In this figure, we can see that the general input is the
position sequence of the target (xk ). Using this information, we estimate the velocity,
acceleration and jerk of the target in three separate KFs (Nomura and Naito present the
advantages of this hybrid technique in [12]). This information is used as ’Input MF’
to obtain F1 (Ins), F2 (v), F3 (a) and F4 (j). These MF inputs are the fuzzy membership
functions defined in Fig. 4. The biggest KF block (rounded) shown in this figure is a
combination of all used algorithms in the fuzzy filter (αβ slow and αβ f ast [5], αβγ,
Kv, Ka and Kj). This block obtains the output of all specified filters. The ’Output MF’
calculates the final output using the Ri rules.
Now, we present the rules (Ri ) considered for the fuzzy filter:
R1 : IF object IS inside AND velocity IS low AND acceleration IS low AND
jerk IS low THEN Fuzzy-prediction=Kv
Tracking of Manoeuvring Visual Targets 175
These rules have been obtained empirically, based on the authors experience using
the Kalman filter in different applications.
Notice that rule R10 (when jerk is high) shows that the best filter considered is Kj
and it does not depend on the object’s position (inside or outside) velocity/acceleration
value (low, medium or high).
We have used a product inference engine, singleton fuzzifier and centre average de-
fuzzifier. Figure 4 presents the fuzzy sets definition where (umax , vmax ) is the image
size, μvel = μacc = 2m/s, σvel = σacc = 0.5, cvel = cacc = 1, dvel = dacc = 3,
ivel = iacc = 1 and jvel = jacc = 1 (these values have been empirically obtained).
6 Results
This section is composed by two different parts: first (section 6.1), we analyze the pre-
diction algorithm presented originally in this work (Fuzzy prediction block diagram
shown in Fig. 3) and second (section 6.2), some simulations of the visual servoing
scheme (see Fig. 2) are done including the Fuzzy prediction algorithm.
In Fig. 5, we show the effectiveness of our algorithm’s prediction compared with the
classical KF methods. In this figure, we can see positions Pkr (actual object position),
r
Pk−1 (object position in k − 1) and Pk−2
r
(object position in k − 2). Next real position
r 1 6
of the object will be Pk+1 , and points from Pk+1 to Pk+1 , represent the prediction
obtained by each single filter. The best prediction is given by the Fuzzy filter presented
as a novelty in this work. This experiment is done for a parabolic trajectory of an object
affected by the gravity acceleration. (See Fig. 5 and Fig. 6).
176 C. Pérez et al.
20 50
Real Position
18
Prediction Ab ∼
Prediction Abg P6 40
k+1
16
Prediction Kv ∼
5
∼ Pk+1
Prediction Ka 2
Pk+1
14 ∼
Prediction Kj ∼ 7 30
4 Pk+1
Position (pixels)
12 r
Pk+1
∼
3
10 P 20
k+1
r
Pk
8
Real Position
10
6 r
Pk−1 Prediction Ab
Prediction Abg
4
0 Prediction Kv
r
Pk−2
Prediction Ka
2
Prediction Kj
Prediction FKF
0 −10
18 18.5 19 19.5 20 20.5 21 21.5 0 2 4 6 8 10 12 14 16
t (miliseconds) t (miliseconds)
We have done a lot of experiments for different movements of the object and we
have concluded that our Fuzzy prediction algorithm works better than the others filters
compared (filters compared are: αβ, αβγ, Kv, Ka, Kj and CPA -see section 6.2- with
our Fuzzy predictor). Figure 6 shows the real trajectory and the trajectory predicted for
each filter. For this experiment, we have used the first four real positions of the object
as input for all filters and they predict the trajectory using only this information. As we
can see in this figure, the best prediction is again the Fuzzy.
To prove the control scheme presented in Fig. 2, we have used the object motion shown
in Fig. 7 (up). This target motion represents a ramp-like motion between 1 < t < 4
seconds and a sinusoidal motion for t > 6 seconds. This motion model is corrupted
with a noise of σ=1 pixel. This motion is used by Stefan Chroust and Markus Vincze in
[5] to analyze the switching Kalman filter (SKF).
For this experiment, we compare the proposed filter (Fuzzy) with a well known
filter, the Circular Prediction Algorithm (CPA) [16]. In Fig. 7 (down), we can see the
Tracking of Manoeuvring Visual Targets 177
results of Fuzzy predictor and CPA algorithms. For changes of motion behaviour, the
Fuzzy produce less error than CPA. For the change in t=1, the error of the Fuzzy predic-
tor is [+0.008,-0] and [+0.015,-0.09] for the CPA. For the change in t=4, Fuzzy predictor
error = [+0,-0.0072] and CPA error = [+0.09,-0.015]. For the change in t=6, Fuzzy pre-
dictor error = [+0.022,-0] and CPA error = [+0.122,-0.76]. For the region 6 < t < 9
(sinusoidal movement between 2.5m and 0.5m) both algorithms works quite similarly:
Fuzzy predictor error = [±0.005] and CPA error = [±0.0076]. CPA filter works well
because it is designed for movements similar to a sine shape, but we can compare this
results with the SKF filter proposed in [5] and SKF works better (due to the AKF (Adap-
tive Kalman Filter) effect). Therefore, the Fuzzy predictor filter proposed works better
than CPA for all cases analyzed but comparing Fuzzy predictor with SKF, Fuzzy pre-
dictor is better for t=1, t=4 and t=6 but not for 6 < t < 9 (sinusoidal movement).
Figure 9 shows the zoom region 0 < t < 2 and −0.02 < Δxp < 0.02 of the same
experiment. In this figure, we can see the fast response of the Fuzzy predictor.
Target motion (m)
3
0.02
FKF
2 CPA
0.015
1
0.01
0
0 1 2 3 4 5 6 7 8 9 0.005
Δxp (m)
CPA
FKF 0
0.05
−0.005
Δxp (m)
0 −0.01
−0.015
−0.05
0 1 2 3 4 5 6 7 8 9 −0.02
0 1 2 3 4 5 6 7 8 9
time (seconds) time (seconds)
0.005
Prediction FKF
Δxp (m)
20
0
15
−0.005
10
−0.01
5
−0.015 0
−0.02 −5
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 0 10 20 30 40 50 60 70 80 90 100
Experimental results are obtained for this work using the following setup: Pulnix GE
series high speed camera (200 frames per second), Intel PRO/1000 PT Server Adapter
card, 3.06GHz Intel processor PC computer, Windows XP Professional O.S. and OpenCV
blob detection library.
For this configuration, the bounce of a ball on the ground is processed to obtain data
shown in Fig. 10. Results of this experiment are presented in table 1. In this table, we
can see the dispersion of several filters. The Fuzzy predictor dispersion is less than αβ,
αβγ, Kv, Ka and Kj although the Fuzzy predictor is a combination of them. This table
contains data from this particular experiment (the bounce of a ball on the ground). For
this experiment, the position of the ball is introduced to the filters to prove the behaviour
of them. The filter proposed (Fuzzy predictor) is the best analyzed.
Table 1. Numerical comparative for dispersion value of all filters implemented (bounce of a ball
experiment).
In Fig. 11 we can see some frames of the experiment ’bounce of a ball on the
ground’. For each frame the center of gravity of the tennis ball is obtained.
References
1. Kalman, R.: A new approach to linear filtering and prediction problems. In: IEEE Transac-
tions on Pattern Analysis and Machine Intelligence, IEEE Computer Society (1960)
2. Corke, P.: Visual Control of Robots: High Performance Visual Visual Servoing. 1996 edn.
Research Studies Press, Wiley, New York (1998)
3. Dickmanns, E., Graefe, V.: Dynamic monocular machine vision. In: Applications of dinamyc
monoclar machine vision, Machine Vision and Applications (1988)
4. Wilson, W., Williams Hulls, C., Bell, G.: Relative end-effector control using cartesian po-
sition based visual servoing. In: IEEE Transactions on Robotics and Automation, IEEE
Computer Society (1996)
5. Stefan, C., Markus, V.: Improvement of the prediction quality for visual servoing with a
switching kalman filter. I. J. Robotic Res. 22 (2003) 905–922
6. Chroust, S., Zimmer, E., Vincze, M.: Pros and cons of control methods of visual servoing. In:
In Proceedings of the 10th International Workshop on Robotics in the Alpe-Adria-Danube
Region, IEEE Computer Society (2001)
7. Soatto, S., Frezza, R., Perona, P.: Motion estimation via dynamic vision. In: IEEE Transac-
tions on Automatic Control, IEEE Computer Society (1997)
8. Duric, Z., Fayman, J., Rivlin, E.: Function from motion. In: Transactions on Pattern Analysis
and Machine Intelligence, IEEE Computer Society (1996)
9. Angeles, J., Rojas, A., Lopez-Cajun, C.: Trajectory planning in robotics continuous-path
applications. In: Journal of Robotics and Automation, IEEE Computer Society (1988)
10. Duric, Z., Rivlin, E., Rosenfeld, A.: Understanding the motions of tools and vehicles. In:
Proceedings of the Sixth International Conference on Computer Vision, IEEE Computer
Society (1998)
11. Duric, Z., Rivlin, E., Davis, L.: Egomotion analysis based on the frenet-serret motion model.
In: Proceedings of the 4th International Conference on Computer Vision, IEEE Computer
Society (1993)
12. Nomura, H., Natio, T.: Integrated vsual servoing sysem to grasp industrial parts moving
on conveyer by controlling 6dof arm. In: Internacional Conference on Systems, Man. and
Cybernetics, IEEE Computer Society (2000)
180 C. Pérez et al.
13. Li, X., Jilkov, V.: A survey of maneuvering target tracking: Dynamic models. In: Signal and
Data Processing of Small Targets, The International Society for Optical Engineering (2000)
14. Mehrotra, K., Mahapatra, P.: A jerk model for tracking highly maneuvering targets. In:
Transactions on Aerospace and Electronic Systems, IEEE Computer Society (1997)
15. Wang, L.: Course in Fuzzy Systems and Control Theory. Pearson US Imports & PHIPEs.
Pearson Higher Education (1997)
16. Tenne, D., Singh, T.: Circular prediction algorithms-hybrid filters. In: American Control
Conference, IEEE Computer Society (2002)
17. Maybeck, P.: Stochastic Models, Estimation and Control. Academic Press, New York (1982)
18. Hutchinson, S., Hager, G., Corke, P.: Visual servoing: a tutorial. In: Transactions on Robotics
and Automation, IEEE Computer Society (1996)
19. Markus, V., Gregory, D.: Robust Vision for Vision-Based Control of Motion. SPIE Press /
IEEE Press, Bellingham, Washington (2000)
20. Vincze, M., Weiman, C.: On optimizing windowsize for visual servoing. In: International
Conference on Robotics and Automation, IEEE Computer Society (1997)
21. Vincze, M.: Real-time vision, tracking and controldynamics of visual servoing. In: Interna-
tional Conference on Robotics and Automation, IEEE Computer Society (2000)
22. Sugeno, M.: Industrial applications of fuzzy control. Elsevier Science Publications Company
(1985)
23. Passino, K., Yourkovich, S.: Fuzzy Control. Addison-Wesley, Ohio, USA (1988)
Motion Control of an Omnidirectional Mobile Robot
1 Introduction
Recently, omnidirectional wheeled robots have received more attention in mobile robots
applications, because they have full mobility in the plane, which means that they can
move at each instant in any direction without any reorientation [1]. Unlike nonholo-
nomic robots, such as car-like robots, having to rotate before implementing any desired
translation velocity, omnidirecitonal robots have higher maneuverability and are widely
used in dynamic environments, for example, in the middle-size league of the annual
RoboCup competition.
Most motion control methods of mobile robots are based on dynamic models [2–5]
or kinematic models [6–8] of robots. A dynamic model directly describes the relation-
ship between the forces exerted by the wheels and the robot movement, with the applied
voltage of each wheel as the input and the robot movement in terms of linear and an-
gular accelerations as the output. But the dynamic variations caused by the changes in
the robot’s inertia moment and perturbations from the mechanic components [9] make
the controller design more complex. With the assumption that no slippage of wheels
occurs, sensors have high accuracy and ground is planar enough, kinematic models are
widely used in designing robots behaviors because of the simpler model structures. As
the inputs of kinematic models are robot wheels velocities, and outputs are the robot
linear and angular velocities, the actuator dynamics of the robot are assumed to be fast
enough to be ignored, which means that the desired wheel velocities can be achieved
immediately. However, the actuator dynamics limit and even degrade the robot perfor-
mance in real situations.
Another important practical issue of robot control is actuator saturation. Because
the commanding motor speeds of the robot wheels are bounded by the saturation limits,
the actuator saturation can affect the robot performance, even destroy the stability of
the controlled robot systems [10, 11].
182 X. Li and A. Zell
This paper presents a motion control method for an omnidirectional robot, based
on the inverse input-output linearization of the kinematic model. It takes into account
not only the identified actuator dynamics but also the actuator saturation in designing a
controller, and guarantees the stability of the closed-loop control system.
The remainder of this paper introduces the kinematic model of an omnidirectional
middle-size Robocup robot in section 2; Path following and orientation tracking prob-
lems are solved based on the inverse input-output linearized kinematic model in section
3, where the actuator saturation is also analyzed; section 4 presents the identification of
actuator dynamics and their influence on the control performance. Finally, the experi-
ment results and conclusions are discussed in sections 5 and 6, respectively.
Besides the fixed world coordinate system [Xw , Yw ], a mobile robot fixed frame
[Xm , Ym ] is defined, which is parallel to the floor and whose origin locates at R. θ
denotes the robot orientation, which is the direction angle of the axis Xm in the world
coordinate system. α and ϕ denote the direction of the robot translation velocity vR
observed in the world and robot coordinate system, respectively. The kinematic model
with respect to the robot coordinate system is given by :
⎡ √ √ ⎤
3/3 − 3/3 0
v = ⎣ 1/3 1/3 −2/3 ⎦ q̇, (1)
1/(3L) 1/(3L) 1/(3L)
where v = [ẋm m T
R ẏR ω] is the vector of robot velocities observed in the robot coordinate
m m
system; ẋR and ẏR are the robot translation velocities; ω is the robot rotation velocity.
Motion Control of an Omnidirectional Mobile Robot 183
q̇ is the vector of wheel velocities [q̇1 q̇2 q̇3 ]T , and q̇i (i = 1, 2, 3) is the i-th wheel’s
velocity, which is equal to the wheel’s radius multiplied by the wheel’s angular velocity.
Introducing the transformation matrix from the robot coordinate system to the world
coordinate system as
w cos θ − sin θ
Rm = , (2)
sin θ cos θ
the kinematic model with respect to the world coordinate system is deduced as:
⎡2 ⎤
3 cos(θ + δ) − 3 cos(θ − δ) 3 sinθ
2 2
⎢ ⎥
ẋ = ⎣ 23 sin(θ + δ) − 23 sin(θ − δ) − 32 cosθ ⎦ q̇, (3)
1 1 1
3L 3L 3L
where ẋ = [ẋR ẏR θ̇]T is the vector of robot velocities with respect to the world co-
ordinate system; ẋR and ẏR are the robot translation velocities; θ̇ is the robot rotation
velocity; δ refers to the wheel’s orientation in the robot coordinate system and is equal
to 30 degrees.
It is important to notice that the transformation matrix in the kinematic models is
full rank, which denotes that the translation and rotation of the robot are decoupled, and
guarantees the separate control of these two movements.
For the high level control laws without considering the wheel velocities, the kine-
matic model
ẋ = Gv (4)
is used in our control method, where the transformation matrix G is equal to [w Rm 0 ; 0 1].
Because G is full rank, the characteristics of decoupled movement is also kept.
This linear system shown in Fig. 2 is completely decoupled and allows the con-
trolling of the robot’s translation and rotation in a separate way. When a controller K
is designed based on this simple linear system, the controller of the original system is
generated as CK. The overall control loop, which consists of the nonlinear system, the
compensator and the controller, is shown in Fig. 3,
where x denotes the robot state vector [xR yR θ]T and xd is the desired state vector;
xR and yR are robot position observed in the world coordinate system.
184 X. Li and A. Zell
Based on this input-output linearized system, path following and orientation track-
ing problems are analyzed with respect to the robot translation and rotation control in
the following subsections. The influence of actuator saturation is also accounted to keep
the decoupling between the translation and rotation movements.
Based on the above definitions, the path following problem is to find proper control
values of the robot translation velocity vR and angular velocity α̇ such that the deviation
distance xn and angular error θ̃R = α − θP tend to zero.
To solve this problem, a Lyapunov candidate function
1 1
V = Kd x2n + Kθ θ̃R
2
(5)
2 2
can be considered, where Kd and Kθ are positive constants. The time derivation of V
results in
˙
V̇ = Kd xn ẋn + Kθ θ̃R θ̃R . (6)
Mojaev [12] presents a simple control law based on the deviation xn , where R
is controlled to move along an exponential curve and to converge to the axis xt . The
Motion Control of an Omnidirectional Mobile Robot 185
and its gain characteristics illustrated in Fig. 5, we can take the saturation function as
a dynamic gain block ka , which has maximum value one and converges to zero when
Motion Control of an Omnidirectional Mobile Robot 187
the input saturates. Then the closed-loop system of controlling the robot orientation is
as shown in Fig. 6, in which a PD controller is used to control the robot orientation
converging to the ideal θd ,
ω = k1 (eθ + k2 ėθ ), (24)
where eθ = θd − θ, k1 and k2 are the proportional and derivative gains, respectively. It
−ka k1
can be obtained that the closed-loop has only one pole 1+k a k1 k2
and one zero −1/k2 .
Therefore, when k2 and k1 are positive, the stability of the closed-loop system can be
guaranteed whenever ka decreases.
4 Actuator Dynamics
The results in the last section are only practical when we assume that the low level
actuator dynamics is faster than the kinematics, or the delay of actuator dynamics can
be ignored. Therefore, it is necessary to analyze the actuator dynamics and take it into
account when designing a controller. In the following subsections, the actuator dynam-
ics is identified based on the observed input-output data, and its influence on the robot
motion control is presented.
The system identification problem is to estimate a model based on the observed input-
output data such that a performance criterion is minimized. Because the full rank trans-
formation matrix in the low level dynamics model (1) denotes the outputs ẋm m
R , ẏR and
ω are not relevant, we identify the actuator models for these three values. The inputs of
the actuator models are required velocity values(ẋm m
Rc , ẏRc and ωc ), and the outputs are
188 X. Li and A. Zell
1.5
0.5
model outputs
measured outputs
0
0 50 100 150 200 250 300
Data number
To coincide with the robot’s continuous model, the identified models are trans-
formed from discrete ones into continuous ones using the ’zoh’(zero-order hold) method,
8.7948(s + 58.47)
ẋm
R = ẏ m , (31)
(s + 73.66)(s + 6.897) Rc
−0.5
−1
−1.5
model outputs
measured outputs
−2
0 50 100 150 200 250 300
Data number
m
Fig. 8. Identified model for ẏR .
1.5
0.5
−0.5
−1
−1.5
0 100 200 300 400 500 600 700
Data number
1.667(s + 45.37)
ω= ωc . (33)
(s2 + 6.759s + 76.11)
respect to the robot coordinate system. Because the poles of the actuators dynamics
(31) and (32) have negative real parts, these two systems are stable. That means there
exits a finite short time t∗ , after which the real velocities ẋm m
R and ẏR can converge to the
m m
desired ones ẋRc and ẏRc , and the inputs u1 and u2 begin to take effect. Therefore, the
190 X. Li and A. Zell
above path following law can also guarantee the robot approach to the reference path,
although during t∗ the deviation distance xn and angular error θ̃R may increase.
In the orientation tracking control, as the dynamic system (33) adds another two
poles to the closed-loop system, shown in Fig. 11, the controller parameters decided in
the above section may result the system losing the stability.
By setting the positions of poles and zeros of the closed-loop system with the locus
technique, we obtain that the conditions k1 > 0 and k2 > 0.0515 can guarantee the
closed-loop system’s stability, even when the actuators saturate. Fig. 12 shows the root
locus of an open-loop system in the critical situation with k2 = 0.0515, where all the
poles of the closed-loop system locate in the left-half plane whatever positive value
ka k1 is. Otherwise, when k2 is less than 0.0515, the root locus may cross the imaginary
axis, and the poles of closes-loop system may move to the right-half plane when ka
goes to zero.
5 Experiment
The control algorithm discussed above has been tested in our robot laboratory having
a half-field of the RoboCup middle size league. The omnidirectional robot is shown in
Fig. 13.
An AVT Marlin F-046C color camera with a resolution of 780 × 580 is assembled
pointing up towards a hyperbolic mirror, which is mounted on the top of the omnidi-
rectional robot, such that a complete surrounding map of the robot can be captured. A
Motion Control of an Omnidirectional Mobile Robot 191
6 Conclusions
In this paper a new motion control method for an omnidirectional robot is presented.
This approach is based on the inverse input-output linearized robot kinematic model,
192 X. Li and A. Zell
Fig. 14. Reference path and robot path. Fig. 15. Reference path and robot path.
which completely decoupled the robot translation and rotation. The robot translation
is steered to follow a reference path, and the robot rotation is controlled to track the
desired orientation. Because the actuator dynamics and saturation can greatly affect the
robot performance, they are taken into account when designing the controller. With the
Lyapunov stability theory, the global stability of the path following control law has been
proven. The locus technique is used to analyze and choose the suitable parameters of
the PD controller, such that the robot orientation can converge to the desired one even
when the wheels velocities saturate.
In real-world experiments, the robot was controlled to follow an eight-shaped curve
with a constant translation velocity of 1m/s, and to track sharp changing orientations.
The results show the effectiveness of the proposed control method in the case of both
actuator saturation and non-saturation.
Fig. 20. Real wheel velocities. Fig. 21. Real wheel velocities.
References
1. Campion, G., Bastin, G., D’Andréa-Novel, B.: Structural properties and classification of
kinematic and dynamic models of wheeled mobile robots. In: IEEE Transactions on Robotics
and Automation. Volume 12. (1996)
2. Watanabe, K.: Control of an omnidirectional mobile robot. In: KES’98, 2th International
Conference on Knowledge-Based Intelligent Electronic Systems. (1998)
3. Liu, Y., Wu, X., Zhu, J.J., Lew, J.: Omni-directional mobile robot controller design by
trajectory linearization. In: ACC’03, Proceeding of the 2003 American Control Conference.
(2003)
4. Purwin, O., Andrea, R.D.: Trajectory generation and control for four wheeled omnidirec-
tional vehicles. In: Robotics and Autonomous Systems. Volume 54(1). (2006)
5. Tsai, C.C., Huang, H.C., Wang, T.S., Chen, C.M.: System design, trajectory planning and
control of an omnidirectional mobile robot. In: 2006 CACS Automatic Control Conference.
(2006)
6. Muir, P.F., Neuman, C.P.: Kinematic modeling for feedback control of an omnidirectional
wheeled mobile robot. In: Autonomous Robot Vehicles, Springer-Verlag (1990)
7. Terashima, K., Miyoshi, T., Urbano, J., Kitagawa, H.: Frequency shape control of omni-
directional wheelchair to increase user’s comfort. In: ICRA’04, Proceedings of the 2004
IEEE International Conference on Robotics and Automation. (2004)
8. Rojas, R., Förster, A.G.: Holonomic Control of a Robot with an Omni-directional Drive.
BöttcherIT Verlag, Bremen (2006)
9. Scolari Conceição, A., j. Costa, P., Moreira, A.: Control and model identification of a mobile
robot’s motors based in least squares and instrumental variable methods. In: MMAR’05,
11st International Conference on Metgids abd Models in Automation and Robotics. (2005)
10. Indiveri, G., Paulus, J., Plöger, P.G.: Motion control of swedish wheeled mobile robots in the
presence of actuator saturation. In: 10th annual RoboCup International Symposium. (2006)
11. Scolari Conceição, A., Moreira, A., j. Costa, P.: Trajectory tracking for omni-directional mo-
bile robots based on restrictions of the motor’s velocities. In: SYROCO’06, 8th International
IFAC Symposium on Robot Control. (2006)
12. Mojaev, A., Zell, A.: Tracking control and adaptive local navigation for nonholonomic mo-
bile robot. In: Proceedings of the IAS-8 conference. (2004)
13. Heinemann, P., Rueckstiess, T., Zell, A.: Fast and accurate environment moddelling using
omnidirectional vision. In: Dynamic Perception, Infix (2004)
A Strategy for Exploration with a Multi-robot System
Abstract. The present paper develops a novel strategy for the exploration of an
unknown environment with a multi-robot system. Communication between the
robots is restricted to line-of-sight and to a maximum inter-robot distance. The
algorithm we propose is related to methods used for complete coverage of an area,
where all free space is physically covered. In the present paper it is required that
the entire free space is covered by the sensors of the robots, enabling us to scan
more space in less time, compared to complete coverage algorithms. The area
to be scanned contains disjoint convex obstacles of unknown size and shape. The
geometry of the robot group has a zigzag shape, which is stretched or compressed
to adapt to the environment. The robot group is allowed to split and rejoin when
passing obstacles. A direct application of the algorithm is mine field clearance.
1 Introduction
The research domain of multi-agent robot systems can be divided into subdomains ac-
cording to the task given to the robot group [1]. At present well-studied subdomains are
motion-planning (also called path-planning), formation-forming, region-sweeping, and
combinations of the foregoing. The problem considered in the present paper belongs to
the discipline comprising region-sweeping. In this discipline two different robot tasks
are usually considered.
In the first task a group of robots receives the order to explore/map an unknown re-
gion. The goal is to obtain a detailed topography of the desired area. A typical approach
to tackle the above problem with multiple robots assumes unlimited communication [2]:
since exploration algorithms are already devised for a single robot it seems straightfor-
ward to divide the area to be explored into disjunct regions, each of which is assigned
to a single robot. The robots communicate to each other the area they have explored so
that no part of the free space will be explored twice unnecessarily. At no point during
the task are the robots trying to form a fixed formation. Each robot explores a different
part of the unknown region and sends its findings to a central device which combines
the data received from the robots into one global map of the area.
Closely related to the exploring/mapping task is the second task, called complete
coverage, where the robots have to move over all of the free surface in configuration
space. Typical applications are mine field clearance, lawn mowing and snow cleaning.
The coverage problem has been addressed in the literature both in a deterministic and
196 J.A. Rogge and D. Aeyels
a probabilistic setting. In the probabilistic approach the robots are considered as if they
were fluid or gas molecules satisfying the appropriate physical laws of motion [3], [4].
Just as a gas by diffusion fills an entire space, the robots will cover all free space when
time tends to infinity. In the remainder of the paper we focus on the deterministic setting.
In this setting the robot group typically forms (partial) formations to solve the task.
Reference [5] gives a short overview of existing techniques for multi-robot coverage
problems. Different approaches to the coverage problem are found in [6], [7], [8], [9]
[10] and [11].
The problem statement of the present paper does not differ that much from the
common exploration/mapping task and the complete coverage problem, but is rather a
combination of both. It is required that all of the free space is sensed by the robots, but
not necessarily physically covered. However, unlike the common exploration case, the
sensing of the area does not have as goal to map the topography of the free space and
the location of the obstacles in it. Our aim is to locate several unknown targets within
the free space. Moreover, similar to the complete coverage setting we demand a 100%
certainty that all free space has been covered by the sensors at the end of the exploration
procedure, implying that all targets have been found. Since the robots no longer have to
cover all free space physically, the novel algorithm will yield a time gain compared to
complete coverage strategies. It is assumed that the space to be explored does not have
a maze-like structure with many narrow corridors, but is an open space containing only
convex obstacles sparsely spread throughout. Our algorithm is presented in Section 2
of the paper. A short comparison between the sensor coverage algorithm presented here
and the physical coverage algorithm of [10] is given in Section 3.
A specific application we have in mind is mine field clearance using chemical va-
por microsensors [12]. Once a landmine is deployed, the environment near the mine
becomes contaminated with explosives derived from the charge contained in the mine.
The vapor microsensors are able to detect the chemical vapor signature of the explo-
sives emanating from the landmines. This implies that complete coverage algorithms
may be too restrictive with respect to the demining problem. Performing the algorithm
of the present paper, with the weaker requirement of sensor coverage, will result in a
gain of time.
The algorithm can be used in problems where a robot group has to traverse a terrain
containing sparsely spread obstacles. There is a natural trade-off between coherence of
the formation and avoidance of the obstacles. The robot group is allowed to split in
order to pass the obstacles, resulting in faster progress of the group across the terrain.
The algorithm ensures that once the obstacle is passed, the robots regroup.
2.1 Setting
Consider a population of N identical robots, with N even. Each robot is equipped with
two types of sensors. One type serves as a means to detect the goal targets to be found
in the assigned area, e.g. landmines; the other type is used to detect and locate other
A Strategy for Exploration with a Multi-robot System 197
robots and obstacles in the neighborhood of the robot1 . Both sensors have a maximum
detection range st and sr respectively. It is assumed that targets which come within the
radius of the corresponding sensor area st or sr of the robot are always detected, and
that if they are located farther away than the distance st , sr they are never detected.
The robot configuration allows limited communication. First, this is expressed by the
maximum detection range sr as described above. Second, line-of-sight communication
is assumed: two robots can only sense each other if they are sufficiently close to each
other and if there is no obstacle located on the straight line connecting both robots.
Two robots are called connected to each other when they sense each other. Every
robot is assigned an index number. The initial state of the robot configuration is such
that robot i is connected to robots i − 1 and i + 1, ∀i ∈ {2, . . . , N − 1}. (Robot 1 is
only connected to robot 2 and robot N is only connected to robot N − 1.) Furthermore,
each robot keeps a constant distance d < sr with its neighbors and observes them at
preferred angles with respect to its forward direction. With notation from Figure 1 these
angles are defined as follows. For robots with indices i < N2 , the angles are
π/6, i even,
αi =
5π/6, i odd,
(1)
−π/6, i even,
βi =
−5π/6, i odd \ {N/2 + 1}.
To obtain the angles of the remaining robots, with indices i ≥ N2 , simply replace i
by i + N2 in the formulas above and define β N := −π/2 and α N +1 := π/2. Each
2 2
robot is equipped with a compass. Together with the above defined angles, the forward
direction of each robot (the same for all robots) is imposed at the initialization of the
algorithm. The above conditions imply a robot formation with zigzag shape, as depicted
in Figure 2 for N = 6. The dashed circles have radius st and signify the sensed area
i−1 i+1
αi βi
2 5
1 3 4 6
The lower bound on st in (2) ensures that the areas sensed for goal targets of neighbor-
ing robots partially overlap, as illustrated by Figure 2.
Fig. 3. A depiction of the algorithm. The arrows indicate the constant velocity of both leader
robots. The dashed lines represent the strip boundaries.
– the position of the obstacle at an angle with its forward direction inside the interval
(−γ, γ), with γ a fixed value inside the interval (0, π4 ).
The presence of the obstacle is communicated to all the robots in the group. Each robot
takes on a different role such that two subgroups will be formed. The robots with index
i ∈ S1 := {2, . . . , N/2} now follow the neighboring robot with corresponding index
i − 1. Similarly, robots with index i ∈ S2 := {N/2 + 1, . . . , N − 1} follow the
neighboring robot with index i + 1. More precisely, the robot with index i tries to reach
the following coordinates:
(xi−1 + d sin π6 , yi−1 + (−1)i d cos π6 ), if i ∈ S1 ,
(3)
(xi+1 − d sin π6 d, yi+1 + (−1)i d cos π6 ), if i ∈ S2 .
These coordinates are considered with respect to a right-handed (x, y)-frame with the
y-axis parallel to the strip boundary, and directed into the driving direction of the leader
robots. Each robot still tries to stay in the preferred formation, but in order to do so only
takes information of one neighbor into account. Moreover, the condition on the relative
position between the neighboring robots N/2 and N/2 + 1 is suspended, which will
lead to the splitting of the robot group. Notice that indifferent of the robot that observes
the obstacle first, the group will split between robots N/2 and N/2 + 1. This choice is
motivated in Section 2.4.
Consider the situation for robot i where one of the following occurs:
– The obstacle is blocking the straight path between the present position of robot i
and its desired position,
– Robot i does not detect its neighbor necessary to determine its preferred position.
If this situation occurs, the robot receives the order to follow the edge of the obstacle,
keeping the obstacle on its right if i ∈ S1 , or its left if i ∈ S2 . This behavior is called
wall-following. The robot continues to wall-follow around the obstacle until none of
the above conditions is satisfied. After that, it assumes its desired position again. If
all robots have past the obstacle, each robot is again able to reach its desired position
in the preferred formation. In particular, robots N/2 and N/2 + 1 will meet again in
their desired relative position. When this happens a signal is sent to all robots with
the message that the group has past the obstacle. A simulation of the above described
algorithm is presented in Figure 4 with N = 10.
Remark. It may occur that a robot cannot reach its desired position because it is located
too far away from its present position. Then the robot simply rides towards the desired
position at the maximum allowed velocity, trying to catch up.
Fig. 5. A group of 10 robots passing an obstacle. It is demonstrated how the left subgroup adjusts
its angles αi and βi in order to spread and scan the area between left strip boundary and obstacle.
obstacle and the left boundary of the scanning strip. Once past the gap the robots in this
subgroup have to spread, since the distance between the obstacle and the left boundary
increases and we want to sense all of free space between the boundary and the obstacle.
The obstacle has such a shape that the robots have to spread out across almost the entire
width of the scanning strip before meeting a robot member of the right subgroup. The
basic algorithm is modified as follows. When robot N/2 (resp. N/2 + 1) encounters the
obstacle, it is now programmed to follow the obstacle’s edge until it meets its neighbor
N/2 + 1 (resp. N/2). Additionally, it ensures that its neighbor N/2 − 1 (resp. N/2 + 1)
stays in its detection range by sending a signal to the other robots of its subgroup to
increase the angle π/4 of (3). This changes the desired position of each robot in the
subgroup resulting in a stretching of the group, as far as necessary.
The above modified algorithm justifies our choice of initial formation and width
of the scanning strip. If we had naively chosen a value (N − 1)d as the width of a
scanning strip, the initial preferred robot formation would be able to span this entire
distance, namely by forming a line with the angles defined in Section 2.1 equal to αi =
−βi = π/2. However, one subgroup, consisting of only half of the number of robots,
would not be able to span this distance, resulting in either an error message from the
algorithm or in unscanned areas, if a situation described in Figure 5 was encountered.
Closely related to this observation is the choice to split the robot group precisely
in the middle. Since the sensor range of each robot is limited and the robots operate in
an unknown environment, the shape of each obstacle is unknown. To guarantee that the
area around the obstacle is fully covered by the sensors, we have to supply a sufficient
number of robots to both sides of the obstacle. For instance, when the shape of the
obstacle in Figure 5 is known a priori, one can decide to send more than half of the
robots to the left of the obstacle. Consider the case where the obstacle is reflected with
respect to the vertical axis. In this case sending less than half of the robots to the right
would lead to uncovered areas or an error message in the algorithm. With limited sensor
information it is not possible to discriminate between the situation of Figure 5 and its
reflected version. This leads us to always split the group into two equal parts.
202 J.A. Rogge and D. Aeyels
Fig. 6. Two situations where an obstacle is located on the boundary between strips. On the left
hand side a dead end situation arises; on the right hand side one of the leader robots guides the
group around the obstacle.
Before tackling the dead end problem, let us treat the case presented on the right
hand side of Figure 6, which does not lead to a dead end situation. Consider an (x, y)-
frame with the y-axis parallel to the strip boundary, and directed into the driving direc-
tion of the leader robots. When the leader robot encounters the obstacle, the algorithm
assigns to this leader a wall-following procedure around the obstacle. The leader keeps
the obstacle either on its right or left (depending on its position in the robot formation)
while moving into the interior of the strip away from the strip boundary. As can be con-
cluded from the picture, the y-coordinate of the leader increases while moving around
the obstacle. We wish to keep the velocity component of the leader robot parallel to
the strip boundary equal to v. Since the robot deviates from its straight path parallel to
the strip boundary, this implies it has to speed up. When the leader reaches the strip
boundary again, it switches back to the original task of moving parallel to the boundary.
Now consider the left hand side of Figure 6. A dead end is detected by the algorithm
when two conditions are satisfied:
– one of the leader robots cannot move into the desired direction parallel to the strip
boundary, because an obstacle is blocking the way.
– when the leader robot starts wall-following the obstacle as described above, the
value of its y-coordinate decreases.
As soon as a dead end is observed by the leader robot, it changes its behavior and
stops its wall following algorithm. Instead, it projects its corresponding strip boundary
(N/2 − 1)d/8 units outwards and resumes the original scanning algorithm with respect
A Strategy for Exploration with a Multi-robot System 203
to the new boundary. If the extra width turns out to be insufficient to guide the robot
subgroup around the obstacle outside of the original scanning strip, the boundary is
projected a second (third,...) time. This way the subgroup which was stuck in the dead
end is guided around the obstacle. When both subgroups reestablish contact, the leader
robot returns to the original strip boundary. This behavior is faster and easier to imple-
ment than a turning-back scenario, where the subgroup of robots which meets a dead
end retraces it steps to go around the obstacle inside the original scanning strip.
Remark. The above situation with a solid wall as strip boundary, forcing a turning-back
maneuver, is precluded.
When the robot group reaches the end of a scanning strip, it needs to be transported to
the next strip. This is done in a few easy steps. Consider the situation of Figure 7. First
the right leader changes its behavior into that of a robot in the interior of the formation,
i.e. it tries to attain the desired formation. The left leader moves (N/2 − 1)d units
to the left perpendicular to the strip boundary. The rightmost robot resumes its leader
role and all robots reverse their forward direction with respect to the desired direction
in the previous strip. At this moment the robots are not yet positioned in the desired
formation: the indices of the robots are reversed. Each robot i assumes a new index
Fig. 7. The robot group moves from the end of a scanning strip to the start of the next strip.
204 J.A. Rogge and D. Aeyels
number f (i) = (N + 1) − i, and is ordered to reassume its desired position in the robot
group without the leader robots advancing. The preferred formation is attained, and the
robots are ready to start the algorithm in the next strip. Naturally, every time the end of
a strip is reached, the roles of left and right leader alternate, so that the robot group does
not get trapped into a loop consisting of two strips.
In the remainder of this section we will compare speed of performance of the present
algorithm with the algorithm of [10]. In order to do so, realistic distance values are con-
sidered. Chemical vapor sensors detecting mines have a range st = 1.70 m. Obstacles
and other robots can be detected by laser based sensors with a range of sr = 3.3 m such
that (2) is satisfied. Assume the robots themselves possess a diameter of 0.3 m and set
the fixed interdistance d between neighboring robots in the preferred formation equal to
sr . With N the number of robots in the group, this yields a strip width of 1.65(N −2) m.
When no obstacles are encountered, the robots are allowed to move at a preset
maximum velocity vmax . In the algorithm of the present paper vmax is directed parallel
to the strip boundary, whereas the interior robots in [10] travel back and forth inside the
strip at vmax . It can be proven that for the latter case with the values given above the
speed of progress parallel to the strip boundary is vmax /6.
A Strategy for Exploration with a Multi-robot System 205
In the presence of obstacles a comparison is more difficult. First consider the com-
plete coverage algorithm [10]. As can be concluded from Figure 8, in the presence of
an obstacle the robots will advance faster parallel to the strip boundary, since the space
occupied by the obstacle does not have to be covered. The robot group will proceed
fastest when the shape of the obstacle is such that there is no space left for the robots
to travel back and forth between obstacle and strip boundary. Hence, depending on size
and shape of the obstacle the robots advance with a speed between vmax /6 and vmax .
Now, consider the algorithm of the present paper. Some interior robots perform wall-
following around the obstacles. This implies their path is longer than the path of the
leader robots. If the leader robots keep moving at the maximum allowed velocity, those
interior robots will never again be able to reach their desired position inside the forma-
tion after the obstacle is past. Hence, when an obstacle is encountered the leaders have
to take on a velocity v0 which is smaller than vmax . This velocity v0 is determined as fol-
lows. The middle robots N/2 and N/2 + 1 transmit their positions via the other robots
to their respective leader robots. The leaders adjust their velocity v0 such that the differ-
ence between their y-coordinate and the y-coordinate of the corresponding robot N/2
or N/2 + 1 stays at all time within a prespecified bound. The middle robots only slow
down the group significantly during the first and last stage of their obstacle following,
i.e. when moving away from or towards the strip boundary without significantly ad-
vancing parallel to it. As soon as there is enough free space ahead of the middle robots,
the subgroup is again allowed to move parallel to the strip boundary with a speed close
to vmax .
From the above observations the following is concluded. The robot group in the
present algorithm slows down to pass an obstacle, but for most of the time the speed will
be close to vmax . The robot group of the complete coverage algorithm speeds up when
passing an obstacle, but for most obstacles the algorithm still requires a robot group
moving back and forth between the obstacle and the strip boundary. This implies that
the increased speed will on average be closer to vmax /6 than to vmax . Hence, in generic
cases, the present algorithm performs faster than the complete coverage strategy even
in the presence of obstacles.
4 Conclusions
The present paper described an algorithm for multi-robot exploration in an unknown
environment. The algorithm guarantees that all free space is covered by the robot sen-
sors. The robots form a zigzag-shaped formation which scans the area in strips. In order
to pass an obstacle, of which size and shape are not known a priori, the robot group
splits in the middle. If necessary, the zigzag shape of each subgroup may stretch out in
order to cover the free area between the obstacle and the strip boundary. The algorithm
is also able to handle obstacles located on the strip boundary.
References
1. Ota, J.: Multi-agent robot systems as distributed autonomous systems. Advanced Engineer-
ing Informatics 20 (2006) 59 – 70
2. Burgard, W., Moors, M., Stachniss, C., Schneider, F.: Coordinated multi-robot exploration.
IEEE Transactions on Robotics 21 (2005) 376–386
3. Kerr, W., Spears, D., Spears, W., Thayer, D.: Two formal gas models for multi-agent sweep-
ing and obstacle avoidance. In: Formal Approaches to Agent-Based Systems, Third Interna-
tional Workshop. (2004) 111–130
4. Keymeulen, D., Decuyper, J.: The fluid dynamics applied to mobile robot motion: the stream
field method. In: Proceedings of 1994 IEEE International Conference on Robotics and Au-
tomation, Piscataway, NJ, USA (1994) 378–385
5. Choset, H.: Coverage for robotics – a survey of recent results. Annals of Mathematics and
Artificial Intelligence 31 (2001) 113–126
6. Cortés, J., Martı́nez, S., Karatas, T., Bullo, F.: Coverage control for mobile sensing networks.
IEEE Transactions on Robotics and Automation 20 (2004) 243–255
7. Kurabayashi, D., Ota, J., Arai, T., Yosada, E.: Cooperative sweeping by multiple robots. In:
Proc. 1996 IEEE International Conference on Robotics and Automation. (1996)
8. Wong, S., MacDonald, B.: Complete coverage by mobile robots using slice decomposition
based on natural landmarks. In: Proc. Eighth Pacific Rim International Conference on Arti-
ficial Intelligence. Lecture Notes in Artificial Intelligence. Volume 3157. (2004) 683–692
9. Zheng, X., Jain, S., Koenig, S., Kempe, D.: Multi-robot forest coverage. In: Proceedings of
the IEEE International Conference on Intelligent Robots and Systems. (2005)
10. Rekleitis, I., Lee-Shue, V., New, A.P., Choset, H.: Limited communication, multi-robot team
based coverage. In: Proc. 2004 IEEE International Conference on Robotics and Automation.
(2004)
11. Kong, C.S., Peng, N.A., Rekleitis, I.: Distributed coverage with multi-robot system. In:
Proceedings of 2006 IEEE International Conference on Robotics and Automation, Orlando,
Florida, USA (2006) 2423–2429
12. Gage, D.: Many-robots mcm search systems. In: Proceedings of Autonomous Vehicles in
Mine Countermeasures Symposium. (1995)
Tracking of Constrained Submarine Robot Arms
1 Introduction
In the last decade, we have witness a surprising leap on scientific knowledge and techno-
logical achievements for AUV, from a simple torpedo to modern AUV. Those vehicles
pose at the same time tantamount scientific and technological challenges in robotics,
control, man-machine interfaces and mechatronics. Modern efforts around the world on
AUV focuses more still in how to provide an acceptable level of (perhaps autonomous
208 E. Olguı́n-Dı́az and V. Parra-Vega
or automatic) navigation capabilities of the main body of the AUV, rather than in the
manipulation capabilities of its tools, perhaps a SRA, carried on the AUV. Therefore,
we bring the attention of a new breed of AUV which main task is manipulation, per-
haps with more than one robot arm, where the underlined issue is that the main body,
the AUV, is considered as the fully actuated (or underactuated) free/floating base of the
robot arm. In this case, we reasonably assume that the AUV is several times heavier
than the SRA so as to provide inertial decoupling between the AUV and the robot arm.
In this case, we have n trusters to drive the AUV and m actuators to drive the SRA.
Pioneering efforts on SRA were focused on motion control with simple PD regulators
in unconstrained motion similar to the case of fixed-base robots in our labs. Acceptable
performance of tracking has been proposed using more complicated (saturated or non-
linear) PID schemes and few model-based controllers have been proposed for tracking,
under lab conditions [16, 5]. Of course, stable contact for SRA is a more complex prob-
lem in comparison to the typical force/position control problem of robot manipulators
fixed to ground because not only due complementary complex dynamics are presented
in SRA, such as buoyancy and added masses, but to de fact that the vehicle reference
frame is not longer inertial, see [12, 6].
However, more interesting submarine tasks require the more challenging problem
of establishing stable contact while moving along the contact surface, like pushing it-
self against a wall or polishing a sunken surface vessel surface or manipulating tools on
submarine pipe lines, forces are presented, and little is known about the structural prop-
erties of these contact forces, let alone exploit them either for design or control. This
problem leads us to study the simultaneous force and pose (position and orientation)
control of SRA under realistic conditions, thus we have the following assumptions
– i. the dynamical model, and its parameters, are hardly known in practice
– ii. the full state is not available
– iii. the geometric description of the contact surface is not completely known
As a first steep to deal with this problem, in this paper we consider assumptions i), while
ii) and iii) are assumed available. Notice that in any case, the controlled trust force of
the AUV must not only maintain stable contact, but must move, or keep still, the base
of the SRA in whatever position is required to achieve tracking of desired time-varying
trajectories of force and posture of the end-effector. How to achieve this is still an open
problem, and subject of future study. Finally, we emphasize that issues i-iii ( and iv
mentioned below), posse such challenges nowadays that deserve particular attention
away from the already complex issues of AUV, and thus the control problem of SRA
requires a particular treatment, which has been been completed addressed in the AUV
literature.
The main general reason that force/posture problem remains rather an open problem
is that we really know little about. On one hand, how to model and control properly a
fully free/floating immersed vehicle constrained by rigid object. On the other hand, the
submarine force control technology lies behind system requirements, such as very fast
Tracking of Constrained Submarine Robot Arms 209
sampling and uniform rates of sensors and actuators, even when the bandwidth of the
submarine robot is very low.
Despite brilliant, for the simplicity of this complex problem, control schemes for
free motion submarine robots in the past few years, in particular those of [17, 4, 2]
those schemes does not guarantee formally convergence of tracking errors, let alone si-
multaneous convergence of posture and contact force tracking errors. There are several
results that suggest empirically that a simple PD control structure behaves as stiffness
control for submarine robots to produce acceptable low performance of contact tasks.
However for more precise and fast tasks, the simultaneous convergence of time-varying
contact forces and posture remains an open problem. Since i is of great concern, con-
trol schemes which does not depend on the model or its regressor are quite important
since for AUV and SRA the role of model-free controllers are very important because
it is very hard to known exactly the dynamic model and its dynamic parameters. Neural
network could be an option, however because of the limited processing capabilities of
typical SRA, we need to resort on other control schemes. Recently, some efforts have
focused on how to obtain simple control structures to control the time-varying pose of
the AUV under the assumption that the relative velocities are low [2, 8].
For force control of SRA under assumption i, virtually none complete control sys-
tem has been published. However, to move forward more complex force controllers,
we believe that a better understanding of the structural properties of submarine robots
in stable contact to rigid objects are required. To this end, we consider the rigid body
dynamics of SRA subject to holonomic constraints (rigid contact), which exhibits sim-
ilar structural properties of fixed-base constrained robots. Thus, in this paper we have
chosen the orthogonalization principle [9] to extend from fix base to free-floating base
to propose a simple, yet high performance, controller with advanced tracking stability
properties.
1.4 Contribution
Firstly, we draw the attention of the control problem of constrained SRA, which de-
serves a particular treatment apart to the AUVs control problem. to this end we go
through the full dynamic model. Then, a quite simple force/posture model-free decen-
tralized control structure is proposed in this paper, which guarantees robust tracking of
210 E. Olguı́n-Dı́az and V. Parra-Vega
2 The Model
The model of a submarine can be obtained with the momentum conservation theory
and Newton’s second law for rigid objects in free space via the Kirchhoff formulation
[11], the inclusion of hydrodynamic effects such as added mass, friction and buoyancy
and the account of external forces/torques like contact effects [6]. The model is then
expressed by the next set of equations:
From this set, (1) is called the dynamic equation while (2) is called the kinematic
equation. The generalized coordinates vector q ∈ IR6 is given on one hand by the
3 Cartesian positions x, y, z of the origin of the submarine frame (Σv ) with respect
to a inertial frame (Σ0 ), and on the other hand by any set of attitude parameters that
represent the rotation of the vehicle’s frame with respect to the inertial one. Most com-
mon sets of attitude representation such a Euler angles, in particular roll-pitch-yaw
(φ, θ, ψ), use only 3 variables (which is the minimal number of orientation variables).
Then, for a submarine, the generalized coordinates q = (xv , yv , zv , φv , θv , ψv ) repre-
sents its 6 degrees of freedom. The vehicle velocity ν ∈ IR6 is the vector representing
both linear and angular velocity of the submarine in the vehicle’s frame. This vector
(v) T T
is then defined as ν = (vv , ω (v) )T . The relationship between this vector and
the generalized coordinates is given by the kinematic equation. The linear operator
Jν (q) ∈ IR6×6 in (2), is built by the concatenation of two transformations. The first is
Jq (q) ∈ IR6×6 which converts time derivatives of attitude parameters in angular veloc-
ity. The second is JR (q) = diag{R0v , R0v } ∈ IR6×6 which transforms a 6 dimension
tensor from the inertial frame to vehicle’s frame. Thus, the linear operator is defined as
T
Jν (q) = JR (q)J
q (q). A detailed
discussion on the terms of (1)can be found in [1] The
disturbance η v ν, ζ(t), ζ̇(t) of the surrounding fluid depends mainly in the incidence
velocity, i.e. the relative velocity of the vehicle velocity and the fluid velocity ζ(t). The
last is a non-autonomous function, but an external perturbation. This disturbance has
the property of η v (ν, 0, 0) = 0.That is that al the disturbances are null when the fluid
velocity and acceleration are null.
The dynamic model (1)-(2) can be rearranged by replacing (2) and its time derivative
into (1). The result is one single equation model:
which, whenever ζ(t) = ζ̇(t) = 0, i.e. η q (·) = 0, has the form of any Lagrangian
system. Its components fulfills all properties of such systems i.e. definite positiveness of
inertia and damping matrices, skew symmetry of Coriolis matrix and appropriate bound
of all components [15]. The contact effect is also obtained by the same transformation.
However it can be expressed directly from the contact wrench in the inertial frame
(Σ0 ) by the relationship τ c = JνT (q)F (v) T (0)
c = Jq (q)F c ,where the contact force F c
(0)
is the one expressed in the inertial frame. By simplicity it will be noted as F c from
this point further. The relationship with the one expressed in the vehicle’s frame is
given by F c = JR T
(q)F (v)
c . This wrench represents the contact forces/torques exerted
by the environment to the submarine as if measured in a non moving frame. These
forces/torques are given by the normal force of an holonomic constraint when in contact
and the friction due to the same contact. For simplicity in this work, tangential friction
is not considered. The equivalent of the disturbance is obtained also with the linear
operator given as: η q (·) = JνT (q)η v (·).
A holonomic constraint (or infinitely rigid contact object) can be expressed as a function
of the generalized coordinates of the submarine as
ϕ(q) = 0, (4)
with ϕ(q) ∈ IRr , where r stands for the number of independent contact points between
the SRA and the motionless rigid object. Equation (4) means that stable contact appears
while the SRA submarine does not deattach from the object ϕ(q) = 0. Evidently all
time derivatives of (4) are zero, which for r = 1
Jϕ (q)q̇ = 0, (5)
∂ϕ(q )
where Jϕ (q) = ∂ q ∈ IRr×n is the constraint jacobian. Last equation means that
velocities of the submarine in the directions of constraint jacobian are restricted to be
zero. This directions are then normal to the constraint surface ϕ(q) at the contact point.
As a consequence, the normal component of the contact force has exactly the same
direction as those defined by Jϕ (q), consequently, the contact force wrench can be
expressed as
T
F c = Jϕ+ (q)λ, (6)
J
where Jϕ+ (q) = Jϕϕ is a normalized version of the constraint jacobian; λ ∈ IRr is the
magnitude of the normal contact force at the origin of vehicle frame: λ = F c . The
free moving model expressed by (1)-(2), when no fluid disturbance and in contact with
the holonomic constraint can be rewritten as:
T T
Mv ν̇ + hv (q, ν, t) = u + JR (q)Jϕ+ (q)λ, (7)
ν = Jν (q)q̇, (8)
ϕ(q) = 0, (9)
212 E. Olguı́n-Dı́az and V. Parra-Vega
where hv (q, ν, t) = Cv (ν)ν + Dv (q, ν, t)ν + g v (q). Equivalently, the model (3) is
also expressed as
with hq (q, q̇, t) = Cq (q, q̇)q̇ + Dq (q, q̇, t)q̇ + g q (q) and Jϕ̄ (q) = Jϕ+ (q)Jq (q).
Equations (10)-(11) are a set of Differential Algebraic Equations index 2 (DAE-2). To
solve them numerically, a DAE solver is required. This last representation has the same
structure and properties as those reported in [3].
3 Control Design
Similar to [7], the orthogonal projection of Jϕ (q), which arises onto the tangent space
at the contact point, is given by following operator
Q(q) = In − JϕT (q)Jϕ (q) ∈ IRn×n , (12)
where the regressor Y (q, q̇, q̈) ∈ IRn×p is composed of known nonlinear functions and
Θ ∈ IRp by p unknown but constant parameters.
Tracking of Constrained Submarine Robot Arms 213
In order to design the controller, we need to work out the open loop error equation using
(13), in terms of nominal references (q˙r , named Y r , as follows. Consider
Mq (q)q̈ r + [Cq (q, q̇) + Dq (·)] q̇ r + g q (q) = Y r (q, q̇, q̇ r , q̈ r )Θ, (14)
where q̈ r ) is the time derivative of (q˙r ,, to be defined. Then the open loop (10) can be
written by adding and subtracting (14) as
Mq (q)ṡ = − [Cq (q, q̇) + Dq (·)] s − Y r (q, q̇, q̇ r , q̈ r )Θ + Jϕ̄T (q)λ + uq , (15)
where s = q̇ − q̇ r is called the extended error. The problem of control design for the
open loop (15) is to find uq such that exponential convergence arises when Y r Θ is not
available.
Consider that q d (t) and λd (t) are the desired smooth trajectories of position and contact
force, with q̃ = q(t) − q d (t) and λ̃ = λ(t) − λd (t) as the position and force tracking
errors, respectively. Then, let following reference q̇ r be:
t
q̇ r = Q q̇ d − σq̃ + S dp − γ1 sgn{S qp (t)}dt (16)
t0
t
+βJϕT SF − SdF + γ2 sgn{SqF (t)}dt ,
t0
where the orthogonal extended position and force manifolds S vp and SvF , respectively,
are given by
Svp = S qp + γ1 sgn(S qp (ς))dς, (20)
SvF = SqF + γ2 sgn(SqF (ς))dς. (21)
214 E. Olguı́n-Dı́az and V. Parra-Vega
with μ > 0 and Kd = KdT > 0, ∈ IRn×n . This nominal control, designed in the q-space
can be mapped to the original coordinates model, expressed by the set (1)-(2), using the
next relationship u = Jν−T (q)uq .Thus, the physical controller u is implemented in
terms of a key inverse mapping Jν−T .
Closed-loop System. The open loop system (15) under the continuous model-free sec-
ond order sliding mode control (22) yields to
Stability Analysis
Theorem 1. Consider a constrained submarine (10)-(33) under the continuous model-
free second order sliding mode control (22). The Underwater system yields a second
order sliding mode regime with local exponential convergence for the position, and
force tracking errors.
Proof. A passivity analysis S, τ ∗ indicates that the following candidate Lyapunov
function V qualifies as a Lyapunov function
1 T T
V = (s Mq s + βSvF SvF ), (24)
2
for a scalar β > 0. The time derivative of the Lyapunov candidate equation immediately
leads to
where it has been used the skew symmetric property of Ṁ − 2C(q, q̇), the boundedness
of both the feedback gains and submarine dynamic equation (there exists upper bounds
for M, C(q, q̇), g(q), q̇r , q̈r ), the smoothness of ϕ(q) (so there exists upper bounds for
Jϕ and Q(q)), and finally the boundedness of Z. All these arguments establish the ex-
istence of the functional . Then, if Kd , β and η are large enough such that s converges
into a neighborhood defined by centered in the equilibrium s = 0, namely
s → as t → ∞. (26)
Tracking of Constrained Submarine Robot Arms 215
This result stands for local stability of s provided that the state is near the desired
trajectories for any initial condition. This boundedness in the L∞ sense, leads to the
existence of the constants 3 > 0 and 4 > 0 such that
sional image of Q, we have that S ∗qp = QS qp ∈ IRn . Consider now that under an abuse
of notation that S qp = S ∗qp , such that for small initial conditions, if we multiply the
derivative of S qp in (20) by S Tqp , we obtain
which have used (27), and γ1 > 3 , to guarantee the existence of a sliding mode at
S qp (t) = 0 at time t ≤ S qp (t0 )/(γ1 − 3 ), and according to the definition of S qp
(below (20)), S qp (t0 ) = 0, which simply means that S qp (t) = 0 for all time. We see
T
that if we multiply the derivative of (21) by Svf , we obtain
T
SqF ṠqF = −γ2 SqF + SqF
T
ṠvF ≤ −γ2 SqF + SqF ṠvF . ≤ −(γ2 − 4 )Sqf
which have used (28), and γ1 > 3 , to guarantee the existence of a sliding mode at
S qp (t) = 0 at time t ≤ SqF (t0 )/(γ2 − 4 ) and, according to (21), SqF (t0 ) = 0,
which simply means that Sqf (t) = 0 for all time, which simply implies that λ → λd
exponentially fast.
4 Simulation Results
Simulations has been made on a simplified platform of a real submarine. Data has been
obtained from the Vortex system of IFREMER. Simulator presents only vertical planar
results (only in the x-z plane), so the generalized coordinates for this case of study are:
⎛ ⎞
xv
q = ⎝ zv ⎠ (29)
θv
ϕ(q) ≡ x − 2 (31)
1
The strict analysis follows Liu, et. al.
216 E. Olguı́n-Dı́az and V. Parra-Vega
−7
x 10
2.01 4 500
2.005
x error [m]
vx
0
2
0
−2
1.995
−4
1.99
0 1 2 3 1 2 3 −500
0 1 2 3 4 5 6 7
10
500
vz
1.5 0
0
−5
1
−10
0 1 2 3 1 2 3 −500
0 1 2 3 4 5 6 7
1
Orientation error [deg]
0.5 50
0 0
0
−0.5
−5 −50
−1
0 1 2 3 0 1 2 3
time [sec] time [sec] 1 2 3 4 5 6 7
time [sec]
Fig. 1. Generalized coordinates q and errors Fig. 2. Inputs controlled forces u, in vehicle’s
q̃ = q(t) − q d ν for set-point disturbance- frame for set-point disturbance-free case (con-
free case (continuous line for model-free sec- tinuous line for model-free second order slid-
ond order sliding mode control, dotted line for ing mode control, dotted line for PD control).
PD control).
Initial conditions were calculated to be at the contact surface with no force. Simulations
with simple PD were also performed as a comparison tool. The model-free control
parameters are as follows:
Kd γ1 γ2 σ α β η μ
−3
200M̂v 0.0025 10 5 4 0.025 1000 1
where M̂v is made by the diagonal values of the constant Inertia matrix when expressed
in the vehicle’s frame. For de PD the gains were defined as Kp = 100M̂v and Kd =
20M̂v .
The set of eqns. 7)-(8)-(32) or the set (10)-(33) describes the constrained motion of
the submarine when in contact to infinitely rigid surface described by (4). Numerical
solutions of these sets can be obtained by simulation, however the numerical solution,
using a DAE solver, can take too much effort to converge due to the fact that these sets of
equation represent a highly stiff system. In order to minimize this numerical drawback,
Tracking of Constrained Submarine Robot Arms 217
50 0.01
x
Control variable Sqp
Contact force [N]
0
0.005
−50
0
−100
−0.005
−150
−0.01
1 2 3 4 5 6 7 0 0.5 1 1.5 2 2.5 3
0.1
Contact force error [N]
z
Control variable Sqp
0.4
0.05
0.2
0 0
−0.2 −0.05
−0.4
−0.1
1 2 3 4 5 6 7 0 0.5 1 1.5 2 2.5 3
0.1 0.02
θ
qF
0.05
0.01
0
0
−0.05
−0.1 −0.01
1 2 3 4 5 6 7 0 0.5 1 1.5 2 2.5 3
time [sec] time [sec]
Fig. 3. Contact force λ, force error λ̃ = λ(t)− Fig. 4. Control variables S qp for the model-
λd and control variables SqF for the model- free second order sliding mode control for set-
free second order sliding mode control; all point disturbance-free case.
for set-point disturbance-free case (continu-
ous line for model-free second order sliding
mode control, dotted line for PD control).
the holonomic constraint has been treated as a numerically compliant surface which
dynamic is represented by
This is known in the force control literature of robot manipulators as constrained sta-
bilization method, which bounds the nonlinear numerical error of integration of the
backward integration differentiation formula. With a appropriate choice of parameters
P 1 and D 1, the solution of ϕ(q, t) → 0 is bounded. This dynamic is cho-
sen to be fast enough to allow the numerical method to work properly. In this way,
it is allowed very small deviation on the computation of λ, typically in the order of
−106 or less, which may produce, according to some experimental comparison, less
than 0.001% numerical error. Then, the value of the normal contact force magnitude
becomes:
−1
λ = Jϕ Jν−1 Mv−1 JRT T
Jϕ+ Jϕ Jν−1 Mv−1 hv − u
d
− Jϕ Jν−1 ν − DJϕ Jν−1 ν − P ϕ(q) , (35)
dt
−1
= Jϕ Mq−1 Jϕ̄T Jϕ Mq−1 (hq − uq ) − J˙ϕ q̇ − DJϕ q̇ − P ϕ(q) . (36)
The numerical dynamic induced in the constraint surface were performed with P =
9x106 and D = 36x103 .
218 E. Olguı́n-Dı́az and V. Parra-Vega
4.2 Disturbance-free
Disturbance forces where calculated using the function η v (ν, t) in (1), which is explic-
itly described in [6]. The velocity of the fluid were calculated considering a possible
values of tides and current. So the horizontal fluid velocity (x component) is composed
by a constant drift of 0.5 m/s (about 1 knot) and a periodic wave of 1 m/s amplitude
(about 2 knots) over a period of 7 sec. The vertical fluid velocity (z component)is com-
posed only by a periodic wave of 1 m/s amplitude (about 2 knots) over a period of 6
sec.
Tracking Case. For this case, the desired position/force desired trajectories are as
follows: a variable deep, center at 2 meter with 1 meter amplitude (pick to pick) and a
10 sec period. The desired orientation trajectory is 10 degree amplitude with an offset
of -5 degrees with period of 10 sec. And the contact desired force of 70N amplitude
with offset of 100N, and a period of 4 sec.
Figures 5 and 6 shows respectively position and force inputs. The difference in sta-
bilization velocity has been explicitly computed in order to visualize the qualitative
differences. In any case, this velocity can be modified with appropriate tuning of gain
parameters. Notice that there is some transient and variability in the position of the
contact point in the the direction normal to that force (the x coordinate). The same tran-
sients are present in the force graphic 7 where a noticeable difference between simple
PD and model-free second order sliding mode control is present. The big difference is
that although PD can regulate with some practical relative accuracy it is not capable of
track nor regulate force reference.
Tracking of Constrained Submarine Robot Arms 219
−7
x 10 500
2.01 4
vx
x position [m]
2.005
x error [m]
0 0
2
−2
1.995
−4
−500
1.99 0 1 2 3 4 5 6 7 8 9 10
2 4 6 8 10 2 4 6 8 10
800
3 40
vz
2.5 20
Deep [m]
0
2 0
−400
1.5 −20
−800
1 −40 0 1 2 3 4 5 6 7 8 9 10
2 4 6 8 10 0 2 4 6 8 10
10
0.5
0
0 0
−0.5 −50
−10
0 1 2 3 4 5 6 7 8 9 10
−1
0 2 4 6 8 10 0 2 4 6 8 10
time [sec] time [sec] time [sec]
Fig. 5. Generalized coordinates q and errors Fig. 6. Inputs controlled forces u, in vehicle’s
q̃ = q(t) − q d ν for tracking disturbance case frame, for tracking disturbance case (contin-
(continuous line for model-free second order uous line for model-free second order sliding
sliding mode control, dotted line for PD con- mode control, dotted line for PD control).
trol).
Figure 7 shows the contact force magnitude λ, the force error λ̃ = λ(t) − λd
and force manifold SqF for the model-free second order sliding mode control. In this
graphic is is clear that no force is induced with the PD scheme. In the case of the model-
free 2nd order sliding mode, force regulation is achieved very fast indeed.
Figure 8 shows the convergence of the extended position manifolds S qp . They do
converge to zero and induce there after the sliding mode dynamics.
In the beginning we were expecting that a controller implemented for robot manipulator
could be implemented also for submarine robots if some explicit additional terms were
added, such as those control terms that compensate for hydrodynamic and buoyancy
forces. However, surprisingly, no additional terms were required! It suffices only proper
mapping of the gradient of contact forces. Some remarks are in order.
200
x
Control variable Sqp
Contact force [N]
100 0.005
0 0
−100 −0.005
−200 −0.01
1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10
0.1
Contact force error [N]
z
0.4
qp
Control variable S
0.2 0.05
0 0
−0.2
−0.05
−0.4
−0.1
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10
0.1 0.02
θ
qF
0.05
0.01
0
0
−0.05
−0.1 −0.01
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10
time [sec]
time [sec]
Fig. 7. Contact force λ, force error λ̃ = λ(t)− Fig. 8. Control variables S qp for the model-
λd and control variables SqF for the model- free second order sliding mode control for
free second order sliding mode control, all for tracking disturbance case.
tracking disturbance case (continuous line for
model-free second order sliding mode control,
dotted line for PD control).
generalized coordinates time derivative with a generalized physical velocity. This re-
lationship is specially important for the angular velocity of a free moving object due
to the fact that time derivative of angular representations (such a roll-pitch-yaw) is not
the angular velocity. However there is always a correspondence between these vectors.
For external forces this mapping is indeed important. It relates a physical force/torque
wrench to the generalized coordinates q = (xv , yv , zv , φv , θv , ψv ) whose last 3 com-
ponents does not represent a unique physical space. In this work such mapping is given
by Jq and appears in the contact force mapping by the normalized operator Jϕ̄ .
Notice that the controller exhibits a PD structure plus a nonlinear I-tame control action,
with nonlinear time-varying feedback gains, for each orthogonal subspace. It is in fact
a decentralized PID-like, coined as ”Sliding PD” controller. It is indeed surprising that
similar control structures can be implemented seemingly for a robot in the surface or
below water, with similar stability properties, simultaneous tracking of contact force
and posture. Of course, this is possible under a proper mapping of jacobians, key to
implement this controller.
When friction at the contact point arises, which is expected for submarine tasks wherein
the contact object is expected to exhibits a rough surface, with high friction coefficients,
a tangential friction model should be added in the right hand side. Particular care must
be employed to map the velocities. Complex friction compensators can be designed,
similar to the case of force control for robot manipulators, therefore details are omitted.
Tracking of Constrained Submarine Robot Arms 221
6 Conclusions
References
1. E. Olguı́n Dı́az, V. Parra-Vega On the Force/Posture Control of a Constrained Submarine
Robot 4th International Conference on Informatics in Control, Robotics and Automation,
Conference Prodeedings. May 2007
2. Smallwood, D.A.; Whitcomb, L.L. Model-based dynamic positioning of underwater robotic
vehicles: theory and experiment Oceanic Engineering, IEEE Journal of Volume: 29 Issue: 1
Jan. 2004
3. V. Parra-Vega, Second Order Sliding Mode Control for Robot Arms with Time Base Genera-
tors for Finite-time Tracking, Dynamics and Control, 2001.
4. Smallwood, D.A.; Whitcomb, L.L. Toward model based dynamic positioning of underwater
robotic vehicles OCEANS, 2001. MTS/IEEE Conference and Exhibition Volume: 2 2001
5. Villani, L.; Natale, C.; Siciliano, B.; Canudas de Wit, C.; An experimental study of adaptive
force/position control algorithms for an industrial robot Control Systems Technology, IEEE
Transactions on Volume 8, Issue 5, Sept. 2000
6. E. Olguı́n Dı́az, Modélisation et Commande d’un Système Véhicule/Manipulateur Sous-
Marin PhD Thesis. Laboratoire d’Automatique de Grenoble, January 1999
7. Y. H. Liu, S. Arimoto, V. Parra-Vega, and K. Kitagaki, Decentralized Adaptive Control Of
Multiple Manipulators in Cooperations, International Journal of Control, Vol. 67, No. 5, pp.
649-673, 1997.
8. M. Perrier, C. Canudas de Wit, “Experimental Comparison of PID versus PID plus Nonlinear
Controller for Subsea Robots” Journal of Autonomous Robots, Special issue on Autonomous
Underwater Robots, 1996.
9. V. Parra-Vega and S. Arimoto, A Passivity-based Adaptive Sliding Mode Position-Force Con-
trol for Robot Manipulators, International Journal of Adaptive Control and Signal Process-
ing, Vol. 10, pp. 365-377, 1996.
10. Arimoto, S. Fundamental problems of robot control Robotica (1995), volume 13, pp 19-
27,111-122, Cambrige University Press
11. Thor I. Fossen. Guidance and Control of Ocean Vehicles. John Wiley and Sons, Chichester
1994
12. I. Schjølberg, T. I. Fossen. “Modelling and Control of Underwater Vehicle-Manipulator
Systems” Proceedings the 3rd Conference on Marine Craft Maneuvering and Control
(MCMC’94), Southampton, UK, 1994.
13. Chiaverini, S.; Sciavicco, L.The parallel approach to force/position control of robotic ma-
nipulators. IEEE Transactions on Robotics and Automation Volume 9, Issue 4, Aug. 1993
222 E. Olguı́n-Dı́az and V. Parra-Vega
1 Introduction
The concept of impedance and its generalization reactance, has been used to define
equivalent circuits of mechanical and electro-mechanical systems since the
development of the Maxwell model of solids. The idea that driving point impedances
could be decomposed into terms that parallel electrical elements was initiated by [5]
who showed that the frequency response of any system is determined by the poles and
zeros of its transfer function. The conditions for network synthesis are described by
[1] and later applied by [8] who introduced bond graphs to distinguish and represent
effort and flow variables in a graphical setting. Examples of electro-mechanical
system simulations are numerous and include magnetic circuits [6], mechatronics and
electromechanical transducers [11], [12], [9].
Mechanical block diagrams are routinely used to model robot dynamics although
some [3] limit them to a single axis while others [13] rely entirely on equivalent
electric circuits to avoid the inherent difficulties of creating mechanical models of
multi-axis devices, transmission systems or other systems with coupled dynamics.
Section 2 of this paper describes the conventional electro-mechanical analogy
and points out a limitation of the mass model. It goes on to describe a new
mass/pulley (MP) model which overcomes the inherent deficiency in the conventional
226 L.J. Stocco and M.J. Yedlin
mass model. In Section 3, it is shown how the new MP model can be used to model
the dynamics of devices which have coupled effective masses. Examples are provided
which include both 2-DOF and 3-DOF serial and parallel manipulators. Lastly,
concluding remarks are made in Section 4.
2 Electro-Mechanical Analogies
Each of the components in Figure 1 has two terminals except for the mass which has
only one. This is due to the fact that the dynamic equation of a mass (3) does not
accommodate an arbitrary reference. Acceleration is always taken with respect to the
global reference, or ground. Consider the two systems in Figure 2 which are well
known to be analogous.
In Figure 2, the voltage across the capacitor ec corresponds to the velocity of the
mass vm. Both of these are relative measurements that only correspond to one another
because both are taken with respect to ground. Consider, on the other hand, the circuit
in Figure 3 which contains a capacitor with one terminal open circuited.
In Figure 3, the open circuit at n2 prevents any current from flowing through the
capacitor. Since there is no current shunted into the capacitor at n1, the voltage at n1 is
unaffected by the capacitor. In the mechanical “equivalent”, it is not possible to
connect a non-zero mass M to node n1 without affecting the output velocity vo. This is
due to the implicit ground reference of the mass (shown by a dotted line) which is
physically impossible to interrupt. Note that this same limitation does not apply to the
Modelling Robot Dynamics with Masses and Pulleys 227
spring or damper since they both have two terminals which can be connected or left
floating, as desired.
G1 B1 vo
E( s) n1 C n2 V(s ) n1
M
G2
B2
Because of the above limitation, there are mechanical systems which can not be
modelled using a mechanical system diagram. Elaborate transmission systems such as
robotic manipulators may contain mass elements that are only present when relative
motion occurs between individual motion stages. Currently, systems such as these can
only be modelled using electric circuits since capacitors can be used to model this
type of behaviour but masses cannot.
228 L.J. Stocco and M.J. Yedlin
1
Δx o = --- ( Δx 1 + Δx 2 ) (4)
2
v o = 1--- ( v 1 + v 2 ) (5)
2
Note from (5) that although the pulley provides the desired differential velocity
input, it also introduces an undesired 2:1 reduction ratio. However, setting v1 to 0 (i.e.
connecting n1 to ground) results in (6). Therefore, a similar pulley system with one
input tied to ground could be used to scale up velocity by an equivalent ratio.
v 2 = 2v o (6)
The MP model uses ideal cables with zero mass and infinite length and stiffness.
The ideal cables travel through the system of massless, frictionless pulleys without
any loss of energy. The MP model operates in zero gravity so the mass is only
accelerated as a result of cable tension and/or compression. Unlike practical cables,
the ideal cables never become slack. When an attractive force is applied between n1
and n2, F<0 and the mass is accelerated downward. A block diagram of the MP model
Modelling Robot Dynamics with Masses and Pulleys 229
is presented in Figure 6 where P has the same value as M in Figure 5. Note that,
unlike a pure mass, the MP model has two terminals, n1 and n2 which correspond to
the two ends of the primary cable.
C
n1 I n2
E1 E2
n1 V1 V2 n2
F
F F
F
M
F F
n1 n2
P
V(s )
B1 n 1 B2
n2
P vo
Consider Figure 7 which is the mechanical system from Figure 3 with the mass
replaced by an MP model. With terminal n2 left unconnected, the primary cable of the
MP model travels freely through the primary pulley without accelerating the mass or
consuming energy. The MP model behaves just like the capacitor in Figure 3. Also
note the topological similarity between the electrical circuit in Figure 3 and its true
mechanical equivalent in Figure 7. This is a direct result of the topological
consistency between the capacitor and the MP model, both of which have two
symmetric terminals. As pointed out in [10], this consistency allows one to analyze
mechanical systems using electric circuit analysis techniques once all masses have
been replaced by MP models.
230 L.J. Stocco and M.J. Yedlin
Consider the simplified dynamics of a 2-DOF robot (9) where M is the mass matrix, B
is the damping matrix, F is a vector of joint forces/torques (10), R is a vector of joint
rates r1 and r2 (10), and s is the Laplace operator. Spring constants, gravitational and
coriolis effects are assumed to be negligible for the purpose of this example. If the
damping in the system is dominated by the actuator damping coefficients, B is a
diagonal matrix (10). M, on the other hand, represents the effective mass perceived by
each joint and is not diagonal or otherwise easily simplified in general.
F = BR + MsR (9)
f1 b1 0 r1 r1
= + Ms (10)
f2 0 b2 r 2 r2
m1 m 2
M = (11)
m2 m 2
m1 m2
q1 q2
b1 b2
m1 m2
q1 q2
f1 f2
b'1 m'1 b'2 m'2
v2
i2 g2
Å
v1 sc2
i1 g1 sc1
Å
i1 – i 2 g1 + g 2 –g 2 v1 c1 0 v1
= + s (12)
i2 –g 2 g2 v2 0 c2 v2
f 1 – f2 r1 r1
= B' + M's (13)
f2 r 1 + r2 r1 + r2
m' 1 0 m1 + m 2 0 (15)
M' = =
0 m' 2 0 m2
b' 1 b1
b' 2 b2 (16)
=
m' 1 m1 + m2
m' 2 m2
In this simple example, masses are sufficient to model the system behaviour but
only because the device has a single degree of freedom so M’ is diagonal and there is
no cross-coupling between actuators. In general, however, effective mass is not
always decoupled and the off-diagonal elements of M’ can be expected to be non-
zero. When M’ is not diagonal, conventional single-terminal masses are unable to
model the entire effective mass of the system. They can not model the off-diagonal
terms that describe inertial effects resulting from relative motion of the actuators.
232 L.J. Stocco and M.J. Yedlin
Consider the 2-DOF serial robot shown in Figure 10. The mass matrix for this
mechanism is approximated in [2] by two point masses d1 and d2 placed at the distal
actuator and end-effector as indicated below. The resulting mass matrix (17) has the
terms shown in (18-20) where q1 and q2 are the joint angles and l1 and l2 are the link
lengths. Just as in the previous example, actuator damping coefficients b1 and b2 are
taken to dominate the total system damping.
l1 l2
b1 b2
q1 q2
d1 d2
m1 ( q ) m3 ( q )
M(q ) = (17)
m3 ( q ) m2 ( q )
2 2
m1 = l2 d2 + 2l1 l 2 d2 cos ( q 2 ) + l 1 ( d 1 + d2 ) (18)
2
m 2 = l2 d 2 (19)
2
m 3 = l2 d2 + l1 l2 d2 cos ( q2 ) (20)
The equivalent circuit model of this system is shown in Figure 11. It is similar to
Figure 9 except that the capacitor values are configuration dependent and a third
capacitor c12 is included to model the coupled mass terms that are present. Performing
nodal analysis results in (21) and the corresponding M’ matrix in (22) which can be
rearranged to solve for the mechanical model parameters in terms of the physical
mass values in (23). B’ is the same diagonal matrix as in (14).
Note from (22) that M’ is diagonal (i.e. p’12=0) when m2=m3. From (19,20), this
is merely the special case when q2=±π/2. Therefore, it is not possible to model this
system using only masses due to their implicit ground reference, as described in
Section 2.1. The off-diagonal terms can, however, be modelled using the MP model
proposed in Section 2.2. It results in a mechanical system model that is topologically
identical to the equivalent circuit in Figure 11 where each grounded capacitor (c1,c2)
is replaced by a regular mass and each ungrounded capacitor (c12) is replaced by an
MP model since the MP model is able to accommodate a non-zero reference
acceleration. The resulting mechanical system is shown in Figure 12.
Modelling Robot Dynamics with Masses and Pulleys 233
v2
sc12 ( q )
i2 g2
Å
sc2 ( q )
v1
sc1 ( q )
i1 g1
Å
i1 – i2 g 1 + g2 – g2 v1 c 1 + c 12 –c 12 v1
= + s (21)
i2 – g2 g2 v2 –c 12 c2 + c 12 v2
m' 1 m1 + m3
m' 2 = m3 (23)
p' 12 m2 – m3
2 2
2 l 2 d 2 + 3 l 1 l2 d 2 cos ( q2 ) + l1 ( d 1 + d 2 )
2
l2 d2 + l1 l2 d 2 cos ( q2 )
f2
f1
b' 2
b' 1 m' 1 ( q ) m' 2 ( q )
Although p’12 has a negative value when -π/2<q2<π/2, the net mass perceived by
each actuator is always positive because M is positive definite. When p’12 is negative,
it simply means that the motion of actuator 1 reduces the net mass perceived by
actuator 2, but the net mass perceived by actuator 2 is always greater than zero.
The same technique can be applied to parallel manipulators such as the 2-DOF 5-bar
linkage used by (Hayward et al., 1994). In the case of parallel manipulators, each
234 L.J. Stocco and M.J. Yedlin
actuator is referenced to ground but there remains a coupling between the effective
mass perceived by each actuator which, like a serial manipulator, is configuration
dependent. This coupling is modelled by c12 and p’12 in the equivalent electrical and
mechanical models shown in Figure 13. Typically, parallel manipulators also have
coupled damping terms due to their passive joints which would be modelled by a
conductance g12 added between nodes 1 and 2 (i.e. in parallel with c12). However, for
the sake of simplicity, the damping of the passive joints are neglected here.
sc 1 ( q )
v1 v2
sc1 ( q )
i1 g1 sc2 ( q ) g2 i2
Å Å
f1
b' 1
m' 1
f2
b' 2 m' 2 p' 1
i1 g1 0 v1 c 1 + c 12 –c 12 v1
= + s (24)
i2 0 g2 v2 – c 12 c 2 + c 12 v2
m' 1 m1 + m3
m' 2 = m 2 + m3 (26)
p' 12 – m3
actuators and the damping B and spring K matrices are diagonal (27,28). With parallel
manipulators, the B and K matrices typically contain off-diagonal terms but they are
easily modelled using conventional techniques since springs and dampers are 2-
terminal devices which can be placed at any two nodes in a system diagram.
B = diag ⎛⎝ b 1 b 2 … b n ⎞⎠ (27)
K = diag ⎛⎝ 1 ⁄ k 1 1 ⁄ k 2 … 1 ⁄ k n ⎞⎠ (28)
To account for inertial cross-coupling, the model must contain a capacitor and/or
MP model between every pair of actuators. For example, the electric circuit model
and corresponding mechanical system model of a serial 3-DOF manipulator are
shown in Figure 14. The capacitance C matrix resulting from the nodal analysis (29)
of the circuit in Figure 14 is shown in (30).
v3
i3 g3 sc 23 sc13
Å
v2
i2 g2 sc 12 sc 3
Å
v1
i1 g1 sc 1 sc2
Å
f2 f3
f1
b' 2 b'3
b' 1 m' 2
m' 1 m' 3
p' 12 p' 23
p' 13
i 1 – i2 v1 v1
i 2 – i 3 = G ( q ) v 2 + C ( q )s v 2 (29)
i3 v3 v3
c 1 + c 12 + c 13 – c12 – c 13
C(q ) = –c 12 c 2 + c 12 + c 23 – c 23 (30)
–c 13 – c23 c 3 + c 23 + c 13
Just as in previous examples, the 3x3 mass matrix M’ (32) is rearranged into the
form shown in (31) to parallel the current/voltage relationship of (29). For a mass
matrix M of the form shown in (33), the entries of the M’ matrix are solved for in
(34). Similarly, for a parallel 3-DOF robot, the electric circuit model and
corresponding mechanical system model are shown in Figure 15. For a mass matrix of
the form shown in (33), the elements of M’ are shown in (35).
f1 – f 2 r1 r1
f2 – f3 = B' r1 + r 2 + M's r1 + r 2 (31)
f3 r1 + r 2 + r3 r1 + r2 + r 3
m1 ( q ) m4 ( q ) m 5 ( q )
M ( q ) = m4 ( q ) m2 ( q ) m 6 ( q ) (33)
m5 ( q ) m6 ( q ) m 3 ( q )
m' 1 m1 – m4
m' 2 m4 – m5
m' 3 m5
= (34)
p' 12 m2 + m 5 – m4 – m6
p' 23 m3 – m6
p' 13 m6 – m5
4 Conclusions
It is argued that a plain mass is not a complete and general model of a capacitor since
a mass only has one terminal whereas a capacitor has two. The response of a mass
corresponds to its acceleration with respect to ground and, therefore, can only be used
to simulate a capacitor which has one terminal connected to ground. It cannot be used
Modelling Robot Dynamics with Masses and Pulleys 237
v3
sc 3 g3 i3
Å
sc13 sc23
sc 12
v1 v2
i1 g1 sc1 sc 2 g2 i2
Å Å
f3
b'3
f2
b' 2
m' 2
f1 m' 3
b' 1
p' 12 p'23
m' 1
p'13
It is shown that the MP model can be used to model systems with cross-coupled
effective masses which are otherwise, impossible to model with pure masses alone.
This includes both serial and parallel manipulators with any number of degrees of
freedom. The mechanical system model that is obtained fully describes the dynamic
response of the system and is topologically identical to its electric circuit equivalent.
As shown in (Stocco & Yedlin, 2007), this makes it possible to apply electric circuit
analysis techniques to mechanical systems, directly.
238 L.J. Stocco and M.J. Yedlin
References
1 Introduction
Model Predictive Control (MPC), which is also referred to as Receding or Rolling Hori-
zon Control, has become more and more important for control applications from var-
ious fields. This is due to the fact that not only the current system state, but also a
model-based prediction of future system states over a finite N stage prediction horizon
is considered in the control law. For this prediction horizon, an open-loop optimal con-
trol problem with a corresponding cost function is solved. The resulting control input is
then applied in an open-loop feedback fashion to the system.
The well understood and widely used MPC for linear system models [1] together
with linear or quadratic cost functions is not always sufficient when it is necessary
to achieve even higher quality control, e.g., in high precision robot control or in the
process industry. Steadily growing requirements on the control quality can be met by
incorporating nonlinear system models and cost functions in the control. The typically
significant increase in computational demand arising from the nonlinearities has been
mitigated in the last years by the steadily increasing available computation power for
control processes [2] and advances in the employed algorithms to solve the necessary
optimizations [3].
240 F. Weissel et al.
wall
robot
ple system, which has been introduced in previous sections. Concluding remarks and
perspectives on future work are given in Section 6.
2 Problem Formulation
The considered discrete-time system is given by
where xk denotes the vector-valued random variable of the system state, uk is the ap-
plied control input, and a( · ) a nonlinear, time-invariant function. wk denotes the white
stationary noise affecting the system additively element-wise, i.e., the elements of wk
are processed in a( · ) just additively. For details see Section 3.3. Please note that ran-
dom variables are denoted by lower case bold face letters, an underscore denotes a
vector-valued quantity.
Example System
A mobile miniature walking robot (Fig. 1) is supposed to move along a given trajectory,
e.g. along a wall, with constant velocity. This robot is able to superimpose left and
right turns onto the forward motion. The robot’s motion can be modeled similar to
the motion of a two-wheeled differential-drive robot, where the system state xk =
T
[xk , αk ] comprises the distance to the wall xk and the robot’s orientation relative to
the wall αk . This leads to the nonlinear discrete-time system equation
where s is the robots constant step width and wxk as well as wα k denote the noise influ-
ence on the system. The control input uk is a steering action, i.e., a change of direction
of the robot. Furthermore, the robot is equipped with sensors to measure distance y xk
and orientation y α
k with respect to the wall according to
y xk = xk + v xk ,
(3)
yα
k = αk + v k ,
α
where v xk and v α
k describe the measurement noise.
242 F. Weissel et al.
At any time step k, the system state is predicted over a finite N -step prediction
horizon. Within this horizon, an open-loop optimal control problem is solved, i.e., the
optimal input uk∗ is determined according to
where
−1
N
Vk (xk , uk ) = min Exk,1:N gN (xk,N ) + gn (xk,n , uk,n ) , (4)
uk,1:N −1
n=0
with xk = xk,0 . Vk (xk , uk ) comprises the step costs gn (xk,n , uk,n ) depending on the
predicted system states xk,n and the corresponding control inputs uk,n , as well as a
terminal cost gN (xk,N ). This open-loop optimal control input uk∗ is then applied to the
system at time step k. In the next time step k + 1, the whole procedure is repeated,
which leads to an open-loop feedback control scheme.
For most nonlinear systems, the analytical evaluation of (2) is not possible. One
reason for this is the required prediction of system states for a noise-affected nonlinear
system. The other one is the necessity for calculating expected values, which also cannot
be performed in closed form. In the next sections an integrated approach to overcome
these two problems is presented.
3 State Prediction
Predicting the system state is an important part in stochastic NMPC for noise-affected
systems. The probability density f˜k+1
x
(xk+1 ) of the system state xk+1 for the next time
step k + 1 has to be computed utilizing the so-called Chapman-Kolmogorov equation
[12]
f˜k+1
x
(xk+1 ) = f˜uTk (xk+1 |xk )f˜kx (xk )dxk . (5)
Rd
The transition density f˜uTk (xk+1 |xk ) depends on the system described by (1). For linear
systems with Gaussian noise, the Kalman filter [13] provides an exact solution to (5),
as (5) is reduced to the evaluation of an integral over a multiplication of two Gaussian
densities, which is analytically solvable.
For nonlinear systems, an approximate description of the predicted density f˜k+1
x
(xk+1 )
is inevitable, since an exact closed-form representation is generally impossible to ob-
tain. One possible approach to stochastic NMPC is linearizing the system and then
applying the Kalman filter [14]. The resulting single Gaussian density is typically not
sufficient for approximating f˜k+1
x
(xk+1 ). Hence, we propose representing all densities
involved in (5) by means of Gaussian mixtures, which can be done due to their universal
approximation property [15].
To reduce the complexity of approximating all density functions corresponding to
system (1) and to allow an efficient state prediction, the concept of modularization
Stochastic Nonlinear Model Predictive Control 243
is proposed, see Section 3.3. Here, (1) is decomposed into vector-valued subsystems.
Approximations for these subsystems in turn can be reduced to the scalar case, as stated
in Section 3.2. For that purpose, in the following section a short review on the closed-
form prediction approach for scalar systems with additive noise is given. Combining
these techniques enables state prediction for system (1) based on Gaussian mixture
approximations of the transition density functions corresponding to scalar systems.
xk+1 = a(xk , uk ) + wk ,
L
x
fk+1 (xk+1 ) = ωi · N (xk+1 − μi ; σi2 ) , (6)
i=1
where L is the number of Gaussian components, N (xk+1 −μi ; σi2 ) is a Gaussian density
with
mean μi , standard deviation σi , and weighting coefficients ωi with ωi > 0 as well
L
as i=1 ωi = 1. For obtaining this approximate representation of the true predicted
density that provides high accuracy especially with respect to higher-order moments
and a multimodalities, the corresponding transition density f˜uTk (xk+1 |xk ) from (5) is
approximated off-line by the Gaussian mixture
L
fuTk (xk+1 , xk , η) = ωi · N (xk+1 − μi,1 ; σi,1
2
) · N (xk − μi,2 ; σi,2
2
)
i=1
The axis-aligned structure of the approximate transition density allows performing re-
peated prediction steps with constant complexity, i.e., a constant number L of mixture
x
components for fk+1 (xk+1 ).
This efficient prediction approach can be directly applied to vector-valued systems,
like (1). However, off-line approximation of the multi-dimensional transition density
corresponding to such a system is computationally demanding. Therefore, in the next
two sections techniques to lower the computational burden are introduced.
244 F. Weissel et al.
T
with state vector xk+1 = [xk+1,1 , xk+1,2 , . . . , xk+1,d ] ∈ Rd and noise
T
wk = [wk,1 , wk,2 , . . . , wk,d ] ∈ Rd . Under the assumption that wk is white and sta-
tionary (but not necessarily Gaussian or zero-mean), with mutual independent elements
wk,j , approximating the corresponding transition density f˜uTk (xk+1 |xk ) = f˜w (xk+1 −
a(xk , uk )) can be reduced to the scalar system case.
For our proposed stochastic NMPC framework, we assume that the nonlinear system is
corrupted by element-wise additive noise. By incorporating this specific noise structure,
the previously stated closed-form prediction step can indirectly be utilized for system
(1). Similar to Rao-Blackwellized particle filters [16], we can reduce the system in (1)
Stochastic Nonlinear Model Predictive Control 245
uk
(1) (2) (m)
wk wk wk
a(1) (x k , uk ) +
(2)
(2)
a(2) (x k , uk ) + · · · (m) (m)
a(m) (x k , uk ) +
xk xk xk x k+1
Unit Delay
to a set of less complex subsystems. These subsystems are of a form according to (7),
(m) (m)
xk+1 = a(xk , uk , wk ) = a(m) (xk , uk ) + wk
(m) (m−1) (m−1)
xk = a(m−1) (xk , uk ) + w k
..
.
(2) (1) (1)
xk = a(1) (xk , uk ) + wk .
We name this approach modularization, where the subsystems
(i+1) (i) (i)
xk = a(i) (xk , uk ) + wk , for i = 1, . . . , m
and
(2)
xk+1 = xk + xk ,
αk+1 = αk + uk + wα
k .
(2)
The auxiliary system state xk is stochastically dependent on αk . We omit this depen-
dence in further investigations of the example system for simplicity.
246 F. Weissel et al.
Please note that there are typically stochastic dependencies between several auxil-
iary system states. To consider this fact, the relevant auxiliary system states have to be
augmented to conserve the dependencies. Thus, the dimensions of these auxiliary states
need not all to be equal.
4 Cost Functions
In this section, two possibilities to model cost functions, the well-known quadratic devi-
ation and a novel approach employing Gaussian mixture cost functions, are presented.
Exploiting the fact that the predicted state variables are, as explained in the previous
section, described by Gaussian mixture densities, the necessary evaluation of the ex-
pected values in (4) can be performed efficiently in closed-form for both options.
In the following, cumulative cost functions according to (4) are considered, where
gn (xk,n , uk,n ) denotes a step cost within the horizon and gN (xk,N ) a cost depending
on the terminal state at the end of the horizon.
For simplicity, step costs that are additively decomposable according to
are considered, although the proposed framework is not limited to this case.
One of the most popular cost functions is the quadratic deviation from a target value x̌
or ǔ according to
T
gnx (xn ) = (xn − x̌n ) (xn − x̌n ) .
As in our framework the probability density function of the state xn is given by an axis-
aligned Gaussian mixture fnx (xn ) with L components, the calculation of Exn {gnx (xn )},
which is necessary to compute (4), can be performed analytically as it can be interpreted
as the sum over shifted and dimension-wise calculated second-order moments
T
Exn {gnx (xn )} = Exn {(xn − x̌n ) (xn − x̌n )}
= trace E2 {(xn − x̌n )}
xn
L
T
= trace ωi (μi − x̌n )(μi − x̌n ) + diag(σ i )2 ,
i=1
L
employing E2x {x} = i=1 ωi (μ2i + σi2 ).
Here, the calculation of the expected value Exn {gnx (xn )}, which is necessary for
the calculation of (4), can also be performed analytically
Exn {gnx (xn )} = fnx (xn ) · gnx (xn ) dxn
Rd
L
= ωi N (xn − μi ; diag(σ i )2 )
Rd i=1
M
· ωj N (xn − μj ; diag(σ j )2 ) dxn
j=1
L
M
= ωij N (xn − μij ; diag(σ ij )2 ) dxn , (8)
i=1 j=1 Rd
=1
with
where fnx (xn ) denotes the L-component Gaussian mixture probability density func-
tion of the system state (6) and gnx (xn ) the cost function, which is a Gaussian mixture
with M components.
The versatility of the cost function representation can even be increased if also Dirac
mixtures, i.e., weighted sums of Dirac delta distributions, are employed. This allows to
penalize (or reward) individual discrete configurations of the continuous-valued state
space. The calculation of the expected value Exn {gnx (xn )} can be carried out similarly
to (8) for Dirac mixtures as well as for the sum of Gaussian and Dirac mixtures [17].
0 0
−0.2 −0.2
−0.4 −0.4
gnx !
gnx !
−0.6 −0.6
−0.8 −0.8
−1 −1
−2 0 2 4 6 8 10 −2 0 2 4 6 8 10
(a) xn ! (b) xn !
Fig. 3. Asymmetric and multimodal cost functions consisting of four and three components
(gray), respectively.
4
xk !
0
0 5 10 15 20 25 30 35 40
(a) Position and orientation. k!
0
gk
gDP
gk !
k
−0.5 gdet
k
−1
0 5 10 15 20 25 30 35 40
k!
(b) Cost per step.
Fig. 4. First 40 steps of a simulation (red solid line: stochastic NMPC, green dotted line: stochastic
NMPC with DP, blue dashed line: deterministic NMPC).
Using the efficient state prediction presented in Section 3 together with the value
function representations presented above, (2) can be solved analytically for a finite set of
control inputs. Thus, an efficient closed-form solution for the optimal control problem
within stochastic NMPC is available. Its capabilities will be illustrated by simulations
in the next section.
5 Simulations
Based on the above example scenario, several simulations are conducted to illustrate
the modeling capabilities of the proposed framework as well as to illustrate the benefits
that can be gained by the direct consideration of noise in the control. The considered
system is given by (2) and (3), with s = 1 and uk ∈ {−0.2, −0.1, 0, 0.1, 0.2}. The
considered noise influences on the system wxk and wα k are zero-mean white Gaussian
noise with standard deviation σw x
= 0.5 and σw α
= 0.05 ≈ 2.9◦ respectively. The
Stochastic Nonlinear Model Predictive Control 249
measurement noise is also zero-mean white Gaussian noise with standard deviation
σvx = 0.5 and σvα = 0.1 ≈ 5.7◦ . All simulations are performed for an N = 4 step
prediction horizon, with a cumulative cost function according to (4), where gN (xk,N )
is the function depicted in Fig. 3 (a) and gn (xk,n , uk,n ) = gN (xk,N ) ∀n. In addition,
the modularization is employed as described above.
To evaluate the benefits of the proposed NMPC framework, three different kinds of
simulations are performed:
Direct calculation of the optimal input considering all noise influences (stochastic
NMPC):
The direct calculation of the open-loop feedback control input with consideration of
the noise is performed using the techniques presented in the previous sections. Thus,
it is possible to execute all calculations analytically without the need for any numer-
ical method. Still, this approach has the drawback that the computational demand for
the optimal control problem increases exponentially with the length of the horizon N ,
which makes it only suitable for short horizons.
Calculation of the optimal input with a value function approximation scheme and Dy-
namic Programming (stochastic NMPC with Dynamic Programming):
In order to be able to use the framework efficiently also for long prediction horizons
as well as to consider state information within the prediction horizon (closed-loop
feedback or optimal control), it is necessary to employ Dynamic Programming (DP).
Unfortunately, this is not directly possible, as no closed-form solution for the value
function Jn is available. One easy but either not very accurate or computationally de-
manding solution would be to discretize the state space. More advanced solutions can
be found by value function approximation [18]. For the simulations, a value function
approximation as described in [7] is employed that is well-suited with regard to closed-
form calculations. Here, the state space is discretized by covering it with a finite set
of Gaussians with fixed means and covariances. Then weights, i.e., scaling factors,
are selected in such a way that the approximate and the true value function coincide
at the means of every Gaussian. Using these approximate value functions together
with the techniques described above, again all calculations can be executed analyti-
cally. In contrast to the direct calculation, now the computational demand increases
only linearly with the length of the prediction horizon but quadratically in the num-
ber of Gaussians used to approximate the value function. Here, the value functions are
approximated by a total of 833 Gaussians equally spaced over the state space within
(x̂n , α̂n ) ∈ Ω := [−2, 10] × [−2, 2].
For each simulation run, a particular noise realization is used that is applied to the
different controllers. In Fig. 4(a), the first 40 steps of a simulation run are shown. The
distance to the wall xk is depicted by the position of the circles, the orientation αk by
the orientation of the arrows. It can be clearly seen that the system is heavily influenced
250 F. Weissel et al.
by noise and that the robot under deterministic control behaves very differently from
the other two. The deterministic controller just tries to move the robot to the minimum
of the cost function at x̌k = 2 and totally neglects the asymmetry of the cost function.
The stochastic controllers lead to a larger distance to the wall, as they consider the noise
affecting the system in conjunction with the non-symmetric cost function.
In Fig. 4(b), the evaluation of the cost function for each step is shown. As expected,
both stochastic controllers perform much better, i.e., they generate less cost, than the
deterministic one. This finding has been validated by a series of 100 Monte Carlo sim-
ulations with different noise realizations and initial values. The uniformly distributed
initial values are sampled from the interval x0 ∈ [0, 8] and α0 ∈ [−π/4, π/4]. In Ta-
ble 1, the average step costs of the 100 simulations with 40 steps each are shown. To
facilitate the comparison, also normalized average step costs are given. Here, it can be
seen that the stochastic controller outperforms the deterministic one by over 10% in
terms of cost. In 82% of the runs, the stochastic controller gives overall better results
than the deterministic one. By employing dynamic programming together with value
function approximation the benefits are reduced. Here, the deterministic controller is
only outperformed by approximately 3.5%. The analysis of the individual simulations
leads to the conclusion that the control quality significantly degrades in case the robot
attains a state which is less well approximated by the value function approximation as
it lies outside Ω. Still, the dynamic programming approach produced better results than
the deterministic approach in 69% of the runs.
These findings illustrate the need for advanced value function approximation tech-
niques in order to gain good control performance. One approach that seamlessly inte-
grates into the presented SNMPC framework and that outperforms the employed value
function approximation significantly is presented in [19]. This is possible by abandon-
ing the grid approximation approach and using Gaussian kernels that are placed in an
optimized fashion.
6 Conclusions
A novel framework for closed-form stochastic Nonlinear Model Predictive Control for
a continuous state space and a finite set of control inputs has been presented that di-
rectly incorporates the noise influence in the corresponding optimal control problem.
By using the proposed state prediction methods, which are based on transition density
approximation by Gaussian mixture densities and complexity reduction techniques, the
otherwise not analytically solvable state prediction of nonlinear noise affected systems
can be performed in an efficient closed-form manner. Another very important aspect of
NMPC is the modeling of the cost function. The proposed methods also use Gaussian
mixtures, which leads to a level of flexibility far beyond the traditional representations.
Stochastic Nonlinear Model Predictive Control 251
By employing the same representation for both the predicted probability density func-
tions and the cost functions, stochastic NMPC is solvable in closed-form for nonlin-
ear systems with consideration of noise influences. The effectiveness of the presented
framework and the importance of the consideration of noise in the controller have been
shown in simulations of a walking robot following a specified trajectory.
One interesting future extension will be the incorporation of the state estimation in
the control, which is important for nonlinear systems and nonquadratic cost functions,
as here the separation principle does not hold. An additional interesting aspect will be
the consideration of effects of inhomogeneous noise, i.e., noise with state and/or input
dependent noise levels. Here, the consideration of the stochastic behavior of the system
is expected to have an even greater impact on the control quality. Also the extension
to new application fields is intended. Of special interest is the extension to the related
emerging field of Model Predictive Sensor Scheduling [20, 21], which is of special im-
portance, e.g. in sensor-actuator-networks.
References
1. Qin, S.J., Badgewell, T.A.: An Overview of Industrial Model Predictive Control Technology.
Chemical Process Control 93(316) (1997) 232–256
2. Findeisen, R., Allgwer, F.: An Introduction to Nonlinear Model Predictive Control. In: 21st
Benelux Meeting on Systems and Control. (March 2002) 119–141
3. Ohtsuka, T.: A Continuation/GMRES Method for Fast Computation of Nonlinear Receding
Horizon Control. Automatica 40(4) (April 2004) 563–574
4. Camacho, E.F., Bordons, C.: Model Predictive Control. 2 edn. Springer-Verlag London Ltd.
(June 2004)
5. Kappen, H.J.: Path integrals and symmetry breaking for optimal control theory. Journal of
Statistical Mechanics: Theory and Experiments 2005(11) (November 2005) P11011
6. Deisenroth, M.P., Weissel, F., Ohtsuka, T., Hanebeck, U.D.: Online-Computation Approach
to Optimal Control of Noise-Affected Nonlinear Systems with Continuous State and Control
Spaces. In: Proceedings of the European Control Conference (ECC 2007), Kos, Greece (July
2007)
7. Nikovski, D., Brand, M.: Non-Linear Stochastic Control in Continuous State Spaces by Ex-
act Integration in Bellman’s Equations. In: Proceedings of the 2003 International Conference
on Automated Planning and Scheduling. (June 2003) 91–95
8. Marecki, J., Koenig, S., Tambe, M.: A Fast Analytical Algorithm for Solving Markov Deci-
sion Processes with Real-Valued Resources. In: Proceedings of the Twentieth International
Joint Conference on Artificial Intelligence (IJCAI-07). (January 2007)
9. Huber, M., Brunn, D., Hanebeck, U.D.: Closed-Form Prediction of Nonlinear Dynamic
Systems by Means of Gaussian Mixture Approximation of the Transition Density. In: Pro-
ceedings of the IEEE International Conference on Multisensor Fusion and Integration for
Intelligent Systems (MFI 2006). (September 2006) 98–103
10. Weissel, F., Huber, M.F., Hanebeck, U.D.: A Closed–Form Model Predictive Control Frame-
work for Nonlinear Noise–Corrupted Systems. In: 4th International Conference on Informat-
ics in Control, Automation and Robotics (ICINCO 2007). Volume SPSMC., Angers, France
(May 2007) 62–69
11. Weissel, F., Huber, M.F., Hanebeck, U.D.: Test-Environment based on a Team of Miniature
Walking Robots for Evaluation of Collaborative Control Methods. In: IEEE/RSJ Interna-
tional Conference on Intelligent Robots and Systems (IROS 2007). (November 2007)
252 F. Weissel et al.
Abstract. The conjugate gradient is the most popular optimization method for
solving large systems of linear equations. In a system identification problem, for
example, where very large impulse response is involved, it is necessary to apply a
particular strategy which diminishes the delay, while improving the convergence
time. In this paper we propose a new scheme which combines frequency-domain
adaptive filtering with a conjugate gradient technique in order to solve a high or-
der multichannel adaptive filter, while being delayless and guaranteeing a very
short convergence time.
1 Introduction
The multichannel adaptive filtering problem’s solution depends on the correlation be-
tween the channels, the number of channels and the order and nature of the impulse re-
sponses involved in the system. The multichannel acoustic echo cancellation (MAEC)
application, for example, can be seen as a system identification problem with extremely
large impulse responses (depending on the environment and its reverberation time, the
echo paths can be characterized by FIR filters with thousands of taps).
In these cases a multirate adaptive scheme such a partitioned block frequency-
domain adaptive filter (PBFDAF) [8] is a good alternative and is widely used in com-
mercial systems nowadays. However, the convergence speed may not be fast enough
under certain circumstances.
Figure 1 shows the working framework, where xp represents the p channel input
signal, d the desired signal, y the output of adaptive filter and e the error signal we try
to minimize. In typical scenarios, the filter input signals xp , p = 1, · · · , P (where P is
a number of channels), are highly correlated which further reduces the overall conver-
gence of the adaptive filter coefficients wpm , m = 1, · · · , L (L is the filter length),
P
L
y (n) = xp (n − m) wpm . (1)
p=1 m=1
The mean square error (MSE) minimization of the multichannel signal with respect
to the filter coefficients is equivalent to the Wiener-Hopf equation
Rw = r . (2)
254 L.G. Morales
x1
w1
y e
xP
wP
2 PBFDAF
The PBFDAF was developed to deal efficiently with such situations. The PBFDAF is a
more efficient implementation of Least Mean Square (LMS) algorithm in the frequency-
domain. It reduces the computational burden and user-delay bounded. In general, the
PBFDAF is widely used due to be good trade-off between speed, computational com-
plexity and overall latency. However, when working with long impulse response, as
the acoustic impulse responses (AIR) used in MAEC, the convergence properties pro-
vided by the algorithm may not be enough. Besides, the multichannel adaptive filter is
structurally more difficult, in general, than the single channel case [4].
This technique makes a sequential partition of the impulse response in the time-
domain prior to a frequency-domain implementation of the filtering operation. This
The Conjugate Gradient Partitioned Block Frequency-Domain 255
P
Q K−1
y (n) = xp (n − qK − m) wp(qK+m) . (3)
p=1 q=1 m=0
Where the total filter length L, for each channel, is a multiple of the length of each
segment L = QK, K ≤ L. Thus, using the appropriate data sectioning procedure, the
Q linear convolutions (per channel) of the filter can be independently carried out in the
frequency-domain with a total delay of K samples instead of the QK samples needed
in standard FDAF implementations.
Figure 2 shows the block diagram of the algorithm using the overlap-save method.
In the frequency domain with matrix notation, equation (3) can be expressed as
Y=X⊗W . (4)
Notice that the sums are performed prior to the time-domain translation. In this way
we reduce (P − 1)(Q − 1) FFTs in the complete filtering process. As in any adaptive
system the error can be obtained as
e=d−y , (6)
T
with d = d (mK) · · · d ((m + 1) K − 1) .
256 L.G. Morales
x1
d
x1 x1
z −K
x21 w21
y21 ··· y 0 e
z −K
xQ1 yQ1
wQ1
xP
xP xP
x1P y1P yP
w1P
z −K
x2P y2P
w2P
z −K
xQP yQP
wQP
The error in the frequency-domain (for the actualization of the filter coefficients)
can be obtained as
0K×1
e=F . (7)
e
3 PBFDAF-CG
CG algorithm is a technique originally developed to minimize quadratic functions, as
(2), which was later adapted for the general case [6]. Its main advantage is its speed
as it converges in a finite number of steps. In the first iteration it starts estimating the
gradient, as in the steepest descent (SD) method, and from there it builds successive
directions that create a set of mutually conjugate vectors with respect to the positively
defined Hessian (in our case, the auto-correlation matrix R in the frequency-domain).
In each m-block iteration the conjugate gradient algorithm will iterate k = 1, · · · ,
min(N, K) times; where N represent the memory of the gradient estimation, N ≤ K.
In a practical system the algorithm is stopped when it reaches a user-determined MSE
level. To apply this conjugate gradient approach to the PBFDAF algorithm the weight
actualization equation (8) must be modified as
w (m + 1) = w (m) + αv (m) . (11)
Where w is the coefficient vector of dimension M QP × 1 which results from re-
arranging matrix W (in the notation w ← W). v is a finite R-conjugated vector set
i Rvj = 0, ∀i = j. The R-conjugacy property is useful as the linear
which satisfies vH
independency of the conjugate vector set allows expanding the w• solution as
K−1
w• = α0 v0 + · · · + αK−1 vK−1 = αk vk . (12)
k=0
Starting at any point w0 of the weighting space, we can define v0 = −g0 being
g0 ← Ḡ0 , Ḡ0 = ∇ (W0 ), p0 ← P̄0 , P̄0 = ∇ W0 − Ḡ0 .
wk+1 = wk + αk vk , (13)
gH v
αk = vH gk −kp , (14)
k ( k k)
This alternative approach does not require knowing neither the Hessian nor the em-
ployment of a linear search. Notice that all the operations (13-17) are vector opera-
tions that keep the computational complexity low. The equation (17) is known as the
Hestenes-Stiefel method but there are different approaches for calculating βk : Fletcher-
Reeves (19), Polar-Ribire (20) and Dai-Yuan (21) methods.
gH g
βkF R = k+1 k+1
gH gk , (19)
k
gH (g −gk )
βkP R = k+1 gHk+1
gk , (20)
k
gH gk+1
βkDY = vH k+1 . (21)
k (gk+1 −gk )
The constant βk is chosen to provide R-conjugacy for the vector vk with respect
to the previous direction vectors vk−1 , · · · , v0 . Instability occurs whenever βk exceeds
unity.
In this approach, the successive directions are not guaranteed to be conjugate to each
other, even when one uses the exact value of the gradient at each iteration. To ensure the
algorithm stability the gradient can be initialized forcing βk = 1 in (16) when βk > 1.
4 Computational Cost
Table 1 shows a comparative analysis for both algorithms in terms of operations number
(multiplications, sums) clustered by functionality. Note that constants A, B and C, in
the PBFDAF computational burden estimation, are used as reference for the number of
operations in PBFDAF-CG. For one iteration (k = 1), the computational cost of the
PBFDAF-CG is 40 times higher than the PBFDAF.
5 Simulation Examples
The impulse responses are calculated using the image method [2] with an expected
reverberation time of 70ms (reflection coefficients [0.8 0.8; 0.5 0.5; 0.6 0.6]). The
speech source, microphones and loudspeakers are situated as in Fig. 3. In the emit-
ting room, the source is located in [1000 500 1000] and the microphones in [{800 900
1000 1100 1200} 2000 750]. Notice that the microphone separation is only 10 cm,
which would be a worse case scenario that provides with highly correlated signals. In
the reception room the loudspeakers are situated in [{500 750 1000 1250 1500} 100
750] and the microphone in [1000 2000 750].
The directivity patterns of the loudspeakers ([elevation 0 ◦ , azimuth -90 ◦ , aperture
beam 180 ◦ ]) and the microphones ([0 ◦ 90 ◦ 180 ◦ ]) are modified so that they are face
to face. We are considering P = 5 channels as it is a realistic situation for home appli-
cations; enough for obtaining good spatial localization and significantly more complex
than the stereo case.
260 L.G. Morales
6 Conclusions
The PBFDAF algorithm is widely used in multichannel adaptive filtering applications
such as MAEC commercial systems with good results (in general for stereo case).
However, especially when working in multichannel, high reverberation environ-
ments (like teleconference) its convergence may not be fast enough. In this article we
262 L.G. Morales
have presented a novel algorithm: PFDAF-CG; based on the same structure, but using
much more powerful CG techniques to speed up the convergence time and improve the
MSE and misalignment performance.
As shown in the results, the proposed algorithm improves a MSE and misalignment
performance, and converges a lot faster than its counterpart while keeping the compu-
tational convergence relatively low, because all the operations are performed between
vectors in the frequency-domain.
We are working on better gradient estimation methods in order to reduce compu-
tational cost. Besides, it is possible to arrive to a compromise between complexity and
speed modifying the maximum number of iterations.
−1
0 1 2 3 4 5 6 7 8
0
MSE (dB)
−20
−40
−60
−80
0 1 2 3 4 5 6 7 8
missaligment (dB)
1
0
−1
−2
−3
0 1 2 3 4 5 6 7 8
time (s)
12
11
iterations 10
5
0 1 2 3 4 5 6 7 8
time (s)
Figure 6 shows the PBFDAF-CG iterations versus time. The total number of iter-
ations for this experiment is 992 for PBFDAF and 1927 for PBFDAF-CG (80 times
higher computational cost).
Figure 7 shows the result of PBFDAF-CG with MLS source (identical settings)
and Fig. 8 the iterations versus time. Notice that more uniform MSE convergence and
best misalignment. The computational cost decrease while time the increases. A better
performance is possible increasing the SNR and diminishing the MSE level threshold.
References
1. Aguado, A., Martnez, M.: Identificacin y Control Adaptativo. Prentice Hall (2003).
2. Allen, J.B., Berkley, D.A.: Image method for efficiently simulating small-room acoustics. In
J.A.S.A. 65 (1979) 943–950.
3. Bendel, Y., Burshtein, D.: Delayless Frequency Domain Acoustic Echo Cancelation. In IEEE
Transactions on Speech and Audio Processing. 9:5 (2001) 589–587.
4. Benesty, J., Huang, Y. (Eds.): Adaptive Signal Processing: Applications to Real-World Prob-
lems, Springer (2003).
5. Boray, G., Srinath, M.D.: Conjugate Gradient Techniques for Adaptive Filtering. In IEEE
Transactions on Circuits and Systems-I: Fundamental Theory and Application. 39:1 (1992)
1–10.
6. Luenberger, D.G.: Introduction to Linear and Nonlinear Programming, MA: Addison-
Wesley, Reading, Mass (1984).
7. Shink, J.: Frequency-Domain and Multirate Adaptive Filtering. In IEEE Signal Processing
Magazine. 9:1 (1992) 15–37.
8. Páez Borrallo, J., Garcı́a Otero, M.: On the implementation of a partitioned block frequency-
domain adaptive filter (PBFDAF) for long acoustic echo cancellation. In Signal Processing.
27 (1992) 301–315.
264 L.G. Morales
Appendix
The “conjugacy” relation vH i Rvj = 0, ∀i = j means that two vectors, vi and vj , are
orthogonal with respect to any symmetric positive matrix R. This can be looked upon as
a generalization of the orthogonality, for which R is the unity matrix. The best way to
visualize the working of conjugate directions is by comparing the space we are working
in with a “stretched” space.
w0
w0
w1 g1
w1 g1
v1
v1
w w
v0
v0
The SD methods are slow due to the successive gradient orthogonality that results of
minimize the recursive updating equation (8) respect to μ (m). The movement toward
a minimum has the zigzag form. The left part in Fig. 9 shows the quadratic function
contours in a real space (for r = 0 in (2) are elliptical). Any pair of vectors that appear
perpendicular in this space would be orthogonal. The right part shows the same drawing
in a space that is stretched along the eigenvector axes so that the elliptical contours from
the left part become circular. Any pair of vectors that appear to be perpendicular in
this space is in fact R-orthogonal. The search for a minimum of the quadratic function
starts at w0 , and takes a step in the direction v0 and stops at the point w1 . This is a
minimum point along that direction, determined in the same way for SD method. While
the SD method would search in the direction g1 , the CG method would chose v1 . In this
stretched space, the direction v0 appears to be a tangent to the now circular contours
at the point w1 . Since the next search direction v1 is constrained to be R-orthogonal to
the previous, they will appear perpendicular in this modified space. Hence, v1 will take
us directly to the minimum point of the quadratic function (2nd order in the example).
Guaranteed Characterization of Capture Basins of
Nonlinear State-Space Systems
Abstract. This paper proposes a new approach to solve the problem of comput-
ing the capture basin C of a target T. The capture basin corresponds to the set of
initial states such that the target is reached in finite time before possibly leaving
of constrained set. We present an algorithm, based on interval analysis, able to
characterize an inner and an outer approximation C− ⊂ C ⊂ C+ of the capture
basin. The resulting algorithm is illustrated on the Zermelo problem.
1 Introduction
The purpose of this paper is to present an algorithm based on guaranteed numerical
computation which, given the dynamics of the system, provides an inner and outer
approximation of the capture basin. We recall some definitions and notations related to
capture basin. In the sequel, we consider nonlinear continuous-time dynamical systems
of the form
ẋ(t) = f (x(t), u(t)),
(1)
x(0) = x0 ,
were x ∈ Rn is the state of the system with initial condition x0 at t = 0 and u ∈ Rm is
the control vector. We shall assume that the function f is sufficiently regular to guarantee
that for all piecewise continuous function u(.) the solution of (1) is unique. The state
vector x(t) is not allowed to exit a given compact set K ⊂ Rn and the input u(t) should
belong to a given compact set U ⊂ Rm .
We define the flow (see [1]) φt (x0 , u) as the solution of (1) for the initial vector x0 and
for the input function u(.). The path from t1 to t2 is defined by
def
φ[t1 ,t2 ] (x0 , u) = x ∈ Rn | ∃t ∈ [t1 , t2 ], x(t) = φt (x0 , u) . (2)
Define a target set T ⊂ K ⊂ Rn as a closed set we would like to reach for one t ≥ 0.
The capture basin C of T is the set
C x0 ∈ K | ∃t ≥ 0, ∃u(.) ∈ F( [0, t] → U),φt (x0 , u) ∈ T
and φ[0,t] (x0 , u) ⊂ K , (3)
266 N. Delanoue et al.
where F ([0, t] → U) represents the set of piecewise continuous functions from [0, t] →
U. Then, C is the set of initial states x ∈ K for which there exists an admissible control
u, and a finite time t ≥ 0 such that the trajectory φ[0,t] (x0 , u) with the dynamic f under
the control u lives in K and reaches T at time t.
The aim of the paper is to provide an algorithm able to compute an inner and an outer
approximation of C, i.e., to find two subsets C− and C+ such that
C− ⊂ C ⊂ C+ .
2 Interval Analysis
The interval theory was born in the 60’s aiming rigorous computations using finite
precision computers (see [5]). Since its birth, it has been developed and it proposed
today orignal algorithms for solving problems independently to the finite precision of
computers computations, although reliable computations using finite precision remains
one important advantage of the interval based algorithms [6].
An interval [x] is a closed and connected subset of R. A box [x] of Rn is a Cartesian
product of n intervals. The set of all boxes of Rn is denoted by IRn . Note that Rn =
] − ∞, ∞[× · · · ×] − ∞, ∞[ is an element of IRn . Basic operations on real numbers or
vectors can be extended to intervals in a natural way.
− +
Example 1. If [t] = [t1 , t2 ] is an interval and [x] = x−
1 , x1 × x2 , x2 is a box, then
+
− +
x1 , x1 [t1 , t2 ] ∗ x− , x+ =
[t1 , t2 ] ∗ = 1 1
x− +
2 , x2 [t1 , t2 ] ∗ x− +
2 , x2
[min(t1 x− + − + − + − +
1 , t1 x1 , t2 x1 , t2 x1 ), max(t1 x1 , t1 x1 , t2 x1 , t2 x1 )] .
− − − −
[min(t1 x2 , t1 x2 , t2 x2 , t2 x2 ), max(t1 x2 , t1 x2 , t2 x2 , t2 x+
+ + +
2 )]
This methodology can easily be applied for any box [x1 ] × [x2 ] and the resulting algo-
rithm corresponds to an inclusion function for f .
The interval union [x] [y] of two boxes [x] and [y] is the smallest box which contains
the union [x] ∪ [y]. The width w([x]) of a box [x] is the length of its largest side.
The ε-inflation of a box [x] = [x− −
1 , x1 ] × · · · × [xn , xn ] is defined by
+ +
Interval analysis for ordinary differential equations were introduced by Moore [5] (See
[7] for a description and a bibliography on this topic). These methods provide numeri-
cally reliable enclosures of the exact solution of diffential equations. These techniques
are based on Picard Theorem.
Theorem 1. Let t1 be a positive real number. Assume that x(0) is known to belong to
the box [x](0). Assume that u(t) ∈ [u] for all t ∈ [0, t1 ]. Let [w] be a box (that is
expected to enclose the path x(τ ), τ ∈ [0, t1 ]). If
where [f ]([x].[u]) is an inclusion function of f (x, u), then, for all t ∈ [0, t1 ]
such that
Using Theorem 1, one can build an algorithm computing an enclosure [x]([t]) for the
path x([t]) = {x(t), t ∈ [t]} from an enclosure [x] for x(0). The principle of this algo-
rithm is illustrated by Figure 2.
Comments : The interval [t] = [t1 , t2 ] is such that t1 ≥ 0. Step 2 computes an es-
timation [x̂](t2 ) for the domain of all x(t1 ) consistent with the fact that x(0) ∈ [x].
Note that, at this level, it is not certain that [x̂](t2 ) contains x(t2 ). Step 3 computes the
smallest box [v] containing [x](t1 ) and [x̂](t2 ). At Step 4, [v] is inflated (see (4)) to
provide a good candidate for [w]. α and β are small positive numbers. Step 5 checks
the condition of Theorem 1. If the condition is not satisfied, no bounds can be computed
for x(t2 ) and Rn is returned. Otherwise, Step 8 computes a box containing x(t2 ) using
theorem 1.
The algorithm to we gave to compute the interval flow is very conservative. The pes-
simism can drastically be reduced by using the Lohner method [8].
3 Algorithm
This section presents an algorithm to compute an inner and an outer approximation of
the capture basin. It is based on Theorem 2.
Theorem 2. If C− and C+ are such that C− ⊂ C ⊂ C+ ⊂ K, if [x] is a box and if
u ∈ F( [0, t] → U), then
(i) [x] ⊂ T ⇒ [x] ⊂ C
(ii) [x] ∩ K = ∅ ⇒ [x] ∩ C = ∅
(iii) φ (t, [x], u) ⊂ C− ∧ φ ([0, t] , [x], u) ⊂ K ⇒ [x] ⊂ C
(iv) φ (t, [x] , U) ∩ C+ = ∅ ∧ φ (t, [x] , U) ∩ T = ∅ ⇒ [x] ∩ C = ∅
Proof. (i) and (ii) are due to the inclusion T ⊂ C ⊂ K. Since T ⊂ C− ⊂ C, (iii) is a
consequence of the definition of the capture basin (see (3)). The proof of (iv) is easily
obtained by considering (3) and in view of fact that C ⊂ C+ ⊂ K.
Finally, a simple but efficient bisection algorithm is then easily constructed. It is sum-
marized in Algorithm 2. The algorithm computes both an inner and outer approximation
of the capture basin C. In what follows, we shall assume that the set U of feasible input
vectors is a box [u]. The box [x] to be given as an input argument for E NCLOSE should
contain set K.
Comments. Steps 4 and 7 uses Theorem 2, (i)-(iii) to inflate C− . Steps 5 and 8 uses
Theorem 2, (ii)-(iv) to deflate C+ .
where
C− ⊂ C ⊂ C+ .
270 N. Delanoue et al.
Algorithm 2: ENCLOSE.
Data: K, T, [x],[u]
Result: C− , C+
begin
1 C− ← ∅; C+ ← [x]; L ← {[x]} ;
2 while L = ∅ do
3 pop the largest box [x] from L;
4 if [x] ⊂ T then
C− ← C− ∪ [x];
5 else if [x] ∩ K = ∅ then
C+ ← C+ \[x];
6 take t ≥ 0 and u ∈ [u]
7 if [φ] (t, [x], u) ⊂ C− and [φ] ([0, t], [x], u) ⊂ K then
C− ← C− ∪ [x];
8 else if [φ] (t, [x], [u]) ∩ C+ = ∅ and [φ] (t, [x], [u]) ∩ T = ∅ then
C+ ← C+ \[x];
9 else if w([x]) ≥ ε then
bisect [x] and store the two resulting boxes into L;
end
4 Experimentations
This section presents an application of Algorithm 2. The algorithm has been imple-
mented in C ++ using Profil/BIAS interval library and executed on a PentiumM 1.4Ghz
processor. As an illustration of the algorithm we consider the Zermelo problem [9, 10].
In control theory, Zermelo has described the problem of a boat which wants to reach
an island from the bank of a river with strong currents. The magnitude and direction
of the currents are known as a function of position. Let f (x1 , x2 ) be the water current
of the river at position (x1 , x2 ). The method for computing the expression of the speed
vector field of two dimensional flows can be found in [11]. In our example the dynamic
is nonlinear,
x22 − x21 −2x1 x2
f (x1 , x2 ) 1 + 2 , .
(x1 + x22 )2 (x21 + x22 )2
The speed vector field associated to the dynamic of the currents is represented on
Figure 3.
Let T B (0, r) with r = 1 be the island and we set K = [−8, 8] × [−4, 4], where
K represents the river. The boat has his own dynamic. He can sail in any direction at a
speed v. Figure 4 presents the two-dimensional boat. Then, the global dynamic is given
by ⎧
⎪ x (t) = 1 + x2 − x1 + v cos(θ)
⎪ 2 2
⎨ 1 2 2
(x1 + x2 )2 ,
⎪
⎪ x2 (t) = −2x 1 x2
⎩ + v sin(θ)
(x21 + x22 )2
where the controls 0 ≤ v ≤ 0.8 and θ ∈ [−π, π].
Guaranteed Characterization of Capture Basins of Nonlinear State-Space Systems 271
Figure 5 shows the result of the ENCLOSE algorithm, where the circle delimits the
border of the target T. Then, C− corresponds to the union of all dark grey boxes and
C+ corresponds to the union of both grey and light grey boxes. Thus, we have the
following inclusion relation :
C− ⊂ C ⊂ C+ .
272 N. Delanoue et al.
5 Conclusions
In this paper, a new approach to deal with capture basin problems is presented. This ap-
proach uses interval analysis to compute an inner an outer approximation of the capture
basin for a given target. To fill out this work, different perspectives appear. It could be in-
teresting to tackle problems in significantly larger dimensions. The limitation is mainly
due to the bisections involved in the interval algorithms that makes the complexity ex-
ponential with respect to the number of variables. Constraint propagation techniques
[12] make it possible to push back this frontier and to deal with high dimensional prob-
lems (with more than 1000 variables for instance). In the future, we plan to combine our
algorithm with graph theory and guaranteed numerical integration [7, 13] to compute a
guaranteed control u.
References
1. Hirsch, M. W., Smale, S.: Differential Equations, Dynamical Systems, and Linear Algebra.
ap, San Diego (1974)
2. Aubin, J.: Viability theory. Birkhuser, Boston (1991)
3. Saint-Pierre, P.: Approximation of the viability kernel. Applied Mathematics & Optimization
29 (1994) 187-209
4. Cruck, E., Moitie, R., Seube, N.: Estimation of basins of attraction for uncertain systems
with affine and lipschitz dynamics. Dynamics and Control 11(3) (2001) 211-227
5. Moore, R.E.: Interval Analysis. Prentice-Hall, Englewood Cliffs, NJ (1966)
6. Kearfott, R. B., Kreinovich, V., eds.: Applications of Interval Computations. Kluwer, Dor-
drecht, the Netherlands (1996)
7. Nedialkov, N. S., Jackson, K. R., Corliss, G. F.: Validated solutions of initial value problems
for ordinary differential equations. Applied Mathematics and Computation 105 (1999) 21-68
8. Lohner, R.: Enclosing the solutions of ordinary initial and boundary value problems. In
Kaucher, E., Kulisch, U., Ullrich, C., eds.:Computer Arithmetic: Scientific Computation and
Programming Languages. BG Teubner, Stuttgart, Germany (1987) 255-86
9. Bryson, A.E., Ho, Y.C.: Applied optimal control: optimization, estimation, and control. Hal-
sted Press (1975)
10. Cardaliaguet, P., Quincampoix, M., Saint-Pierre, P.: Optimal times for constrained nonlinear
control problems without local controllability. Applied Mathematics and Optimization 36
(1997) 21-42
11. Batchelor, G.K.: An introduction to fluid dynamics. Cambridge university press(2000)
12. L. Jaulin, M. Kieffer, O. Didrit, E. Walter: Applied Interval Analysis, with Examples in Pa-
rameter and State Estimation, Robust Control and Robotics. Springer-Verlag, London (2001)
13. Delanoue, N.: Algorithmes numriques pour l’analyse topologique. PhD dissertation,
Universit d’Angers, ISTIA, France (decembre 2006) Available at www.istia.univ-
angers.fr/∼delanoue/.
In Situ Two-Thermocouple Sensor Characterisation
using Cross-Relation Blind Deconvolution
with Signal Conditioning for Improved Robustness
Peter Hung1, Seán McLoone1, George Irwin2, Robert Kee2 and Colin Brown2
1
Department of Electronic Engineering, National University of Ireland Maynooth
Maynooth, Co. Kildare, Ireland
{phung, sean.mcloone}@eeng.nuim.ie
2
Virtual Engineering Centre, Queen’s University Belfast
Belfast, BT9 5HN, Northern Ireland
{g.irwin, r.kee, cbrown17}@qub.ac.uk
1 Introduction
In order to achieve high quality, low cost production with low environmental impact,
modern industry is turning more and more to extensive sensing of processes and
machinery, both for diagnostic purposes and as inputs to advanced control systems.
Of particular interest in many applications is the accurate measurement of temperature
transients in gas or liquid flows. For example, in an internal combustion engine the
dynamics of the exhaust gas temperature is a key indicator of its performance as well
as a valuable analytical input for on-board diagnosis of catalyst malfunction, while in
the pharmaceutical industry precise control of transient temperatures is sometimes
necessary in lypholisers used in drug manufacture to ensure the quality and
consistency of the final product. These and many other applications, thus require the
availability of fast response temperature sensors.
274 P. Hung et al.
In 1936 Pfriem [12] suggested using two thermocouples with different time constants
to obtain in situ sensor characterisation. Since then, various thermocouple
compensation techniques incorporating this idea have been proposed in an attempt to
achieve accurate and robust temperature compensation [2] [5] [6] [7] [13]. However,
the performance of all these algorithms deteriorates rapidly with increasing noise
power, and many are susceptible to singularities and sensitive to offsets [14].
Some of these two-thermocouple methods rely on the restrictive assumption that
the ratio of the thermocouple time constants α (α <1 by definition) is known a priori.
Hung et al. [2] [13] developed difference equation methods that do not require any a
priori assumption about the time constant ratio. The equivalent discrete time
representation for the thermocouple model (1) is:
276 P. Hung et al.
T (k ) = aT (k − 1) + bTf (k − 1) , (4)
where a and b are difference equation ARX parameters and k is the sample instant.
Assuming ZOHs and sampling interval τs, the parameters of the discrete and
continuous time thermocouple models are related by
a = exp(− τ s τ ) , b = 1 − a . (5)
For two thermocouples we have
T1 (k ) = a1T1 (k − 1) + b1Tf (k − 1) and (6)
where the pseudo-sensor output ΔT2k and inputs ΔT1k and ΔT12k −1 are defined as
ΔT1k = T1 (k ) − T1 (k − 1)
ΔT2k = T2 (k ) − T2 (k − 1) (10)
ΔT12k −1 = T1 (k − 1) − T2 (k − 1).
For an M-sample data set (9) can be expressed in ARX vector form
Y = Xθ , (11)
with Y = ΔT2k , X = [ΔT1k ΔT12k −1 ], and θ = [ β b2 ]T . Here ΔT1k , ΔT2k and ΔT12k −1 are
vectors containing M-1 samples of the corresponding composite signals ΔT1k , ΔT2k
and ΔT12k −1 .
This characterisation model, referred to as the β-formulation, can be identified
using least squares techniques. Due to the form of the composite input and output
signals, the noise terms in the X and Y data blocks are not independent with the result
that conventional least squares and total least squares both generate biased parameter
estimates even when the measurement noise on the thermocouples is independent.
However, by formulating identification as a generalised total least squares (GTLS)
problem, unbiased parameter estimates can be obtained. The resulting β-GTLS
algorithm is more robust than other difference equation formulations [15] and
Two-Thermocouple Sensor Characterisation using Cross-Relation Blind Deconvolution 277
One of the best known deterministic blind deconvolution approaches is the method of
cross-relation (CR) proposed by Liu et al. [17]. Such techniques exploit the
information provided by output measurements from multiple systems of known
structure but unknown parameters, for the same input signal.
This new approach to characterisation of thermocouples is completely different
from those in Section 2. As commutation is a fundamental assumption for the method
of cross-relation, the thermocouple models are both assumed to be linear. This is
reasonably realistic as long as the thermocouples concerned are used within well-
defined temperature ranges. Nonetheless, linearisation can easily be carried out using
either the data capture hardware or software, even if the thermocouple response is
nonlinear. Further, the approach requires constant model parameters, therefore the
fluid or gas flow velocity v is assumed to be constant, such that the two thermocouple
time constants τ1 and τ2 are time-invariant.
Equation (13) is then minimised with respect to τˆ1 and τˆ2 to yield the estimates of
the unknown thermocouple time constants. Clearly, the cross-relation cost function
J MSE (τˆ1 , τˆ2 ) is zero when τˆ1 = τ 1 and τˆ2 = τ 2 . In practice it will not be possible to
obtain an exact match between T12 and T21 due to measurement noise and other factors
such as thermocouple modelling inaccuracy and violations of the assumption that the
two thermocouples are experiencing identical environmental conditions.
Xu et al. [18] suggest that one of the necessary conditions for multiple finite-
impulse-response channels to be identifiable is that their transfer function
polynomials do not share common roots. Applying this condition to the two-
thermocouple characterisation problem corresponds to requiring that the time
constants, and hence the diameters (3), of the thermocouples are different, that is
τ1 ≠ τ 2 ⇒ d1 ≠ d 2 . (14)
Not surprisingly, this requirement is consistent with all other two-thermocouple
characterisation techniques mentioned in Section 2. Thus, cross-relation
deconvolution converts the problem of sensor characterisation into an optimisation
one.
A 3-D surface plot and a contour map of a typical J MSE (τˆ1 , τˆ2 ) cost function are
shown in Fig. 2. Unfortunately, J MSE (τˆ1 , τˆ2 ) is not quadratic and cannot therefore be
minimised using linear least squares. More importantly, the cost function has a second
minimum when both time constant values approach infinity. Under these conditions,
both low-pass filters (12) take infinite amounts of time to respond. In other words,
they are effectively open-circuited and their differences will always be zero. The
existence of this minimum applies regardless of the noise conditions or any violations
of the modelling assumptions. The minimum at infinity is thus in fact the global
minimum, while the true time constant value is located at a local minimum. In the
absence of noise, it is noted that J MSE = 0 at both the global and local minima.
Two-Thermocouple Sensor Characterisation using Cross-Relation Blind Deconvolution 279
0.05
0.1
0.045
0.08 global
minimum
0.04
0.06
MSE
local minimum
τ (sec)
0.035
and optimum
J
0.04
1
0.03
0.02
local 0.025
0 minimum
0.01
0.02 0.02
0.03
τ1 (sec) 0.04
0.16 0.14 0.12 0.1 0.08
0.015
0.05 0.2 0.18
0.08 0.1 0.12 0.14 0.16 0.18 0.2
τ2 (sec) τ (sec)
2
(a) (b)
Fig. 2. A typical JMSE cost function for noiseless thermocouple measurements: (a) 3-D plot of
cost function; and (b) corresponding contour map.
0.2
0.1
0.18
0.08 0.16
0.14
0.06
JNMSE
0.12
JNMSE
0.04 0.1
J
0.08
0.02
local 0.06
0 minimum 0.04 JMSE
0.01
0.02
0.03 0.02
τ1 (sec) 0.04
0.14 0.12 0.1 0.08
0
0.05 0.18 0.16
0.2 0 0.2 0.4 0.6 0.8 1
τ (sec) τ2 (sec)
2
(a) (b)
Fig. 3. A typical JNMSE cost function for noiseless thermocouple measurements: (a) 3-D plot;
and (b) a comparison of 1-D cross sections of the MSE and NMSE CR cost functions.
The narrow basin of attraction of the desired local minimum coupled with the
global minimum at infinity has serious implications for optimisation complexity since
search bounds have to be carefully selected to avoid divergence of gradient search
algorithms to the global minimum. Further, with increasing noise level the local
minima becomes shallower and shallower, and eventually disappears causing the
optimisation problem to become ill-posed.
As noted in [3] the ill-posed problem can be resolved by employing a normalised
mean squared error (NMSE) cost function defined as
E[(T12 (t ) − T21 (t )) 2 ]
J NMSE (τˆ1 , τˆ 2 ) = . (15)
0.5 [var(T12 ) + var(T21 )]
A typical example of this cost function is plotted in Fig. 3(a). To highlight the effect
of normalisation, the 1-D cross sections of both the MSE and NMSE cost functions
along the line τˆ1 = α τˆ2 is also plotted in Fig. 3(b). Essentially, normalisation
penalises large time constants, thereby eliminating the minimum at infinity giving a
well conditioned convex cost function.
280 P. Hung et al.
A weakness of the MSE and NMSE cross-relation algorithms is that they generate
biased estimates. In fact, a statistical analysis of the algorithms [8] reveals that the
MSE implementation yields postiviely biased estimates, while the NMSE
implementation results in negatively biased estimates at high noise levels, though the
latter is less significant when temperature variation is broadband.
4 Signal Conditioning
One approach to reducing the noise induced estimation bias is to introduce signal
conditioning filters (Fc(s)) prior to the CR characterisation algorithm as illustrated in
Fig. 4. Provided the filters are identical, linear (thereby ensuring commutativity) and
do not completely block the measured signals, the operation of the CR algorithm is
unaffected. Within these constraints there is substantial freedom in the design of the
filters.
Assuming white measurement noise, which has a constant power spectrum profile
across all frequencies, the obvious choice is to match the passband of the conditioning
filters to the bandwidth of the temperature fluctuations. However, this is not the
optimum choice, since it does not take into account the effect of the thermocouples.
2
Consider the magnitude squared transfer function G ( jω ) from the input signal, Tf,
to the cross-relation (CR) error signal, e, when Hˆ 1 = Hˆ 2 = 1 , defined as
2
2T ( jω ) − T2 ( jω ) 2
G ( jω ) = 1 = H 1 ( jw) − H 2 ( jw) . (16)
Tf ( jω )
decay occurs because there is less and less difference between thermocouple signals
while moving into the passband of the lowest bandwidth thermocouple (i.e. < 1/τ2).
Fig. 5. Normalised Cross-relation error transfer function as a function of frequency for a two-
thermocouple probe with time constants 0.02 and 0.1 seconds respectively.
5 Performance Evaluation
Temperature ( C)
o
55
T2
50
50
45
40
35 45
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
(a) (b)
Fig. 6. Simulated temperature profiles: (a) sinusoidal; and (b) random band-limited to 20 Hz.
The test rig, depicted in Fig. 7, was specifically designed to produce periodic
temperature fluctuations at constant fluid velocity [10]. It is supplied with air through
a pressure regulator and a needle valve in order to obtain approximately constant mass
flow rate. The flow is divided into two streams, one heated and the other at the
supplied temperature. The streams are balanced using ball valves to ensure a uniform
velocity profile across the air outlet. Both streams are then passed to isolated
reservoirs before leaving their corresponding orifices. Finally, the warm and cool
streams are combined in the mixing chamber before reaching the temperature probe.
The frequency of periodic temperature fluctuations is controlled by the frequency of
crank rotation that is connected to the rig via a linkage. The temperature probe
consists of two thermocouples of unequal diameters (50 and 127 µm) and a constant-
current thermal anemometer (3.8 µm) used to provide a reference temperature
measurement. The gas velocity was measured using a pitot-static tube and a fast
response pressure transducer which was fixed directly above the temperature
measurement probe.
Using this test rig data was collected for periodic temperature fluctuations with a
fundamental frequency of 38 rad/s at a sampling frequency of 1 kHz (Fig. 8(a)). Table
1 shows the time constant estimates obtained with each of the three characterisation
methods. For comparison purposes, the best estimate of the time constants using the
anemometer signal as an approximation to the true temperature is also included in the
table. This essentially represents a lower bound on the true time constant values.
Two-Thermocouple Sensor Characterisation using Cross-Relation Blind Deconvolution 283
Tf
T12
3
90
2
80 T1
T2
Temperature ( oC)
1
CR Signal ( C)
o
70
60
−1
50
−2
T21
40
−3
30 −4
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5
(a) (b)
Fig. 8. Test rig data: (a) temperature profiles, and; (b) comparison of CR signals.
As can be seen, the results are consistent across all methods, suggesting that the
data has low noise contamination and is consistent with the two-thermocouple probe
modelling assumptions. This is further confirmed by the close fit obtained between
the CR signals (T12 and T21 from Fig. 1) as shown in Fig. 8(b). Consequently, for the
purposes of the Monte Carlo simulations, the 500 point test rig data set is taken to be
essentially noise free with τ 1 ≈ 38 and τ 2 ≈ 188 ms.
284 P. Hung et al.
For each of the three data sets 100-run Monte Carlo simulations were performed
for zero-mean white Gaussian measurement noise added to the noise free
thermocouple outputs. The amount of noise added was quantified in terms of the
noise level Le, defined as
var(ni )
Le = ⋅ 100% , i = 1, 2 , (18)
var(Tf )
where n1 and n2 are the noises added to the thermocouple measurements. For a given
Le, the performance of each characterisation algorithm was assessed in terms of the
percentage error in estimating the time constants, that is:
τˆi − τ i
eτ = ⋅100%, i = 1, 2 . (19)
i
τi
The means and standard deviations of this estimation error are recorded for a range
of noise levels in Table 2 for each of the characterisation algorithms under
consideration (SCCR, CR and β-GTLS). In SCCR the bandwidth of the conditioning
filters was chosen as fL=60, fU=90 rad/s for the sinusoidal data, fL=5, fU=120 rad/s for
the random data and fL=25, fU=125 rad/s for the test rig data. Note that results for τˆ2
have been omitted as they show a similar pattern to those observed for τˆ1 .
The results clearly show that the inclusion of signal conditioning filters has the
desired effect. SCCR consistently has much lower bias than CR, particularly at higher
noise levels. The picture for variance is less clear. SCCR estimates have slightly
greater variance on average than the CR estimates for the simulated data, but
substantially less variance in the case of the test rig data. This is currently the subject
of further study.
While β-GTLS is theoretically unbiased, the variance in the estimates grows very
rapidly with noise, and the algorithm essentially breaks down for Le > 5 in the
simulated examples and Le > 1 for the test rig data. The substantially worse β-GTLS
results in the last problem are due to the higher sample rate and fewer data points in
this problem, both of which amplify the sensitivity of GTLS to noise. In contrast, CR
and SCCR perform very well on this problem, though the estimation variance is
significantly higher than in the simulated examples due to the smaller number of data
points (500 compared to 5000). In practice, pre-filtering of the data can substantially
improve the robustness of GTLS to noise at the expense of introducing some bias [2],
but this has not been investigated here due to space constraints.
The cross-relation (CR) method of blind deconvolution provides an attractive
framework for two-thermocouple sensor characterisation. It does not require a priori
knowledge of the thermocouple time constant ratio α, as required in many other
characterisation algorithms, though this information can be exploited if available. CR
is more noise-tolerant in the sense of reduced parameter estimation variance when
compared to the alternatives such as β-GTLS. The standard CR implementation yields
Two-Thermocouple Sensor Characterisation using Cross-Relation Blind Deconvolution 285
biased estimates, but this is significantly reduced with the inclusion of signal
conditioning filters. The resulting SCCR algorithm has been shown to be superior to
other methods on both simulated and experimental data.
Table 2. Means and (standard deviations) of τˆ1 estimation errors (%) obtained with β-GTLS,
CR and SCCR for each data set for a range of noise levels.
Noise Level
(Le) 1 3 5 7 10 15 20
Sinusoidal simulation
-0.17 -0.77 -0.57 3.14 5.47 35.85 -6.43
β-GTLS (0.69) (5.29) (13.90) (26.93) (60.83) (318.50) (969.42)
-0.13 -1.64 -3.95 -6.94 -12.10 -19.78 -26.3
CR
(0.32) (1.01) (1.56) (2.39) (3.22) (4.29) (4.43)
-0.07 -0.36 -0.57 -1.53 -2.57 -5.68 -9.53
SCCR
(0.36) (1.02) (1.58) (2.41) (3.25) (4.70) (6.43)
Random simulation
-0.01 0.33 1.27 5.23 17.12 49.57 98.31
β-GTLS (0.96) (7.62) (20.27) (42.64) (88.72) (465.72) (549.96)
0.01 -0.34 0.73 1.55 2.87 5.27 10.18
CR
(0.21) (0.71) (1.24) (1.37) (2.39) (3.36) (4.39)
-0.04 0.22 -0.08 0.53 1.44 2.51 5.09
SCCR
(0.33) (0.97) (1.54) (2.28) (3.07) (4.80) (6.60)
Test rig
0.05 20.84 -127.97 -180.31 -281.23 -229.94 -168.53
β-GTLS (9.76) (89.11) (1237.54) (1028.05) (697.22) (604.96) (1000.33)
-1.72 -1.08 -1.21 -1.83 -4.43 -3.35 -2.51
CR
(1.29) (4.23) (6.12) (7.50) (11.50) (19.73) (31.30)
-0.94 -0.98 -0.28 -0.59 -1.67 -1.87 1.57
SCCR
(1.03) (2.89) (4.51) (5.90) (9.68) (14.59) (19.74)
References
1. Kee, R. J., Blair, G. P.: Acceleration test method for a high performance two-stroke racing
engine. In: SAE Motorsports Conference, Detroit, MI, Paper No. 942478 (1994)
2. Hung, P. C., McLoone, S., Irwin G., Kee, R.: A difference equation approach to two-
thermocouple sensor characterisation in constant velocity flow environments. Rev. Sci.
Instrum. 76, Paper No. 024902 (2005)
3. Hung, P. C., Kee, R. J., Irwin G. W., McLoone, S. F.: Blind Deconvolution for Two-
Thermocouple Sensor Characterisation. ASME Dyn. Sys. Measure. Cont. 129, 194–202
(2007)
4. Hung, P. C., McLoone, S. F., Irwin, G. W., Kee, R. J.: Blind Two-Thermocouple Sensor
Characterisation. In: International Conference on Informatics Control, Automation and
Robotics (ICINCO 2007), Angers, France, pp.10–16 (2007)
5. Forney, L. J., Fralick G. C.: Two wire thermocouple: Frequency response in constant flow.
Rev. Sci. Instrum. 65, 3252–3257 (1994)
6. Tagawa, M., Ohta, Y.: Two-Thermocouple Probe for Fluctuating Temperature
Measurement in Combustion – Rational Estimation of Mean and Fluctuating Time
Constants. Combust. and Flame. 109, 549–560 (1997)
286 P. Hung et al.
7. Kee, R. J, O'Reilly, P. G., Fleck, R., McEntee, P. T.: Measurement of Exhaust Gas
Temperature in a High Performance Two-Stroke Engine. SAE Trans. J. Engines. 107, Paper
No. 983072 (1999)
8. McLoone, S. F., Hung, P. C., Irwin, G. W., Kee, R. J.: On the stability and biasedness of
the cross-relation blind thermocouple characterisation method. In: IFAC World Congress
2008, Seoul, South Korea, accepted (2008)
9. McLoone, S., Hung, P., Irwin, G., Kee, R.: Difference equation sensor characterisation
algorithms for two-thermocouple probes. Trans. InstMC, accepted (2008)
10. Brown, C., Kee, R. J., Irwin, G. W., McLoone, S. F., Hung, P.: Identification Applied to
Dual Sensor Transient Temperature Measurement. In: UKACC Control 2008, Manchester,
UK, submitted (2008)
11. Petit, C., Gajan, P., Lecordier, J. C., Paranthoen, P.: Frequency response of fine wire
thermocouple. J. Phy. Part E. 15, 760–764 (1982)
12. Pfriem, H.: Zue messung verandelisher temperaturen von ogasen und flussigkeiten. Forsch.
Geb. Ingenieurwes. 7, 85–92 (1936)
13. Hung, P., McLoone, S., Irwin G., Kee, R.: A Total Least Squares Approach to Sensor
Characterisations. In: 13th IFAC Symposium on Sys. Id., Rotterdam, The Netherlands, pp.
337–342 (2003)
14. Kee, J. K., Hung, P., Fleck, B., Irwin, G., Kenny, R., Gaynor, J., McLoone, S.: Fast
response exhaust gas temperature measurement in IC Engines. In: SAE 2006 World
Congress, Detroit, MI, Paper No. 2006-01-1319 (2006)
15. McLoone, S., Hung, P., Irwin, G., Kee, R.: Exploiting A Priori Time Constant Ratio
Information in Difference Equation Two-Thermocouple Sensor Characterisation. IEEE
Sensors J. 6, 1627–1637 (2006)
16. Van Huffel S., Vandewalle, J.: The Total Least Squares Problem: Computational Aspects
and Analysis, SIAM, Philadelphia, 1st edition (1991)
17. Liu, H., Xu, G., Tong, L.: A deterministic approach to blind identification of multichannel
FIR systems. In: 27th Asilomar Conference on Signals, Systems and Computers, Asilomar,
CA, pp. 581–584 (1993)
18. Xu, G., Liu, H., Tong, L., Kailath, T.: A least-squares approach to blind channel
identification. IEEE Trans. Signal Proc. 43, 2982–2993 (1995)
Dirac Mixture Approximation
for Nonlinear Stochastic Filtering
Abstract. This work presents a filter for estimating the state of nonlinear dy-
namic systems. It is based on optimal recursive approximation the state densities
by means of Dirac mixture functions in order to allow for a closed form solu-
tion of the prediction and filter step. The approximation approach is based on a
systematic minimization of a distance measure and is hence optimal and deter-
ministic. In contrast to non-deterministic methods we are able to determine the
optimal number of components in the Dirac mixture. A further benefit of the pro-
posed approach is the consideration of measurements during the approximation
process in order to avoid parameter degradation.
1 Introduction
In this article, we present a novel stochastic state estimator for nonlinear dynamic sys-
tems suffering from system as well as measurement noise. The estimate is described by
means of probability density functions. The problem that arises with the application of
stochastic filters to nonlinear systems is that the complexity of the density representa-
tion increases and the exact densities cannot be calculated directly in general. Common
solutions to this problem in order to build practical estimators can be devided into two
classes. The approaches of the first class approximate or modify the system and mea-
surement functions and apply a standard filter to this modified system. The idea of the
second class is to approximate the resulting density functions themselves in order to
calculate the filter steps in closed form.
A common representative of the first class is the extended Kalman filter (EKF).
It is based on linearization of the system and measurement functions and applying a
standard Kalman filter to this modified system. This approach is applicable to systems
with negligible nonlinearities and additive noise, but fails in more general cases.
Another approach is to approximate the system together with its noise as a proba-
bilistic model by means of a conditional density function. The application of adequate
representations of the model like Gaussian mixtures with axis-aligned components [1],
allows for efficient implementation of the filter steps.
Filters approximating the density functions instead of the system function can be
divided into two main approaches found in the literature: i) sample-based density rep-
resentations and ii) analytic density representations.
288 O.C. Schrempf and U.D. Hanebeck
Sample-based filters like the popular particle filter [2] apply Monte Carlo methods
for obtaining a sample representation. Since these sample are usually produced by a ran-
dom number generator, the resulting estimate is not deterministic. Furthermore, Markov
Chain Monte Carlo Methods (MCMC) are iterative algorithms that are unsuited for re-
cursive estimation, hence, importance sampling like in [3] is often applied. The problem
of sample degradation is usually tackled by bootstrap methods [4].
Other methods describe the probability density functions by means of their mo-
ments. A popular approach is the so called Unscented Kalman filter (UKF) [5] that uses
the first moment and the second central moment for representing the densities. This
allows for an efficient calculation of the update but fails in representing highly com-
plex densities arising from nonlinear systems. Furthermore, the assumption of jointly
Gaussian distributed states and measurements is made, which is not valid in general.
An approach that represents the state densities by means of Gaussian mixture den-
sity function is the so called Gaussian sum filter [6]. The Gaussian mixture represen-
tation allows for approximating arbitrary density functions, but finding the appropriate
parameters is a tough problem. A more recent approach is the Progressive Bayes filter
[7] which uses a distance measure for approximating the true densities. The key idea
in this approach is to transform the approximation problem into an optimization prob-
lem. This is a major motivation for the approximation applied in the approach presented
here.
The filter method we propose here follows the idea of approximating the density
functions instead of the system itself, but the approximation is performed in a system-
atic manner. The general idea is to approximate the continuous density function by
means of a Dirac mixture function that minimizes a certain distance measure to the true
density. The approximation process itself is described in [8] and will therefore only be
discussed briefly in this work. We will focus here on the complete filter consisting of
approximation, prediction [9] and filter step.
Since we make use of a distance measure, we are able to quantify the quality of our
approximation. Furthermore, it is possible to find an optimal number of components
required for sufficient estimates. Following this idea we will extend our optimization
method to a full estimation cycle by considering the measurement as well.
This work is based on a publication entitled A State Estimator for Nonlinear Stochas-
tic Systems Based on Dirac Mixture Approximations [10] and is organized as follows:
We will give a problem formulation in Section 2 followed by an overview of the com-
plete filter in Section 3. The building blocks of the filter are described in Section 4
whereas Section 5 presents further optimization methods. Experimental results com-
paring the proposed filter to state-of-the-art filters are given in Section 6 followed by
conclusions in Section 7.
2 Problem Formulation
xk+1 = ak (xk , uk , wk ) .
Dirac Mixture Approximation for Nonlinear Stochastic Filtering 289
The measurements of the system are given according to the nonlinear function
yk = hk (xk , vk ) .
xk+1 = gk (xk ) + wk
yk = hk (xk ) + vk .
3 Filter Outline
In this section, we will give a brief overview of the recursive filtering scheme depicted
as a block diagram in Figure 1. The left part of the figure shows the nonlinear system
suffering from additive noise as described in Sec. 2. The right part shows the estimator.
The input of the estimator is a measurement ŷk coming from the system. The output of
the estimator is a probability density function f e (xk ) from which a point estimate x̂k
can be derived. The estimator itself works recursively as can be seen from the loop in
the diagram. Each recursion consists of a prediction step, an approximation step, and a
filter step.
The prediction step receives a density f e (xk ) from the previous filter step. This
density is an approximation represented by means of a Dirac mixture allowing for an
290 O.C. Schrempf and U.D. Hanebeck
wk vk
Fig. 1. A block diagram of the recursive estimator. The left grey box shows the system given by
the system and measurement equations. The estimator, shown in the grey box at the right, consists
of a filter step, a prediction step and an approximation step. From the output of the estimator a
point estimate can be derived.
analytically exact solution of the Bayesian prediction integral with respect to this ap-
proximation. The prediction yields a continuous mixture density representation (e.g., a
Gaussian mixture) f˜p (xk+1 ). Details are given in Sec. 4.2.
The continuous mixture density f˜p (xk+1 ) resulting from the prediction step serves
as input to the approximation step. The density is systematically approximated by
p
means of a Dirac mixture f (xk+1 ) minimizing a distance measure G f (xk+1 ) , ˜p
4 Filter Components
4.1 Density Approximation
We will now introduce Dirac mixture functions and explain how they can be interpreted
as parametric density functions. Subsequently, we will briefly describe the systematic
approximation scheme.
Dirac Mixture Density Representation. Dirac mixtures are a sum of weighted Dirac
delta functions according to
L
X
f (x, η) = wi δ(x − xi ) , (1)
i=1
where
T
η = [x1 , x2 , . . . , xL , w1 , w2 , . . . , wL ]
is a parameter vector consisting of locations xi , i = 1, . . . , L and weighting coeffi-
cients wi , i = 1, . . . , L. The Dirac delta function is an impulse representation with the
properties
0, x 6= 0
δ(x) =
not defined, x = 0
and Z
δ(x) dx = 1 .
IR
Dirac Mixture Approximation for Nonlinear Stochastic Filtering 291
A mixture of Dirac delta functions as given in (1) can be used for representing arbitrary
density functions if the following requirements are Rconsidered. Since the properties of
a density function f (x) demand that f (x) ≥ 0 and IR f (x) dx = 1, we have
wi ≥ 0, i = 1, . . . , L
and
L
X
wi = 1 .
i=1
where only L parameters and L degrees of freedom are used. This results in a simpler,
less memory consuming representation with less approximation capabilities.
Dirac mixtures are a generic density representation useful for approximating com-
plicated densities arising in estimators for nonlinear dynamic systems.
where r(x) is a nonnegative weighting function. r(x) will be later in the filter step
selected in such a way that only those portions of the predicted probability density
function are approximated with high accuracy, where a certain support of the likelihood
function is given. This avoids to put much approximation effort into irrelevant regions
of the state space.
The goal is now to find a parameter vector η that minimizes (3) according to η =
arg minη G(η). Unfortunately, it is not possible to solve this optimization problem di-
rectly. Hence, we apply a progressive method introduced in [8]. For this approach, we
introduce a so called progression parameter γ into F̃ (x) that goes from 0 . . . 1. The
purpose of this parameter is to find a very simple and exact approximation of F̃ (x, γ)
for γ = 0. Further we must guarantee that F̃ (x, γ = 1) = F̃ (x). By varying γ from 0
to 1 we track the parameter vector η that minimizes the distance measure.
In order to find the minimum of the distance measure, we have to find the root of
the partial derivative with respect to η according to
∂G(η,γ)
∂G(η, γ) ∂x !
= ∂G(η,γ) = 0 . (4)
∂η
∂w
Together with (2) and (3) this results in the system of equations
L
X
F̃ (xi , γ) = wj H(xi − xj ) ,
j=1
Z ∞ L
X Z ∞
r(x)F̃ (x, γ) dx = wj r(x)H(x − xj ) dx ,
xi j=1 xi
where i = 1, . . . , L.
In order to track the minimum of the distance measure we have to take the derivative
of (4) with respect to γ.
Dirac Mixture Approximation for Nonlinear Stochastic Filtering 293
This results in a system of ordinary first order differential equations that can be
written in a vector–matrix–form as
b = P η̇ , (5)
where
∂ F̃ (x1 ,γ)
∂γ
..
.
∂ F̃ (x ,γ)
L
R∞ ∂γ
∂ F̃ (x,γ) dx
∂γ
b=
xR∞
0
∂ F̃ (x,γ)
dx
x1 ∂γ
..
∞ .
R ∂ F̃ (x,γ)
∂γ dx
xL
and
∂η T
η̇ = = ẋ1 , . . . , ẋL , ẇ0 , ẇ1 , . . . , ẇL .
∂γ
η̇ denotes the derivative of η with respect to γ.
The P matrix is given by
P11 P12
P=
P21 P22
with
˜
−f (x1 , γ) 0 ··· 0
0 −f˜(x2 , γ) · · · 0
P11 = ,
.. .. .. ..
. . . .
0 0 · · · −f˜(xL , γ)
1
2 0 0 ··· 0
1
2 0 ···
1 0
P12 = . .. ,
.. .. . .
.. . . . .
1
1 1 1 ··· 2
and
c − x1 c − x2 c − x3 · · · c − xL
c − x2 c − x2 c − x3 · · · c − xL
P22 = c − x3 c − x3 c − x3 · · · c − xL
.
.. .. .. .. ..
. . . . .
c − xL c − xL c − xL · · · c − xL
where the transition density f (xk+1 |xk ) of the considered nonlinear system with addi-
tive noise is given by
f (xk+1 |xk ) = f w (xk+1 − g(xk )) .
f w ( · ) is the density of the system noise (e.g. Gaussian).
In general, the integral involved in (6) cannot be solved analytically for arbitrary
prior densities f e (xk ). For a given input point x̄k , however, represented by the Dirac
delta function f e (xk ) = δ(xk − x̄k ), (6) can be solved in closed form according to
fp (xk+1 ) = f w (xk+1 − g(x̄k )) .
In the case of zero mean Gaussian system noise with
f w (w) = N (w, 0, σ w ) ,
this yields
fp (xk+1 ) = N (xk+1 , g(x̄k ), σ w ) ,
which is a Gaussian Density with a standard deviation σ w .
For a given Dirac mixture prior f e (xk ) according to (1) given by
L
(i) (i)
X
f e (xk ) = wk δ(xk − xk ) , (7)
i=1
with
(i) (i) (i)
w̄k = c · wk · f v (ŷk − h(xk )) ,
(i) (i)
where wk is the i’th weight and xk is the i’th position of the approximated prediction
f p (xk ). The normalization constant can be calculated as
L
!−1
(i) (i)
X
c= wk · f v (ŷk − h(xk )) .
i=1
Naive approximation of the predicted density in a fixed interval may lead to many
small weights, since not all regions of the state space supported by the prediction are
as well supported by the likelihood. This phenomenon can be described as parameter
degradation. To circumvent this problem, we make use of the weighting function r(x)
in (3). Details on this approach are presented in the next section.
296 O.C. Schrempf and U.D. Hanebeck
In this section, we describe how to tackle the problem of parameter degradation that
is inherent to all filter approaches considering only discrete points of the density. We
further describe a method for finding an optimal number of components for the approx-
imation taking into account the prediction and filter steps as well.
To fight the problem of parameter degradation described in the previous section we
make use of the fact, that although the likelihood function is not a proper density it
decreases to zero for x → ±∞ in many cases. Therefore, we can define an area of
support in the state space where the likelihood is larger than a certain value. This area
of support is an interval and can be represented by the weighting function r(x) in (3).
It guarantees, that all components of the approximation are located in this interval and
are therefor not reweighed to zero in the filter step. In other words, the complete mass
of the approximation function accounts for the main area of interest.
In [9], we introduced an algorithm for finding the optimal number of components
required for the approximation with respect to the subsequent prediction step. We will
now extend this algorithm in order to account for the preceding filter step as well.
efficient procedure for approximating arbitrary mixture densities with Dirac mixtures
comprising a large number of components is given in [9].
In each search step of the algorithm, the distance measure of the approximated
density at hand to the density defined by κt is calculated. In this way the smallest
number of components for a prespecified error can be found.
6 Experimental Results
In order to compare the performance of the proposed filter to state-of-the-art filters, we
have simulated a nonlinear dynamic system according to the left part of Figure 1. We
apply the filter to a cubic system and measurement function motivated by the cubic
sensor problem introduced in [13].
The simulated system function is given by
and the additive noise is Gaussian with σ w = 0.2 standard deviation. The measurement
function is
h(xk ) = xk − 0.5x3k + v
with additive Gaussian noise and σ v = 0.5.
The generated measurements are used as input to our filter as well as to an unscented
Kalman filter and a particle filter. The particle filter is applied in a variant with 20
particles and a variant with 1000 particles in order to compare the performance.
In a first run of the experiment we show T = 3 steps in Figure 2. The upper row
shows the prediction steps, the lower row shows the corresponding filter steps. The
continuous prediction f p (xk+1 ) of the Dirac mixture (DM) filter is depicted by the dark
gray line. The black line underneath shows the true prediction computed numerically
as a reference. The black plot in the lower line shows the likelihood function given
by the current measurement and the filled light gray triangles depict the Dirac mixture
components after the filter step.
Both rows also show the point estimates of the various applied filters in the current
step. The filled black marker indicates the true (simulated) state, whereas dark gray
stands for the Dirac mixture point estimate. Light gray with triangle is the UKF es-
timate and the other light gray markers are the particle filter estimates. The particle
filter indicated by the circle uses 20 particles, the one indicated by the square uses 1000
particles.
We simulated the system a further 10 times for T = 7 steps in order to calculate
the root means square error erms of the 4 filters. The results are shown in Figure 3. The
plot shows that the point estimates of the Dirac mixture filter are much closer to the true
state than the point estimates of the other filters.
7 Conclusions
In this paper, we presented a complete Dirac mixture filter that is based on the approx-
imation of the posterior density. The filter heavily utilizes the properties of the Dirac
298 O.C. Schrempf and U.D. Hanebeck
f p (x)
f p (x)
0.4 0.4 0.4
0 0 0
−2 0 2 −2 0 2 −2 0 2
x x x
k=1 L=17 k=2 L=12 k=3 L=29
1 1 1
f e (x)
f e (x)
f e (x)
0.5 0.5 0.5
0 0 0
−2 0 2 −2 0 2 −2 0 2
x x x
Fig. 2. The recursive filter for T = 3 steps. k indicates the step number and L the number of
components for the Dirac mixture. The upper row shows the prediction steps, the lower row
shows the filter steps. Upper row: The dark grey curve is the continuous density predicted by
the DM filter, the thick black line underneath is the true density. The filles black marker depicts
the true (simulated) system state, the other markers depict the predicted point estimates of the
following filters: dark gray=DM, light gray triangle=UKF, light gray circle=PF20, light gray
square=PF1000. Lower row: The black line shows the likelihood. The encoding of the point
estimates are similar to the upper line. Light gray filled triangles depict the Diracs.
0.9
DM
0.8 UKF
PF20
0.7 PF1000
0.6
0.5
erms
0.4
0.3
0.2
0.1
0
1 2 3 4 5 6 7
k
Fig. 3. Root mean square error for 10 runs and T = 7 steps.
mixture approximation for recursively calculating a closed form estimate. The key idea
is that the approximation can be seen as an optimal representation of the true continuous
density function. After each prediction step a full continuous density representation is
used again in order to allow for an optimal reapproximation.
Dirac Mixture Approximation for Nonlinear Stochastic Filtering 299
The new approach is natural, mathematically rigorous, and based on efficient algo-
rithms [14, 8] for the optimal approximation of arbitrary densities by means of Dirac
mixtures with respect to a given distance measure.
Compared to particle filters, the proposed method has several advantages. First,
the Dirac components are systematically placed in order to minimize a given distance
measure. The distance measure accounts for the actual measurement and guarantees
that the prediction of the approximate densities is close to the true density of the next
time step. As a result, very few components are sufficient for achieving an excellent
estimation quality. Second, the optimization does not only include the parameters of
the Dirac mixture approximation, i.e., weights and locations, but also the number of
components. As a result, the number of components is automatically adjusted according
to the complexity of the underlying true distribution and the support area of a given
likelihood. Third, as the approximation is fully deterministic, it guarantees repeatable
results.
Compared to the Unscented Kalman Filter, the Dirac mixture filter has the advan-
tage, that it is not restricted to first and second order moments. Hence, multi-modal
densities, which cannot be described sufficiently by using only the first two moments,
can be treated very efficiently. Such densties occur quite often in strongly nonlinear sys-
tems. Furthermore, no assumptions on the joint distribution of state and measurement
have to be made.
References
1. Huber, M., Brunn, D., Hanebeck, U.D.: Closed-Form Prediction of Nonlinear Dynamic
Systems by Means of Gaussian Mixture Approximation of the Transition Density. In: In-
ternational Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI
2006), Heidelberg, Deutschland. (2006) 98–103
2. Doucet, A., Freitas, N.D., Gordon, N.: Sequential Monte Carlo Methods in Practice.
Springer-Verlag, New York (2001)
3. Geweke, J.: Bayesian Inference in Econometric Models using Monte Carlo Integration.
Econometrica 24 (1989) 1317–1399
4. Gordon, N.: Bayesian Methods for Tracking. PhD thesis, University of London (1993)
5. Julier, S., Uhlmann, J.: A New Extension of the Kalman Filter to Nonlinear Systems. In: Pro-
ceedings of SPIE AeroSense, 11th International Symposium on Aerospace/Defense Sensing,
Simulation, and Controls, Orlando, FL. (1997)
6. Alspach, D.L., Sorenson, H.W.: Nonlinear Bayesian Estimation Using Gaussian Sum Ap-
proximation. IEEE Transactions on Automatic Control AC–17 (1972) 439–448
7. Hanebeck, U.D., Briechle, K., Rauh, A.: Progressive Bayes: A New Framework for Non-
linear State Estimation. In: Proceedings of SPIE. Volume 5099., Orlando, Florida (2003)
256–267 AeroSense Symposium.
8. Schrempf, O.C., Brunn, D., Hanebeck, U.D.: Dirac Mixture Density Approximation Based
on Minimization of the Weighted Cramér–von Mises Distance. In: Proceedings of the In-
ternational Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI
2006), Heidelberg, Germany. (2006) 512–517
9. Schrempf, O.C., Hanebeck, U.D.: Recursive Prediction of Stochastic Nonlinear Systems
Based on Dirac Mixture Approximations. In: Proceedings of the American Control Confer-
ence (ACC ’07), New York City, USA. (2007)
300 O.C. Schrempf and U.D. Hanebeck
10. Schrempf, O.C., Hanebeck, U.D.: A State Estimator for Nonlinear Stochastic Systems Based
on Dirac Mixture Approximations. In: 4th Intl. Conference on Informatics in Control, Au-
tomation and Robotics (ICINCO 2007). Volume SPSMC., Angers, France (2007) 54–61
11. Kullback, S., Leibler, R.A.: On Information and Sufficiency. Annals of Mathematical Statis-
tics 22 (1951) 79–86
12. Boos, D.D.: Minimum Distance Estimators for Location and Goodness of Fit. Journal of the
American Statistical association 76 (1981) 663–670
13. Bucy, R.S.: Bayes Theorem and Digital Realizations for Non-Linear Filters. Journal of
Astronautical Sciences 17 (1969) 80–94
14. Schrempf, O.C., Brunn, D., Hanebeck, U.D.: Density Approximation Based on Dirac Mix-
tures with Regard to Nonlinear Estimation and Filtering. In: Proceedings of the 45th IEEE
Conference on Decision and Control (CDC’06), San Diego, California, USA. (2006)
On the Geometry of Predictive Control with Nonlinear
Constraints
1 Introduction
The philosophy behind Model-based Predictive Control (MPC) is to exploit in a ”reced-
ing horizon” manner the simplicity of the open-loop optimal control. The control action
ut for a given state xt is obtained from the control sequence k∗u = [uTt , . . . , uTt+N −1 ]T
as a result of the optimization problem:
N
−1
min ϕ(xt+N ) + l(xt+k , ut+k )
ku k=0 (1)
subj. to : xt+1 = f (xt ) + g(xt )ut ;
h(xt , ku ) ≤ 0
constructed for a finite prediction horizon N , cost per stage l(.), terminal weight ϕ(.),
the system dynamics described by f (.), g(.) and the constraints written in a compact
form using elementwise inequalities on functions linking the states and the control ac-
tions, h(.).
The control sequence k∗u is optimal for an initial condition - xt and produces an
open-loop trajectory which contrasts with the need for a feedback control law. This
drawback is overcome by solving the local optimization (1) for every encountered (mea-
sured) state, thus indirectly producing a state feedback law.
For the optimization problem (1) within MPC, the current state serves as an initial
condition and influences both the objective function and the topology of the feasible
302 S. Olaru et al.
domain. Globally, the system state can be interpreted as a vector of parameters, and
the problems to be solved are part of the multiparametric optimization programming
family. From the cost function point of view, the parametrization is somehow easier to
deal with and eventually can be entirely translated toward the set of constraints to be
satisfied (the MPC literature contains references to schemes based on suboptimality or
even to algorithms restraining the demands to feasible solution of the receding horizon
optimization [1]). Unfortunately, similar observation cannot be made about the feasible
domain and its adjustment with respect to the parameters evolution. The optimal solu-
tion is thus often influenced by the constraints activation, the process being forced to
operate at the designed constraints for best performance. The distortion of the feasible
domain during the parameters evolution will consequently affect the structure of the op-
timal solution. Starting from this observation the present paper focuses on the analysis
of the geometry of the domains described by the MPC constraints.
The structure of the feasible domain is depending on the model and the set of con-
straints taken into consideration in (1). If the model is linear, the linear constraints on
inputs and states can be easily expressed by a system of linear inequalities. In the case
of nonlinear systems, these properties are lost but there are several approaches to trans-
form the dynamics to those of a linear system over the operating range as for example
by piecewise linear approximation, feedback linearisation or the use of time-varying
linear models.
In the present paper, the feasible domains will be analyzed with a focus on the
parametrization mainly upon the concept of parameterized polyhedra [2], which appears
in the MPC formulations like:
min F (xt , ku )
ku ⎧
⎨ Ain ku ≤ bin + Bin xt (2)
subj. to : Aeq ku = beq + Beq xt
⎩
h(xt , ku ) ≤ 0
where the objective function F (xt , ku ) is usually linear or quadratic. Secondly it will
be shown that the optimization problem may take advantage during the real-time im-
plementation from the construction of the explicit solution.
In the presence of nonlinearities h(xt , ku ) ≤ 0 two cases can be treated:
In the following, Section 2 introduces the basic concepts related to the parameterized
polyhedra. Section 3 presents the use of the feasible domain analysis for the construc-
tion of the explicit solution for linear and quadratic objective functions. In Section 4 an
extension to nonlinear type of constraints is addressed.
On the Geometry of Predictive Control with Nonlinear Constraints 303
with αi , βi , γi the coefficients describing the convex, non-negative and linear combina-
tions in (3). Numerical methods like the Chernikova algorithm [5] are implemented for
constructing the double description, either starting from constraints (3) either from the
generators (4) representation.
This dual representation of the parameterized polyhedral domain reveals the fact that
only the vertices are concerned by the parametrization (resulting the so-called parame-
terized vertices - vi (x)), whereas the rays and the lines do not change with the param-
eters’ variation. In order to effectively use the generators representation in (5), several
aspects have to be clarified regarding the parametrization of the vertices (see for exem-
ple [6]). The idea is to identify the parameterized polyhedron by a non-parameterized
one in an augmented space:
ku k k
P̃ = ∈ Rp+n | [ Aeq | − Beq ] u = beq ; [ Ain | − Bin ] u ≤ bin (6)
x x x
304 S. Olaru et al.
The original polyhedron in (5) can be found forany particular value of the parameters
vector x through P (x) = Projku P̃ ∩ H(x) , for any given hyperplane H(x0 ) =
ku
∈ Rp+n | x = x0 and using Projku (.) as the projection from Rp+n to the
x
first p coordinates Rp .
Within the polyhedral domains P̃, the correspondent of the parameterized vertices
in
(5) can be found among the faces of dimension n. After enumerating these n-faces:
F1 (P̃), . . . Fj (P̃), . . . , Fς (P̃) , one can write:
n n n
T
∀i, ∃j ∈ {1, . . . , ς} s.t. vi (x)T xT ∈ Fjn (P̃) or equivalently:
vi (x) = Projku Fjn (P̃) ∩ H(x) (7)
From this relation it can be seen that not all the n-faces correspond to parameterized
vertices. However it is still easy to identify those which can be ignored in the process
of construction of parameterized vertices based on the relation Projx Fjn (P̃ ) < n
with Projx (.) the projection from Rp+n to the last n coordinates Rn (corresponding to
the parameters’ space). Indeed the projections are to be computed for all the n-faces,
those which are degenerated are to be discarded and all the others are stored as validity
domains - Dvi ∈ Rn , for the parameterized vertices that they are identifying:
Dvi = Projn Fjn (P̃ ) (8)
Once the parameterized vertices identified and their validity domain stored, the depen-
dence on the parameters vector can be found using the supporting hyperplanes for each
n-face: −1
Aeq Beq beq
vi (x) = x+ (9)
Āinj B̄inj b̄inj
where Āinj , B̄inj , b̄inj represent the subset of the inequalities, satisfied by saturation
for Fjn (P̃ ). The inversion is well defined as long as the faces with degenerate projec-
tions are discarded.
ℵ = Rn \ {∪Dvi ; i = 1 . . . ϑ} (10)
representing the set of infeasible states for which no control sequence can be designed
due to the fact that the limitations are overly constraining. As a consequence the com-
plete description of the infeasibility is obtained.
On the Geometry of Predictive Control with Nonlinear Constraints 305
In the case of sufficiently large memory resources, construction of the explicit solution
for the multiparametric optimization problem (2) can be an interesting alternative to
the iterative optimization routines. In this direction recent results were presented at
least for the case of linear and quadratic cost functions (see [8],[9],[10],[11],[12]). In
the following it will be shown that a geometrical approach based on the parameterized
polyhedra can bring a useful insight as well.
The linear cost functions are extensively used in connection with model based predictive
control and especially for robust case ([13], [14]). In a compact form, the multiparamet-
ric optimization problem is:
ku ∗ (xt ) = min f T ku
ku (11)
subject to Ain ku ≤ Bin xt + bin
The problem deals with a polyhedral feasible domain which can be described as
previously in a double representation. Further the explicit solution can be constructed
based on the relation between the parameterized vertices and the linear cost function
(as in [5]). The next result resumes this idea.
Proposition. The solution for a multiparametric linear problem is characterized as
follows:
a) For the subdomain ℵ ∈ Rn where the associated parameterized polyhedron has
no valid parameterized vertex the problem is infeasible;
b) If there exists a bidirectional ray l such that f T l = 0 or a unidirectional ray r
such that f T r ≤ 0, then the minimum is unbounded;
c) If all bidirectional rays l are such that f T l = 0 and all unidirectional rays r are
such that f T r ≥ 0 then there exists a cutting of the parameters in zones where the
parameterized polyhedron has a regular shape Rj = Rn − ℵ. For each region Rj
j=1...ρ
the minimum is computed with respect to the given linear cost function and for all the
valid parameterized vertices:
m(x) = min f T vi (x)|vi (x) vertex of P(x) (12)
where vi∗ are the vertices corresponding to the minimum m(x) over Rij and r∗i are such
that f T r∗i = 0
This result provides the entire family of solutions for the linear multiparametric
optimization, even for the cases where this family is not finite (for example there are
several vertices attaining the minimum). For the control point of view a continuous
piecewise candidate is preferred, eventually by minimizing the number of partitions in
the parameters space [15].
The construction mechanism uses the parameterized vertices in order to split the
regions neighboring the feasible domain in zones characterized by the same type of
projection.
On the Geometry of Predictive Control with Nonlinear Constraints 307
h (x, ku ) 0
Ain ku ≤ bin + Bin x
The idea is to exploit the existence of linear constraints in (15) and construct exact
solutions as long as the unconstrained optimum can be projected on them. In a second
stage if the unconstrained optimum is projected on the convex part of the nonlinear con-
straints, then an approximate solution is obtained by their linearization. Finally if the
unconstrained optimum has to be projected on the nonconvex constraints then a Voronoi
partition is used to construct the explicit solution. Before detailing the algorithms sev-
eral useful tools are introduced.
Gridding of the Parameter Space. The parameters (state) space is sampled in
order to obtain a representative grid G. The way of distributing the points in the state
space may follow a uniform distribution, logarithmic or tailored according to the a-
priori knowledge of the nonlinearities.
For each point of the grid x ∈ G a set of points on the frontier of the feasible domain
D(x) can be obtained - Vx by the same kind of parceling. By collecting Vx for all x ∈ G
a distribution of points VG in the extended arguments+parameters space is obtained.
Convex Hulls. A basic operation is the construction of the convex hull (or a ade-
quate approximation) for the feasible domain in (15). Writing this parameterized feasi-
ble domain as:
D(x) = ku h (x, ku ) 0; Ain ku ≤ bin + Bin x (20)
and using the distribution of points on the frontier VG , one can define in the extended
(argument+parameters) space a convex hull CVG :
ku
mN +n kui
CVG = ∈R ∃ , i = 1..mN + n + 1, kui ∈ VG ,
x xi
mN mN (21)
ku +n+1 k +n+1
s.t. = λi ui , λi = 1; λi ≥ 0 }
x i=1 xi i=1
Vi = {x ∈ Rn |x − vi 2 ≤ x − vi 2 , ∀j = i} (22)
It can be observed that each frontier of Vi is part of the bisection hyperplane between
si and one of the neighbor points sj . As a consequence of this fact, the regions Vi
are polyhedrons. Globally, the Voronoi partition is a decomposition of space Rn in ν
polyhedral regions.
In the following F(X) (and Int(X)) represents the frontier (and the interior respec-
tively) for a compact set X.
On the Geometry of Predictive Control with Nonlinear Constraints 309
h (ku ) 0
Ain ku ≤ bin + Bin x
In relation with the feasible domain D of 23 we define:
Algorithm:
7. If k∗u saturates a subset of constraints K ⊂ H
(a) Retain the set of points:
S = v ∈ V|∀k
u ∈ CV s.t. Sat(H, ku ) = K; B(RN L , ku ) = Sat(RN L , v)}
Algorithm:
on CV :
U ∗ ← P rojCV U
8. If ∃x0 such that the point:
k∗u ku
= U∗ ∩ x = x0
x0 x
then:
On the Geometry of Predictive Control with Nonlinear Constraints 311
(a) Construct
∗
ku k k
UN L (x0 ) = ∈ U u ∈ U ∗ s.t. S H, u = K(x0 )
x x x0
(b) Perform:
ku ku
U∗ = U∗ \ S H, = K(x0 )
x x0
(c) Retain the set of points:
ku ku
S= v ∈ V|∀ ∈ CV with S(H, ) = K(x0 ) ⇒
x x
B(RN L , x) = S(RN L , v)}
(d) Construct the Voronoi partition for the collection of points in S
(e) Position UN L (x0 ) w.r.t. this partition and map the suboptimal solution
∗
UN L (x0 ) ← UN L (x0 ) by using the vertex v for each active region.
k∗u ku
=v←
x x
else: jump to (10)
9. Return to point (8)
10. If the quality of the solution is not satisfactory, improve the distribution of the points
VG and restart from (2).
5 Numerical Example
Consider the MPC problem implemented using the first control action of the optimal
sequence:
N −1
ku∗ = arg min xTt+k|t Qxt+k|t + uTt+k|t Rut+k|t + xTt+N |t P xt+N |t (25)
ku i=0
⎧
⎪
⎪ 11 10
⎪
⎪ xt+k+1|t = x + u k0
⎪
⎪ 0 1 t+k|t 2 1 t+k|t
⎪
⎪ 2 √
⎪
⎨ −2 u 2
, (u 1 )2 + u2 ± 2 3; 0 k Nu − 1
−2 t+k|t
2 t+k|t t+k|t
⎪
⎪
⎪
⎪ 0.59 0.76
N k Ny − 1
⎪
⎪ ut+k|t = x
0.42 - 0.16 t+k|t u
⎪
⎪ -
⎪
⎩
KLQR
with
10 0 20 13.73 2.46
Q= ;R = ;P = ; Nu = 1; N = 2
0 1 03 2.46 2.99
By following the previous algorithm, in the first stage, the partition of the state space
is performed by considering only the linear constraints (figure 1(a)). Each such region
312 S. Olaru et al.
(a) (b)
Fig. 1. a) Partition of the arguments space (linear constraints only). b) Retention of the regions
with feasible linear projections.
(a) (b)
Fig. 2. Partition of the arguments space (nonlinear case) - a) 10 points per active nonlinear con-
straint; b) 100 points per nonlinear constraint.
(a) (b)
Fig. 3. Partition of the state space - a) 10 points per nonlinear constraint; b) 100 points per non-
linear constraint.
corresponds with a specific projection law. By simply verifying the regions where this
projection law obeys the nonlinear constraints, the exact part of the explicit solution is
obtained (figure 1(b)).
Further, a distribution of points on the nonlinear frontier of the feasible domain has
to be obtained with the associated Voronoi partition. By superposing it to the regions
non covered at the previous step a complete partition of the arguments space is real-
ized. Figures 2(a)-2(a) depict such a partitions for 10 and 100 points for each nonlinear
constraint.
By correspondence, the figures 3(a) and 3(b) describe the partition of the state space
for the explicit solution. Finally the complete explicit solution for the two cases are
On the Geometry of Predictive Control with Nonlinear Constraints 313
(a) (b)
Fig. 4. Explicit control law - a) 10 points per nonlinear constraint; b) 100 points per nonlinear
constraint.
described in figures 4(a) and 4(b). The discontinuities are observable as well as the
increase in resolution over the nonlineairity with the augmentation of the number of
points in the Voronoi partition. In order to give an image of the complexity it must be
said that the explicit solutions have 31 and 211 regions respectively and the computa-
tional effort was less than 2s in the first case and 80s in the second case, mainly spent
in the construction of supplementary regions in the Voronoi partition.
6 Conclusions
References
1. Scokaert, P.O., Mayne, D.Q., Rawlings, J.B.: Suboptimal model predictive control (feasi-
bility implies stability). In: IEEE Transactions on Automatic Control. Volume 44. (1999)
648–654
2. Olaru, S., Dumur, D.: A parameterized polyhedra approach for explicit constrained predic-
tive control. (In: 43rd IEEE Conference on Decision and Control, 2004.) 1580–1585 Vol.2
3. Motzkin, T.S., R.H.T.G., R.M., T.: The Double Description Method, republished in Theodore
S. Motzkin: Selected Papers, (1983). Birkhauser (1953)
4. Schrijver, A.: Theory of Linear and Integer Programming. John Wiley and Sons, NY (1986)
5. Leverge, H.: A note on chernikova’s algorithm. In: Technical Report 635, IRISA, France
(1994)
6. Loechner, V., Wilde, D.K.: Parameterized polyhedra and their vertices. International Journal
of Parallel Programming V25 (1997) 525–549
7. Olaru, S., Dumur, D.: Compact explicit mpc with guarantee of feasibility for tracking. In:
44th IEEE Conference on Decision and Control, and European Control Conference. (2005)
969–974
314 S. Olaru et al.
8. Seron, M., Goodwin, G., Dona, J.D.: Characterisation of receding horizon control for con-
strained linear systems. In: Asian Journal of Control. Volume 5. (2003) 271–286
9. Bemporad, A., Morari, M., Dua, V., Pistikopoulos, E.: The Explicit Linear Quadratic Regu-
lator for Constrained Systems. Automatica 38 (2002) 3–20
10. Goodwin, G., Seron, M., Dona, J.D.: Constrained Control and Estimation. Springer, Berlin
(2004)
11. Borelli, F.: Constrained Optimal Control of Linear and Hybrid Systems. Springer-Verlag,
Berlin (2003)
12. Tondel, P., Johansen, T., Bemporad, A.: Evaluation of piecewise affine control via binary
search tree. Automatica 39 (2003) 945–950
13. Bemporad, A., Borrelli, F., Morari, M.: Robust Model Predictive Control: Piecewise Linear
Explicit Solution. In: European Control Conference. (2001) 939–944
14. Kerrigan, E., Maciejowski, J.: Feedback min-max model predictive control using a single
linear program: Robust stability and the explicit solution. International Journal of Robust
and Nonlinear Control 14 (2004) 395–413
15. Olaru, S., Dumur, D.: On the continuity and complexity of control laws based on multipara-
metric linear programs. In: 45th IEEE Conference on Decision and Control. (2006)
16. Grancharova, A., Tondel, P., Johansen, T.A.: International Workshop on Assessment and
Future Directions of Nonlinear Model Predictive Control. (2005)
Author Index