NAHAER
NAHAER
© Zhenyou Zhang
Printed by NTNU-trykk
Acknowledge
It gives me an immense pleasure to present the thesis report in its completed form.
First of all, I would like to thank my main supervisor Prof. Kesheng Wang for his
guidance, but more importantly for his moral support through this research work.
Without his timely advice and through knowledge in Data Mining and Condition-
based Maintenance, the research would not have been accomplished such a great
success. I am extremely thankful for his support.
I thank my co-supervisor Odd Myklebust for his support and help during these
years.
I thank my colleagues PhD Candidate Quan Yu, Associate Prof. Yiliu Liu, Dr.
Lijuan Dai, Dr. Rhythm S. Wadhwa and all of whom helped and supported me
during these years.
I thank Prof. Lilan Liu who came to NTNU as a visit scholar from Shanghai
University. I was very pleasure to work together with her during her half year
visiting.
I thank Dr. Guijuan Lin who came to NTNU as an exchange PhD student from
Tongji University. I was very pleased to work with her during three months.
I also thank two Master students, Deborah Cruciani and Roberta Cusanno, from
University of Bologna to work together in fault diagnosis and maintenance
scheduling optimization.
Last, but the most important, I am extremely grateful to my parents and my wife
for being there for me all these years, for their patience, support and
encouragement.
I
II
Abstract
The outcomes of the thesis can be applied in mechanical and electrical system in
industries of manufacturing, wind and hydro power plants.
III
IV
Table of Contents
Table of Contents
Acknowledge .............................................................................................................I
Abstract ................................................................................................................... III
Table of Contents ..................................................................................................... V
List of Figures ......................................................................................................... XI
List of Tables ........................................................................................................ XV
Abbreviations ......................................................................................................XVII
1 Introduction ....................................................................................................... 1
1.1 Motivation of Present Work ...................................................................... 1
1.2 Literature Review...................................................................................... 2
1.2.1 Review of Maintenance Strategies ........................................................ 2
1.2.1.1 CorrectiveMaintenance(CM).......................................................5
1.2.1.2 PreventiveMaintenance...............................................................6
1.2.1.3 PredictiveMaintenance(PM).......................................................7
1.2.2 Review of Sensor System and Sensor Placement Optimization ........... 9
1.2.2.1 SensorClassification......................................................................9
1.2.2.2 WirelessSensorNetworks(WSNs).............................................11
1.2.2.3 RadioͲfrequencyIdentification(RFID).........................................12
1.2.2.4 SensorPlacementOptimization..................................................14
1.2.3 Review of Fault Diagnosis and Prognosis .......................................... 15
1.2.4 Review of Maintenance Scheduling Optimization ............................. 16
1.3 Contributions........................................................................................... 17
1.4 List of Scientific Articles ........................................................................ 18
1.5 Outline of Thesis ..................................................................................... 20
2 Framework of Intelligent Fault Diagnosis and Prognosis Systems (IFDPS) for
CBM........................................................................................................................ 21
2.1 Introduction ............................................................................................. 21
2.2 Objectives and Benefits .......................................................................... 21
2.3 Structure of IFDPS .................................................................................. 22
2.3.1 Data Acquisition ................................................................................. 23
V
Table of Contents
2.3.1.1 ClassificationofSensors..............................................................24
2.3.1.2 SensorPlacementOptimization..................................................24
2.3.2 Signal Preprocessing and Feature Extraction ...................................... 25
2.3.3 Fault Diagnosis and Identification ...................................................... 25
2.3.4 Fault Prognosis and Remaining Useful Life Evaluation ..................... 26
2.3.5 Maintenance Scheduling Optimization ............................................... 29
2.4 Summary ................................................................................................. 29
3 Data Mining Techniques for IFDPS ................................................................ 31
3.1 Introduction ............................................................................................. 31
3.2 Artificial Neural Networks (ANN) ......................................................... 32
3.2.1 Supervised Learning ANNs ................................................................ 32
3.2.1.1 ForwardPhase............................................................................33
3.2.1.2 BackwardPhase..........................................................................34
3.2.2 Self-Organizing Map (SOM) .............................................................. 35
3.3 Semi-supervised Learning Methods (Manifold Regularization) ............. 37
3.4 Association Rules.................................................................................... 40
3.4.1 Market-basket Analysis....................................................................... 40
3.4.2 Mining Association Rules Steps ......................................................... 42
3.4.3 The Apriori Algorithm ........................................................................ 42
3.4.4 Generating Association Rules from Frequent Itemset ........................ 43
3.4.5 Improving the Efficiency of the Apriori Algorithm ............................ 45
3.4.5.1 PartitionͲbasedApriori................................................................45
3.4.5.2 Sampling......................................................................................45
3.4.5.3 Hashing........................................................................................46
3.4.5.4 Transactionremoval...................................................................46
3.5 Swarm Intelligence ................................................................................. 46
3.5.1 Ant Colony Optimization (ACO) ........................................................ 47
3.5.2 Particle Swarm Optimization .............................................................. 49
3.5.2.1 BiologicalMetaphor....................................................................49
3.5.2.2 BasisAlgorithmofPSO................................................................50
3.5.2.3 TheParametersofPSO...............................................................51
VI
Table of Contents
3.5.2.4 VariantsofPSO............................................................................52
3.5.3 Bee Colony Algorithm ........................................................................ 54
3.5.3.1 BiologicalMetaphor....................................................................54
3.5.3.2 AlgorithmofBCA.........................................................................56
3.6 Summary ................................................................................................. 57
4 Sensor Classification and Sensor Placement Optimization ............................. 59
4.1 Introduction ............................................................................................. 59
4.2 Classification of Sensors ......................................................................... 60
4.3 Wireless Sensor Networks ...................................................................... 67
4.4 RFID Sensor Networks ........................................................................... 71
4.4.1 RFID System ....................................................................................... 72
4.4.2 Embedded RFID Sensor Monitoring .................................................. 73
4.5 General Sensor Networks........................................................................ 74
4.6 Sensor Placement Optimization (SPO) ................................................... 74
4.6.1 Problem Description ........................................................................... 76
4.6.2 Application of PSO in Sensor Placement Optimization ..................... 77
4.6.2.1 TheProcessofPSOApplicationinSensorPlacement
Optimization...............................................................................................77
4.6.2.2 CaseStudyandItsResults...........................................................77
4.6.3 Application of BCA in Sensor Placement Optimization..................... 85
4.6.3.1 TheProcessofApplicationofBCAinSensorPlacement
Optimization...............................................................................................85
4.6.3.2 CaseStudyandItsResults...........................................................85
4.7 Summary ................................................................................................. 87
5 Signal Preprocessing and Feature Extraction .................................................. 89
5.1 Introduction ............................................................................................. 89
5.2 Signal Preprocessing ............................................................................... 90
5.3 Feature Extraction ................................................................................... 91
5.3.1 Feature Extraction in Time Domain.................................................... 91
5.3.2 Feature Extraction in Frequency Domain ........................................... 93
5.3.3 Feature Extraction in Time-Frequency Domain ................................. 95
5.3.3.1 ContinuousWaveletTransform(CWT).......................................96
VII
Table of Contents
5.3.3.2 DiscreteWaveletTransform(DWT)............................................97
5.3.3.3 WaveletPacketDecomposition..................................................98
5.3.3.4 WaveletͲbasedFeatures...........................................................100
5.4 Feature Selection ................................................................................... 102
5.5 Summary ............................................................................................... 104
6 Fault Diagnosis based on Data Mining Techniques ...................................... 105
6.1 Introduction ........................................................................................... 105
6.2 Fault Diagnosis based on SBP .............................................................. 107
6.3 Fault Diagnosis based on SOM ............................................................. 108
6.4 Fault Diagnosis based on Semi-supervised Learning ........................... 109
6.5 Fault Diagnosis based on Association Rules ........................................ 113
6.6 Case Study 1: Fault Diagnosis Integration of WPD, PCA and BP
Network ............................................................................................................. 113
6.6.1 Experimental Setup ........................................................................... 114
6.6.2 Experimental Procedure .................................................................... 114
6.6.3 Features Extraction in Wavelet Domain ........................................... 115
6.6.4 Principal Component Analysis (PCA) .............................................. 116
6.6.5 Fault Diagnosis using BP Network ................................................... 117
6.6.6 Results and Discussion...................................................................... 118
6.7 Case Study 2: Fault Diagnosis Integration of WPD, FFT and BP Network
.............................................................................................................. 120
6.7.1 Feature Extraction ............................................................................. 121
6.7.2 Fast Fourier Transform to WPD Signals........................................... 123
6.7.3 Fault Diagnosis Procedure of Integrating WPD, FFT and BP Network .
.......................................................................................................... 124
6.7.4 Experiment and Results .................................................................... 125
6.7.5 Discussion ......................................................................................... 126
6.8 Case Study 3: Fault Diagnosis based on Self-organizing Map ............. 130
6.8.1 Experimental Setup ........................................................................... 130
6.8.2 Fault Types of Centrifugal Pump System ......................................... 131
6.8.3 Experiment and Results .................................................................... 133
6.9 Summary ............................................................................................... 135
7 Fault Prognosis based on Artificial Neural Network ..................................... 137
VIII
Table of Contents
IX
Table of Contents
X
List of Figures
List of Figures
XI
List of Figures
Fig. 4.11 Initial Placement of Measuring Points on the Blower ............................. 78
Fig. 4.12 The Finite Element Model of Blower and Its First Four Modes ............. 79
Fig. 4.13 Fitness Changes with Change of Iteration PSO ( n 5 ) for Total
Displacement Mode ................................................................................................ 83
Fig. 4.14 Fitness Changes with Change of Iteration PSO ( n 5 ) for X Direction
Displacement Mode ................................................................................................ 84
Fig. 4.15 Fitness Changes with Change of Iteration PSO ( n 5 ) for Y Direction
Displacement Mode ................................................................................................ 84
Fig. 4.16 Fitness Changes with Change of Iteration PSO ( n 5 ) for Z Direction
Displacement Mode ................................................................................................ 84
Fig. 4.17 Structure of BCA Application in Sensor Placement Optimization ......... 85
Fig. 4.18 Fitness Changes with Change of Iteration BCA ( n 5 ) for Total
Displacement Mode ................................................................................................ 86
Fig. 4.19 Fitness Changes with Change of Iteration BCA ( n 5 ) for X Direction
Displacement Mode ................................................................................................ 86
Fig. 4.20 Fitness Changes with Change of Iteration BCA ( n 5 ) for Y Direction
Displacement Mode ................................................................................................ 86
Fig. 4.21 Fitness Changes with Change of Iteration BCA ( n 5 ) for Z Direction
Displacement Mode ................................................................................................ 87
Fig. 5.1 Vibration Signal in Time Domain ............................................................. 94
Fig. 5.2 Frequency Response Function of Vibration Signal in Fig. 5.1.................. 95
Fig. 5.3 3-layer Signal Decomposition by Discrete Wavelet Transform ................ 98
Fig. 5.4 Decomposed Signals by DWT .................................................................. 98
Fig. 5.5 Decomposed Signals by WPD................................................................... 99
Fig. 5.6. Wavelet Packet Coefficients and Their Relevant Standard Deviation ... 101
Fig. 6.1 Procedure of Fault Diagnosis BP Network.............................................. 108
Fig. 6.2 Procedure of SOM in Fault Diagnosis..................................................... 110
Fig. 6.3 Solution of Two-moon Problem without Unlabelled Dataset ................. 111
Fig. 6.4 Solution of Two-moon Problem with Unlabelled Dataset ...................... 112
Fig. 6.5 Procedure of Semi-supervised Learning in Fault Diagnosis ................... 112
Fig. 6.6 The Structure of Association Rule-based Fault Diagnosis ...................... 113
Fig. 6.7 Hardware of Experimental Setup ............................................................ 114
Fig. 6.8 Sensors Setup on Blower ......................................................................... 114
Fig. 6.9 Parts for Simulation Degradations........................................................... 115
XII
List of Figures
XIII
List of Figures
XIV
List of Tables
List of Tables
Table 2.1 The Methods of Signal Pre-process and Signal Process. ........................ 25
Table 3.1 A Model of a Simple Transaction Database ........................................... 41
Table 4.1 Measurands of Sensors ........................................................................... 60
Table 4.2 Technological Aspects of Sensors .......................................................... 62
Table 4.3 Detection Means Used in Sensors........................................................... 62
Table 4.4 Sensor Conversion Phenomena............................................................... 62
Table 4.5 Fields of Application............................................................................... 63
Table 4.6 Sensor Materials ..................................................................................... 63
Table 4.7 Main Natural Frequencies of Blower ...................................................... 78
Table 4.8 Total Displacement Mode for Each Point Order .................................... 79
Table 4.9 X Directional Displacement Mode for Each Point Order ....................... 80
Table 4.10 Y Directional Displacement Mode for Each Point Order ..................... 80
Table 4.11 Z Directional Displacement Mode for Each Point Order...................... 81
Table 4.12 Optimal Sensor Placement for Different Number of Measuring Point
using Total Displacement Mode ............................................................................. 82
Table 4.13 Optimal Sensor Placement for Different Number of Measuring Point
using X Direction Displacement Mode ................................................................... 82
Table 4.14 Optimal Sensor Placement for Different Number of Measuring Point
using Y Direction Displacement Mode ................................................................... 83
Table 4.15 Optimal Sensor Placement for Different Number of Measuring Point
using Z Direction Displacement Mode ................................................................... 83
Table 5.1 The Methods of Signal Pre-process and Signal Process ......................... 89
Table 5.2 Comparing Different Time-Frequency Analysis Methods ..................... 96
Table 6.1 Variance for each component ............................................................... 117
Table 6.2 Part of Training Data ............................................................................ 126
Table 6.3 Test Data and the Results ...................................................................... 127
Table 6.4 Measurement Points and Their Corresponding Vibration Types .......... 131
Table 6.5 Parameters Calculated from Vibration Signals ..................................... 133
Table 7.1 Input and Outputs of ANN Model ........................................................ 143
Table 8.1 Weekly Peak Load in Percent of Annual Peak (%) .............................. 158
Table 8.2 Data of Generators ................................................................................ 158
XV
List of Tables
XVI
Abbreviations
Abbreviations
XVII
Abbreviations
CI Computational Intelligence
AR Association Rules
DT Decision Trees
SI Swarm Intelligence
CFT Continues Fourier Transform
DFT Discrete Fourier Transform
WVD Wigner-ville Distribution
TSA Time Synchronous Averaging
WT Wavelet Transform
WP Wavelet Packet
SBP Supervised Back-Propagation
SSL Semi-supervised Learning
RKHS Reproducing Kernel Hilbert Spaces
TID Ttransaction Identifier
HBCA Honey Bee Colony Algorithm
MEMS Microelectromechanical System
RTD Resistance Temperature Detector
WSN Wireless Sensor Network
GPS Global Positioning System
ARR Analytical Redundancy Relation
PCA Principal Component Analysis
BTA Boosting Tree Algorithm
CWT Continuous Wavelet Transform
DWT Discrete Wavelet Transform
WPD Wavelet Packet Decomposition
CBR Case-based Reasoning
SDWPC Standard Deviation of Wavelet Packet Coefficients
PDF Probability Density Function
GMS Generating Unit Maintenance Scheduling
IPSO Improved PSO
RSOM Routing and Scheduling Optimization of Maintenance
OWT Offshore Wind Turbines
XVIII
Chapter 1: Introduction
1 Introduction
1
Chapter 1: Introduction
Condition, the state of a machine, is related to the Remaining Useful Life (RUL).
In the industrial and manufacturing arenas, fault prognosis can be used to estimate
the remaining useful life of a machine or a component once an impending failure
condition is detected, isolated and identified. It is obviously seen that fault
prognosis with fault diagnosis is a basis of predictive maintenance scheduling.
Therefore, the present work also proposes methods for fault prognosis to support
CBM policy.
To carry out CBM, condition monitoring is very important and obtaining parameter
information of machine is the bases for all the processes include diagnostics,
prognostics and predictive maintenance decision. Normally, sensors are used to
collect information of machines. There are two issues need to be considered for
sensors. The one is what kind of sensors should be chosen to collect the
information. The other is where the sensors should be set up on the machine to get
the information continuous or periodically. Actually, the present work focuses on
the second issue, i.e. the sensor placement optimization.
Data Mining (DM) techniques could be very useful for maintenance scheduling,
prognostics, diagnostics and sensor placement selection. Many companies, such as
BMW, ABB, Boeing and Statoil, have lots of history data. But the data has not
been used effectively in current time. DM techniques can be used to extract useful
information from the history data to support all process mentioned above.
Therefore, during the three years of PhD work, Intelligent Fault Diagnosis and
Prognosis System (IFDPS) for Condition-based Maintenance in Manufacturing
systems and processes is established. The framework IFDPS includes almost all
processes of sensor selection, sensor placement optimization, fault diagnosis, fault
diagnosis and prognosis, and maintenance scheduling optimization. It is hoped that
IFDPS can help the companies to carry out near-zero breakdown manufacturing
and further to carry out zero-defect manufacturing.
As mentioned above, the present work mainly based on the maintenance policy,
methods of diagnostics and prognostics, signal process and sensor strategy. In this
section, the state-of-the-arts for these topics are reviewed briefly.
Maintenance is defined [EN 13306: 2001, 2001] as the combination of all technical,
administrative and managerial actions during the life cycle of an item intended to
retain it in, or restore it to, a state in which it can perform the required function (a
function or a combination of functions of an item which are considered necessary
to provide a given service). It is a set of organized activities that are carried out in
order to keep an item in its best operational condition with minimum cost acquired.
The maintenance actions could be either repair or replacement activities, which are
necessary for an item to reach its acceptable productivity condition, and these
2
Chapter 1: Introduction
activities should be carried out with a minimum possible cost. In the period of pre-
World War II, people saw maintenance as an added cost to the plant which did not
increase the value of finished product, and thus, the maintenance at that era was
restricted to fixing the unit when it breaks because it was the cheapest option.
During and after World War II at the time when the advances of engineering and
scientific technology developed, people developed other types of maintenance,
which were much cheaper such as preventive maintenance and in addition, people
in this era classified maintenance as a function of the production system.
Nowadays, increased awareness of such issues as environment safety, quality of
product and services makes maintenance one of the most important functions that
contribute to the success of the industry and world-class companies are in
continuous need of a very well organized maintenance plan to compete world-wide.
The brief history of maintenance mentioned above can be seen in Fig. 1.1 [Shenoy
& Bhadbury, 1998].
3
Chapter 1: Introduction
reflected in the action of keeping production machines and facilities in the best
possible condition. Typically, the objectives of maintenance can be classified into
three groups [Boucly, 2001; Marquez, 2007; Wireman, 1990]:
x Technical objectives. These objectives are the operational imperatives from
the business sector of a company or plant. In general, operational
imperatives are linked to a satisfactory level of equipment availability and
people safety. A generally accepted method to measure the fulfillment of
this goal is the Overall Equipment Effectiveness (OEE), as described in
TPM method [Nakajima, 1988].
x Legal objectives/Mandatory regulations. Normally it is a maintenance
objective to fulfill all these existing regulations for electrical devices,
pressure equipment, vehicles, protection means, etc.
x Financial objectives. To satisfy the technical objective at the minimum cost.
From a long term perspective global equipment life cycle cost should be a
suitable measure for this.
Generally, the objectives can be list as bellowing:
1) Maximizing production or increasing facilities availability at the lowest
cost and at the highest quality and safety standards.
2) Reducing breakdowns and emergency shutdowns.
3) Optimizing resources utilization.
4) Reducing downtime.
5) Improving spares stock control.
6) Improving equipment efficiency and reducing scrap rate.
7) Minimizing energy usage.
8) Optimizing the useful life of equipment.
9) Providing reliable cost and budgetary control.
10) Identifying and implementing cost reductions.
A maintenance action may include a set of maintenance activities: inspection,
monitoring, routine maintenance, overhaul, rebuilding and repair. Inspection can be
performed by measuring, observing, testing or gauging the relevant features of an
item before, during or after other maintenance activity. Monitoring is a kind of
activities performed manually or automatically, continuously or periodically
intended to obtain the actual state of the equipment which can be used to evaluate
parameters changes of the equipment when the equipment is in operating state.
Routine maintenance is a kind of regular elementary maintenance activities, such
as cleaning, tightening of connections and checking lubrication, which usually do
not need special qualification authorization or tools. Overhaul is a comprehensive
set of examinations and actions performed at prescribed intervals of time or a
number of operations in order to maintain the required level of reliability,
availability and safety, and sometimes may require partial or complete dismantling
of the items. Rebuilding is performed when the equipment or components are
approaching their useful life or should be regularly replaced in order to provide the
equipment with a useful life that may be greater than the lifespan of the original
equipment. Repairing is a physical action to restore the required functions of faulty
equipment [Marquez, 2007]. A maintenance action could include some of one or
4
Chapter 1: Introduction
more above activities. The maintenance may also need fault diagnosis and
prognosis for monitored equipment.
With a long history development, maintenance has been made great progress. At
the beginning, maintenance action is performed when the equipment become
failure. However, this kind of maintenance policy cannot meet the requirement of
the industry and many other types of maintenance are emerged during the several
decades as seen in Fig. 1.2 [EN 13306: 2001, 2001]. In many literatures, Condition-
based Maintenance (CBM) is also called predictive maintenance. This section
mainly reviews corrective maintenance and preventive maintenance briefly, and
review predictive maintenance in detail.
5
Chapter 1: Introduction
means the maintenance is not immediately carried out after fault detection but is
delayed according to given maintenance rules.
The CM policy has its advantages. Its planning is very simple because the
maintenance action is needed only when the failure happens and the plan is only to
consider the failure rate. The maintenance work is not scheduled until it is really
needed. However, it has major disadvantages [Holmberg et al., 2010]:
x Failure can, and probably will, occur at an inconvenient time, e.g., when the
plant is at full load, or while it is starting.
x A component fault may go unnoticed, leading to expensive consequential
damage, e.g., bearing seizure causes damage to a shaft.
x Dangerous and/or expensive failure consequences should be expected.
x No data are available regarding the past, present and possible future state of
the machine.
x A large breakdown crew may need to be available on standby. All the
required expertise should be either within the plant or easily accessed from
external resources, which is almost always costly, or a longer waiting time
should be expected.
x A large spares inventory is necessary to ensure quick repair.
x Failures exceeding the capacity of the repair team lead to “fire-fighting”.
6
Chapter 1: Introduction
7
Chapter 1: Introduction
knowledge of the failure causes and effects and the deterioration patterns of
equipment [Ahmad & Kamaruddin, 2012].
The condition monitoring process can be carried out into two ways: on-line and
off-line. On-line processing is carried out during the running state of the equipment
(operating state), while off-line processing is performed when the equipment is not
running. In addition, condition monitoring can be performed either periodically or
continuously. Typically, periodical monitoring is carried out at certain intervals,
such as every hour or every working shift end, with the aid of portable indicators,
such as hand-held meters, acoustic emission units, and vibration pens. The
condition monitoring process also includes evaluations based on human senses to
measure or evaluate equipment conditions, such as degree of dirtiness and
abnormal color. As for continuous monitoring, as its name suggests, monitoring is
performed continuously and automatically based on special measurement devices,
such as vibration and acoustic sensors.
There are two main limitation of continuous monitoring exist: it is expensive
because many special devices are required and inaccurate information may be
obtained because the continuous flow of data creates increased noise. In contrast,
the main limitation of periodic monitoring is the possibility of missing some
important information of equipment failure between monitoring intervals [Jardine
et al., 2006]. Most equipment failures are preceded by certain signs, conditions, or
indications that such a failure was going to occur and many condition monitoring
techniques can be used to monitor equipment conditions [Bloch & Geitner, 2012].
PM has some advantages over other maintenance policies: 1) Improving
availability and reliability by reducing downtime; 2) Enhancing equipment life by
reducing wear from frequent rebuilding, minimizing potential for problems in
disassembly and reassembly and detecting problems as they occur; 3) Saving
maintenance costs by reducing repair costs, reducing overtime and reducing parts
inventory requirements; 4) Decreasing number of maintenance operations causes
decreasing of human error influence. However, there are still some challenges of
PM: 1) Initiating PM is costly because the cost of sufficient instruments could be
quite large especially if the goal is to monitor already installed equipment; 2) The
goal of PM is accurate maintenance, but it is difficult to achieve for the complexity
of equipment and environment; 3) Introducing PM will invoke a major change in
how maintenance is performed, and potentially to the whole maintenance
organization in a company. Organizational changes are in general difficult.
There are many kinds of techniques, such as sensors techniques, signal process
techniques, fault diagnosis techniques, fault prognosis techniques and maintenance
optimization techniques, can be used to support maintenance decision making. All
these techniques will be reviewed.
8
Chapter 1: Introduction
Mechanical sensor systems have been studied extensively, and a large number of
such devices are currently in use to monitor system performance for operational
state assessment and tracking of fault indicators. A number of mechanical
quantities - position, speed, acceleration, torque, strain, temperature, etc. - are
commonly employed in dynamic systems. Most of devices for measuring these
quantities are available commercially, and their operation has been amply
described in textbooks and publications [Silva, 1989; Stuart & Allocca, 1984].
However, the most useful Mechanical sensors for condition monitoring are
accelerometers and strain gauge.
System performance and operational data are monitored routinely in all industrial
establishments, utility operations, transportation systems, etc. for process control,
performance evaluation, quality assurance, fault diagnosis and prognosis, and
9
Chapter 1: Introduction
10
Chapter 1: Introduction
11
Chapter 1: Introduction
12
Chapter 1: Introduction
material flow, especially suitable for large production networks [Ilie-zudor et al.,
2006]. RFID is the use of a wireless non-contact radio system to transfer data from
a tag attached to an object, for the purposes of identification and tracking
[http://en.wikipedia.org/wiki/Radio-frequency_identification]. In general terms, it
is a means of identifying a person or object using a radio frequency transmission.
The technology can be used to identify, track, sort or detect a wide variety of
objects [Lewis, 2004]. Recently, RFID become more and more interesting
technology in many fields such as agriculture, manufacturing and supply chain
management.
13
Chapter 1: Introduction
RFID vibration sensing tag. The other is to connect vibration sensor to RFID tag
and the RFID system only used to transmit vibration data to RFID reader and
further to host computer. This application can make the measuring vibration
become very flexible and effective [Wang & Zhang, 2012].
14
Chapter 1: Introduction
During a system failure, only a small fraction of the downtime is spent to maintain
or repair the components that cause the fault. Up to 80% of that is spent to locate
the source of the fault [Kegg, 1984]. In case of complex installation such as
automotive manufacturing plant, one minute downtime may cause as high as
$20,000 cost [Spiewak et al., 2000]. Early fault diagnosis is crucial for avoiding
major malfunction and massive loss in economy and productivity. In diagnosing
rotating machinery, sound emissions or vibration signals are used to monitor the
performance of the machine and could be used to judge whether the machine is
failure or degrading. Many useful techniques for signal analysis have been applied.
These techniques can be classified into three types: time domain [Chen et al., 2008;
Wang et al., 2010], frequency domain such as Fast Fourier Transform [Corinthios,
1971; Liu et al., 2010; Rai & Mohanty, 2007] and time-frequency domain such as
the Short Time Fourier Transform [Portnoff, 1980], Hilbert-Huang Transform [Yu
et al., 2007], Wigner-ville distribution [Andria et al., 1994; Staszewski et al., 1997;
Wang et al., 2008] and Wavelet Transform [Dongyan Chen & Trivedi, 2005; Lin &
Qu, 2000; Prabhakar et al., 2002; Seker & Ayaz, 2003; Tse et al., 2004; Wu &
Chen, 2006; Wu & Kuo, 2009; Wu & Liu, 2009; Zheng et al., 2002].
Autoregressive model method can also be used to extract features of a machine or
component for fault diagnosis and prognosis [Li et al., 2009]. Wavelet transform is
the best of these tools because short time Fourier transform only provides a
constant time-frequency resolution, and Wigner-ville distribution produced
interface terms on the time-frequency domain in a critical condition [Wu & Chen,
2006]. It has particular advantages for characterizing signals at different
localization levels in time as well as signal processing, image processing, pattern
recognition, seismology and machine fault diagnosis.
After processing vibration signals and extracting the features, the more important
thing is identifying the fault and predicting the remaining useful life. There are
many methods could be used in this area. Support vector machine (SVM) learning
is a popular machine learning application due to its high accuracy and good
generalization capabilities [Saravanan et al., 2008]. Li et al. [Li et al., 2005]
proposed a hidden Markov model (HMM)-based fault diagnosis in speed-up and
speed-down process for rotary machinery. In the implementation of the system, one
PC was used for data sampling and another PC was used for data storage and
analysis. Wu and Chow [Wu & Chow, 2004] presented a self-organizing map
(SOM) based radial-basis-function (RBF) neural network method for induction
machine fault detection. The system was implemented by utilizing a PC and
additional data acquisition equipment. Many methods based on ANN have been
developed for online surveillance with knowledge discovery, novelty detection and
learning abilities [Kasabov, 2001; Markou & Singh, 2003; Marzi, 2004]. ANN,
Fuzzy Logic System (FLS), Genetic Algorithms (GA) and Hybrid Computational
Intelligence (HCI) systems were applied in fault diagnosis and a case of centrifugal
pump was utilized to show how the methods work [Wang, 2002]. Decision tree
method was used to identify fault in of mean shifts in bivariate processes in real
time [He et al., 2011]. Probability based Bayesian network methods was used to
identify vehicle fault which can be used to diagnose single-fault and multi-fault
15
Chapter 1: Introduction
[Huang et al., 2008]. Lee, et al. [Lee et al., 2006] developed an intelligent
prognostics and e-maintenance system named “Watchdog Agent” with the method
of Statistical matching, and performance signature and Support Vector Machine
(SVM) based diagnostic tool.
There exist some literatures integrating these techniques for fault diagnosis and
prognosis. Momoh and Button integrated FFT and ANN to analyze and identify the
fault of aerospace DC arcing [Momoh & Button, 2003]. Fourier transform and
wavelet transform were integrated to detect and identify the fault of induction
motor using stator current information [Lee, 2011]. Wavelet analysis techniques
and ANN were integrated for fault diagnosis in induce motors [Lee, 2011],
automotive generator [Wu & Kuo, 2009] and gear box [Saravanan &
Ramachandran, 2010] and the results were pretty good. In the PhD work, some
techniques are integrated together to classify and predict fault and further to predict
the remaining useful life. These results can be used to support the maintenance
decision making and optimizing the scheduling.
16
Chapter 1: Introduction
1.3 Contributions
17
Chapter 1: Introduction
improve the effectiveness and reliability of condition monitoring and improve the
quality of data collection. This thesis proposes a method for sensor placement
optimization in machine level by combining Finite Element Analysis (FEA) and
Swarm Intelligence, i.e. Particle Swarm Optimization (PSO) and Bee Colony
Algorithm (BCA). The method can find the optimal positions of a number of
sensors.
Techniques of signal processing and feature extraction are crucial for obtaining key
performance information so that the system can diagnose and prognose effectively.
The thesis analyze the vibration signal through traditional methods such as fast
Fourier transform (FFT), short-time Fourier transform (STFT) and some modern
signal analysis techniques such as wavelet transform, etc. These techniques
together feature extraction method such as Principal Component Analysis (PCA)
For fault diagnosis, the thesis combines the methods of signal processing and
feature extraction mentioned above, and some data mining techniques such as
Artificial Neural Network (ANN) and Self-organizing Map (SOM). These methods
can detect and diagnose the fault effectively.
For fault prognosis, the thesis proposes a methodology to predict the indicator of
component fault based on the collected information by sensors and ANN other than
based on the traditional statistics methods. This methodology has already applied to
wind turbine fault prognosis and it works effectively. The method establishes ANN
model for the indicator in normal condition of wind turbine using the history
SCADA which is collected by wind farm operator but not use effectively. Then the
thresholds of different conditions can be set by using the history data with different
extent fault. Finally ANN model can be applied online to monitor the wind turbines
and gives staff earn warning of fault so that they can schedule the maintenance
actions in advance to reduce downtime, production loss and maintenance cost.
For different purpose, the different maintenance models are established. Based on
these models, the maintenance schedule can be optimized by Swarm Intelligence
(SI), i.e. Particle Swarm Optimization (PSO), Ant Colony Optimization (ACO) and
Bee Colony Algorithm (BCA).
The algorithms of PSO, ACO and BCA are improved or modified in order to be
applied to maintenance scheduling optimization.
¾ Zhang, Z., Wang, Y. and Wang, K., (2013). Fault Diagnosis and Prognosis
using Wavelet Packet Decomposition, Fourier Transform and Artificial Neural
Network. Journal of Intelligent Manufacturing, vol. 24 (6), pp. 1213-1227 (doi:
10.1007/s10845-012-0657-2).
¾ Zhang, Z., Wang, Y. and Wang, K., (2013). Intelligent Fault Diagnosis and
Prognosis Approach for Rotating Machinery integrating Wavelet Transform,
Principal Component Analysis, and Artificial Neural Networks. International
Journal of Advanced Manufacturing Technology, vol. 68 (1-4), pp. 763-773
(doi: 10.1007/s00170-013-4797-0).
18
Chapter 1: Introduction
¾ Zhang, Z., and Wang, K., (2014). Wind turbine fault detection based on
SCADA data analysis using ANN. Advances in Manufacturing, 2(1), pp. 70-78
(doi: 10.1007/s40436-014-0061-6).
¾ Liu, Y., Zhang, Z. and Liu, Z., (2011). Customized Configuration for
Hierarchical Products: Component Clustering and Optimization with PSO. The
International Journal of Advanced Manufacturing Technology, 57. pp. 9-12.
¾ Zhang, Z. and Wang, K., (2013). Wind Turbine Fault Detection Based on
SCADA Data Analysis Using ANN. International Workshop of Advanced
Manufacturing and Automation (IWAMA2013), Nov. 27, pp. 323-335.
¾ Wang, K., Sharma, V. and Zhang, Z. (2013). SCADA Data Interpretation for
Condition-based Monitoring of Wind Turbines. International Workshop of
Advanced Manufacturing and Automation (IWAMA2013), Nov. 27, pp. 307-321.
¾ Zhang, Z. and Wang, K., (2012). Sensors Placement Optimization for
Condition Monitoring. Proceedings of International Workshop of Advanced
Manufacturing and Automation (IWAMA2012), June 20-21, pp. 69-76.
¾ Zhang, Z. and Wang, K., (2012). IFDPS-Intelligent Fault Diagnosis and
Prognosis System for Condition-based Maintenance. International Workshop of
Advanced Manufacturing and Automation (IWAMA2012), June 20-21, pp.77-84.
¾ Zhang, Z. and Wang, K. (2010). Application of Improved Discrete Particle
Swarm Optimization (IDPSO) in Generating Unit Maintenance Scheduling.
International Workshop of Advanced Manufacturing and Automation
(IWAMA2010), pp. 79-86.
¾ Zhang, Z. and Wang, K., (2012). Dynamic Condition-Based Maintenance
Scheduling Using Bee Colony Algorithm. Proceedings of International Asia
Conference on Industrial Engineering and Management Innovation (IEMI2012),
Oct. 27-29, pp.1607-1618.
¾ Zhang, Z. and Wang, K., (2011). Fault isolation using self-organizing map
(SOM) ANNs. IET International Conference of Wireless Mobile & Computing,
Nov. 14-16, pp. 425-431.
¾ Wang, K. and Zhang, Z., (2012). Application of Radio Frequency Identification
(RFID) to Manufacturing. SFI-Norman, SINTEF (ISBN 978-82-14-05388-3),
pp. 1-24.
¾ Wang, K. and Zhang, Z. (2011). Intelligent Fault Diagnosis and Prognosis
System (IFDAPS) for Condition-based Maintenance. Trondheim: SINTEF
A17147 (ISBN 978-82-14-05057-8) pp. 1-21.
¾ Wang, K., Sharma, V. and Zhang, Z., (2013). SCADA Data Mining for wind
turbine fault diagnosis and failure prognosis: Principles, Trends, Applications
and Research Areas, Trondheim: SINTEF (ISBN 978-82-14-05496-5), pp. 1-20.
¾ Zhang, Z. and Wang, K. (2012). Fault Diagnosis using Association Rules. In
Wang, K. and Wang, Y. edit: Data Mining for Zero-Defect Manufacturing, pp.
53-75.
¾ Cusanno, R., Zhang, Z. Regattieri, A. and Wang, K., (2012). Apply Particle
Swarm Optimization for Condition-based Maintenance Scheduling. In Wang, K.
and Wang, Y. edit: Data Mining for Zero-Defect Manufacturing, pp. 117-131.
¾ Crucian, D., Zhang, Z. and Wang, K., (2012). Fault Diagnosis and Prognosis
Using Self-organizing Map. In Wang, K. and Wang, Y. edit: Data Mining for
Zero-Defect Manufacturing, pp. 101-115.
19
Chapter 1: Introduction
20
Chapter 2: Framework of Intelligent Fault Diagnosis and Prognosis Systems
(IFDPS) for CBM
2.1 Introduction
The main objective of IFDPS is to establish a framework to show how to use the
signals, databases, analysis tools and maintenance decision-making techniques for
21
Chapter 2: Framework of Intelligent Fault Diagnosis and Prognosis Systems
(IFDPS) for CBM
Fig. 2.1 shows the general structure of IFDPS which presents from the machine
degrading, sensors, signal processing, fault diagnosis and prognosis, and
maintenance scheduling optimization. The main tasks performed by IFDPS are the
following:
x Continuous collection of data from different sensors mounted on the
machine includes the information of machine and environment.
x Continuous processing the data collected from sensors in order to get
useful information to evaluate the off-line and on-line health condition of
the machine and also to detect if some symptoms of degradation or
anomalies are present or could become present.
x According to the useful information mentioned above, the condition or the
fault can be identified. If there are any degradation become unaccepted, the
system can tell staff which components or machines and when should be
maintained or repaired.
x According to the condition of the component or machine, the remaining
useful life can be predicted.
x According to the condition of equipment and predicted remaining useful
life, the maintenance action plan can be scheduled by some intelligent
computational optimization algorithm.
The techniques of subtasks are presented in the following sections.
22
Chapter 2: Framework of Intelligent Fault Diagnosis and Prognosis Systems
(IFDPS) for CBM
Data acquisition is first phase of the IFDPS and is a basis of fault diagnosis and
prognosis and hence is foundation of intelligent Condition-based Maintenance
scheduling. The tasks of this phase are selecting a suitable sensors and optimal
sensor strategy. Sensors and sensing strategies constitute the foundational basis for
fault diagnosis and prognosis. Strategic issues arising with sensor suites employed
to collect data that eventually will lead to online realization of diagnostic and
prognostic algorithms are associated with the type, number, and location of sensors;
their size, weight, cost, dynamic range, and other characteristic properties; whether
they are of the wired or wireless variety; etc. [Vachtsevanos et al., 2006]. Data
collected by transducing devices rarely are useful in their raw form. Such data must
be processed appropriately so that useful information may be extracted that is a
reduced version of the original data but preserves as much as possible those
characteristic features or fault indicators that are indicative of the fault events we
are seeking to detect, isolate, and predict the time evolution of. Thus such data
must be preprocessed, that is, filtered, compressed, correlated, etc., in order to
remove artifacts and reduce noise levels and the volume of data to be processed
subsequently. Furthermore, the sensor providing the data must be validated; that is,
the sensors themselves are not subjected to fault conditions. Once the
preprocessing module confirms that the sensor data are “clean” and formatted
appropriately, features or signatures of normal or faulty conditions must be
extracted. This is the most significant step in the IFDPS architecture whose output
will set the stage for accurate and timely diagnosis of fault modes. The extracted-
feature vector will serve as one of the essential inputs to fault diagnostic algorithms.
23
Chapter 2: Framework of Intelligent Fault Diagnosis and Prognosis Systems
(IFDPS) for CBM
Following will introduce the two aspects: sensor category and placement
optimization.
24
Chapter 2: Framework of Intelligent Fault Diagnosis and Prognosis Systems
(IFDPS) for CBM
flaw is analyzed which information can also be used to optimize the sensor
placement.
Generally, there are two steps to deal with the signals from sensors. The one is
signal preprocessing which is intended to enhance the signal characteristics that
eventually may facilitate the efficient extraction of useful information that is the
indictors of the condition of monitoring component or subsystem. The tools in this
category include filtering, amplification, data compression, data validation, and de-
noising which generally aim at improving the signal-to-noise ratio. And the other is
extracting features from preprocessed signals that are characteristic of an incipient
failure or fault. Generally, the features can be extracted from three domains: time
domain, frequency domain and time-frequency domain. All possible signal
preprocessing and feature extraction methods are shown in Table 2.1 and which
features could be selected depend on the real machines or system. All these kinds
of methods are selectable in IFDPS and which methods are applied can be decided
by real machine or system analysis. What’s more, in order to express the enough
information to express the condition, the methods in the table can be combined
together to be indicators of the condition.
Signal process
Signal
Preprocessing Wavelet
Time Domain Frequency Domain
Domain
Filter, Mean, RMS,
Amplification, Continues Fourier
Shape factor, Transform (CFT), Discrete
Signal Skewness, Kurtosis, Fourier Transform (DFT),
Conditioning, Wavelet
Crest factor, Fast Fourier Transform Transform
Extracting Weak (FFT),
Entropy Error, (WT) and
Signals, De-noising
Entropy estimation, Wigner-ville Distribution Wavelet Packet
Vibration Signal (WVD) and (WP)
Compression and Histogram Lower
and Short Time Fourier
Time Synchronous Transformation (STFT)
Averaging (TSA) Histogram upper
Fault diagnosis strategies have been developed in recent years and have found
extensive utility in a variety of application domains. The enabling technologies
typically fall into two major categories: model based and data-driven, as shown in
Fig. 2.2. Model-based techniques rely on an accurate dynamic model of the system
and are capable of detecting even unanticipated faults. They take advantage of the
actual system and model outputs to generate a ‘‘discrepancy’’ or residual, as it is
known, between the two outputs that is indicative of a potential fault condition. On
25
Chapter 2: Framework of Intelligent Fault Diagnosis and Prognosis Systems
(IFDPS) for CBM
the other hand, data-driven techniques often address only anticipated fault
conditions, where a fault ‘‘model’’ now is a construct or a collection of constructs
such as neural networks, expert systems, etc. that must be trained first with known
prototype fault patterns (data) and then employed online to detect and determine
the faulty component’s identity.
IFDPS focus on the data-driven techniques and hybrid techniques. If the historical
data can be obtained easily, the data-driven is very good to identify the fault and
evaluate the condition. When only part of historical can be obtained, the hybrid
techniques which combine the data-driven techniques and model-based techniques
can be used to evaluate the condition of machine effectively. The semi-supervised
learning method also can be used to evaluate condition and identify fault when
only part of historical data is available and it is very effective. All these techniques
are selectable according to the real manufacturing system analysis.
26
Chapter 2: Framework of Intelligent Fault Diagnosis and Prognosis Systems
(IFDPS) for CBM
In a first step, a physical model must be established if a good model is not available.
This can be a challenging work and requires good knowledge about the problem
that is modelled. However, once a good model is available, it can be applied to all
comparable problems where good estimates or measurements of the model input
parameters are available. Since the processes in real world may be quite
complicated and may be affected by many mechanisms and effects, one usually has
not the possibility to take all of them into account. Thus, a physical model may be
restricted to include the main mechanisms and main effects only.
Physical models are often empirical, which means they are based on observation or
experiments. Physical models can basically be used for all kinds of predictions,
both long-term and short-term, depending on what they are designed for.
Most stochastic can be applied for many different problems. An advantage of
stochastic models applied to lifetime models are of general nature and prediction is
that both an estimate of the mean lifetime and various estimates of uncertainty can
be established, such as variance of the lifetime, confidence intervals for parameters
and predictions, etc. Parameter estimation in stochastic modelling is based on the
observation of the model output. Thus, observations of the model output, such as
observations of lifetime or degradation, are usually collected as basis for parameter
estimation. When possible, one should fit different stochastic models to the data
and choose the model that gives the best prediction. Many techniques exist to
choose the best model and to check the goodness of fit (e.g. p-value, confidence
intervals, comparison of maximum likelihood values and various graphical
methods such as probability plots). As alternative to data collection, expert
judgement can be used for parameter estimation. There exist different techniques
for expert judgement, e.g. [Cooke, 1992]. Stochastic models can basically be used
for both short-term and long-term predictions. However, for lifetime prediction,
they are mostly used to make medium and long-term predictions. Furthermore,
they are often used in system modelling or as input in other models (such as
maintenance and optimization models) where the main interest is in long- term
averages (such as failure rates). They can also successfully be applied for
comparing and explaining the lifetime influence of different designs or other
factors either by looking on the results from different samples or by incorporating
explanatory variables in the model.
However, in the absence of a reliable and accurate system model, and statistical
data, another approach to determine the remaining useful life is to monitor the
trajectory of a developing fault and predict the amount of time until the developing
fault reaches a predetermined level requiring action, which is the so called data-
driven prognostic method. Data-driven techniques utilize monitored operational
data related to system health. They can be beneficial when understanding of first
principles of system operation is not straightforward or when the system is so
complex that developing an accurate alternative model is prohibitively expensive.
An added value of data-driven techniques is their ability to transform high-
dimensional noisy data into lower dimensional information useful for decision-
making [Dragomir et al., 2007]. Furthermore, recent advances in sensor technology
and refined simulation capabilities enable us to continuously monitor the health of
operating components and manage the related large amount of reference data.
27
Chapter 2: Framework of Intelligent Fault Diagnosis and Prognosis Systems
(IFDPS) for CBM
0.1
0.09
0.08
0.07
0.06
0.05
0.04
0.03
0.02
0.01
0
0 50 100 150 200 250 300 350 400 450 500
After the condition of the component is determined, the remaining useful life can
be evaluated according to the condition. Most current RUL estimation methods are
based on the event data or condition monitoring data which want to find the
relationship between RUL and time the component used or RUL and feature values
28
Chapter 2: Framework of Intelligent Fault Diagnosis and Prognosis Systems
(IFDPS) for CBM
[Lee et al., 2006; Si et al., 2011]. The method of Fig. 2.3 tries to find the
relationship between the RUL and the condition of a component that is evaluating
RUL by the condition and RUL distribution for each condition. The distributions of
RUL are obtained by the statistical methods. For example, if the condition of a
component is 0, the remaining useful is 350h with a certain standard deviation.
When the condition is 1.0, the RUL is much closed to 0 which means the
component has to be maintained or repaired. From Fig. 2.3 the RUL distribution
become narrow that means the RUL evaluation is more accuracy when the
condition closed to failure. Therefore the confidence value of RUL increases with
the condition deterioration.
IFDPS also propose another method to predict the RUL by establishing ANN
model for machines in normal condition and set thresholds in several different
levels. This method has applied in real industries such as wind power industry
which is described in Chapter 7.
The functions of the maintenance are finding out fault status of maintenance object
and maintenance effect, and selecting right maintenance policy to achieve expected
maintenance effect. The purpose of it is making maintenance decision based on
current information to prevent occurrence and development of failure effectively,
ensure the security of equipment and personnel, and reduce economic lost caused
by failure. Maintenance scheduling optimization is a kind of NP problem and the
SI algorithms could be a very good technique to solve this kind of problem. IFDPS
apply Genetic Algorithm (GA), Particle Swarm Optimization (PSO), Ant Colony
Optimization (ACO) and Bee Colony Algorithm (BCA) and try to find the optimal
dynamic predictive maintenance scheduling. All these kind methods are selectable
in IFDPS to solve maintenance scheduling optimization problems.
2.4 Summary
29
Chapter 2: Framework of Intelligent Fault Diagnosis and Prognosis Systems
(IFDPS) for CBM
30
Chapter 3: Data Mining Techniques for IFDPS
3.1 Introduction
There are many aspects can be researched for the Framework of IFDPS. The
volumes of data from sensors and processing procedure become tremendous filling
the computers and networks. Sometimes, the data is too huge and too complicated
to analyze effectively and thus how to get the useful information from these data
becomes very significant point. This PhD work mainly focus on the application of
Data Mining (DM) techniques in all processes of IFDPS such as sensor placement
optimization, fault diagnosis, fault prognosis and maintenance scheduling
optimization. There are already some researches in these areas but most of these
researches focus on one process. DM technology has recently become a hot topic
for decision-makers because it provides valuable, hidden business and scientific
“Intelligence” from a large amount of historical data. It is a kind of methods to
extract information and knowledge from recorded data. This Chapter describes
some DM techniques used in the PhD work.
Data mining can be defined as the analysis of (often large) observational data sets
to find unsuspected relationships and to summarize the data in novel ways that are
both understandable and useful to the data owner [Hand et al., 2001]. It is the entire
process of applying computer-based methodology, including new techniques for
knowledge discovery, from data. It draws ideas and resources from multiple
disciplines, including machine learning, statistics, database research, high
performance computing and commerce. This explains the dynamic, multifaceted
and rapidly evolving nature of the data mining discipline. Generally, there are two
main goals of data mining: prediction and description. Prediction involves using
some variables or fields in the dataset to predict unknown or future values of other
variables of interest. Description focuses on finding patterns describing the data
that can be interpreted by humans. Therefore, the data mining activities can be
classified into two categories: predictive data mining which produces the model of
the system described by the given dataset, and the descriptive data mining which
produces new, nontrivial information based on the available dataset. The main
tasks of DM techniques are [Kantardzic, 2003]:
x Classification – discovery of a predictive learning function that classifies a data
item into one of several predefined classes.
x Regression – discovery of predictive learning function, which maps a data item
to a real-value prediction variable.
x Clustering – a common descriptive task in which one seeks to identify a finite
set of categories or clusters to describe the data.
x Summarization – an additional descriptive task that involves methods for
finding a compact description for a set of data.
x Dependency Modeling – finding a local model that describes significant
dependencies between variables or between the values of a feature in a data set
or in a part of a data set.
31
Chapter 3: Data Mining Techniques for IFDPS
The pattern classification theory has become a key factor in fault diagnosis and
prognosis. Some classification methods for equipment performance monitoring use
the relationship between the type of fault and a set of patterns which is extract from
the collected signals without establishing explicit models. Currently, ANN is one
of the most popular methods in this domain. ANN is a model that emulates a
biological neural network [Wang, 2005]. The origin of ANN can be traced back to
a seminar paper by McCulloch and Pitts [McCulloch & Pitts, 1943] that
demonstrated a collection of connected processors, loosely modeled on the
organization of brain, could theoretically perform any logical or arithmetic
operation. Then, the development of ANN techniques is very fast which is
extensive to many categories containing Back-propagation (BP), Self-organization
Mapping (SOM) and Radial Basis Function (RBF), etc. The application of artificial
neural network models lies in the fact that they can be used to infer a function from
observations. This is particularly useful in applications where the complexity of the
data or task makes the design of such a function by hand impractical. This
attribution is very nontrivial in diagnostic problems. BP neural network is a main
type of ANN used to solve fault diagnosis and prognosis problems.
ANN can deal with complex non-linear problem without sophisticated and
specialized knowledge of the real systems. It is an effective classification
techniques and low operational response times needed after training. The
relationship between the condition of component and the features is not linear but
non-linear. BP neural network does not need to know the exact form of analytical
function on which the model should be built. This means neither the functional
type nor the number and position of the parameters in the model-function need to
know. It can deal with multi-input, multi-output, quantitative or qualitative,
complex system with very good abilities of data fusion, self-adaptation and parallel
processing. Therefore, it is very suitable to be selected as a method of fault
diagnosis and prognosis. There are many papers dealing with the use of ANN and
most of their contributions are ANN training efficiency and strategies for ANN
itself. ANN in IFDPS will be used to detect and predict the condition of machines
with other techniques such as wavelet analysis and Fourier transform. Two ANN
techniques of Supervised Back-Propagation (SBP) and Self-Organizing Map are
introduced in this subsection.
32
Chapter 3: Data Mining Techniques for IFDPS
multilayer feed-forward network usually containing the input layer, hidden layer,
and output layer (Fig. 3.1), which trained by an error back propagation algorithm.
The biggest advantage of ANNs trained by back propagation is that there isn’t need
to know the exact form of analytical function on which model should be built. So
it’s not necessary have neither the function type not even the number and position
of the parameters in the model function. Moreover, BP network can learn and store
a lot of input-output model mapping without mathematical equations which
describing this mapping. The learning method of BP is the steepest descent method
which is adjusting the weights and thresholds of the network to minimize the sum
of squared errors. The general procedure of BP network training can be
summarized as follows [Wang, 2005].
1) Initialize the weights to small random vales (-1, 1);
2) Select a training vector pair (input and the corresponding desired output)
from the training set and present the input vector to the inputs layer of the
ANN;
3) Calculate the actual outputs (forward phase);
4) Adjust the weights to reduce the difference according to the error between
actual output and target (backward phase);
5) Return to step 2 and repeat for each pattern p until the total error has
reached an acceptable level;
6) Stop.
The backward phase and forward phase are described separately in this section.
33
Chapter 3: Data Mining Techniques for IFDPS
h
Ik ¦w kj yj k 1, 2" , n; j 1, 2," h
j 1
(3.2)
where H j is the combined or net input to hidden layer node j , while I k is the net
input to the node k of the output layer. The output for each node of hidden layer
and output layer can be obtained as following respectively:
yi f (H ) j 1, 2," , h (3.3)
zk f (Ik ) (3.4)
where f is an activation function. Finally, the output of node k of output layer can
be rewritten as:
ª h m
º
zk f « ¦ wkj f (¦ v ji xi ) » (3.5)
¬j 1 i 1 ¼
where K is a constant often called the learning rate. Then the update rule for hidden
layer nodes can be obtained as:
n n
'v ji KG j xi K xi f '( H j ) ¦ G k wkj K xi f '( H j ) ¦ [(tk z k ) f '( I k ) wkj ] (3.7)
k 1 k 1
Then, all weights wkj and v ji can be updated according to Eq. (3.6) and Eq. (3.7)
respectively as following:
wkjnew wkjold 'wkj wkjold K y j (tk zk ) f '( I k )
(3.8)
n
34
Chapter 3: Data Mining Techniques for IFDPS
This is the process of forward phase and backward phase in BP network training.
Afterward, the whole training processes can be done according to the steps
descripted above. The objective of ANN training is to obtain suitable weights to
meet the inputs and the targets of training data. After the training of BP network,
for each set of test data or query data, there is a set of output calculated by the final
updated weights. BP network is a very useful model in real application especially
when the real physical model and mathematic model are unavailable. It acts as a
black box, which allows no physical interpretation of its internal parameters and
functions. This propriety is very important to apply BP network in condition
monitoring because most real mathematic models are unavailable. For a specific
application in fault diagnosis and prognosis, after training by features extracted
from processed historic data, the BP network can classify the fault and predict the
states of the monitored components or machine units.
35
Chapter 3: Data Mining Techniques for IFDPS
Zij
In generally, the learning process of SOM network can be several steps [Wang,
2005].
1) Initializing the weight vector randomly Zij , the learning rate K and other
relative training parameters.
2) For each input vector, the responses of all neurons in the output layer are
calculated and the winning node U c is selected. The winning node means its
weight Zij best matches the input vector that is the Euclidean Distance is the
smallest among all nodes.
3) After the winning node is selected, identifying the neighborhood around U c ,
that is the set of competitive units close to the winning node. Fig. 3.3 shows the
two examples of a neighborhood around wining node: the one is rectangular
lattice and the other is the hexagonal lattice. The size of the neighborhood
begins with a large enough size and then decreases with the number of iterations
of the network.
4) Updating the weight vectors of node Uc and all nodes in the neighborhood
around it by the following functions:
°Z j (t ) K (t ) f ( d c d i ) ( x Z j (t )) j H (t )
Z j (t 1) ® (3.10)
°̄ Z j (t ) otherwise
t
H (t ) H 0 (1 ) (3.11)
T
T t
K (t ) (K max K min ) K min (3.12)
T 1
36
Chapter 3: Data Mining Techniques for IFDPS
where:
t : the current learning epoch;
x: input vector;
T : the total number of learning epoch;
H0 : the initial neighborhood size;
dc di : the topological distance between the central neuron c and the current
neuron i ;
f : topology dependent function.
H (t ) : the actual neighborhood size in t epoch;
th
K (t ) :
the learning rate in t th epoch;
5) Updating the learning rate using Eq. (3.12).
6) Reduce the neighborhood function using Eq. (3.11).
7) Loop from 2) to 6) until no noticeable changes of the feature map.
SOM network has some advantages and some disadvantages. SOM permits to
cluster data where there is no prior knowledge of the results or of the clustering. It
is able to convert multi-dimensional data clusters into the form of a two-
dimensional grid preserving the topological relationship of the data. It may be used
where there is ample supply of “good normal” data containing some but little bad
or usual data. That is engine monitoring or alarm monitoring. The SOM has very
serious computational disadvantages, which affects the performance of large scale
application running on parallel computers. In order to find which neuron is to be
stimulated, the program has to check all of the neurons. This is a big restriction
when large SOM network are to be trained. Sometimes grid size may need to be
adjusting in response to number of clusters expected.
37
Chapter 3: Data Mining Techniques for IFDPS
some supervision information – but not necessarily for all examples. In this case,
the data set X ( xi )i[ n] can be divided into two parts: the points X l ( x1 ," , xl ) for
which labels Yl ( y1 ," , yl ) are provided, and the points X u ( xl 1 ," , xl u ) , the
labels of which are not known. SSL is very useful in real industry application
especially when the history data are huge but only a small part of them are labeled.
Semi-supervised learning methods fall into ¿ve categories: SSL with generative
models, SSL with low density separation, graph-based methods, co-training
methods, and self-training methods [Blum & Chawla, 2001; Yuan, 2012].
Recently graph-based methods with more applicable assumption have attracted
considerable attention. Speci¿cally, graph-based manifold regularization [Belkin et
al., 2006] exploits the geometric structure of the marginal distribution of the CM
data in the feature space. The incorporation of unlabeled data has demonstrated the
potential for improved accuracy in time series prediction [Wei & Keogh, 2006],
speech recognition [Jansen & Niyogi, 2005], calibration-eơort reduction problem
[Pan et al., 2001]. In this paper, we are looking for a general semi-supervised
classi¿cation framework for fault detection. The manifold regularization based
methods is a good option.
The Manifold regularization combines the ideas of spectral graph theory, manifold
learning and kernel methods in a coherent and natural way to incorporate both the
cluster assumption and the manifold assumption in Reproducing Kernel Hilbert
Spaces (RKHS) regularization framework. In this section, we address the manifold
regularization based SSL framework concisely following the description of [Belkin
et al., 2006]. More details refer to [Sindhwani et al., 2005].
As mentioned above, consider a set of l labelled samples {( xi , zi )}li 1 and a set of u
unlabelled samples { x j }lj ul 1 , where xi , x j R d are the feature vectors collected
from the input space 0 according to the marginal distribution (0 , and zi R is
the classi¿ed label determined by the conditional distribution ( ( z | x ) . Manifold
regularization introduces the regularized risk functional as an additional regularizer
that serves to impose this assumption on the learning problem. The learning
problem corresponds to solving the following optimization problem:
1 l
¦ i 1 . ( xi , zi , f ) J A f JI ³
2
f
arg min K
M ,M (3.13)
f H K l M
which ¿nds the optimal function f
in the RKHS space H K of functions f : 0 o R
corresponding to a Mercer kernel K : 0 u 0 o R , e.g. a linear or Gaussian kernel.
The ¿rst term of the regularized risk functional in Eq. (3.13) is de¿ned on the loss
function . measured the discrepancy between predicted value f ( xi ) and actual
label zi . The second term controls the complexity of f in terms of the RKHS norm,
with J A being the RKHS norm regularization parameter. The third term is speci¿c
to manifold regularization and is based on the assumption that the support of (0
forms a compact sub-manifold % . It controls the complexity of f in the intrinsic
geometry of (0 , with J I being the corresponding manifold regularization
38
Chapter 3: Data Mining Techniques for IFDPS
parameter. The third term is approximated using the graph Laplacian de¿ned on all
l u labelled and unlabelled examples without using the label information. Hence
the optimization problem can be reformulated as:
1 l JI
¦ . ( xi , zi , f ) J A f fˆ T Lfˆ
2
f
arg min (3.14)
f H K l i1 K
(l u ) 2
where fˆ ( f ( x1 ), . . . , f ( xl u ))c and L is the Laplacian matrix of a graph that models the
underlying geometric structure.
From the extended Representer theorem [Belkin et al., 2006], the optimal function
can be expressed in the following form:
l u
f
( x) ¦ D K ( x , x)
i 1
i i
(3.15)
When lose function . in Eq. (3.14) is adopt to be the squared loss function
.( xi , zi , f ) zi f (x) , the Laplacian Regularized Least Squares (LapRLS) algorithm
2
For the LapRLS, the optimal solution in Eq. (3.16) D
(D1
,},Dl
u )c is given by:
J Il
D
( JK J AlI LK ) 1 Z (3.17)
(l u ) 2
where K is the (l u) u (l u) Gram matrix over all labelled and unlabelled samples,
Z is an (l u ) -dimensional label vector given by Z ( z1 , }, zl , 0, }, 0)c , and
J (1, } ,1, 0, } , 0) is an (l u ) u (l u ) diagonal matrix with the ¿rst l diagonal
entries being 1 and the rest being 0.
When lose function . in Eq. (3.14) is adopt to be the hinge loss function
. ( xi , zi , f ) 1 zi f ( x ) , the algorithm formulates the Laplacian Support Vector
Machines (LapSVM). Please refer to [Belkin et al., 2006] in details. The manifold
regularization algorithms SSL can be summarized as following [Belkin et al.,
2006]:
Input: l labelled examples {( xi , yi )}li 1 , u unlabelled examples { x j }lj ul 1 .
39
Chapter 3: Data Mining Techniques for IFDPS
Step 5: Compute D * using Eq. (3.17) for square loss (Laplacian RLS).
¦
l u
*
Step 6: Output function f (x) i 1
Di* K ( xi , x) .
Association rules are one of the major techniques of data mining and it is perhaps
the most common form of local-pattern discovery in unsupervised learning systems.
Association rules mining retrieve all possible interesting associations (patterns,
relationships or dependencies) in large sets of the data items which are stored in the
form of transactions that can be generated by an external process, or extracted from
relational database or data warehouse. Due to good scalability characteristics of the
association rules algorithm and the ever-growing size of the accumulated data,
association rules are an essential data mining tool for extracting useful knowledge
from database. The most important thing in this case would be a rule that is
interesting, that tells you something about your database that you have not already
known and probably weren’t able to explicitly articulate.
40
Chapter 3: Data Mining Techniques for IFDPS
From the database of sales transactions, the important associations among items
such that the presence of some items in a transaction will imply the presence of
other items in the same transactions can be discovered. Let I ^i1 , i2 , " , im ` be a
set of literals which called items. Let D (database) be a set of transactions where
each transaction T is a set of items such that T I . Note that the quantities of the
items bought in a transaction are not considered which means each item is a binary
variable indicating whether an item was bought or not. Each transaction is
associated with an identifier called a transaction identifier ( TID ). An example of
the model for such transaction database is given in Table 3.1.
TID Items
001 Apples, Celery, Diapers
002 Beer, Celery, Eggs
003 Apples, Beer, Celery, Eggs
004 Beer, Eggs
A transaction T is said to contain a set of items X if and only if X T . A
transaction rules implies the form X Y where X I , Y I , and X Y I . The
rule X Y holds in the transaction set D with confidence c if c% of the
transaction in D that contain X also contain Y . The rule X Y has support s in
the transaction set D if s% of the transaction in D that contain X * Y . Here, two
important concepts are defined bellowing:
—Support, which indicates the frequency (probability) of the entire rule with the
respect to D . It is defined as a ratio of the number of transactions containing A and
B to the total number of transactions (the probability of both A and B co-
occurring in D ):
^T D|A * B T`
support A B =P A * B = (3.18)
D
41
Chapter 3: Data Mining Techniques for IFDPS
It is often desirable to pay attention to only those rules that may have a reasonably
large support. Such rules with high confidence and strong support are referred to as
strong rules. The task of mining association rules is essentially to discover strong
association rules in large databases.
42
Chapter 3: Data Mining Techniques for IFDPS
43
Chapter 3: Data Mining Techniques for IFDPS
guaranteed by using frequent itemsets, and thus we need only to generate the rules
and prune those rules which do not satisfy the minimum confidence. The
confidence can be defined based on the corresponding support values as follows:
support (A * B)
confidence A B =P A|B = (3.20)
support (A)
Support
Support Support Itemset
Itemset Itemset count
count count
Support A,B 1
Itemset A,B,C 1 A,C 2
count A,C 2
A,B,E 1 B,C 2
B,C,E 2 A,E 1
A,C,E 1 B,E 3 B,C 2
B,C,E 2 C,E 2 B,E 3
C,E 2
44
Chapter 3: Data Mining Techniques for IFDPS
3.4.5.2 Sampling
As the database size increases, sampling appears to be an attractive approach to
data mining. Sampling generates association rules based on a sampled subset of
transactions in D . In this case, a randomly selected subset S of D is used to
search for the frequent itemsets. The generation of frequent itemsets from S is
more efficient (faster), but some of the rules that would have been generated from
D may be missing, and some rules generated from S may not be present in D , i.e.,
the “accuracy” of the rules may be lower. Usually the size of S is selected so that
the transactions can fit into the main memory, and thus only one scan of the data is
required (no paging). To reduce the possibility that we will miss some of the
45
Chapter 3: Data Mining Techniques for IFDPS
frequent itemsets from D when generating frequent itemsets from S , we may use
a lower support threshold for S as compared with the support threshold for D .
This approach is especially valuable when the association rules are computed on a
very frequent basis.
3.4.5.3 Hashing
Hashing is used to reduce the size of the candidate k-itemsets, i.e., itemsets
generated from frequent itemsets from iteration k 1 , Ck , for k ! 1 . For instance,
when scanning D to generate L1 from the candidate 1-itemsets in C1 , we can at
the same time generate all 2-itemsets for each transaction, hash (map) them into
different buckets of the hash table structure, and increase the corresponding bucket
counts. A 2-itemset whose corresponding bucket count is below the support
threshold cannot be frequent, and thus we can remove it from the candidate set C 2 .
In this way, we reduce the number of candidate 2-itemsets that must be examined
to obtain L2 .
46
Chapter 3: Data Mining Techniques for IFDPS
Ant colonies can accomplish complex tasks that far exceed the individual
capabilities of a single ant [Dorigo & Stützle, 2004]. The ACO model is applied
firstly to solve the Travelling Salesman Problem (TSP). The two main phases of
the algorithm constitute the solution construction and the pheromone update. For
TSP, m ants concurrently build a tour and select cities randomly at the beginning
47
Chapter 3: Data Mining Techniques for IFDPS
of the tour construction. At each construction step, ant k decides which city to visit
next according to a random proportional rule. The probability with which ant k ,
currently at city i , chooses to go to city j is:
[W ij ]D [Kij ]E
pijk , if j Nik (3.21)
¦
'
lNik
[W ij ]D [Kij ]E
where W ij is the pheromone deposited on arc (i, j ) , Kij 1/ dij , which represents the
visibility of city j towards city i which is inversely proportional to the distance
dij , D and E are two parameters which determine the relative influence of the
pheromone trail and the heuristic information, and N ik is the set of cities that ant k
has not visited yet [Dorigo & Stützle, 2004].
The pheromone trails are updated after tours constructing by evaporating at a
constant rate and accumulating with new deposits:
m
W ij m (1 U )W ij ¦ 'W ijk , (i , k ) L (3.22)
k 1
where 0 U d 1 is the pheromone evaporation rate and 'W ijk is the amount of
pheromone that ant k deposits on the arcs it has visited, defined as follows:
1 / C k if arc(i, j) belongs to T k
'W ijk ®
¯0 otherwise
where C k is the length of the tour T k built by ant k . By using this rule, the
probability increases that forthcoming ants will use this arc. A brief pseudo-code
and the implementation steps of ACO can be written as following pseudo-code and
Fig. 3.8.
Begin
Initialization
While stopping criterion not satisfied do
Deploy each ant in a starting city
For each ant
Repeat
Calculate probability of remaining cities selected to be next city
Choose next city according to probability using roulette wheel selection
algorithm
Until all cities are visited
Update pheromone
End for
Update the best route (beat solution)
End while
Record and output the beat route (solution)
End
48
Chapter 3: Data Mining Techniques for IFDPS
49
Chapter 3: Data Mining Techniques for IFDPS
o o
( pi (t ) xi (t ))
o o
( pg xi (t ))
o
xi (t ) o
vi (t 1) o
o
xi (t 1)
pi (t ) o
vi (t )
o
pg
o
The current position xi can be considered as a set of coordinates describing a point
in space. If the current position is better than any that has been found so far, then
o
the coordinates are stored in the vector pi . The value of the best function result so
o
far is stored in a variable that can be called pg . The objective, of course, is to keep
o o
finding better positions and updating pi and pg . New points are chosen by adding
o o o
vi coordinates to xi , and the algorithm operates by adjusting vi , which can
effectively be seen as a step size. The steps of implementing PSO were shown as
follows:
Step 1: Initialize a population array of particles with random positions and
velocities on D dimensions in the search space.
Step 2: Loop
Step 3: For each particle, evaluate the desired optimization fitness function in
D variables.
o
Step 4: Compare particle’s fitness evaluation with that of its pi . If current
o o
value is better than that of pi , then set pi equal to the current
coordinates.
50
Chapter 3: Data Mining Techniques for IFDPS
Step 5: Identify the particle in the neighborhood with the best success so far,
o
and assign it to the variable pg .
Step 6: Change the velocity and position of the particle according to the
following equations:
o o o o o o
vi (t 1) Z vi (t ) c1 r1 ( pi vi (t )) c2 r2 ( pg vi (t )) (3.23)
o o o
xi (t 1) xi (t ) vi (t 1) (3.24)
where Z is the inertia weighting; c1 and c2 are acceleration coefficients,
positive constraint; r1 and r2 are the random numbers deferring
uniform distribution on [0, 1]; i represents i th iteration.
Step 7: If a criterion is met (usually a sufficiently good fitness or a maximum
number of iterations), exit loop.
The flowchart of PSO can be seen as Fig. 3.10. In PSO, every particle remembers
its own previous best value as well as the neighborhood best. therefore it has a
more effective memory capability than the GA. PSO is also more efficient in
maintaining the diversity of the swarm, since all the particles use some information
related to the most successful particle in order to improve themselves, whereas in
GA, the worse solutions at every generation are discarded and only the good ones
are saved for next generation. Therefore in GA the population does the evolution
around a set of best individuals in every generation. In addition, PSO is easier to
implement and there are fewer parameters to adjust compared with GA [Valle et al.,
2008].
51
Chapter 3: Data Mining Techniques for IFDPS
according to the problems under consideration [Liu & Abraham, 2005; Shi &
Eberhart, 2001].
The parameters c1 and c2 , in Eq. (3.23), are not critical for the convergence of PSO.
However, proper fine-tuning may result in faster convergence and alleviation of
local minima. As default values, usually, c1 c2 2 are used, but some experiment
results indicate that c1 c2 1.49 might provide even better results. From Eq.(3.23),
it is better for local exploitation when c1 ! c2 , while it is better for global
exploration when c1 c2 . Recent work reports that it might be even better to choose
a larger cognitive parameter, c1 , than a social parameter, c2 , but with c1 c2 d 4
[Clerc & Kennedy, 2002]. Therefore, the parameter c1 can be changed from c1min to
c1max and the parameter c2 can be changed from c2 max to c2 min regularly in order to
make the algorithm promote global exploration in the beginning and get more
refined solutions (local exploitation) in the end.
52
Chapter 3: Data Mining Techniques for IFDPS
parameters for a better performance (adaptive PSO). In other cases, the nature of
the problem to be solved requires the PSO to work under complex environments as
in the case of the multi-objective or constrained optimization problems or tracking
dynamic systems. There are also some discrete variants of PSO and other
variations to the original formulation that can be included to improve its
performance. This section will present some of them.
A. Binary PSO
Kennedy and Eberhart proposed a discrete binary version of PSO for binary
problems [Kennedy & Eberhart, 1997]. In their model a particle will decide on
"yes" or “no", "true" or "false", "include" or "not to include" etc. also this binary
values can be a representation of a real value in binary search space.
In the binary PSO, the particle’s personal best and global best is updated as in
continuous version. The major difference between binary PSO with continuous
version is that velocities of the particles are rather defined in terms of probabilities
that a bit will change to one. Using this definition a velocity must be restricted
within the range [0, 1]. So a map is introduced to map all real valued numbers of
velocity to the range [0, 1] [Kennedy & Eberhart, 1997]. The normalization
function used here is a sigmoid function as:
1
Sig (vij (t )) vij ( t )
(3.25)
1 e
JG
where vij (t ) means the j th component of vector vi (t ) . The Eq. (3.23) is also used
to update the velocity vector of the particle. And the new position of the particle is
obtained using the following equation:
1 rij sig (vij (t 1))
xij (t 1) ® (3.26)
¯0 otherwise
53
Chapter 3: Data Mining Techniques for IFDPS
There are also some other variants of PSO such as Fuzzy PSO [Shi & Eberhart,
2001; D. Tian & Li, 2009], Adaptive PSO [Valle et al., 2008], Gaussian PSO
[Krohling, 2004, 2005; Secrest & Lamont, 2002], Dissipative PSO (DPSO) [Biskas
et al., 2006; Xie et al., 2002], PSO With Passive Congregation (PSOPC) [He et al.,
2004], Stretching PSO (SPSO) [Kannan et al., 2004; Parsopoulos & Vrahatis,
2002], Cooperative PSO (CPSO) [Baskar & Suganthan, 2004; Bergh &
Engelbrecht, 2004; El-Abd & Kamel, 2006], and Comprehensive Learning PSO
(CLPSO) [Liang et al., 2006]. Each variant of PSO mentioned above have
improved its performance in one or more aspects. We can choose a suitable one
when we need apply PSO or its variants to find optimal solution.
54
Chapter 3: Data Mining Techniques for IFDPS
between hive and recruitment target is indicated by the duration of the waggle runs.
The farther the target, the longer the waggle phase, with a rate of increase of about
75 milliseconds per 100 meters.
After unloading the collected food, a foraging bee returning to the beehive from a
food source (employed bee) decides whether to abandon the food source or not. If
the food source is abandoned, the bee observes the dances of other employed bees
and follows one of the possible ways adverted for other bees as a follower bee or
starts to search for an entirely new source as a scout bee. However, if the food
source is not abandoned, the employed bee decides whether to dance for the source
to recruit other bees or not and just keep on going to the same food source without
adverting it. Fig. 3.12 shows the decision model of bees’ behaviour.
EBB FS(B)
FS(New)
OB
SB
OB EBB
SB
Unload
OB nectar
Unload
Da
nectar nce
for a rea EB B
B
Dance area
SB
for New
a
r A re
EBA
fo e a
c
an
D
Unload
OB
nectar
HIVE
55
Chapter 3: Data Mining Techniques for IFDPS
56
Chapter 3: Data Mining Techniques for IFDPS
3) Repeat ( Cycle 1 )
G
4) Produce new solutions vi (food source positions) in the neighbourhood of
G
xi for the employed bees using Eq. (3.27) and evaluate these solutions
using fitness function.
vij xij Iij ( xij xkj ) (3.27)
where
Iij : Random number between [-1, 1].
i : {1, 2," , C} the i th food source.
j : the j th component of parameters.
k : {1, 2," , SN } randomly chosen index of parameter (dimension) which is
different from j .
G G
5) Apply the greedy selection process for employed bees between vi and xi .
G
6) Calculate the probability value pi for the solutions xi by means of their
fitness values using Eq. (3.28).
fiti / ¦ n
SN
pi 1
fit n (3.28)
G
7) Produce the new solutions vi (new positions) for the onlookers from the
G
solutions xi using Eq. (3.27), which selected depending on pi , and
evaluate them.
G
8) Apply the greedy selection process for the onlooker bees between vi and
G
xi .
9) Determine the abandoned solution (source), if exists, replace it with a new
G
randomly produced solution xi for the scout using the following equation.
xij xijmin rand[0,1]( xijmax xijmin ) (3.29)
10) Memorize the best food source position (solution) achieved so far.
11) Cycle Cycle 1
12) Exit if Cycle maxCycle or other criterion is met.
In the process of BCA algorithm, step 5) and step 6) constitute the employed bee
phase, step 7) and step 8) constitute the onlooker bee phase while step 9) is scout
bee phase. The problem of dynamic CBM scheduling is a kind of NP problem and
the BCA is a good method to find the optimal solution for this kind problem.
3.6 Summary
This Chapter introduced the basic concepts of Data Mining techniques and
algorithms. ANN includes supervised learning and unsupervised learning is mainly
applied in the case of the accurate physical model or mathematical model is
unavailable, but the huge history data are available. When there are huge history
57
Chapter 3: Data Mining Techniques for IFDPS
data but only a small part of them are labeled, the Semi-supervised learning can be
very good method to build the model. Association rules are mainly used to find the
relations between the features. Swarm Intelligence, such as particle Swarm
Optimization and Bee Colony Algorithm, is mainly used to solve the optimization
problems, find the optimal solution for NP problems.
There are too many methods and algorithm of Data Mining techniques. This
Chapter only introduced some of them which will be applied in IFDPS system.
58
Chapter 4: Sensor Classification and Sensor Placement Optimization
4.1 Introduction
A sensor is a converter that measures physical quantity and converts it into signal
which can be read by an observer or by instruments. It is device that detects
changes in the ambient conditions or in the state of another device or a system, and
conveys or records this information in a certain manner. Sensors and sensing
strategies constitute the foundational basis for fault diagnosis and prognosis
systems. Most of sensors are well-developed in market and the customers just need
to choose suitable sensors to collect data which can be used to monitor the
condition of components or machines. When choosing sensors for diagnostics and
prognostics, many parameters and features of the sensors must to be considered
which are the type, number, and location of sensors; their size, weight, cost,
dynamic range, and other characteristic properties; whether they are of the wired or
wireless variety; etc. The raw data collected from the sensors are rarely useful
because they may contain much noise or no explicit features. These data must be
processed appropriately so that useful information may be extracted that is a
reduced version of the original data but preserves as much as possible those
characteristic features or fault indicators that are indicative of the fault events we
are seeking to detect, isolate, and predict the time evolution of. Thus such data
must be preprocessed, that is, filtered, compressed, correlated, etc., in order to
remove artifacts and reduce noise levels and the volume of data to be processed
subsequently. Furthermore, the sensor providing the data must be validated; that is,
the sensors themselves are not subjected to fault conditions. Once the
preprocessing module confirms that the sensor data are ‘‘clean’’ and formatted
appropriately, features or signatures of normal or faulty conditions must be
extracted. This is most important in the framework of IFDPS because it is the input
of the processes of diagnostics and prognostics [Wachtsevanos et al. 2006]
Sensor suites are specific to the application domain, and they are intended to
monitor such typical state awareness variables as temperature, pressure, speed,
vibrations, etc. Some sensors are inserted specifically to measure quantities that are
directly related to fault modes identified as candidates for diagnosis. Among them
are strain gauges, ultrasonic sensors, proximity devices, acoustic emission sensors,
electrochemical fatigue sensors, interferometers, etc., whereas others are of the
multipurpose variety, such as temperature, speed, flow rate, etc., and are designed
to monitor process variables for control and/or performance assessment in addition
to diagnosis. More recently we have witnessed the introduction of wireless devices
in the area of condition monitoring.
For a normal word ‘sensor’ device, it actual have two components: sensor and
transducer. A sensor is defined as a device that is sensitive to light, temperature,
electrical impedance, or radiation level and transmits a signal to a measuring or
control device. On the other hand, a transducer is defined as a device that receives
energy from one system and retransmits it, often in a different form, to another
system. A measuring device passes through two stages while measuring a signal.
59
Chapter 4: Sensor Classification and Sensor Placement Optimization
The sensor strategies are mainly focused two issues: the one is which kind of
sensors is suitable to measure the signals, and the other is which place the sensors
should be set up. This Chapter will introduce these two issues.
There are many kinds of sensors in the business market. White [White, 1987]
presented out a sensor classification scheme for categorizing sensors which are
recalled in the following tables. Table 4.1 shows most measurands for which
sensors may be needed under the headings: acoustic, biological, chemical, electric,
magnetic, mechanical, optical, radiation (particle), and thermal, etc. With a
particular measurand, one is primarily interested in sensor characteristics such as
sensitivity, selectivity, and speed of response which is shown in Table 4.2 called
technological aspects of sensors. Table 4.3 shows the detection means used in
sensors. Table 4.4 is intended to indicate the primary phenomena used to convert
the measurand into a form suitable for producing the sensor output. The application
fields are listed in Table 4.5. Most sensors contain a variety of materials (for
example, almost all contain some metal). The entries in Table 4.6 should be
understood to refer to the materials chiefly responsible for sensor operation.
60
Chapter 4: Sensor Classification and Sensor Placement Optimization
61
Chapter 4: Sensor Classification and Sensor Placement Optimization
62
Chapter 4: Sensor Classification and Sensor Placement Optimization
63
Chapter 4: Sensor Classification and Sensor Placement Optimization
Recent years have seen an increased requirement for a greater understanding of the
causes of vibration and the dynamic response of failing structures and machines to
vibratory forces. An accurate, reliable, and robust vibration transducer therefore is
required to monitor online such critical components and structures. Piezoelectric
accelerometers offer a wide dynamic range and rank among the optimal choices for
vibration-monitoring apparatus. They exhibit such desirable properties as
[Wachtsevanos et al., 2006]:
x Usability over very wide frequency ranges;
x Excellent linearity over a very wide dynamic range;
x Electronically integrated acceleration signals to provide velocity and dis-
placement data.
x Vibration measurements in a wide range of environmental conditions while
still maintaining excellent accuracy
x Self-generating power supply
x No moving parts and hence extreme durability
x Extremely compact plus a high sensitivity-to-mass ratio
Piezoelectric accelerometers are used to measure all types of vibrations regardless
of their nature or source in the time or frequency domain as long as the
accelerometer has the correct frequency and dynamic ranges.
A strain-gauge sensor is based on a simple principle from basic electronics that the
resistance of a conductor is directly proportional to its length and resistivity and
inversely proportional to its cross-sectional area. Applied stress or strain causes the
metal transduction element to vary in length and cross-sectional area, thus causing
a change in resistance that can be measured as an electrical signal. Certain
substances, such as semiconductors, exhibit the piezoresistive effect, in which
application of strain greatly affects their resistivity. Strain gauges of this type have
a sensitivity approximately two orders greater than the former type. The transducer
usually is used within a Wheatstone bridge arrangement, with one, two, or all four
of the bridge arms being individual strain gauges, so that the output voltage change
64
Chapter 4: Sensor Classification and Sensor Placement Optimization
System performance and operational data are monitored routinely in all industrial
establishments, utility operations, transportation systems, etc. for process control,
performance evaluation, quality assurance, and fault diagnosis purposes. A large
number of sensor systems have been developed and employed over the years. The
list includes devices that are intended to measure such critical properties as
temperature; pressure; fluid, thermodynamic, and optical properties; and
biochemical elements, among many others. Sensors based on classic measuring
elements—inductive, capacitive, ultrasound— have found extensive applications.
Temperature variations in many mechanical, electrical, and electronic systems are
excellent indicators of impending failure conditions. Temperatures in excess of
control limits should be monitored and used in conjunction with other
65
Chapter 4: Sensor Classification and Sensor Placement Optimization
66
Chapter 4: Sensor Classification and Sensor Placement Optimization
Of interest also are sensor systems that can be produced inexpensively, singly or in
an array, while maintaining a high level of operational reliability.
Microelectromechanical systems (MEMS) and sensors based on fiber-optic
technologies are finding popularity because of their size, cost, and ability to
integrate multiple transducers in a single device. Micro-machined MEMS devices
in silicon or other materials are fabricated in a batch process with the potential for
integration with electronics, thus facilitating on-board signal processing and other
‘‘smart’’ functions. A number of MEMS transducer and sensor systems have been
manufactured in the laboratory or are available commercially, monitoring such
critical parameters as temperature, pressure, acceleration, etc. [Wachtsevanos et al.,
2006].
Fiber optics has penetrated the telecommunications and other high-technology
sectors in recent years. They find utility in the sensor field because of their
compact and flexible geometry, potential for fabrication into arrays of devices,
batch fabrication, etc. Fiber optic sensors have been designed to measure strain,
temperature, displacement, chemical concentration, and acceleration, among other
material and environmental properties. Their main advantages include small size,
light weight, immunity to electromagnetic and radio frequency interference
(EMI/RFI), high- and low-temperature endurance, fast response, high sensitivity,
and low cost. Fiber optic technologies are based on extrinsic Fabry-Perot
interferometry (EFPI), chemical change in the fiber cladding, optical signal
changes owing to fiber stress and deformation, etc.
There are also some other kinds of sensors available in the market and most of
them are very good to meet the monitoring requirement. We only need choose
suitable ones to collect data from the machines for monitoring.
67
Chapter 4: Sensor Classification and Sensor Placement Optimization
x Noise levels,
x The presence or absence of certain kinds of objects,
x Mechanical stress levels on attached objects, and
x The current characteristics such as speed, direction, and size of an object.
A sensor network consists of multiple detection stations called sensor nodes, each
of which is small, lightweight and portable. Every sensor node is equipped with a
transducer, microcomputer, transceiver and power source. The transducer generates
electrical signals based on sensed physical effects and phenomena. The
microcomputer processes and stores the sensor output. The transceiver, which can
be hard-wired or wireless, receives commands from a central computer and
transmits data to that computer. The power for each sensor node is derived from
the electric utility or from a battery.
Sensor networks can be deployed in the following two ways [Intanagonwiwat et
al., 2000]:
x Sensors can be positioned far from the actual phenomenon, i.e., something
known by sense perception. In this approach, large sensors that use some
complex techniques to distinguish the targets from environmental noise are
required.
x Several sensors that perform only sensing can be deployed. The positions
of the sensors and communications topology are carefully engineered (Fig.
1.5). They transmit time series of the sensed phenomenon to the central
nodes where computations are performed and data are fused.
The above described features ensure a wide range of applications for sensor
networks. Some of the application areas are health, military, and security. For
example, the physiological data about a patient can be monitored remotely by a
doctor. While this is more convenient for the patient, it also allows the doctor to
better understand the patient’s current condition. Sensor networks can also be used
to detect foreign chemical agents in the air and the water. They can help to identify
the type, concentration, and location of pollutants. In essence, sensor networks will
provide the end user with intelligence and a better understanding of the
environment [Akyildi et al., 2002]. Sensor networks can also be very helpful in
condition monitoring for manufacturing machines, wind turbines, transporters and
infrastructure because they may be distributed in different place. Potential
applications of sensor networks may include:
x Condition monitoring for factory or infrastructure;
x Industrial automation;
x Automated and smart homes;
x Video surveillance;
x Traffic monitoring;
x Medical device monitoring;
x Monitoring of weather conditions;
x Air traffic control;
x Military applications;
x Robot control.
68
Chapter 4: Sensor Classification and Sensor Placement Optimization
While many sensors connect to controllers and processing stations directly (e.g.,
using local area networks), an increasing number of sensors communicate the
collected data wirelessly to a centralized processing station which are compose a
Wireless Sensor Network (WSN). This is important since many network
applications require hundreds or thousands of sensor nodes, often deployed in
remote and inaccessible areas. Therefore, a wireless sensor has not only a sensing
component, but also on-board processing, communication, and storage capabilities.
With these enhancements, a sensor node is often not only responsible for data
collection, but also for in-network analysis, correlation, and fusion of its own
sensor data and data from other sensor nodes. When many sensors cooperatively
monitor large physical environments, they form a WSN. Sensor nodes
communicate not only with each other but also with a base station (BS which could
be a gateway) using their wireless radios, allowing them to disseminate their sensor
data to remote processing, visualization, analysis, and storage systems. For
example, Fig. 4.4 shows two sensor fields monitoring two different geographic
regions and connecting to the Internet using their base stations [Dargie and
Poellabauer, 2010].
The capabilities of sensor nodes in a WSN can vary widely, that is, simple sensor
nodes may monitor a single physical phenomenon, while more complex devices
may combine many different sensing techniques (e.g., acoustic, optical, magnetic).
They can also differ in their communication capabilities, for example, using
ultrasound, infrared, or radio frequency technologies with varying data rates and
latencies. While simple sensors may only collect and communicate information
about the observed environment, more powerful devices (i.e., devices with large
processing, energy, and storage capacities) may also perform extensive processing
and aggregation functions. Such devices often assume additional responsibilities in
a WSN, for example, they may form communication backbones that can be used by
other resource-constrained sensor devices to reach the base station. Finally, some
devices may have access to additional supporting technologies, for example,
Global Positioning System (GPS) receivers, allowing them to accurately determine
their position. However, such systems often consume too much energy to be
feasible for low-cost and low-power sensor nodes [Dargie and Poellabauer, 2010].
69
Chapter 4: Sensor Classification and Sensor Placement Optimization
The well-known IEEE 802.11 family of standards was introduced in 1997 and is
the most common wireless networking technology for mobile systems. It uses
different frequency bands, for example, the 2.4-GHz band is used by IEEE 802.11b
and IEEE 802.11g, while the IEEE 802.11a protocol uses the 5-GHz frequency
band. IEEE 802.11 was frequently used in early wireless sensor networks and can
still be found in current networks when bandwidth demands are high (e.g., for
multimedia sensors). However, the high-energy overheads of IEEE 802.11-based
networks make this standard unsuitable for low-power sensor networks. Typical
data rate requirements in sensor networks are comparable to the bandwidths pro-
vided by dial-up modems, therefore the data rates provided by IEEE 802.11 are
typically much higher than needed. This has led to the development of a variety of
protocols that better satisfy the networks’ need for low power consumption and low
data rates. For example, the IEEE 802.15.4 protocol [Callaway et al., 2002] has
been designed specifically for short- range communications in low-power sensor
networks and is supported by most academic and commercial sensor nodes.
The network topologies can be seen as in Fig. 1.5 and the most widely used ones
are topologies of star and mesh. When the transmission ranges of the radios of all
sensor nodes are large enough and the sensors can transmit their data directly to the
base station, they can form a star topology as shown on the left in Fig. 4.5. In this
topology, each sensor node communicates directly with the base station using a
single hop. However, sensor networks often cover large geographic areas and radio
transmission power should be kept at a minimum in order to conserve energy;
consequently, multi-hop communication is the more common case for sensor net-
works (shown on the right in Fig. 4.5). In this mesh topology, sensor nodes must
not only capture and disseminate their own data, but also serve as relays for other
sensor nodes, that is, they must collaborate to propagate sensor data towards the
base station. This routing problem, that is, the task of finding a multi-hop path from
a sensor node to the base station, is one of the most important challenges and has
received immense attention from the research community. When a node serves as a
relay for multiple routes, it often has the opportunity to analyze and pre-process
sensor data in the network, which can lead to the elimination of redundant
information or aggregation of data that may be smaller than the original data. The
more detailed information about Wireless Sensor Networks can be found at the
reference of [Dargie & Poellabauer, 2010].
70
Chapter 4: Sensor Classification and Sensor Placement Optimization
71
Chapter 4: Sensor Classification and Sensor Placement Optimization
Typical RFID systems fundamentally consist of four elements: the RFID tags, the
RFID readers, the antennas and choice of radio characteristics, and the computer
network (if any) that is used to connect the readers (Fig. 4.6). Tags are attached to
objects and each of them has a certain amount of internal memory (E2PROM) in
which it stories information about the object, such as its unique ID number, or in
some cases more details including manufacture data and product composition.
When these tags pass through a field generated by a reader, they transmit this
information back to the reader, thereby identifying the object. Until recently, the
focus of RFID technology was mainly on tags and readers which were being used
in systems where relatively low volumes of data are involved. This is now
changing as RFID in the supply chain is expected to generate huge volumes of data,
which will have to be filtered and routed to the backend IT systems. To solve this
problem companies have developed special software packages (Middleware),
which act as buffers between the RFID front end and the IT backend [Wang &
Zhang, 2012].
There are two main communication principles between RFID readers/antennas and
RFID Tags: inductive coupling and backscatter reflection which are used in near
field and far field respectively (Fig. 4.7). The principle of inductive coupling
means transferring energy from one circuit to another through mutual inductance.
Near field employs inductive coupling of the tag to the magnetic field circulating
around the reader antenna (like a transformer). In RFID systems using inductive
coupling, the reader antenna and the RFID tag antenna each have a coil which
72
Chapter 4: Sensor Classification and Sensor Placement Optimization
together forms a magnetic field so that the tag draws energy from the field to
change the electrical load on the tag antenna. The change is picked up by the reader
and read as a unique serial number. Far field uses similar techniques to radar
(Backscatter reflection) by coupling with the electric field. RFID tags using
backscatter technology reflect radio waves at the same carrier frequency back to
the tag reader, using modulation to transmit the data.
The communication process between the reader and tag is managed and controlled
by one of several protocols, such as the ISO 15693 and ISO 18000-3 for HF or the
ISO 18000-6, and EPC for UHF. Basically what happens is that when the reader is
switched on, it starts emitting a signal at the selected frequency band (typically 860
- 915MHz for UHF or 13.56MHz for HF). Any corresponding tag in the vicinity of
the reader will detect the signal and use the energy from it to wake up and supply
operating power to its internal circuits. Once the Tag has decoded the signal as
valid, it replies to the reader, and indicates its presence by modulating (affecting)
the reader field.
The communication principle can be used to compose parts of wireless sensor
network.
RFID sensor enabled tags, which can be used in such fields as project tracking,
environmental monitoring, automotive electronic system, telemedicine and
manufacturing processes controlling, etc., are bred as the result. Without doubts,
they will play important roles in more and more areas as the technology is
progressively growing. Roughly, the primary sensors in use today can be classified
according to their functions in many categories such as: temperature, pressure,
acceleration, inclination, humidity, light, gas sensor and chemical sensors
[Ruhanne et al., 2008] .
Fig. 4.8 shows the system architecture for a generic sensor tag and its interaction
with RFID systems as it passes through various stages of the manufacturing,
assembling and supply chain. The RFID tags can be combined to the sensor
devices (many different sensors) and transfer the sensing data to the RFID reader
and further to the database through radio waves. Typically for the supply, there are
a number of RFID portals and at each of these passive RFID tag is interrogated.
The data obtained could be used for improving the process and scheduling of
supply chain and production process [Wang & Zhang, 2012]. For the
manufacturing systems and processes, there are many sensors mounted on the
machines which can be combined with RFID tags. The collected data can be
transmitted to RFID reader and database, and the data with some processing
techniques can be used to monitor the condition of the machine and improve the
performance.
73
Chapter 4: Sensor Classification and Sensor Placement Optimization
The basic problem for condition monitoring is to deduce the existence of a defect
in a structure from measurements taken at sensors distributed on the structure. The
correctness of defect diagnosis depends on the method of pattern recognition for
fault and effectiveness of signals from the sensors mounted on the machines. While
carrying out on-site condition monitoring for a machine, the inappropriate
distribution of sensors might result in weak incentives of certain order or modal,
and affect the accuracy of fault identification. The aim of optimizing the placement
of sensors is to obtain as much as possible of machine structural information with
as few as possible sensors, which benefit the company in the economy viewpoint.
Because of constraints of machine structure and environment, and consideration of
economy, only a small number of sensors are installed when a condition
monitoring system is established. It is very important to design the optimal position
of the sensor to mount in order to ensure the accuracy and correctness of
monitoring and fault judgement.
74
Chapter 4: Sensor Classification and Sensor Placement Optimization
75
Chapter 4: Sensor Classification and Sensor Placement Optimization
machine structure using a small number of sensors and ensure the accuracy and
correctness of condition monitoring.
Modal analysis (finite element analysis) is a very important method for fault
diagnosis and condition monitoring. Faults of a machine, such as crack, axis
loosening and fatigue, usually accompany with the change of physical parameters,
such as natural frequency, modal damping, vibration mode and frequency response
function. The faults can be diagnosed according to these changes. The machine’s
vibration is supposed to be a n degree of freedom linear time-invariant system
which differential function can be written as [Wei and Pan 2010]:
xx x
M x(t ) C x(t ) Kx(t ) f (t ) (4.1)
where: M , C and K are the system mass, damping and stiffness matrix
x xx
respectively which are n u n matrix. x (t ) , x(t ) and x (t ) are n order response
vectors of system displacement, velocity and acceleration respectively. f (t )
represents n order excitation force vector. Then the frequency displacement
response function can be obtained by Fourier transform and set x(t ) xe jZt as:
x(Z ) H (Z ) F (Z ) (4.2)
where H (Z ) means the frequency displacement response function which is a
matrix. If the actuation is charged in i point of the machine, the frequency
response function of j point can be written as:
n I jr Iir
H ij (Z ) ¦ Z
r 1
2
M r jZ Cr K r
(4.3)
76
Chapter 4: Sensor Classification and Sensor Placement Optimization
where Iri means the r th component of j th vibration mode and r o means all
calculation vectors are of non-measurement points. Compared Eq. (4.3) and
Eq.(4.4), it is only task to find the minimum value of Eq. (4.4) for the optimal
distribution of sensors. Therefore, it is chosen to be fitness function to find optimal
placement of sensors.
77
Chapter 4: Sensor Classification and Sensor Placement Optimization
to 0.3. The 3D solid model of the blower is built using the three-dimensional
software Solidworks and then import to ANSYS 13.0 to carry out finite element
analysis calculation and modal analysis. The blower is bolted to the floor in real
installation, and thus the boundary condition of baseboard of blower is set to fixed
constraint. This study calculates total 10 order natural frequency (Table 4.7) and its
10 vibration mode shapes of the blower are obtained. The finite element model and
its first four vibration modes are shown in Fig. 4.12. Fig. 4.12(a) to Fig. 4.12 (d)
shows from first to forth order of vibration shape mode respectively. In these
figures, the arrows mean the movement directions of that mode. The natural
frequency results (displacement) in total is shown in Table 4.8, and in three
different directions (X, Y and Z) are shown in Table 4.9-Table 4.11.
78
Chapter 4: Sensor Classification and Sensor Placement Optimization
Fig. 4.12 The Finite Element Model of Blower and Its First Four Modes
Measuring st 10th
1 order 2nd order 3rd order 4th order 5th order 6th order 7th order 8th order 9th other
Point order
1 0.17836 9.060e-2 7.433e-2 6.391e-2 4.649e-2 0.2028 0.32562 0.10588 4.435e-2 3.509e-2
2 0.16318 9.666e-2 6.106e-2 6.268e-2 4.406e-2 0.18447 0.30796 9.893e-2 4.048e-2 4.081e-2
3 0.17333 0.13224 8.327e-2 7.811e-2 4.548e-2 0.259 0.31935 0.11036 4.282e-2 2.617e-2
4 0.17537 0.21849 0.12898 0.10694 4.355e-2 0.38771 0.30514 0.1153 4.059e-2 3.384e-2
5 5.862e-3 1.840e-3 6.917e-4 1.681e-3 2.018e-3 1.222e-2 2.505e-2 9.816e-3 6.142e-4 4.418e-3
6 2.433e-2 2.246e-2 9.974e-3 1.343e-2 6.225e-3 5.518e-2 3.595e-2 3.119e-2 3.028e-3 3.148e-2
5.9661e- 2.6120e- 2.7633e- 7.6294e- 2.009e- 1.951e- 4.300e- 5.0898e- 2.5647e- 8.9125e-
7
10 10 10 11 11 10 10 11 11 11
8 0.2517 9.544e-3 8.621e-2 3.165e-2 3.646e-2 0.10458 0.14308 0.19168 9.056e-2 0.12752
9 0.31278 4.151e-2 9.815e-2 5.240e-2 4.757e-2 0.11371 0.1602 0.18728 4.534e-2 0.12261
10 0.3195 6.082e-2 0.10333 6.415e-2 5.369e-2 0.20737 0.15862 0.2448 4.327e-2 0.16902
79
Chapter 4: Sensor Classification and Sensor Placement Optimization
Measuring 10th
1st order 2nd order 3rd order 4th order 5th order 6th order 7th order 8th order 9th other
Point order
1 -5.07e-3 9.585e-2 -4.30e-3 6.389e-2 3.341e-3 0.19631 -5.48e-2 0.10362 -7.03e-4 2.618e-2
2 -4.91e-3 8.861e-2 -4.54e-3 6.260e-2 3.204e-3 0.18686 -5.21e-2 9.645e-2 -6.69e-4 2.406e-2
3 -6.96e-3 0.1349 -5.64e-3 7.755e-2 4.552e-3 0.25141 -6.62e-2 0.10779 -1.02e-3 1.254e-2
4 -1.23e-2 0.21812 -8.48e-3 0.10665 7.873e-3 0.38268 -9.18e-2 0.10778 -1.79e-3 -2.65e-2
5 4.178e-5 9.337e-4 -1.40e-4 1.545e-3 6.528e-5 3.761e-3 -1.82e-3 2.943e-3 -3.67e-5 1.914e-3
6 -7.02e-3 6.974e-3 7.819e-4 7.756e-3 1.026e-3 3.040e-2 7.178e-3 -1.79e-4 2.035e-3 -3.20e-3
6.717e- -1.3584e- 1.3694e- -5.702e- -9.6312e- -3.1304e- 6.3539e- 1.2883e- 1.4565e- 3.8828e-
7
12 10 11 11 12 10 11 11 12 11
8 -9.822e-4 8.667e-3 2.214e-4 3.020e-2 -4.575e-3 3.122e-2 -3.36e-2 0.19303 3.795e-4 0.12611
9 2.46e-3 3.002e-2 -2.68e-3 4.980e-2 -2.146e-3 9.763e-2 -4.31e-2 0.18305 1.835e-5 0.11062
10 -2.6e-3 3.381e-2 -3.37e-3 5.157e-2 -2.602e-3 0.10477 -4.23e-2 0.18105 -4.48e-4 0.10807
80
Chapter 4: Sensor Classification and Sensor Placement Optimization
All parameters are presented in above figures and tables. Accordingly, the process
of PSO application in sensor placement optimization in Fig. 4.10 and the fitness
function Eq. (4.4), the optimal sensor placement of the blower can be obtained
using PSO. For the PSO algorithm, the number of particles is initialized as 10 and
n (1 ~ 10) sensors are assumed to place on blower measuring points. The weight Z
is set to 1.2-0.8 with decreasing linearly, and the acceleration coefficient c1 and c2
is set as 1.2. The vibration mode parameters in Table 4.8-Table 4.11 are input to
the PSO respectively which can be used to calculate the fitness value.
Table 4.12 shows the smallest fitness value and the corresponding sensor
placement for the different number of measuring points using the total
displacement mode for each measuring point (Table 4.8). From this table, the
amount of information on the blower increases with the increasing of measuring
points, because the fitness become smaller and smaller. The smallest fitness is very
big (8.824) when there only one sensor place on the blower while it became very
small even equal 0 when the number of sensors increasing to 8, 9 to 10. From this
table, the importance of measuring points can be obvious observed. The point 4 is
the most important while the point 7 is the least important. The amount information
also can be calculated from this table. Just take measuring point 6 as example, its
amount information can be calculated as fitness value in point 5 minus in point 6
(2.422-1.213=1.209).
Table 4.13, Table 4.14 and Table 4.15 present the smallest fitness values and the
corresponding sensor placement for the different number of measuring points using
displacement modes for each measuring point in X direction, Y direction and Z
81
Chapter 4: Sensor Classification and Sensor Placement Optimization
direction respectively. With these tables, the same conclusions can be obtained as
the Table 4.12, and what’s more, when the same number of sensors is planned to
installed to the blower, the optimal places may different using different
displacement modes. When optimal sensor placement is applied in real machine, it
is very significant to know which direction is important for deformation referring
to failure of machine.
Fig. 4.13 to Fig. 4.16 show fitness values changes with the changes of iteration
PSO ( n 5 ) for total, X direction, Y direction and Z direction respectively. From
these figures, the optimal sensor placement can be obtained within 20 iterations of
PSO for using all kinds of displacement mode. Combining all these figures and
tables, PSO can successfully solve the optimal sensor placement problem.
As PSO has it important advantages in solving the optimization and NP problems,
it is employed to solve sensor placement optimization problem for improving
product design and fault diagnosis. Fitness is established for PSO application in
sensor placement optimization based on the analysis on placement guidelines of
vibration sensors. Generally, the proposed method combined the structure finite
element modeling and its modal analysis, and PSO the carry out the optimal sensor
placement distribution. The proposed method combining PSO and FEM analysis
can be applied in machine level and component level but not system level because
it need finite element mode and modal analysis of the structure. Therefore, the
future research will be on the method for optimal sensor distribution in system
level.
Table 4.12 Optimal Sensor Placement for Different Number of Measuring Point using Total
Displacement Mode
Measuring Point Sensor place Measuring Point Sensor place
Fitness Fitness
No. position No. position
1 4 8.824 6 1 2 3 4 9 10 1.213
2 4 10 6.793 7 1 2 3 4 8 9 10 0.059
3 3 4 10 5.183 8 1 2 3 4 6 8 9 10 0.004
12345689 4.1E-
4 3 4 9 10 3.786 9
10 18
123456789
5 1 3 4 9 10 2.422 10 0
10
Table 4.13 Optimal Sensor Placement for Different Number of Measuring Point using X
Direction Displacement Mode
Measuring Point Sensor place Measuring Point Sensor place
Fitness Fitness
No. position No. position
1 4 1.7706 6 1 2 3 4 9 10 0.1886
2 34 1.3236 7 1 2 3 4 8 9 10 0.0046
3 134 1.0166 8 1 2 3 4 6 8 9 10 0.0002
12345689 4.26E-
4 1 3 4 10 0.7351 9
10 19
123456789
5 1 2 3 4 10 0.4605 10 0
10
82
Chapter 4: Sensor Classification and Sensor Placement Optimization
Table 4.14 Optimal Sensor Placement for Different Number of Measuring Point using Y
Direction Displacement Mode
Measuring Point Sensor place Measuring Point Sensor place
Fitness Fitness
No. position No. position
1 4 0.3971 6 1 2 3 4 9 10 0.0299
2 34 0.2888 7 1 2 3 4 8 9 10 0.0017
2.240e-
3 3 4 10 0.1860 8 1 2 3 4 6 8 9 10
5
12345689 4.17E-
4 3 4 9 10 0.1257 9
10 20
123456789
5 2 3 4 9 10 0.0729 10 0
10
Table 4.15 Optimal Sensor Placement for Different Number of Measuring Point using Z
Direction Displacement Mode
Measuring Point Sensor place Measuring Point Sensor place
Fitness Fitness
No. position No. position
1 10 2.7715 6 1 2 3 8 9 10 0.2538
2 9 10 2.0660 7 1 2 3 4 8 9 10 0.0478
3 1 9 10 1.4790 8 1 2 3 4 6 8 9 10 0.0032
12345689 2.287E-
4 1 3 9 10 0.9906 9
10 18
123456789
5 1 2 3 9 10 0.5705 10 0
10
3.5
Fitness Value
2.5
Fig. 4.13 Fitness Changes with Change of Iteration PSO ( n 5 ) for Total Displacement
Mode
83
Chapter 4: Sensor Classification and Sensor Placement Optimization
0.5
0.495
0.49
0.485
Fitness Value
0.48
0.475
0.47
0.465
0.46
0 20 40 60 80 100 120 140 160 180 200
Iteration
Fig. 4.14 Fitness Changes with Change of Iteration PSO ( n 5 ) for X Direction
Displacement Mode
0.16
0.15
0.14
0.13
Fitness Value
0.12
0.11
0.1
0.09
0.08
Fig. 4.15 Fitness Changes with Change of Iteration PSO ( n 5 ) for Y Direction
Displacement Mode
0.95
0.9
0.85
0.8
Fitness Value
0.75
0.7
0.65
0.6
0.55
0 20 40 60 80 100 120 140 160 180 200
Iteration
Fig. 4.16 Fitness Changes with Change of Iteration PSO ( n 5 ) for Z Direction
Displacement Mode
84
Chapter 4: Sensor Classification and Sensor Placement Optimization
85
Chapter 4: Sensor Classification and Sensor Placement Optimization
2.65
2.6
2.5
2.45
0 5 10 15 20 25 30 35 40 45 50
Iteration
Fig. 4.18 Fitness Changes with Change of Iteration BCA ( n 5 ) for Total Displacement
Mode
0.5
0.495
0.49
0.485
Fitness Value
0.48
0.475
0.47
0.465
0.46
0 5 10 15 20 25 30 35 40 45 50
Iteration
Fig. 4.19 Fitness Changes with Change of Iteration BCA ( n 5 ) for X Direction
Displacement Mode
0.15
0.14
0.13
Fitness Value
0.12
0.11
0.1
0.09
0.08
0 5 10 15 20 25 30 35 40 45 50
Iteration
Fig. 4.20 Fitness Changes with Change of Iteration BCA ( n 5 ) for Y Direction
Displacement Mode
86
Chapter 4: Sensor Classification and Sensor Placement Optimization
0.9
0.85
0.8
Fitness Value
0.75
0.7
0.65
0.6
0.55
0 5 10 15 20 25 30 35 40 45 50
Iteration
Fig. 4.21 Fitness Changes with Change of Iteration BCA ( n 5 ) for Z Direction
Displacement Mode
4.7 Summary
This Chapter introduced sensor classification scheme for categorizing and list some
criteria to categorize sensors. Most of sensors are very mature in the market. When
a machine needs to be monitored, the properties of signals and the parameters of
sensors can be firstly determined, and then the suitable sensors can be found from
the market. The more important thing in this Chapter is to define a sensor
placement optimization problem which is a NP problem, and introduce two Swarm
Intelligence algorithms: PSO and BCA to solve this problem. The Swarm
Intelligence algorithms are very good at solving the NP problems, and thus, they
are suitable to solve the sensor placement optimization problems. Finally, a case
study is descripted in this Chapter which shows that both BCA and PSO can find
the optimal sensor placement accurately and fast.
When a machine needs to be monitored, one always wants to use as few as possible
sensors to obtain as much as possible information of the machine. To find the
optimal sensor placement could be a basis of condition monitoring of
manufacturing machines which can reduce the number of sensors used and thus
reduce the cost.
87
Chapter 4: Sensor Classification and Sensor Placement Optimization
88
Chapter 5: Signal Preprocessing and Feature Extraction
5.1 Introduction
Feature Extraction
Signal Feature
Preprocessing Frequency Time-Frequency Selection
Time Domain
Domain Domain
Filter, Continues
Mean, RMS, Fourier Short Time
Amplification, Principal
Shape factor, Transform Fourier
Signal Component
Skewness, (CFD), Discrete Transformation
Conditioning, analysis,
Kurtosis, Fourier (STFT), Wavelet
Extracting Weak Transform Support Vector
Crest factor, Transform
Signals, De- (DFT), Machine,
Entropy Error, (WT), Wavelet
noising, Boosting Tree
Entropy Fast Fourier Packet (WP),
Vibration Signal Algorithm, etc.
estimation, etc. Transform etc.
Compression, etc. (FFT), etc.
89
Chapter 5: Signal Preprocessing and Feature Extraction
There are many methods for signal preprocessing as shown in Table 5.1. As
mentioned above, the main aims of signal pre-processing are to improve the signal-
to noise ratio, enhance the signal characteristics, and facilitate the efficient
extraction of useful information from the signals. The electrical signals generated
by sensors are often not adequate for useful information extraction because they
may be very nosy, of low amplitude, biased and dependent on secondary
parameters such as temperature and humidity. What’s more, the quantities of
interested parameters may be not able to be measured directly but can only
measure their related quantities. Therefore, signal conditioning is required which
can be performed with hardware and/or software which can include: amplification,
filtering, converting, range matching, isolation and any other processes required to
make sensor output suitable for processing after conditioning [Gutierrez-Osuna et
al., 2003]. Denoising techniques aim at eliminating noise from measured data while
trying to preserve the important signal features (such as texture and edges) as much
as possible[Ramani et al., 2008]. It is very important step to enhancing data
reliability and improving the accuracy of signal analysis methods. Wavelet based
denoising methods have been successfully applied for signal analysis to improve
the signal-to-noise ratio[Benouaret et al., 2012; Patil & Chavan, 2012]. Soft-
thresholding [Donoho, 1995] and wavelet-shrinkage denoising [Zheng et al., 2000]
are two popular denoising methods. There are still some other denoising techniques:
adaptive threshold denoising for fault detection in power systems [Yang & Liao,
2001], acoustic emission signal denoising for fatigue cracks detection in rotor
heads [Menon et al., 2000], denoising using modulus maxima algorithm for
structure fault detection in fighter aircraft [Hu et al., 2000], signal decomposition
technique (wavelets, wavelet packets and matching pursuit method) based
denoising methods for improving signal-to-noise ratio of knee-joint vibration
signals [Krishnan & Rangayyan, 2000], and reducing background noise level using
The second order displaced power spectral density (SDPSD) function for localized
defects in roller bearings [Piñeyro et al., 2000]. The amount of data collected from
industrial systems tends to be voluminous and, in most cases, difficult to manage
because the increasing of sensors and sample rates. Therefore, data compression is
very important for condition monitoring system especially for those implemented
online or Internet-based systems. Transient analysis is mostly used to compress
data because it can significantly improve the performance of sensor arrays with
careful instrument design and sampling procedures which are: improving
selectivity, reducing acquisition time and increasing sensor lifetime. There are
main three classes of transient analysis methods: Sub-sampling method [Gutierrez-
Osuna et al., 1999; Kermani et al., 1998; Roussel et al., 1998; White et al., 1996],
parameter-extraction method [Eklöv et al., 1997; Gibson et al., 1997; Llobet et al.,
1997; D. M. Wilson & DeWeerth, 1995], and system-identification method [Eklöv
et al., 1997; Gutierrez-Osuna et al., 1999; Nakamoto et al., 2000]. The signal
preprocessing techniques for condition monitoring are mature, and more techniques
and more detail can be referred in the literatures [Gutierrez-Osuna et al., 2003;
Marwala, 2012; Vachtsevanos et al., 2006].
90
Chapter 5: Signal Preprocessing and Feature Extraction
Feature and condition indicator extraction and selection play curial roles in
condition monitoring especially for accuracy and reliability of fault diagnosis and
prognosis. The function of condition monitoring mainly depends on a set of
features extracted from sensor data that can distinguish between fault categories of
interest, and detect and isolate a specific fault at early initiation stages. These
features should be fairly insensitive to noise and within fault class variations. It
should beware that not losing useful information in feature extraction stage. For
time series signals, such as vibration signals, voltage signals and current signals,
the features can be extracted from four domains: time domain, frequency domain,
time-frequency domain and wavelet domain.
Features in time domain is very traditional methods for extraction features, but is
very widely used in fault diagnosis and prognosis which mainly computer the
statistical parameters from signals. The following features are some of these
statistical parameters [Vachtsevanos et al., 2006; Wang & Zhang, 2010]:
Peak value,
1
Pv ª max xi min xi ¼º (5.1)
2¬
where xi (i 1, 2" , N ) is the amplitude at sampling point i and N is the number
of sampling points.
RMS value,
1 N
¦ xi
2
RMS (5.2)
N i1
Standard deviation,
1 N
¦ xi x
2
SD (5.3)
N i1
Kurtosis value,
1
¦ xi x
N 4
Kv N i 1
(5.4)
RMSValue
4
Crest factor,
Peak Value
Crf (5.5)
RMS Value
91
Chapter 5: Signal Preprocessing and Feature Extraction
Clearance factor,
PeakValue
Clf 2 (5.6)
§1 ·
¨ ¦ i 1 | xi
N
|¸
©N ¹
Impulse factor,
Peak Value
Imf (5.7)
1
¦
N
| xi |
N i 1
Shape factor,
RMS Value
Shf (5.8)
1
¦
N
| xi |
N i 1
Weibull negative log-likelihood value was used recently for feature extraction from
vibration signals. The Weibull negative log-likelihood value ( Wnl ) and the normal
negative log-likelihood value ( Nnl ) of the time domain vibration signals are used
as input features along with the other features defined above in this study. The
negative log-likelihood function is defined as:
N
ȁ ¦Log ¬ª f x , T , T ¼º
i 1
i 1 2
where f xi ,T1 ,T2 is the probability density function ( pdf ). For Weibull negative
log-likelihood function and normal negative log-likelihood function, the pdfs are
computed as follows:
Weibull pdf :
ª § x ·E º
f xi , E ,K EK E | xi |E 1 exp « ¨ i
¸ » (5.9)
«¬ © K ¹ »¼
Where E and K are the shape and the scale parameters respectively.
Normal pdf :
ª x P 2 º ½
1 ° ¼°
exp ® ¬
i
f xi , P , V ¾ (5.10)
V 2S ° 2V 2
°
¯ ¿
Where P and V are the mean and the standard deviation respectively.
There are still three time domain parameters, i.e. Activity, mobility and complexity,
can be used for feature extraction [Hjorth, 1970; Xinyang Li et al., 2011]:
Activity var( x(t )) (5.11)
92
Chapter 5: Signal Preprocessing and Feature Extraction
x
Activity ( x)
Mobility (5.12)
Activity ( x(t ))
x
Mobility ( x)
Complexity (5.13)
Mobility ( x(t ))
The above three parameters are often referred as Hjorth parameters and have been
widely applied [Cecchin et al., 2010; Obermaier et al., 2001]. There are also other
parameters can be used for feature extraction: Time-Domain Morphology and
Gradient [Mazomenos et al., 2012], Correlation, Covariance and Convolution
[Vachtsevanos et al., 2006]. For details of these techniques, the corresponding
literatures can be found in above mentioned references.
93
Chapter 5: Signal Preprocessing and Feature Extraction
The FRF matrix is related to the spatial properties by the following expression:
1
>D (Z)@ ª¬Z 2 > M @ jZ >C @ > K @º¼ (5.17)
Here D is the frequency response function, Z is the frequency, > M @ is the mass
matrix, >C @ is the damping matrix, > K @ is the stiffness matrix and j 1 . The
above transform is for the continue signals. For the discrete signals, the frequency
response function can be expressed as:
N
X (k ) ¦ x ( n )e
n 1
j 2 S ( k 1)( n 1) / N
k 1, 2, " , N (5.18)
where N is the number of time series x ( n ) . Fig. 5.1 shows a vibration signal with
the sample rate 4096 in one second while Fig. 5.2 shows the corresponding
frequency response function. From Fig. 5.2, the base frequency of this vibration
signal is 46 Hz, the second order frequency is 92 Hz and the third order frequency
is 138 Hz. There are also some other features can be extracted from the FRF figure
to find the characteristics of the signals for fault diagnosis and prognosis. For
example, power spectral density (PSD):
1 1 2
\x X (k ) X ' ( k ) X (k ) (5.19)
N N
is for fault diagnosis and prognosis which is easier to see details in the frequency
response than using X (k ) [Vachtsevanos et al., 2006].
0.015
0.01
0.005
Amplitude (m/s 2)
-0.005
-0.01
-0.015
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Time (s)
94
Chapter 5: Signal Preprocessing and Feature Extraction
20
18
16
14
12
Amplitude (m/s 2)
10
0
0 200 400 600 800 1000 1200 1400 1600 1800 2000
Frequency (Hz)
Although FFT based methods are powerful tools for fault diagnosis and prognosis,
they are not suitable for non-stationary signals. For analysis in the time-frequency
domain, the Wigner-Ville distribution (WVD) and the short time Fourier transform
(STFT) are the most popular methods for non-stationary signal analysis. However,
WVD suffers from interference terms appearing in the decomposition, and STFT
cannot provide good time and frequency resolution simultaneously because it uses
constant resolution at all frequencies. Moreover, no orthogonal bases exist for
STFT that can be used to implement a fast and effective STFT algorithm
[Okamura, 2011; Vachtsevanos et al., 2006]. The methods for time-frequency
analysis are compared in Table 5.2 [Vachtsevanos et al., 2006]. This section mainly
introduces Wavelet transform for time-frequency analysis and feature extraction.
Wavelet transform is a time-frequency decomposition of a signal into a set of
“wavelet” basic function. Wavelet analysis has proved its great capabilities in
decomposing, denoising, and signal analysis which made the analysis of non-
stationary signals achievable as well as detecting transient feature components as
other methods were inept to perform since wavelet can concurrently impart time
and frequency structures. Wavelet Transform (WT) gives good time and poor
frequency resolution at high frequencies, and good frequency and poor time
resolution at low frequencies. Analysis with wavelets involves with breaking up a
signal into shifted and scaled versions of the original (or mother) wavelet, i.e., one
high frequency term from each level and one low frequency residual from the last
level of decomposition. There are three categories of this transformation:
Continuous Wavelet Transform (CWT), Discrete Wavelet Transform (DWT) and
WPD.
95
Chapter 5: Signal Preprocessing and Feature Extraction
³ x(t )\ (5.20)
*
CT (a, b) ( a ,b ) (t ) dt
f
where \ ( a ,b) (t ) is a continuous function in both the time domain and the frequency
*
domain called the mother wavelet and * represents operation of complex conjugate.
\ (*a,b) (t ) can be expressed as:
1 §t b· (5.21)
\ (*a , b ) (t ) \¨ ¸ where a , b R , a z 0
a © a ¹
The main purpose of the mother wavelet is to provide a source function to generate
the daughter wavelets which are simply the translated and scaled versions of the
mother wavelet. As seen in Eq. (5.21), the transform signal CT ( a , b ) is defined on
a b plane, which a and b are used to adjust the frequency and the time location
of the wavelet in Eq. (5.21). A small a produces a high-frequency wavelet when
high frequency resolution is needed and the reverse is also true. The WT’s superior
time-localization properties stem from the finite support of the analysis wavelet: as
b increases, the analysis wavelet transverses the length of the input signal, and a
increases or decreases in response to changes in the signal’s local time and
frequency content. Finite support implies that the effect of each term in the wavelet
96
Chapter 5: Signal Preprocessing and Feature Extraction
representation is purely localized. This sets the WT apart from the Fourier
Transform, where the effects of adding higher frequency sine waves are spread
throughout the frequency axis.
³ x(t )\ (5.22)
*
DT (a, b) ( j ,k ) (t )dt
f
1 § t 2j k ·
\ * (t ) \¨ j ¸ (5.23)
© 2 ¹
( j ,k )
2j
Where Di (t ) denotes the wavelet detail and Ai (t ) stands for the wavelet
approximation at the j th level. DWT analysis is more efficient still with the
identical accuracy [Goumas et al., 2001].
As discussed above, DWT can decompose the signal into two parts: low-frequency
A1 and high frequency D1 . In the process of decomposition, the lost information
belonging to the low frequency part is captured by the high frequency part. In the
next level of decomposition, this method will also decompose A1 into two parts:
low-frequency A2 and high frequency D2 . The lost information belonging to low
frequency A2 is capture by the high-frequency D2 , and thus, a deeper level
decomposition can be done. The 3-layer structure of signal based on DWT is
shown in Fig. 5.3 in which only approximation version is decomposed.
97
Chapter 5: Signal Preprocessing and Feature Extraction
For the case of signal with the maximum frequency 2048 Hz, D1 , D2 , D3 and A3
represent the frequency 1024~2048 Hz, 512~1024 Hz, 128~512 Hz and 0~128 Hz
respectively. The decomposed signals by DWT from vibration signal (Fig. 5.1) are
shown in Fig. 5.4.
D1 D2
0.1 0.2
0.05 0.1
Amplitude
Amplitude
0 0
-0.05 -0.1
-0.1 -0.2
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
Time (s) Time (s)
D3 A3
0.2 0.2
0.1 0.1
Amplitude
Amplitude
0 0
-0.1 -0.1
-0.2 -0.2
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
Time (s) Time (s)
98
Chapter 5: Signal Preprocessing and Feature Extraction
because the same frequency bandwidths can provide good resolution regardless of
high and low frequencies.
The wavelet functions \ j can be obtained from the following recursive relations:
f
\ 2 j t 2 ¦h k \ i 2t k (5.26)
f
f
\ 2 j 1 t 2 ¦ g k \ i 2t k (5.27)
f
99
Chapter 5: Signal Preprocessing and Feature Extraction
It is observed that the scaling function has a low-pass form, whereas the wavelet
function has a high-pass form. Thus, the wavelet function is essentially responsible
for extracting the detail (high-frequency components) of the original signal.
100
Chapter 5: Signal Preprocessing and Feature Extraction
Fig. 5.6. Wavelet Packet Coefficients and Their Relevant Standard Deviation
101
Chapter 5: Signal Preprocessing and Feature Extraction
There might be too many features extracted from the signals and collected from
sensors which make extraction of useful and understandable information from
these features become difficult. Therefore the dimensionality of the features needs
to be reduced. Feature selection is primarily performed to select relevant and
informative features which can reduce the dimensionality of features effectively. It
can have the other motivations, including [Guyon & Elisseef, 2006]:
1) General data reduction, to limit storage requirements and increase
algorithm speed;
2) Feature set reduction, to save resources in the next round of data collection
or during utilization;
3) Performance improvement, to gain in predictive accuracy;
4) Data understanding, to gain knowledge about the process that generated
the data or simply visualize the data
Many data mining algorithms can be used to carry out feature selection: neural
network ensemble (NNE) [Hansen & Salamon, 1990], neural network (NN) [Liu,
2001; Siegelmann & Sontag, 1994], boosting regression tree (BRT) [Friedman,
2001, 2002; Smola & Scholkopf, 2003], support vector machine (SVM) [Schölkopf
et al., 1999; Steinwart & Christmann, 2008], random forest with regression (RF)
[Breiman, 2001], standard classification and regression tree (CART) [Speybroeck,
2012], k nearest neighbour neural network (kNN) [Shakhnarovich et al., 2005],
wrapper approach integrated with the genetic or the best-first search algorithm
[Espinosa et al., 2005; Tan et al., 2006] and principal component analysis (PCA)
[Jolliffe, 2002]. All these algorithms are widely used for feature selection. Zhang
and Kusiak applied all these algorithm for parameter selection in wind turbine
condition monitoring and compared these algorithm [Kusiak & Verma, 2011;
Kusiak & Zhang, 2010; Zhang & Kusiak, 2012].
PCA is an unsupervised learning approach for dimensionality reduction that uses
correlation coefficients of the parameters to combine and transform them into a
reduced dimensional space [Miranda et al., 2008]. The concept of Principal
Component Analysis (PCA) was invented in 1901 by Karl Pearson [Pearson, 1901].
It is a mathematical procedure that uses an orthogonal transform to convert a set of
observations of possibly correlated variables into a set of values of uncorrelated
variables called principal components. This transform is defined in such a way that
the first principal component has as high a variance as possible, which means
accounting for as much of the variability in the data as possible, and each
succeeding component in turn has the highest variance possible under the
constraint that it be uncorrelated with the preceding components. It can reduce data
dimension and eliminate multi-collinearity. Currently, PCA mostly used to reduce
the dimension while maintain the main information in data mining analysis and
making models. This section mainly introduces the principle of PCA.
PCA computes a new set of uncorrelated multivariate (vector) samples by a
transform of coordinate rotation from original correlated multivariate samples. A
102
Chapter 5: Signal Preprocessing and Feature Extraction
matrix composed by n rows which means n samples are collected and m columns
which represent the number of features are expressed as bellowing:
ª x11 } x1m º
« # % # »
X « » (5.36)
«¬ xn1 " xnm »¼
PCA can obtain a new set of vector according to the following steps:
1) Calculate the correlation coefficient matrix
The correlation coefficient matrix is calculated according to the following equation:
(n 1) Cov(i, j )
R Cor (i, j )
n n
r
ij mu n (5.37)
¦ x (k ) P ¦ x (k ) P
2 2
i i j j
k 1 k 1
1
Cov (i, j ) ( xi Pi )( x j P j ) i, j 1, 2,..., m (5.38)
( n 1)
where P i and P j are the averages of the i th and jth rows of matrix X respectively.
103
Chapter 5: Signal Preprocessing and Feature Extraction
5.5 Summary
104
Chapter 6: Fault Diagnosis based on Data Mining Techniques
6.1 Introduction
Fault diagnosis has become the subject of numerous investigations over the past
two decades. Researchers in many disciplines, such as medicine, engineering, the
sciences, business, and finance, been developing methodologies to detect fault
(failure) or anomaly conditions, pinpoint or isolate which component or object in a
system or process is faulty, and decide on the potential impact of a failing or failed
component on the health of the system [Vachtsevanos et al., 2006]. Fault
diagnostic algorithms must have the ability to detect system performance,
degradation levels, and faults (failures) based on physical property changes through
detectable phenomena. Referring the fault diagnosis and condition monitoring, the
following concepts need to be defined and distinguished [Vachtsevanos et al.,
2006]:
x Fault diagnosis. Detecting, isolating, and identifying an impending or
incipient failure condition—the affected component (subsystem, system) is
still operational even though at a degraded mode.
x Failure diagnosis. Detecting, isolating, and identifying a component
(subsystem, system) that has ceased to operate.
x Fault (failure) detection. An abnormal operating condition is detected and
reported.
x Fault (failure) isolation. Determining which component (subsystem, sys-
tem) is failing or has failed.
x Fault (failure) identification. Estimating the nature and extent of the fault
(failure).
Therefore, the aim of fault diagnosis is to detect abnormal condition of machine
before the failure happens, and also identify which component of the machine will
become failure. To evaluate the techniques for fault diagnosis of a condition
monitoring system, several qualification factors can be used [Vachtsevanos et al.,
2006]:
x Isolability. A measure of the model’s ability to distinguish between certain
specific failure modes. Enabling technologies include incidence matrices
involving both deterministic (zero-threshold) and statistical (high-threshold)
isolability.
x Sensitivity. A qualitative measure characteristic of the size of failures. This
factor depends on the size of the respective elements in the system’s
matrices, noise properties, and the time to failure. Filtering typically is
used to improve sensitivity, but it is rather difficult to construct a straight-
forward framework.
x Robustness. This factor refers to the model’s ability to isolate a failure in
the presence of modeling errors. Improvements in robustness rely on
algebraic cancelation that desensitizes residuals according to certain
modeling errors.
105
Chapter 6: Fault Diagnosis based on Data Mining Techniques
There are many techniques can be used for fault diagnosis. The development of
model-based fault diagnosis began in the early of the 1970s [Dirilten, 1972; Hayes,
1971]. This method of fault detection in dynamic systems has been receiving more
and more attention over the last two decades [Schubert et al., 2011; Soman et al.,
2012; Van den Kerkhof et al., 2012]. It has much to offer in addressing system-
based fault diagnosis issues for complex systems [De Kleer & Williams, 1987;
Isermann, 2005]. It is used to detect any discrepancy between the system outputs
and model outputs. It is assumed that this discrepancy signal is related to a fault.
This method is perfect when the mathematical model or physical model is accurate
and the system outputs are no noise. However, the same difference signal can
respond to model plant mismatches or noise in real measurements, which are
erroneously detected as a fault. What’s more, sometimes, it is impossible to model
nonlinear systems by analytical equations [Mendonqa, 2006]. Therefore, the
model-based fault diagnosis techniques are not very good for some cases such as
non-linear system which mathematical model is not available.
Case-based Reasoning (CBR) [Aamodt & Plaza, 1994; Reisbec & Schank, 1989]
offers a reasoning paradigm that is similar to the way people routinely solve
problems which is another method can be used for fault diagnosis. CBR began to
be applied in fault diagnosis in 1990s [Grant et al., 1996; Patterson & Hughes,
1997], and become very popular afterwards [Fu et al., 2011; Tsai, 2009]. The
cyclic process of CBR can be described as following. When a new problem
happens, one or more similar cases are retrieved from the case base. A solution
suggested by the matching cases then is reused and tested for success. Unless the
retrieved case is a close match, the solution probably will have to be revised,
producing a new case that can be retained. Currently, this cycle rarely occurs
without human intervention and most CBR systems are used mainly as case
retrieval and reuse systems [Watson & Marir, 2009]. The CBR designer is faced
with two major challenges: coding of cases to be stored into the case library or case
base and adaptation, that is, how to reason about new cases so as to maximize the
chances of success while minimizing the uncertainty about the outcomes or actions.
Additional issues may relate to the types of information to be coded in a case, the
type of database to be used, and questions relating to the programming language to
be adopted [Vachtsevanos et al., 2006].
It is obviously that the Model-based fault diagnosis techniques can detect and
identify any faults even for unanticipated ones. But these methods need accurate
mathematical model or physical model which is usually not available for complex
machines. Therefore, data-driven methods could be better solution for fault
diagnosis when the model is unavailable and the CBR does not work well.
In contrast to model-based approaches, data-driven fault diagnostic techniques rely
primarily on process and data which are from sensors specifically designed to
respond to fault signals, to model a relationship between fault features or fault
characteristic indicators and fault classes. Such “models’’ may be cast as expert
systems or artificial neural networks or a combination of these computational
intelligence tools. They require a sufficient database (both baseline and fault
conditions) to train and validate such diagnostic algorithms before their final online
implementation. They lack the insight that model-based techniques provide
106
Chapter 6: Fault Diagnosis based on Data Mining Techniques
regarding the physics of failure mechanisms, but they do not require accurate
dynamic models of the physical system under study. They respond only to
anticipate fault conditions that have been identified and prioritized in advance in
terms of their severity and frequency of occurrence, whereas model-based methods
may be deployed to detect even unanticipated faults because they rely on a
discrepancy or residual between the actual system and model outputs
[Vachtsevanos et al., 2006]. In the past few years, many Computational
Intelligence (CI) techniques have been applied as tools for fault diagnosis [Sun et
al., 2012; Wang, 1996]. This Chapter mainly introduces data mining techniques
especially of CI techniques application in fault diagnosis. Some case studies will be
used to show how these techniques work in fault diagnosis.
The pattern classification theory has become a key factor in fault diagnosis. Some
classification methods for equipment performance monitoring use the relationship
between the type of fault and a set of patterns which is extract from the collected
signals without establishing explicit models. Currently, ANN is one of the most
popular methods in this domain. The principle of ANN has been introduced in
Section 3.2 which included Back-propagation (BP), Self-organization Mapping
(SOM). The application of artificial neural network models lies in the fact that they
can be used to infer a function from observations. This is particularly useful in
applications where the complexity of the data or task makes the design of such a
function by hand impractical. This attribution is very nontrivial in diagnostic
problems. BP neural network is a main type of ANN used to solve fault diagnosis
and prognosis problems.
ANN can deal with complex non-linear problem without sophisticated and
specialized knowledge of the real systems. It is an effective classification
techniques and low operational response times needed after training. The
relationship between the condition of component and the features is not linear but
non-linear. BP neural network does not need to know the exact form of analytical
function on which the model should be built. This means neither the functional
type nor the number and position of the parameters in the model-function need to
know. It can deal with multi-input, multi-output, quantitative or qualitative,
complex system with very good abilities of data fusion, self-adaptation and parallel
processing. Therefore, it is very suitable to select as a method of fault diagnosis.
Fig. 6.1 shows the procedure of fault diagnosis based on BP network. There are
mainly three phases of this method. The first phase is training phase to establish an
ANN model for a specific type of fault. The training data could be history data or
collected data from sensors. The collected raw signals, such as vibration signals
and acoustics signals are very hard to be used to train ANN model, and thus need
extract features from these signals. The signals of vibration and acoustics may
contain noise electrically or mechanically, thus the signals need to be processed to
filter out the noise, improve signal-to-noise ratio and amplify the weak signals.
Then the extracted features can be used to training ANN to establish the model of
107
Chapter 6: Fault Diagnosis based on Data Mining Techniques
the fault. Once the ANN model is established, it can be used to judge if the
machine has fault and identify which component will be failure. This phase called
test phase. The data here used to test ANN model must be the same kind of features
as the training data. Thus, the techniques to be used for signal processing and
feature extraction must be same as that of the training data. The last phase is
maintenance decision making based on the test results of ANN model. This phase
will be complete in Chapter 8.
Unsupervised learning [Jain et al., 1999; Oja, 2002] is another method of data
classification and clustering in addition to the supervised methods (for example BP
network) in the field of data analysis. Supervised methods mostly deal with
training classifier for known symptoms, while unsupervised learning (clustering)
provides exploratory techniques for finding hidden patterns in the data. With huge
volumes of data being generated from different systems every day, what makes a
system intelligent is its ability to analyze the data for efficient decision-making
based on known or new cluster discovery. Unsupervised data clustering is an
intelligent tool for delving deep into the unknown and unexplored data. It is a tool
that brings out the hidden patterns and association between different variables in a
multivariate dataset. When the knowledge of the data is not well known and
108
Chapter 6: Fault Diagnosis based on Data Mining Techniques
explored, the unsupervised learning method can be used to analyze the data. That
means in the field of fault diagnosis, it can be used to understand the data and
cluster the fault hidden in database.
Self-Organizing Mapping (SOM) is a competitive learning network, it uses self-
learning mode of non-supervision and non-direction, and its algorithm is simple
with function of sidewise association [Brando et al., 2007]. It is one of the most
popular unsupervised learning algorithms which can be used to explore useful
information from not well-known data, and thus can be applied in fault diagnosis
when the knowledge of the history data is not known. The principle of SOM is
introduced in Section 3.2.2. Fig. 6.2 shows the procedure of applying SOM in fault
diagnosis which mainly has three phases: training phase, test phase and decision
making phase. The whole procedure of SOM application in fault diagnosis is
similar with that of SBP application. In the training phase, the sensors are used to
collect the data from the monitored mechanical equipment. The data could be the
signals such as vibration and acoustics or the time series such as temperature. The
former signals need to be processed in order to extract useful features while the
later data can be used as features. The features, then, can be used to train SOM for
establish classifier model of different type of faults. Once the classifier is
established, the test phase can be done. The features used to test the classifier must
be the same type with the training data. The finally phase is maintenance decision
making based on the test results of SOM classifier.
109
Chapter 6: Fault Diagnosis based on Data Mining Techniques
in the high-dimensional fault patterns. Thus, the well-trained model can be utilized
to further conditions based monitoring as well as fault diagnosis and prognostics.
110
Chapter 6: Fault Diagnosis based on Data Mining Techniques
The fault diagnosis process of semi-supervised learning is shown in Fig. 6.5. The
manifold regularization based on semi-supervised manifold learning for fault
diagnosis system can be described as follows:
1) Building up general condition monitoring system to collect the labelled and
unlabelled data from both local monitoring machines and the test rig;
2) Implementing feature extraction and feature selection from the labelled and
unlabelled examples according to the criteria which determines the features set
that represent the geometric structure well;
3) Constructing a data adjacency graph with labelled and unlabelled nodes using
graph kernel, which describes an intrinsic manifold, and regulating
classi¿cation decision boundary with manifold regularization algorithm, and
then classifying the online patterns in the features space with classi¿ed labels;
4) Obtaining diagnosis information by classi¿cation of the results, then
determining the failure causes, and putting the corresponding decision or
control measures back to local condition monitoring system.
1.5
0.5
-0.5
-1
-1.5 -1 -0.5 0 0.5 1 1.5 2 2.5
111
Chapter 6: Fault Diagnosis based on Data Mining Techniques
1.5
0.5
-0.5
-1
-1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3
112
Chapter 6: Fault Diagnosis based on Data Mining Techniques
Association rules mining is a kind of data mining techniques which can discover
significant association rules between items in database [Agrawal et al., 1993]. The
basic concept and process of association rules are introduced in Section 3.4. This
part will propose an Association Rule-based Fault Diagnosis which structure is
shown in Fig. 6.6.
Whatever a machines, cars or Robots, after long time running, their performance
may become degradation or failure. Some suitable kinds of sensors should be
selected to monitor their conditions. The data should be pre-processed before
features extraction because the raw data from sensors may contain noise. After
extracting the features, all the data are stored in a database called “Raw Training
Database” which can be used to mine the association rules. For each kind of fault,
several rules can be mined from the training data. Then, select and combine all the
rules together as the whole association rules which can classify the fault or judge
the condition of monitored equipment. Finally, the features extracted from pre-
processed real time data can be used to diagnose the fault using the association
rules generated above. According to result from association rules, the maintenance
or control decision can be made correctly and efficiently.
113
Chapter 6: Fault Diagnosis based on Data Mining Techniques
114
Chapter 6: Fault Diagnosis based on Data Mining Techniques
collect and store the signals from sensors. Repeat this process until collect all the
degrading signals simulated by simulation parts. Fig. 6.10 shows the signals of the
second sensor from perfect state to absolutely failure.
0.5 0.5
Amplitude
Amplitude
0 0
-0.5 -0.5
-1 -1
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
Time (s) Time (s)
Condition 0.7 Condition 1
1 4
0.5 2
Amplitude
Amplitude
0 0
-0.5 -2
-1 -4
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
Time (s) Time (s)
115
Chapter 6: Fault Diagnosis based on Data Mining Techniques
this case, each node present the frequency bandwidth 64 Hz which means a node (4,
0) presents the signal character of the bandwidth between 0 Hz and 64 Hz.
For each signal, wavelet packet was applied up to the fourth level, thus giving 16
signal coefficients. The wavelet packet coefficients (Eq.(5.30)) and their
corresponding standard deviations for one signal are shown in Fig. 6.12. In the end,
the Standard Deviation of Wavelet Packet Coefficients (SDWPC) of processed
signals is selected as feature vector which is used to train ANN after PCA analysis.
1.57504 0.023591
5.45224 0.032451
0.630946 0.054542
1.9242 0.033564
0.443467 0.316338
0.395521 0.194326
0.302271 0.063202
0.15776 0.071707
Fig. 6.12 Wavelet Packet Coefficients (WPC) and Their Relevant Standard Deviation
There are three vibration sensors mounted on the blower and for signal from each
sensors, 16 parameters are extracted and thus overall 48 features for each time
signals. Therefore, Principal Component Analysis (PCA) is employed to reduce the
dimension of the features.
116
Chapter 6: Fault Diagnosis based on Data Mining Techniques
Component No. 1 2 3 4 5 6 7 8 …
160
140
Values of each component in each sample
120
100
80
60
40
20
-20
0 100 200 300 400 500 600 700 800
Samples
117
Chapter 6: Fault Diagnosis based on Data Mining Techniques
Fig. 6.14 Procedure of Fault Diagnosis Integrating BP Network, PCA and WPC
118
Chapter 6: Fault Diagnosis based on Data Mining Techniques
0.05
0.04
error from nominal value
0.03
0.02
0.01
0
0 20 40 60 80 100 120 140 160 180 200
No. of Training data for each degradation
0.25
0.2
error from nominal value
0.15
0.1
0.05
0
0 20 40 60 80 100 120 140 160 180 200
No. of Training data for each degradation
0.25
0.2
error from nominal value
0.15
0.1
0.05
0
0 20 40 60 80 100 120 140 160 180 200
No. of Training data for each degradation
119
Chapter 6: Fault Diagnosis based on Data Mining Techniques
-3
x 10
8
0
0 20 40 60 80 100 120 140 160 180 200
All these four figures show the differences between the predicted values and
nominal values of four different conditions using the features of SDWPC and new
features generated by PCA from SDWPC. Fig. 6.15 shows the result of condition
0. The error is much smaller of the result using the new features generated by PCA
from SDWPC compared to using the features of SDWPC as inputs of ANN. Fig.
6.16 and Fig. 6.18 show the results of condition 0.3 and condition 1 respectively.
When the number of training sets is very small, the results using new features
generated by PCA from SDWPC are much better than using features of SDWPC in
these two figures. However, with the number of the training data increasing, the
results of using both features are almost the same in these two figures and both of
them are correct and precise. Fig. 6.17 shows the result of condition 0.7. In this
figure, in both kinds of features, the performance is very effective and corrective
whatever the number of training data is, but the result of using the new features
generated by PCA from SDWPC is much better than using features of SDWPC.
We can see from Fig. 6.16 and Fig. 6.17, when the condition is neither perfect nor
completely failure, the result of using SDWPC is not believable if the number of
training data is very small because the ‘error from nominal value’ is large. But it is
still believable of using new features generated by PCA from SDWPC to training
and testing ANN in these conditions. We can see from the four figures, the
precision is better of using new features generated by PCA from SDWPC than
using features of SDWPC in any condition and in any number of training data.
This case study is retrieved from [Zhang et al., 2012]. The experimental setup and
Experimental Procedure are the same as Section 6.6.1 Experimental Setup and
Section 6.6.2 Experimental Procedure. However, the data analysis and diagnostic
algorithm are different.
120
Chapter 6: Fault Diagnosis based on Data Mining Techniques
For this case, the signal maximum frequency is 512 Hz, and thus D1, D2, D3 and
A3 represent the frequency 256~512 Hz, 128~256 Hz, 64~128 Hz and 0~64 Hz
respectively in Fig. 6.19. In this experiment, only these four parts are analyzed to
judge the degradation of the performance. The decomposed signals by WPD from
the different degrading signals are shown in Fig. 6.20-Fig. 6.23.
D1 D2
0.4 0.4
0.2 0.2
Amplitude
Amplitude
0 0
-0.2 -0.2
-0.4 -0.4
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
Time (s) Time (s)
D3 A3
0.3 0.2
0.2
0.1
Amplitude
Amplitude
0.1
0
0
-0.1
-0.1
-0.2 -0.2
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
Time (s) Time (s)
121
Chapter 6: Fault Diagnosis based on Data Mining Techniques
D1 D2
0.1 0.2
Amplitude
0 0
-0.05 -0.1
-0.1 -0.2
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
Time (s) Time (s)
D3 A3
0.2 0.2
0.1 0.1
Amplitude
Amplitude
0 0
-0.1 -0.1
-0.2 -0.2
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
Time (s) Time (s)
D1 D2
0.1 0.2
0.05 0.1
Amplitude
Amplitude
0 0
-0.05 -0.1
-0.1 -0.2
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
Time (s) Time (s)
D3 A3
0.2 0.2
0.1 0.1
Amplitude
Amplitude
0 0
-0.1 -0.1
-0.2 -0.2
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
Time (s) Time (s)
D1 D2
0.2 0.4
0.1 0.2
Amplitude
Amplitude
0 0
-0.1 -0.2
-0.2 -0.4
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
Time (s) Time (s)
D3 A3
0.4 0.4
0.2 0.2
Amplitude
Amplitude
0 0
-0.2 -0.2
-0.4 -0.4
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
Time (s) Time (s)
122
Chapter 6: Fault Diagnosis based on Data Mining Techniques
8
10
6
|Y(fft)|
|Y(fft)|
4
5
2
0 0
0 200 400 600 800 1000 0 200 400 600 800 1000
Frequency (Hz) Frequency (Hz)
FD3 FA3
15 20
15
10
|Y(fft)|
|Y(fft)|
10
5
5
0 0
0 200 400 600 800 1000 0 200 400 600 800 1000
Frequency (Hz) Frequency (Hz)
FD1 FD2
6 8
6
4
|Y(fft)|
|Y(fft)|
4
2
2
0 0
0 200 400 600 800 1000 0 200 400 600 800 1000
Frequency (Hz) Frequency (Hz)
FD3 FA3
8 40
6 30
|Y(fft)|
|Y(fft)|
4 20
2 10
0 0
0 200 400 600 800 1000 0 200 400 600 800 1000
Frequency (Hz) Frequency (Hz)
123
Chapter 6: Fault Diagnosis based on Data Mining Techniques
FD1 FD2
6 15
|Y(fft)| 4 10
|Y(fft)|
2 5
0 0
0 200 400 600 800 1000 0 200 400 600 800 1000
Frequency (Hz) Frequency (Hz)
FD3 FA3
10 50
8 40
6 30
|Y(fft)|
|Y(fft)|
4 20
2 10
0 0
0 200 400 600 800 1000 0 200 400 600 800 1000
Frequency (Hz) Frequency (Hz)
FD1 FD2
8 20
6 15
|Y(fft)|
|Y(fft)|
4 10
2 5
0 0
0 200 400 600 800 1000 0 200 400 600 800 1000
Frequency (Hz) Frequency (Hz)
FD3 FA3
15 10
8
10
6
|Y(fft)|
|Y(fft)|
4
5
2
0 0
0 200 400 600 800 1000 0 200 400 600 800 1000
Frequency (Hz) Frequency (Hz)
124
Chapter 6: Fault Diagnosis based on Data Mining Techniques
125
Chapter 6: Fault Diagnosis based on Data Mining Techniques
there are 20 sets of test data in which there are 5 sets of them for each condition.
The nominal condition is called NC while the output condition of test is called TC
in this table. From this table, the results are 100% correct in the above parameter
sets. However, the output is not exactly the same as the nominal condition and
there are deviations between them. The precision of the output is discussed next
section.
6.7.5 Discussion
In this section, three issues will be discussed. The first one is how many training
sets should be used in order to achieve enough accurate condition of the machine
from BP network. The second one is attempting to discuss the relationship between
the accuracy and the number of hidden layer nodes. The last issue is convergent
time of the BP network training.
To discuss the first issue, the numbers of training sets for each condition are
changed from 1 to 200. The number of hidden layer nodes is set to 20 and the
number of training epoch is set to 5000. For each testing data, compare the output
of ANN and the nominal value which is called “error from nominal value” which is
average value of testing data for each condition. The values of these errors are
shown in Fig. 6.29. We can see from this figure, the result is believable whatever
the condition of the component is when the number of training data is larger than
20. For condition 0 and condition 1, the result is still believable even if the number
of training data is smaller than 20. It is clear that the result will be believable if
there are only two conditions (0 and 1 or good and fault) even if the number of
training data is very small. But if there are more conditions, the number of training
data should be increased. Therefore, the number of conditions should be considered
in designing of how many training sets are used to trained BP neural networks.
126
Table 6.3 Test Data and the Results
127
5.346 6.322 4.353 257.69 4.934 3.491 3.361 4175.62 5.982 5.922 3.044 1212.37 0.3 0.28 0.02
3.852 3.516 3.548 303.43 4.874 3.825 3.852 4193.24 3.817 3.952 3.428 1233.34 0.3 0.29 0.011
4.699 3.421 3.31 311.82 4.327 4.911 5.273 7101.38 4.158 3.558 3.183 2098.35 0.7 0.71 0.005
4.087 4.644 3.008 278.09 3.865 3.482 5.644 7211.46 5.392 4.981 3.51 2147.95 0.7 0.64 0.059
3.978 3.719 3.463 286.09 4.321 3.635 5.177 7122.77 4.196 3.682 3.883 2094.52 0.7 0.7 0.002
4.347 2.976 3.434 279.05 5.284 5.405 4.546 7157.75 3.978 4.449 3.333 2126.02 0.7 0.67 0.033
4.44 3.505 3.345 262.11 5.521 3.628 4.63 7080.4 3.991 3.587 4.037 2082.84 0.7 0.67 0.031
3.603 4.235 8.397 910.92 5.633 5.258 9.274 21669.1 3.839 3.875 5.985 6824.31 1 1 0.002
5.451 3.87 6.187 885.07 4.922 6.128 12.67 21416.9 5.687 3.643 8.407 6661.59 1 1 0
Chapter 6: Fault Diagnosis based on Data Mining Techniques
5.957 3.575 8.829 918.7 5.764 5.873 8.867 21244.5 4.995 4.161 4.444 6594.82 1 1 0.002
4.818 3.049 8.352 882.03 6.918 5.931 10.12 20684.6 5.035 3.511 8.835 6461 1 1 0.002
3.745 3.083 8.32 885.89 6.902 5.976 10.9 20605 5.183 3.216 8.642 6455.84 1 1 0.002
Chapter 6: Fault Diagnosis based on Data Mining Techniques
0.25
Condition 0
Condition 0.7
Errors from nominal value
Condition 1
0.15
0.1
0.05
0
0 20 40 60 80 100 120 140 160 180 200
No. of training data
To discuss the second issue, the number of hidden layer nodes is changed from 5 to
135. The number of training data is set to 80 and the number of maximum training
epoch is set to 5000. For each training process, several test sets for every condition
are used to test the trained SBP networks. The results are shown in Fig. 6.30. From
the figure, with the increasing of the number of hidden layer nodes, the fluctuations
of the output for each condition are small. So the changing of the number of hidden
layer nodes does not affect the accuracy of the output. What’s more, there is no
mathematical method to prove what the number of it is best. Therefore, the number
of hidden layer nodes does not need to be considered much.
128
Chapter 6: Fault Diagnosis based on Data Mining Techniques
To discuss the last issue, the number of hidden layer nodes is set as 20 and the
training epoch is set as 2000. Fig. 6.31 shows the BP network training time with
the number of training data increasing from 10 to 200. From the figure, training
time is not apparently increasing with the increasing of training data sets.
Therefore, when the BP network is need, we should use as many as possible data
sets to complete the training. Fig. 6.32 shows the training time changes with the
increasing of hidden layer nodes. The number of training data sets is set as 200 and
the training epoch is set as 2000. From this figure, the training time increase
gradually with the increasing of hidden layer nodes. Therefore, when a BP network
need to trained, the number of hidden layer nodes should be considered. However,
from the experience of previous work, the numbers of the hidden layer neurons
depends both on the input layer number and the output layer neuron number but the
numbers can not be too many [Meng & Meng, 2010].
35
30
Training time (s)
25
20
15
Fig. 6.31 BP Network Training Time with the Increasing of Training Data
45
40
35
Training time (s)
30
25
20
15
0 20 40 60 80 100 120 140
No. of hidden layer nodes
Fig. 6.32 BP Network Training Time with the Increasing of Hidden Layer Nodes
129
Chapter 6: Fault Diagnosis based on Data Mining Techniques
130
Chapter 6: Fault Diagnosis based on Data Mining Techniques
131
Chapter 6: Fault Diagnosis based on Data Mining Techniques
132
Chapter 6: Fault Diagnosis based on Data Mining Techniques
133
Chapter 6: Fault Diagnosis based on Data Mining Techniques
Fig. 6.35 shows the results of SOM classification of Centrifugal Pump System. The
labels “B”, “C”, “L”, “M”, “N”, “U” and “M” present the types of faults. In this
map, the neurons with the fault type means the inputs with the same fault are
mapped into this node. It is noticeable that there are probably more than one fault
types locating in the same node which means the input data represent
corresponding fault types. It is also noticeable that the neurons representing the
same fault type may be not in the same area, which means a type of fault may be
caused by different parameters. The numbers located in the neurons mean the
sequence of the test data sets. From the map, the fault type of the test data could be
classified very clearly.
U-matrix Variable1 Variable2
4.55 13.7
9.93
2.34 1.77
4.93
4
BCL 5
BCL 8B9 10
BL 3
BL 7B BL BL BL
B M
BCL BL BL CN M
BL BL B B BL M
1
BL BL CU 6 2
BL
BL N
11
L U BL BL CU N
L U
L L M N 16
N 17
N
L L U U U N
L L L L 12
L U U U
13
L L U N
M O O U U
O 19
O O M
M M 18
O O 20
O M 15
M 14
M N N
134
Chapter 6: Fault Diagnosis based on Data Mining Techniques
6.9 Summary
This Chapter mainly describes how the data mining techniques work in fault
diagnosis of mechanical machines. These data mining techniques include: BP
network, SOM, Semi-supervised learning and association rules. Some case studies
are used to verify these techniques except semi-supervised learning because there
are no data for case study and a two-moon problem was used to show how it
works. From these case studies, the data mining techniques are suitable to diagnose
the faults of mechanical equipment. The discussion for each case study is presented
in following paragraphs.
Case Study 1 and Case Study 2 described two examples integrating BP network
with other two techniques. The former described case study with the method of
integrating BP network, PCA and WPD while the later described the case study of
methods integrating BP network, WPD and FFT. To verify the correctness and
effectiveness of these two methods, Blower Fault Diagnosis System was
established. These methods demonstrated high effectiveness in diagnosing machine
faults. They can classify the condition of the monitored components.
In former case study, PCA was applied to reduce the input dimension (number of
variables) of BP network without omitting the useful information. BP network
model may become over specified, i.e. more input variables than is strictly
necessary, due to including superfluous variables which are uninformative, weakly
informative, or redundant [May et al., 2011]. In this case, the total volume of the
modeling problem domain increases exponentially with the linearly increasing of
variable dimensionality which is called curse of dimensionality [Bellman, 1961].
This will cause many problems such as: computational burden increasing which is
a significant influence in determining speed of training and training difficulty due
to inclusion of redundant and irrelevant input variables. By reducing the
dimensionality of variables, PCA can solve these problems and improve the
effectiveness of BP network training. Therefore, the method provides a faster, more
effective and more precise solution for fault diagnosis and prognosis. The latter
case study applies FFT after WPD to extract features which does not too many
variables, and thus the PCA is not applied.
In these two cases, the minimum bandwidth 064 Hz is chosen in WPD because
the fundamental frequency of the vibration signal is 47.5 Hz. In a real system, the
minimum bandwidth of WPD (which means how many levels should be
decomposed) should be selected according to the real fundamental frequency.
There is only one type of fault (unbalance) simulated. In the future, multi-fault
diagnosis should be a research topic. These two methods can also be applied to
decide many other faults such as wear, crack, and fatigue of bearings and gearbox
which faults can be reflected by vibration signals. To apply these methods, the
fundamental frequency has to be known firstly and thereafter the sample rate of
vibration signals, the level of wavelet decomposition, and the structure of BP
network can be determined properly. The degradation information could be very
useful for maintenance decision making, and thus, how to apply this degradation
information in maintenance decision making should be a research issue as well in
the future.
135
Chapter 6: Fault Diagnosis based on Data Mining Techniques
136
Chapter 7: Fault Prognosis based on Artificial Neural Network
7.1 Introduction
Prognosis is the ability to predict accurately and precisely the Remaining Useful
Life (RUL) of a failing component or subsystem. The task of the prognostic
module is to monitor and track the time evolution (growth) of the fault. In the
industrial and manufacturing arenas, prognosis is interpreted to answer the question:
“what is the RUL of a machine or a component once an impending failure
condition is detected, isolated, and identified?” It is a basis of a Condition-Based
Maintenance (CBM) system and presents major challenges to CBM system
designer primarily because it entails large-grain uncertainty. Long-term prediction
of the fault evolution to the point that may result in a failure requires means to
represent and manage the inherent uncertainty. Moreover, accurate and precise
prognosis demands good probabilistic models of the fault growth and statistically
sufficient samples of failure data to assist in training, validating, and fine-tuning
prognostic algorithms. Fault prognosis has been approached through probabilistic,
artificial intelligence and other methodologies. Specific techniques include fuzzy-
adaptive Kalman predictor [Tian et al., 2011], Autoregressive Model [Xin et al.,
2012], fuzzy-filtered neural networks [Li et al., 2013] and Case-Based Reasoning
[Berenji, 2006]. However, there are still some challenge in this area [Vachtsevanos
et al., 2006]:
x How we can infer the actual crack dimension over time in the absence of
the techniques of measuring creak length directly?
x How do we predict accurately and precisely the temporal progression of
the fault?
x How do we prescribe the uncertainty bounds or confidence limits
associated with the prediction?
x Once we have predicted the time evolution of the fault and prescribed the
initial uncertainty bounds, how do we improve on such performance
metrics as prediction accuracy, confidence, and precision?
The techniques of fault prognosis can be classified into three categories: model-
based, probability-based and data-driven methodologies. The model-based
techniques can predict any fault of the machines or components if the accurate
physical model or mathematical model is available. The advantages of this
technique are very apparent: it can predict any type fault in any component in any
stage of faults of a machine. However, determining a complete dynamic model in
terms of differential equations that relate the inputs and outputs of the system being
considered may be impractical in some instances since the machine becomes more
and more complex and integration. Often, historical data from previous failures for
a given class of machinery can be used to establish probabilistic model [Hu et al.,
2011] based on statistic methods. These methods require less detailed information
than model-based techniques because the information needed for prognosis resides
in various probability density functions (PDFs), not in dynamic differential
137
Chapter 7: Fault Prognosis based on Artificial Neural Network
equations. Advantages are that the required PDFs can be found from observed
statistical data and that the PDFs are sufficient to predict the quantities of interest
in prognosis. Moreover, these methods also generally give confidence limits about
the results, which are important in giving a feeling for the accuracy and precision
of the predictions. In many instances, one has historical fault/failure data in terms
of time plots of various signals leading up to failure, or statistical data sets. In such
cases, it is very difficult to determine any sort of model for prediction purposes. In
such situations, nonlinear network approximators can be used for prediction of
failure which provides desired outputs directly in term of data using well-
established formal algorithms. This is so-called data-driven technique fault
prognosis. This Chapter will describe the process of fault prognosis based on
neural network.
As mentioned before, most of big companies have huge history data which is not
effectively used currently. This kind of data can be used to predict and identify the
fault of machines before the failure happens. Fig. 7.1 shows how the history data
can be used in fault prognosis by ANN. This figure and the following sections just
take the SCADA data of wind turbines as an instance of research objects. These
kinds of history data normally contain the performance parameters such as
temperatures, vibrations, speed and lubrication, etc., and alarm/fault/warning list of
all components of the machines. The first step of fault prognosis is to select right
parameters to be analyzed. For a specific fault/failure, there is normally one or
more performance parameters could be the indicator to determine if the
fault/failure happens. For instance the temperature of bearing can be the indicator
of the bearing defect. It is not too difficult to choose the right indicator for a
specific fault through data analysis or experience. Besides, the related performance
parameters with the indicator should be also selected through data analysis,
experience or some algorithms such as boosting tree algorithm [Kudo &
Matsumoto, 2004] and wrapper with genetic search [Kohavi & John, 1997]. Then
the ANN model in normal condition can be trained in which the selected
performance parameters could be the input while the indicator of the fault could be
the output of the ANN model. The trained ANN is so-called ANN model of normal
behavior. The second step is to establish ANN predictor for fault prognosis. In this
step, the history data with fault will be used. With the ANN normal behavior and
the selected performance parameter values, the theoretical values of the indicator
can be estimated and compared with real values from the history data. Through the
comparison and how early the customer wants to have early warning, close alarm
and emergency stop, the thresholds of these levels can be set. Fig. 7.5 could be an
example of these functions. Finally, the ANN model with these thresholds can be
the fault predictor of a machine.
138
Chapter 7: Fault Prognosis based on Artificial Neural Network
Parameter Selection
Real Time
Performance
Comparison
139
Chapter 7: Fault Prognosis based on Artificial Neural Network
Renewable energy sources are playing an important role in the global energy mix,
as a means of reducing the impact of energy production on climate change. Wind
energy is the most developed renewable energy technologies worldwide with more
than 282.48 GW installed capacity at the end of 2012 [GWEC, 2013]. Certain
forecasts indicate that the share of wind in Europe’s energy production will reach
up to 20% in the close future [Krohn et al., 2007]. Today, large wind turbines (2-
6MW) are becoming established as economically viable alternatives to traditional
fossil-fuelled power generation. In some countries, such as Denmark, Germany and
Spain, wind turbines have become a key part of the national power networks [Pinar
Pérez et al., 2013].
Condition monitoring of wind turbines is of increasing importance as the size and
remote locations of wind turbines used nowadays makes the technical availability
of the turbine very crucial. Unexpected faults, especially of large and crucial
components, can lead to excessive downtime and cost because of restricted turbine
accessibility especially for some remote controlled wind farms on mountain and
offshore wind farms. However, even smaller issues and faults of auxiliary
equipment like pumps or fans can also cause expensive turbine downtime due to
the same causes. From an operator’s point of view it is therefore worth increasing
the effort spent to monitor the turbine condition in order to reduce unscheduled
downtime and thus operational costs. The key part of wind turbine monitoring
system is to detect and predict fault (fault diagnosis and prognosis) of turbines as
early as possible so that the maintenance staff can manage and prepare the
maintenance action in advance.
Most wind turbines installed nowadays are integrated with SCDA system which
can monitor the main components. SCADA system typically monitors parameters
such as temperatures of bearings, lubricating oil, windings and vibration levels of
driven train [Becker & Poste, 2006]. This monitored data is collected and stored
via a SCADA system that archives the information in a convenient manner, usually
for all of the turbines in the wind farm. This data quickly accumulates to create
large and unmanageable volumes that can hinder attempts to deduce the health of a
turbine’s components. It would prove beneficial, from the perspective of utility
companies, if the data could be analyzed and interpreted automatically to support
the operators in identifying defects. One main function of SCADA data analysis is
fault detection and predict as early as possible to support the decision of
maintenance action and operation.
Model based methods require a comprehensive physical or mathematical model
which is normally unavailable. Success of data based methods is conditioned by
the significance of historical data and the mathematical method used to detect the
patterns in data. For wind turbine systems where an important amount of data is
stored regularly by SCADA system and process model is not available, the use of
data driven methods is preferred [Nassim, 2011].
140
Chapter 7: Fault Prognosis based on Artificial Neural Network
This section describes Artificial Neural Network (ANN) that can be used to predict
and identify incipient faults in the main component of a turbine, such as main
bearing, gearbox and blades, through the analysis of this SCADA data. BP network
is one type of ANN which can solve the non-linear problems without sophisticated
and specialized knowledge of the real systems. It is suitable to be applied in fault
detection and predict and the principle of BP network was described in Section
3.2.1.The SCADA data sets are already collected and stored, and therefore, no new
installation of specific sensors or diagnostic equipment is required. The technique
developed normal behavior model by ANN and SCADA data analysis which can
calculate the theoretical value of related parameters and compare to the real
measurement of the same parameters. The parameters mentioned above can be
indicator of abnormal behavior of incipient component failure. In this way, only
interesting information is highlighted to the operator, therefore significantly
reducing the volume of data they are faced with. This section just take the main
shaft rear bearing monitoring as an instance to show how the technique works.
An operational wind farm typically generates vast quantities of data which is well
known SCADA data.
x The SCADA data contain information about every aspect of a wind farm,
from power output and wind speed to any errors registered within the
system. Thus by keeping track of both wind speed and power output
parameters, the overall health of the turbine can be supervised.
x SCADA data may be effectively used to “tune” a wind farm, providing
early warning of possible failures and optimizing power output across
many turbines in all conditions
It is common for “condition monitoring” to be applied to a wind farm. However,
this involves the addition of extra instrumentation, involving wind farm down time,
extra cost and potential warranty implications. As distinct from condition
monitoring, performance monitoring using existing instrumentation to analyze
SCADA data of wind turbines is no extra instrumentation, no down time and no
cost. It has the advantage of using data already routinely gathered. By making use
of specially-designed software tools, a great deal of information may be gathered
and analyzed to provide a detailed look at the performance of the wind farm.
Typical parameters recorded by SCADA on wind turbines could be broadly
categorized into following types which could be used in fault detection and
diagnosis activity [Verma & Kusiak, 2012].
x Wind parameters, such as wind speed and wind deviations;
x Performance parameters, such as power output, rotor speed, and blade
pitch angle;
x Vibration parameters, such as tower acceleration and drive train
acceleration; and
141
Chapter 7: Fault Prognosis based on Artificial Neural Network
A parameter of main shaft rear bearing in the SCDA data, i.e. turbine rear bearing
temperature, gives an indication of how hot of the bearing are running, and
therefore offer the possibility to detect rear bearing overheating. The
straightforward threshold check which has already been applied in real wind farm,
could be used to flag up temperature exceeding a certain limit, this might be too
late to avoid significant damage to the main shaft rear bearing. The desired
functionality should take into consideration any relevant aspects of turbine
operation. This approach would allow temperatures to be detected that are too high
in the context of the concurrent level of power generation, leading to a quicker and
more effective identification of abnormal behavior.
142
Chapter 7: Fault Prognosis based on Artificial Neural Network
components of this type of wind turbine. Therefore, one of the key components,
main shaft rear bearing, is main monitoring object in this section.
Accordingly, the parameters may affect the rear bearing temperature contain:
active power output, nacelle temperature, turbine speed and cooling fan status.
Unfortunately, the cooling fan status is not available in current SCADA data and
thus the parameters selected to establish ANN model for the parameter of main
shaft rear bearing temperature can be chosen as seen in Table 7.1.
143
Chapter 7: Fault Prognosis based on Artificial Neural Network
continuous number of instances, i.e. a prolonged period of time and not a minor
fluctuation, then this would flag as a fault.
Fig. 7.2 Neural Network Turbine Rear Bearing Temperature Model Training Data
144
Chapter 7: Fault Prognosis based on Artificial Neural Network
40
35
30
25
2009-05-31 2009-06-07 2009-06-14 2009-06-21 2009-06-28 2009-07-05 2009-07-12 2009-07-19 2009-07-26
(a)
Turbine Speed
20
Turbine Speed (RPM)
15
10
0
2009-05-31 2009-06-07 2009-06-14 2009-06-21 2009-06-28 2009-07-05 2009-07-12 2009-07-19 2009-07-26
(b)
Nacelle Temperature
35
Temperature (Celsius Degree)
30
25
20
15
2009-05-31 2009-06-07 2009-06-14 2009-06-21 2009-06-28 2009-07-05 2009-07-12 2009-07-19 2009-07-26
(c)
Active Power
4000
Active Power (KW))
3000
2000
1000
0
2009-05-31 2009-06-07 2009-06-14 2009-06-21 2009-06-28 2009-07-05 2009-07-12 2009-07-19 2009-07-26
(d)
Time (yyyy-mm-dd)
145
Chapter 7: Fault Prognosis based on Artificial Neural Network
40
35
30
25
2009-05-31 2009-06-07 2009-06-14 2009-06-21 2009-06-28 2009-07-05 2009-07-12 2009-07-19 2009-07-26
(a)
-1
-2
-3
09-05-31 09-06-07 09-06-14 09-06-21 09-06-28 09-07-05 09-07-12 09-07-19 09-07-26
(b)
Time (yyyy-mm-dd)
Once the normal behavior of rear bearing ANN model was trained, it can be used
to detect and predict the corresponding fault of rear bearing by comparing
estimated and actual temperature. Fig. 7.5(a) shows the evolution of rear bearing
temperature from the period of July 2010 to March 2011 which contains eight
months where it eventually fails. Fig. 7.5(b) shows the difference trend between the
estimated and actual temperature of rear bearing in this period. The first important
deviation from the model estimates occurred from the start of October 2010, i.e.
point ķ. The frequency of deviation and their duration increased in the following
months. From point ĸ, the deviation from the model estimates increased to 4 0C
and lasted to point Ĺ where the turbine was stopped because of overheating. Then,
the operator of wind farm tried to solve the problem two times in point Ĺ and
point ĺ, but not successful and finally the turbine was completely stopped because
of the same overheating. From this figure, the method can give the operator a
warning as early as three months in point ķ before the failure happens. With the
evolution of the failure, the deviation from model estimation increases and the
alarm can be given to operator when the deviation reaches the level of point ĸ.
Therefore, the alarm can be given as early as 10 days before the failure happens.
The results produced by ANN model for rear bearing fault detection and prediction
are very positive. They can provide an early warning of problems developing in the
bearing before the absolute temperature becomes apparently high. The results of
the fault detection can be used to help the operator to make the schedule of
146
Chapter 7: Fault Prognosis based on Artificial Neural Network
maintenance actions before the failure happens to reduce the maintenance cost,
reduce the unanticipated downtime and improve the reliability of the wind turbine.
60 EstimatedTemp
BearTemp
50
40
30
20
10
4
10
5
2 3
1
5
4
1.5
0
-5
2010-08-01 2010-09-01 2010-10-01 2010-11-01 2010-12-01 2011-01-01 2011-02-01 2011-03-01
(b)
Time (yyyy-mm-dd)
7.3.4 Discussion
This section mainly discusses whether the model established from the SCADA data
of one turbine as in Fig. 7.2 can apply in fault detection for other turbines. Then,
another turbine is selected to test for this purpose. However, there is no same fault
in this wind farm and thus the SCADA data from a new turbine in normal
condition. If the differences between actual and estimated temperatures are located
within 1.5 0C as in Fig. 7.4, the conclusion can be drawn that the ANN model for
rear bearing can be applied for fault detection and prediction for other turbines.
Fig. 7.6 shows the three months’ SCADA data of a new turbine in same wind farm
in normal condition. The data presented in this figure is also very varied: the
turbine speed is varied with starting and stopping, the active power changes from
500 KW to more than 3000 KW, and the temperatures are also varied. These three
months’ SCADA data supposed to be large varied of the parameters in normal
condition. Fig. 7.7 shows the results of ANN model using the SCADA data from
the new turbine. The estimated rear bearing temperature is very close to the actual
value. The maximum difference between estimated and actual temperatures is less
than 1.5 0C which is an early warning level as shown in Fig. 7.5. This means the
147
Chapter 7: Fault Prognosis based on Artificial Neural Network
new turbine is in normal condition. Therefore, the ANN model of rear bearing
using SCADA data from one turbine can be applied for other turbines in same type.
40
35
30
25
Turbine Speed
15
10
0
2010-07-01 2010-08-01 2010-09-01 2010-10-01
(b)
Nacelle Temperature
30
20
10
0
2010-07-01 2010-08-01 2010-09-01 2010-10-01
(c)
Active Power
3000
2000
1000
0
2010-07-01 2010-08-01 2010-09-01 2010-10-01
(d)
Time (yyyy-mm-dd)
Fig. 7.6 Rear Bearing Model Testing Input Data of New Turbine
148
Chapter 7: Fault Prognosis based on Artificial Neural Network
40
35
30
25
20
15
2010-07-01 2010-08-01 2010-09-01 2010-10-01
1.5
0.5
-0.5
-1
-1.5
-2
2010-07-01 2010-08-01 2010-09-01 2010-10-01
Fig. 7.7 Rear Bearing Model Output in Normal Condition of New Turbine
7.4 Summary
This Chapter described ANN technique for early fault prediction and identification
for the main components of wind turbines especially for the bearings based on the
existing SADA data collected by commercial SCADA system. The result shows
that it can deal with large volume SCADA data and give the operators of wind
farm very early warning and close alarm to help them make the right maintenance
schedule and action decision in advance. In this way, the information presented to
the operator is dramatically reduced without omitting useful information. The
maintenance and operation cost can also be reduced by optimize the maintenance
plan, staff and preparation of tools according to the early warning and alarm. The
instance presented in this section only established a normal behaviour ANN model
for one component, i.e. main shaft rear bearing. In the future, more components
need to establish normal behaviour model. In this section, the ANN model
established by one turbine tested in new one only in normal condition, and in the
future, the test should be done in different conditions contain normal level, warning
level and alarm level.
149
Chapter 7: Fault Prognosis based on Artificial Neural Network
150
Chapter 8: Maintenance Scheduling Optimization based on Data Mining
Techniques
8.1 Introduction
The range of maintenance cost is from 15% for manufacturing companies and 40%
for iron and steel industry of the whole cost of manufactured parts and machines
[Mobley, 1990]. The corresponding cost in United Stated is more than 200 billion
dollars every year [Chu et al., 1998]. This shows the significance of maintenance in
the viewpoint of economy.
Generally, there are three different types of maintenance strategies. The first one is
called Corrective Maintenance (CM) which is similar to repair work, is undertaken
after a breakdown or when obvious failure has been located. However, CM at its
best should be utilized only in non-critical areas where capital costs are small,
consequences of failure are slight, no safety risks are immediate, and quick failure
identification and rapid failure repair are possible. The second one is called
preventive maintenance which is scheduled without the occurrence of any
monitoring activities. The scheduling can be based on the number of hours in use,
the number of times an item has been used, or the number of kilometers the items
has been used, according to prescribed dates. The preventive maintenance may
cause much more or much less maintenance activities, which may cause more
maintenance cost or hazard of personnel and equipment. The last one is called
Predictive Maintenance (PM) which is a set of activities that detect changes in the
physical condition of equipment (signs of failure) in order to carry out the
appropriate maintenance work for maximizing the service life of equipment
without increasing the risk of failure. PM is a dynamic schedule according to the
state of machines from continuous and/or periodic inspection. It utilizes the product
degradation information extracted and identified from on-line sensing techniques to
minimize the system downtime by balancing the risk of failure and achievable
profits.
PM has some advantages over other maintenance policies: 1) Improving
availability and reliability by reducing downtime; 2) Enhancing equipment life by
reducing wear from frequent rebuilding, minimizing potential for problems in
disassembly and reassembly and detecting problems as they occur; 3) Saving
maintenance costs by reducing repair costs, reducing overtime and reducing parts
inventory requirements; 4) Decreasing number of maintenance operations causes
decreasing of human error influence. However, there are still some challenges of
PM: 1) Initiating PM is costly because the cost of sufficient instruments could be
quite large especially if the goal is to monitor already installed equipment; 2) The
goal of PM is accurate maintenance, but it is difficult to achieve for the complexity
of equipment and environment; 3) Introducing PM will invoke a major change in
how maintenance is performed, and potentially to the whole maintenance
organization in a company. Organizational changes are in general difficult. The
objective of maintenance scheduling optimization is to optimize maintenance
151
Chapter 8: Maintenance Scheduling Optimization based on Data Mining
Techniques
scheduling in order to maximize the whole profit, ensure safety and increase
availability.
Mathematically, the maintenance scheduling problem is a multiple-constraint, non-
linear and stochastic optimization problem. This kind of problem has been studied
for several decades and many kinds of different methods have been applied to
solve it. Two methods for PM optimization had been developed during 1980s. The
first method [Perla, 1984; Walker, 1987] performs cost/benefit analysis of each
analyzed piece of manufacturing equipment. It is based on identifying important
equipment firstly, and then predicting its future performance with and without
changes in the regularly scheduled maintenance program. The second approach is
the Reliability-Centered Maintenance (RCM) [Crellin, 1986; Hook et al., 1987;
Vasudevan, 1985]. This methodology was adopted from the commercial air
transport industry. It is based on a series of orderly steps, including identification
of system/subsystem functions and failure modes, prioritization of failures and
failure modes (using a decision logic tree), and finally selection of PM tasks that
are both applicable (i.e. have the potential of reducing failure rate) and effective
(i.e. economically worth doing). In the last two decades, many kinds of intelligent
computational methods, such as the artificial neural network method, simulated
annealing method, expert system, fuzzy systems and evolutionary optimization,
have been applied to solve the maintenance scheduling problem and obtained many
very exciting results [Huang, 1998; Miranda et al., 1998; Satoh & Nara, 1991;
Sutoh et al., 1994; Yoshimoto et al., 1993]. And also, with the rapid development
of the evolutionary theory, genetic algorithms (GAs) had become a very powerful
optimization tool and obtained wide application in this area [Arroyo & Conejo,
2002; Back et al., 1997; Huang et al., 1992; Lai, 1998; Lee & Yang, 1998; Y.
Wang & Handschin, 2000]. In recently years, several new intelligent computational
methods such as Ant Colony Optimization (ACO) and Particle Swarm
Optimization (PSO) have been applied in preventive maintenance scheduling
[Benbouzid-Sitayeb et al., 2008; Pereira et al., 2010; Yare & Venayagamoorthy,
2010].
All the above methods of maintenance scheduling are based on the specified time
periods other than based on the condition of the equipment or facilities. PM is a
good strategy which could be used to improve reliability and increase useful life of
the equipment and reduce the cost of maintenance according to the condition of
machine. When the condition of a system, such as its degradation level, can be
continuously monitored, PM policy can be implemented, according to which the
decision of maintaining the system is taken dynamically on the basis of the
observed condition of the system. Recently, genetic algorithms, Monte Carlo
method, Markov and semi-Markov methods are applied in PM [Amari et al., 2006;
Barata et al., 2001, 2002; BeĄrenguer et al., 2000; Grall et al., 2008; Marseguerra
et al., 2002]. However, there are very few literatures on applying the intelligent
computational methods in predictive maintenance based on the conditions
(degradation) of monitored machines.
This Chapter will build PM scheduling models and optimize it using Swarm
Intelligence algorithms.
152
Chapter 8: Maintenance Scheduling Optimization based on Data Mining
Techniques
153
Chapter 8: Maintenance Scheduling Optimization based on Data Mining
Techniques
Power generating companies must generate sufficient electrical power to cater for
the varying demands of consumers. Electricity cannot be easily and cheaply stored,
so it must be continuously generated based on the customers’ demand. With the
increasing demand of electricity, the generating unit maintenance scheduling (GMS)
of power system has become a complex, multi-object-constrained optimization
problem. Within the last three decades, several techniques have appeared in the
literature addressing such optimization problems under different scenarios
[Marwali & Shahidehpour, 2000; Negnevitsky & Kelareva, 1999]. The primary
goal of the GMS is the effective allocation of generating units for maintenance
while ensuring high system reliability, reducing production cost, prolonging
generator life time subject to some units and system constraints [Yare et al., 2008].
In order to obtain an approximate solution of a complex GMS, some new concepts
have been proposed in recent years. They include applications of probabilistic
approach [Billinton & Abdulwhab, 2003], simulated annealing [Satoh & Nara,
1991], decomposition technique [Yellen et al., 1992] and genetic algorithm (GA)
[Firmo & Legey, 2002]. A flexible GMS that considered uncertainties is proposed
with a fuzzy 0-1 integer programming technique adopted and applied to the Taiwan
power system. The application of GA to GMS has been compared with and
confirmed to be superior to other conventional algorithms such as heuristic
approaches and branch-and-bound (B&B) in the quality of solutions [Firmo &
Legey, 2002]. However, the application of particle swarm optimization (PSO) and
154
Chapter 8: Maintenance Scheduling Optimization based on Data Mining
Techniques
their variants to GMS has not been fully explored in the literature. This section is
retrieved from [Zhang & Wang, 2010].
Generally, there are two main categories of objective functions in GMS problems,
namely based on reliability and economic cost. The reliability criteria of levelling
reserve generation for the entire period of study is considered in this paper. As an
objective function of the GMS problems, we establish annual supply reserve ratio
levelling one of the deterministic index. Because algorithms for levelling supply
reserve ratio is easy to implement without considering probabilistic simulation
procedures operation cost, it is possible to establish an annual GMS problem (52-
week horizon). However, it has a weak point in not considering probabilistic
conditions such as generators' forced outage. Actually generation companies have
been utilizing minimizing annual supply reserve ratio more than probabilistic index
methods. For our research, we just focus on PSO algorithm accessing to GMS
problem, so it is enough to formulate the objective function as an annual supply
ratio levelling.
The problem studied here was solved by minimizing the annual supply reserve
ratio. The problem has a number of units and system constraints to be satisfied
which were described as follows:
x Load constraints – total capacity of the units running at any interval should
be not less than predicted load at that interval.
x Crew constraint – for each period, the capacity of maintenance units cannot
exceed the maximum available maintenance capacity considering crew in
this period.
x Start week of maintenance – Each unit has its maintenance periods, the
maintenance schedule cannot exceed these periods.
x Maintenance window constraints – defines the starting of maintenance at
the beginning of an interval and finish at the end of the same interval
which may contain one or several weeks. The maintenance cannot be
aborted or finished earlier than scheduled.
The objective function to be minimized is given by Eq. (8.1) subject to the
constraints given be Eq. (8.2)-(8.5).
2
T ª§ Ac Lt · 1 T § Act Lt · º
it Min¦ «¨ t ¸ ¦¨ ¸»
«© Lt ¹ T t 1 © Lt ¹ »¼
t 1 ¬
2
(8.1)
ª§ IC SLt Lt
T
· 1 T § IC SLt Lt ·º
Min¦ «¨ ¸ ¦¨ ¸»
«©
t 1 ¬ Lt ¹ T t 1© Lt ¹ ¼»
155
Chapter 8: Maintenance Scheduling Optimization based on Data Mining
Techniques
Where:
T : Length of the maintenance planning scheduling (normally 52 weeks);
ACt : Available generation capacity at t th week;
S min
j : Feasible minimum starting week for maintenance scheduling of jth unit;
S max
j : Feasible maximum starting week for maintenance scheduling of jth unit;
C j : Capacity of jth unit;
PSO performs well in the early iterations, but they have problems approaching a
near-optimal solution. If a particle’s current position accords with the global best
and its inertia weight multiply previous velocity is close to zero, the particle will
only fall into a specific position. If their previous velocities are very close to zero,
all the particles will stop moving around the near-optimal solution, which may lead
to premature convergence of algorithm. All the particles have converged to the best
position discovered so far which may be not the optimal solution. So, an improved
PSO (IPSO) is proposed here.
156
Chapter 8: Maintenance Scheduling Optimization based on Data Mining
Techniques
In IPSO, before updating the velocities and positions in every iteration, the
particles are ranked according to their fitness values in descending order. Select the
first part of particles (suppose mutation rate is D , fist part is (1 D ) and put them
into the next iteration directly. Regenerate the rest part of particles (D ) randomly.
In this case, we can regenerate the positions and velocities according to the
following equations instead of Eq. (3.23)-(3.24):
xid
round rand u S max ( j ) S min ( j ) S min ( j ) (8.6)
In order to investigate the performance of IPSO for the GMS problem, a test
system comprising 32 units over a planning period of 52 weeks was used. The case
study is described below and implemented in a MATLAB environment.
There are 32 generating units, annual peak load demand is 2,850 MW, and installed
capacity is 3,450 MW. The weekly peak loads in present of annual peak are shown
in Table 8.1. The specific data of the generators are shown in Table 8.2 which
include capacity (MW), maintenance period and load constraints. The value of
crew constraint is constant at 800 MW.
To implement PSO and IPSO, a population size of 150 particles was chosen to
provide sufficient diversity into the population taking into account the
dimensionality and complexity of the problem. This population size ensured that
the domain was examined in full but at the expense of an increase in execution
time. The other parameters of PSO and IPSO were: c1 = c2 = 2.0, Ȧ = 1.2 - 0.8
with linearly decreasing, total iteration = 300 and V [-3, 3].
Annual supply reserve ratio values by the change of the number of iteration are
shown in Fig. 8.2. We compared simulation results between the PSO and IPSO
algorithms. We can see from this figure, the IPSO algorithm has a better
performance than PSO in GMS problems to find optimal solutions. The particles of
IPSO have a higher possibility to find optimal solution than those of PSO. The
optimal solutions of GMS problems using PSO and IPSO are shown in Table 8.3. It
contains global particles of PSO and IPSO respectively which have a best
maintenance period satisfying maintenance continuity and crew constraints, etc.
157
Chapter 8: Maintenance Scheduling Optimization based on Data Mining
Techniques
158
Chapter 8: Maintenance Scheduling Optimization based on Data Mining
Techniques
159
Chapter 8: Maintenance Scheduling Optimization based on Data Mining
Techniques
M
Prod tot ¦ ( Prod
i 1
i x Zi ) (8.9)
160
Chapter 8: Maintenance Scheduling Optimization based on Data Mining
Techniques
8.3. The states for all the machines can be used as parameters in predictive
maintenance scheduling.
¦P
j 1
ij 1 i 1," , n (8.11)
This model is very similar with the Markov model in lack of a random variable of
inspection time. With the Markov, the mean time between CM and mean time
between PM can be estimated [Amari et al., 2004]. But with the Markov model,
the accumulative error is very difficult to eliminate. The result is only the mean
time between CM and mean time between PM rather than the real plan or
scheduling of CM or PM . With that result, the maintenance action CM and PM
could be much more or less than it necessary because of uncertainty of mechanical
products. Therefore, the inspection action is performed in the beginning of every
161
Chapter 8: Maintenance Scheduling Optimization based on Data Mining
Techniques
period as mentioned in section 8.4.1.2. What’s more, in this model, there is no any
CM or PM action when the state of the machine is in the range between S1 and
S k 1 . The PM plan is made when the state of the machine is in range between Sk
and S n 1 , and as mention above, the CM action is performed if and only if the state
of machine reach or exceed S n . To simplify the analysis, for the element values in
the state transition matrix in Eq. (8.10), from the S1 to S k 1 , only Pii and
Pi ,i 1 (i 1, 2,", k 1) have positive values and others are all zero, while from the Sk
to S n 1 , only Pii , Pi , i 1 and Pi1 (i k , k 1," , n 1) have positive values and the others
are all zero as well. The new equation can be expressed as Eq. (8.12).
ª P11 P12 0 0 0 0 0 " 0 0 º
« 0 P22 P23 0 0 0 0 " 0 0 »
« »
« 0 0 P33 P34 0 0 0 " 0 0 »
« »
« # # # % # # # " # # »
« Pk1 0 0 0 0 Pkk Pk , k 1 " 0 0 »
P « » (8.12)
« Pk 1,1 0 0 0 0 0 Pk 1, k 1 " 0 0 »
«P 0 0 0 0 0 0 " 0 0 »
« k 2,1 »
« # # # # # # # % # # »
« »
« Pn 1,1 0 0 0 0 0 0 " Pn 1, n 1 Pn 1, n »
«¬ 1 0 0 0 0 0 0 " 0 0 »¼
The ideal values of all the elements in Eq. (8.12) for the perfect deterioration model
are express from Eq. (8.13) and Eq. (8.14).
Pii 0 & Pi ,i 1 1, i 1" k (8.13)
For the state of S n in Eq. (8.12), Pn1 1 and all values of other elements are 0
which mean that when the state reach S n , CM has to be performed. These values
could be a real situation of a manufacturing machine but it is difficult make the
values reality. To achieve this point, the values of states from S1 to S n should be
adjusted after a number of periods by statistics.
162
Chapter 8: Maintenance Scheduling Optimization based on Data Mining
Techniques
where Cprod represents the production cost for a period while Cpiece represents the
cost for producing one piece.
Maintenance Cost: it is due to the performing PM and CM, which means the how
much money needed to perform the PM and CM.
M
CM ¦ (CM
i 1
i x C ci PM i x C pi ) (8.16)
where Ctot is the total cost in one period while CI is the inspection cost. Because in
this model all machines are inspected for every period, the value of CI is fixed.
¦ (CM
i 1
i PM i ) d M max (8.20)
where P r is the price for one piece of product, Prod min is the minimum amount
products limitation of one period, and M max is a limitation of the maximum
maintenance action can be performed. In this model, to find the optimal dynamic
predictive maintenance plan for each period, Eq. (8.18) could be an objective
function and Eq. (8.19) and Eq. (8.20) could be two constraints. The aim is to make
PM maintenance scheduling to obtain maximum Profit with two constraints of Eq.
(8.19) and Eq. (8.20).
163
Chapter 8: Maintenance Scheduling Optimization based on Data Mining
Techniques
3.08
3.07
Fitness Value
3.06
3.05
3.04
3.03
0 500 1000 1500 2000 2500
Iteration
164
Chapter 8: Maintenance Scheduling Optimization based on Data Mining
Techniques
165
Chapter 8: Maintenance Scheduling Optimization based on Data Mining
Techniques
Wind energy industry has experienced an extensive and worldwide growth during
the past years. Certain forecasts indicate that the share of wind in Europe’s energy
production will reach up to 20% in the close future [Krohn et al., 2007]. The
efficient operation of installed turbines has an increasing significance. Among
operational decisions, the planning and scheduling of maintenance tasks is decisive
regarding both turbine availability and operational costs. Considering the spread of
offshore installations and the fact that their operational costs including specialized
support resources for offshore operations, such as service vessels and personnel,
can be estimated to be five to ten times more expensive than that of the onshore
farms [Bussel & Zaaijer, 2001; Markard & Petersen, 2009], maintenance
scheduling will receive even more emphasis. Meanwhile, the support resources are
often restricted by the environmental conditions at the site, and certain operations
are allowed only in short weather windows. Missing the weather window may lead
to production interruption and economic loss.
The Chapter aims to investigate an operational decision problem, i.e. routing and
scheduling of a maintenance fleet for offshore wind farms which can be used to
avoid a time-consuming process of manually planning the scheduling and routing
with a presumably suboptimal outcome. Mathematical model of RSOM is retrieved
from a literature [Dai, 2014] and then a swarm intelligence, i.e. Ant Colony
Optimization (ACO) is modified as Duo-ACO to be applied to solve this problem.
Let there are n offshore wind turbines (OWTs) indexed by i . Associate to the
delivery location of OWT i a node i , and to its pick up location a node n i . Also
associate to the harbor, nodes 0 and 2n 1 . The definitions of the variables can be
given as following:
166
Chapter 8: Maintenance Scheduling Optimization based on Data Mining
Techniques
Sets:
Z : the set of delivery nodes, Z = ^1, 2,3," , n` .
Z V Z : the set of nodes that require the vessel present during the maintenance
operations.
N : the set of all the nodes; N Z * > 0, 2n 1@ .
TvdMAX : the maximum working hours on day d for vessel v , which is used as the
weather limitation for different vessels.
LMAX
v : the load capacity of vessel v .
Ti LATE : the latest day to perform the maintenance task on turbine i without incurring
a penalty cost.
CiPE : the penalty cost per day for the delaying maintenance task on turbine i
beyond Ti LATE .
Decision variables
1, vessel v travels from node i to node j on maintenance day d
xvijd ®
¯ 0, otherwise
167
Chapter 8: Maintenance Scheduling Optimization based on Data Mining
Techniques
k vid : the total load weight on vessel v just after it leaves node i on maintenance
day d .
qvid : the total personnel number on vessel v just after it leaves node i on
maintenance day d .
Objective function
½
min ®¦¦ Cv tv (2 n 1) d ¦ CiPE yi ¾ (8.21)
¯vV d T iZ ¿
Constraints
¦¦¦x
j N vV d T
vijd 1, i Z , (8.22)
¦x
i N
v 0 id 1, v V , d T , (8.23)
¦x
j N
vjid ¦x
j N
vijd , v V , d T , i N , (8.24)
¦x
i N
vi ( 2 n 1) d 1, v V , d T , (8.25)
¦x
j N
vjid ¦x
j N
v ( n i ) jd , v V , d T , i Z , (8.26)
¦¦ x
vV d T
vi ( i n ) d 1, i Z V , (8.27)
tv ( n i ) d tvid t Ti M , i Z , v V , d T , (8.28)
¦ ¦ ¦ (d < x
j N vV d T
vijd yi ) d T jLATE , i Z , (8.29)
0 d kvid d LMAX
v i N , v V , d T , (8.36)
tv 0 d 0 v V , d T , (8.39)
168
Chapter 8: Maintenance Scheduling Optimization based on Data Mining
Techniques
yi t 0, i Z (8.40)
Constraints -
1) Eq. (8.22) ensure that each OWT is visited only once for delivery and once
for pick up.
2) Eq. (8.23) and (8.25) ensure that each vessel leaves and returns the harbor
only once every day.
3) Eq. (8.24) and (8.26) ensure flow conservation at each node.
4) Eq. (8.27) means that if the vessel needs to present during the maintenance
operation on one OWT, it will only leave the OWT when the operation is
completed.
5) Eq. (8.28) is precedence constraints which force the pickup is not done
before completing the maintenance operation on the same OWT.
6) Eq. (8.29) is soft constraints which require that the maintenance task is
performed within the preferred time.
7) Eq. (8.30) keeps the travelling time compatibility of each vessel.
8) Eq. (8.31) ensures the service vessels are not overloaded.
9) Eq. (8.32) expresses the compatibility requirements between routes and
vessel loads.
10) Eq. (8.33) ensures that no extra load added when the vessels pick up from
OWTs.
11) Eq. (8.34) and (8.35) describe the compatibility requirements between
routes and personnel number on the vessels.
12) Eq. (8.36) and (8.37) guarantee that neither of load or personnel number
exceeding the vessel limitations.
13) Eq. (8.38) imposes a maximal working time of the service vessels on each
day.
14) Eq. (8.39) means the time is counted from the vessels leaving the harbor.
15) Eq. (8.40) set the delayed maintenance day to be non-negative.
169
Chapter 8: Maintenance Scheduling Optimization based on Data Mining
Techniques
170
Chapter 8: Maintenance Scheduling Optimization based on Data Mining
Techniques
Start
Initializetheparameters,
SetThealgorithmiterationnumber NC=0
Computeprobabilitiesofnextselectednode forall
unvisitednodesforant1(k)byEq.(3.21)
Choosingnextnode accordingtotheprobabilities
NC=NC+1 k=k+1
androulettewheelselectionprinciple
Allnodesarevisitedbyant1(k)and Yes
ant2(k)?
No
Computeprobabilitiesofnextselectednode forall
unvisitednodesforant2(k)byEq.(3.21)
Choosingnextnode accordingtotheprobabilities
androulettewheelselectionprinciple
NO Allnodesarevisitedbyant1(k)and
ant2(k)?
Yes
Updatepheromone1andpheromone2 byEq.(3.22)
No
Allantshavevisitedallnodes?
Yes
No
Reachmaximumiteration
(NC=NCmax)
orotherterminationcriterion?
Yes
End
171
Chapter 8: Maintenance Scheduling Optimization based on Data Mining
Techniques
The process of the program was shown in Fig. 8.6. There are two groups of ants
and each of them represents a vessel. The routing of ant represents the routing and
scheduling of maintenance. From the experience, the number of ants in each group
should be approximately the number of nodes the ants be visited which are
offshore turbines in this case. Therefore, the parameters of Duo-ACO are set as:
number of ants of each group is 10, the maximum iteration is 300, important
coefficient of pheromone D and E are set as 1 and 5 respectively, and the
pheromone evaporation coefficient is 0.1.
The results of maintenance scheduling and routing with 8 offshore turbines are
shown in Table 8.9. Vessel1 and vessel2 visit and maintain 5 turbines and 3
turbines respectively. The routing number here is the same mean as Table 8.7. The
result is that the two vessels can visit and maintain these turbines within one day
(5.4915 and 8.7097 hours respectively) and the objective value of Eq. (8.21) is
3848.5.
172
Chapter 8: Maintenance Scheduling Optimization based on Data Mining
Techniques
173
Chapter 8: Maintenance Scheduling Optimization based on Data Mining
Techniques
6500
6000
5500
Objective Value
5000
4500
4000
3848.5
3500
0 50 100 150 200 250 300
Iteration
In order to examine the Duo-ACO performance for a large number turbines’ wind
farm, a new offshore wind farm with 28 turbines are tested. The information of two
vessels is the same as shown in Table 8.6 and the maximum working hours for
each day is the same as Table 8.8. The conditions and parameters of 28 turbines are
shown in Table 8.10. The parameters of Duo-ACO changes because of the
increasing the number of wind turbine. The number of ants of each group is set as
30 and the maximum iteration is set as 1000. The results are shown in Table
8.11Table and Fig. 8.9. The vessel1 and vessel2 visit and repair, inspection or
replacement 19 turbines and 9 turbines respectively. Vessel1 need four days to visit
and maintain all these 19 turbines, and it needs 5.859, 5.7938, 6.9541, and 4.8947
hours for each day which are less than that of the maximum working hours of
vessel1 in Table 8.8. Vessel2 need 2 days to visit and maintain 9 turbines, and it
needs 7.4814 and 7.0311 hours for each day which are also less than that of
maximum working hours of vessels2 in Table 8.8. The objective value of fitness
function of Eq. (8.21) is 94641.6 as shown in Table 8.11.
These two numerical examples show how to apply Duo-ACO in scheduling and
routing of maintenance fleet for offshore wind farms which is a complex non-linear
problem. Example 1 shown the problem solution with 8 offshore turbines while
example 2 shows that of 28 offshore turbines and both of examples show the
effectively of Duo-ACO application of the scheduling and routing problems of
offshore wind farms.
174
Chapter 8: Maintenance Scheduling Optimization based on Data Mining
Techniques
Time Task
Penalty cost Required Required
window duration
Unit Turbines Task type (euro/day) load (kg) personnel
(day) (hours)
CiPE Li Pi
Ti LATE Ti M
1 T3 Replacement 3 2000 800 3 3
2 T4 Repair 6 500 50 2 2
3 T6 Replacement 4 1500 800 3 3
4 T11 Inspection 12 0 20 1 1
5 T12 Repair 4 1600 200 2 3
6 T13 Replacement 2 2500 500 3 2
7 T14 Replacement 2 2000 500 3 2
8 T16 Repair 5 1000 300 3 2
9 T19 Replacement 1 3000 200 2 2
10 T21 Repair 7 1000 50 1 2
11 T23 Inspection 12 0 20 1 1
12 T25 Inspection 10 0 20 1 1
13 T27 Replacement 2 2500 500 2 3
14 T30 Repair 4 1200 100 2 1
15 T36 Replacement 3 2000 800 5 3
16 T38 Inspection 12 0 20 1 1
17 T39 Replacement 4 2000 300 2 2
18 T42 Repair 5 1000 200 2 2
19 T44 Inspection 10 0 20 1 1
20 T45 Repair 8 1000 500 2 2
21 T49 Replacement 1 3000 800 4 3
22 T52 Replacement 2 2000 800 4 3
23 T54 Repair 5 1000 50 1 1
24 T55 Replacement 3 2000 500 3 2
25 T58 Inspection 13 0 20 1 1
26 T60 Repair 6 1000 500 3 4
27 T61 Repair 7 1000 300 2 3
28 T62 Inspection 12 0 20 1 1
175
Chapter 8: Maintenance Scheduling Optimization based on Data Mining
Techniques
0-2-1-13-15-2-1-13-15-0-3-7-6-10-14-14- 7.4814,
Vessel2 9
7-6-10-3-0 7.0311
4
x 10
1.35
1.3
1.25
1.2
Objective Value
1.15
1.1
1.05
1
9641.6
0.95
0 100 200 300 400 500 600 700 800 900 1000
Iteration
There is also a drawback of the Duo-ACO to solve this problem. With the
increasing of turbines, the process to find the solution using Duo-ACO becomes
time-consuming. However, this problem is not so time sensitive which means the
key point is to find the optimal solution regardless how much time it using.
Therefore, the Duo-ACO is a suitable algorithm to solve this non-linear scheduling
and routing problem.
8.6 Summary
176
Chapter 8: Maintenance Scheduling Optimization based on Data Mining
Techniques
condition of machines but based on a fixed period (a year) and so it can be seen as
preventive maintenance scheduling. PSO was improved with mutation rate D
called improved PSO (IPSO) to apply in generating units maintenance scheduling.
Both PSO and IPSO can find optimal maintenance schedule of generation units but
IPSO has better performance with faster convergence speed and better fitness value.
For application of BCA in predictive scheduling optimization, a model of dynamic
model of condition based maintenance was established. The dynamic predictive
maintenance model is based on the condition of machines other than fixed period
like preventive maintenance. The main effort of predictive maintenance (PM) is to
avoid unnecessary maintenance action tasks by taking maintenance action just in
case of detecting any evidence of abnormal performance in physical condition. A
PM program can significantly decline the maintenance cost by decreasing the
number of needless scheduled preventive activities. PM program allows the
maintenance function to do only the right things, at the correct time, minimizing
spare parts cost, system downtime and time spent on maintenance. Based on the
model and condition of each machine, a dynamic scheduling of PM and CM can be
done using BCA. The result obtained from the numerical example confirms the
trend of successful application using this algorithm in the field of PM, where a
dynamic approach has a fundamental importance. Although the desired results
have fully achieved, and the analysis has helped to highlight and solve many
critical issues, it is clear that more careful analysis should be done when analyzing
PM maintenance model. In this Chapter, only one single kind condition for each
machine. However, mostly, more than one parameters get together to determine the
state of a machine. Therefore, how to get the state of a machine using different
parameters could be a future research field. Furthermore, the case study in this
Chapter only consider the one period because the limitation of our resources. In the
future, the long history period should be considered and the methods for adjusting
the state Si (i 1, 2," , n) could be a good topic to research.
For application of ACO in maintenance scheduling, a model of scheduling and
routing of maintenance fleet for offshore wind farms was established. ACO was
varied with two groups of ants which called Duo-ACO. Through the numerical
examples, Duo-ACO can solve this problem effectively even if the number of
turbines increasing. The drawback of the methodology is that it is impossible to
know if the optimal solution found by Duo-ACO is the best one.
177
Chapter 8: Maintenance Scheduling Optimization based on Data Mining
Techniques
178
Chapter 9: Conclusion and Future Work
This chapter provides general overall comments and concluding remarks about the
work presented in this thesis and some suggestions for future work.
The goals of this thesis are to develop a framework of intelligent Condition based
Maintenance (CBM) and apply data mining techniques in its phases. CBM is a
sufficient maintenance strategy which take maintenance action just before the
failure based on the condition of equipment to increase the reliability and
availability of the equipment and meanwhile reduce maintenance and operation
cost. It can also improve the safety for both equipment and operation staff. There
are mainly two tasks of CBM: the one is fault diagnosis and prognosis for the
equipment and the other is based on which to optimize the maintenance scheduling.
Chapter 2 presented framework of Intelligent Fault Diagnosis and Prognosis
System (IFDPs) for CBM which showed phases of the CBM and data mining
techniques applied in the system.
Chapter 3 presented data mining techniques applying in IFDPS, including Artificial
Neural Network (ANN), Swarm Intelligence (SI) and Association Rules (AR). The
techniques of ANN and AR are supposed to be applied in fault diagnosis and
prognosis while the techniques of SI are supposed to be applied in sensor place
optimization and maintenance optimization.
Chapter 4 introduced the sensor classification and sensor placement optimization
techniques. The presented methods sensor placement optimization is combination
of Finite Element Analysis (FEA) and SI algorithm such as PSO and BCO are
suitable for component level and machine level of sensor placement optimization.
However, the system level sensor placement optimization need to be further
researched.
Chapter 5 presented methods of signal processing typically for vibration signals
and feature extraction. The vibration signals can be processed in time domain,
frequency domain, time-frequency domain and wavelet domain analysis which
many features (parameters) can be extracted. The parameters extracted from
signals may be too many to be classified or predicted using data mining techniques
and thus feature selection techniques need to be used to reduce the dimensionality
of the parameters. PCA is an unsupervised learning approach for dimensionality
reduction that uses correlation coefficients of the parameters to combine and
transform them into a reduced dimensional space. It transforms high
dimensionality features to lower dimensionality but not select the features from
original features directly. Therefore, the feature selection directly from original
features should be researched.
179
Chapter 9: Conclusion and Future Work
Chapter 6 presented the methods of fault diagnosis, i.e. fault detection and
classification, based on data mining techniques such as BP network, SOM and
Association Rules. The conclusions have been presented at the end of this chapter.
When the history data is available but the physical model and mathematical model
are not available or not accurate, the data-driven techniques can be sufficient
applied in fault diagnosis.
Chapter 7 presented fault prognosis based on the indicator prediction of the fault
using BP network. The traditional methods of data-driven fault prognosis are based
on statistics of the history data [Lee et al., 2006]. ANN model is supposed to be
used for multi-component, multi-fault prognosis but the case study for wind turbine
fault prognosis in this chapter only one component and one fault was used. In the
future, the multi-component, multi-fault ANN model should be further researched.
Chapter 8 presented the maintenance optimization based on data mining techniques.
Three different models and Swarm Intelligence (variants of PSO, BCA and ACO)
were presented in this Chapter. Generating Unit Maintenance Scheduling is a
preventive maintenance optimization, while the following two examples are
predictive maintenance or so-called CBM, and both of which can use data mining
techniques to solve.
180
Reference
References
181
Reference
Electronic Testing: Theory and Applications, Vol. 17, No. 1, pp. 29–36.
doi:10.1023/A:1011141724916.
Andersen, T. M., & Rasmussen, M. (1999). Decision support in a condition
based environment. Journal of Quality in Maintenance Engineering,
Vol. 5, No. 2, pp. 89–102. doi:10.1108/13552519910271793.
Andria, G., Savino, M., & Trotta, A. (1994). Application of Wigner-Ville
Distribution to Measurements on Transient Signals. IEEE Transactions
on Instrumentation and Measurement, Vol. 43, No. 2, pp. 187–193.
doi:10.1109/19.293418.
Ansari, F. (1998). Fiber Optic Sensors for Construction Materials and
Bridges. Lancaster: Taylor & Francis.
Arroyo, J. M., & Conejo, A. J. (2002). A parallel repair genetic algorithm to
solve the unit commitment problem. IEEE Transactions on Power
Systems, Vol. 17, No. 4, pp. 1216–1224.
doi:10.1109/TPWRS.2002.804953.
Back, T., Hammel, U., & Schwefel, H.-P. (1997). Evolutionary computation:
comments on the history and current state. IEEE Transactions on
Evolutionary Computation, Vol. 1, No. 1, pp. 3–17.
doi:10.1109/4235.585888.
Barata, J., GuedesSoares, C., Marseguerra, M., & Zio, E. (2001). Monte
Carlo simulation of deteriorating systems. Proceedings ESREL, pp.
879–886.
Barata, J., Soares, C. G., Marseguerra, M., & Zio, E. (2002). Simulation
modelling of repairable multi-component deteriorating systems for “on
condition” maintenance optimisation. Reliability Engineering & System
Safety, Vol. 76, No. 3, pp. 255–264.
Baskar, S., & Suganthan, P. N. (2004). A novel concurrent particle swarm
optimization. Proceedings of the 2004 Congress on Evolutionary
Computation (IEEE Cat. No.04TH8753), pp. 792–796. IEEE.
doi:10.1109/CEC.2004.1330940.
Beƍrenguer, C., Grall, A., & Castanier, B. (2000). Simulation and evaluation
of Condition-based maintenance policies for multi-component
continuous-state deteriorating systems. Proceedings of the Foresi.
Becker, E., & Poste, P. (2006). Keeping the Condition Monitoring of Wind
Turbine Gears. Wind Energy, Vol. 7, No. 2, pp. 26–32.
Beigel, M. (1982). Identification Device. US.
Belkin, M., Niyogi, P., & Sindhwani, V. (2006). Manifold Regularizationௗ:
A Geometric Framework for Learning from Labeled and Unlabeled
182
Reference
183
Reference
184
Reference
Cecchin, T., Ranta, R., Koessler, L., Caspary, O., Vespignani, H., &
Maillard, L. (2010). Seizure lateralization in scalp EEG using Hjorth
parameters. Clinical neurophysiologyࣟ: official journal of the
International Federation of Clinical Neurophysiology, Vol. 121, No. 3,
pp. 290–300. doi:10.1016/j.clinph.2009.10.033.
Chen, D., & Trivedi, K. S. (2005). Optimization for condition-based
maintenance with semi-Markov decision process. Reliability
Engineering & System Safety, Vol. 90, pp. 25–29.
doi:10.1016/j.ress.2004.11.001.
Chen, D., & Wang, W. J. (2002). Classification of Wavelet Map Patterns
Using Multilayer Neural Networks for Gear Fault Detection.
Mechanical Systems and Signal Processing, Vol. 16, No. 4, pp. 695–
704. doi:10.1006/mssp.2002.1488.
Chen, G., Liu, Y., Zhou, W., & Song, J. (2008). Research on intelligent fault
diagnosis based on time series analysis algorithm. The Journal of
China Universities of Posts and Telecommunications, Vol. 15, No. 1,
pp. 68–74.
Chen, S. Y., & Li, Y. F. (2002). A method of automatic sensor placement
for robot vision in inspection tasks. Proceedings 2002 IEEE
International Conference on Robotics and Automation (Cat.
No.02CH37292), Vol. 3, pp. 2545–2550. IEEE.
doi:10.1109/ROBOT.2002.1013614.
CHERNG, A.-P. (2003). OPTIMAL SENSOR PLACEMENT FOR
MODAL PARAMETER IDENTIFICATION USING SIGNAL
SUBSPACE CORRELATION TECHNIQUES. Mechanical Systems
and Signal Processing, Vol. 17, No. 2, pp. 361–378.
doi:10.1006/mssp.2001.1400.
Chong, C., Hean Low, M., Sivakumar, A., & Gay, K. (2006). A Bee Colony
Optimization Algorithm to Job Shop Scheduling. Proceedings of the
2006 Winter Simulation Conference, pp. 1954–1961. IEEE.
doi:10.1109/WSC.2006.322980.
Chu, C., Proth, J., & Wolff, P. (1998). Predictive maintenanceௗ: The One-
unit Replacement Model. International Journal of Production
Economics, Vol. 54, No. 3, pp. 285–295.
doi:http://dx.doi.org/10.1016/S0925-5273(98)00004-8.
Clerc, M., & Kennedy, J. (2002). The particle swarm - explosion, stability,
and convergence in a multidimensional complex space. IEEE
Transactions on Evolutionary Computation, Vol. 6, No. 1, pp. 58–73.
doi:10.1109/4235.985692.
185
Reference
186
Reference
Dorigo, M., Maniezzo, V., & Colorni, A. (1996). Ant system: optimization
by a colony of cooperating agents. IEEE transactions on systems, man,
and cybernetics. Part B, Cyberneticsࣟ: a publication of the IEEE
Systems, Man, and Cybernetics Society, Vol. 26, No. 1, pp. 29–41.
doi:10.1109/3477.484436.
Dorigo, M., & Stützle, T. (2004). Ant colony optimization. Cambridge, Mass:
MIT Press.
Dragomir, O. E., Gouriveau, R., Zerhount, N., & Dragomir, F. (2007).
Framework for a distributed and hybrid prognostic system. In B.
Octavian (Ed.), 4th IFAC Conference on Management and Control of
Production and Logistics (2007), pp. 431–436. doi:10.3182/20070927-
4-RO-3905.00072.
Du, M., Cai, J., Liu, L., & Chen, P. (2011). ARRs based sensor placement
optimization for fault diagnosis. Procedia Engineering, Vol. 16, pp.
42–47. doi:10.1016/j.proeng.2011.08.1049.
Eberhart, R. C., & Shi, Y. (2000). Comparing inertia weights and
constriction factors in particle swarm optimization. Proceedings of the
2000 Congress on Evolutionary Computation. CEC00 (Cat.
No.00TH8512), Vol. 1, pp. 84–88. IEEE.
doi:10.1109/CEC.2000.870279.
Eisenmann, R. C. S., & Eisenmann, R. C. J. (1998). Machinery Malfunction
Diagnosis and Correction. Englewood Cliffs, NJ: Prentice-Hall.
Eklöv, T., Mårtensson, P., & Lundström, I. (1997). Enhanced selectivity of
MOSFET gas sensors by systematical analysis of transient parameters.
Analytica Chimica Acta, Vol. 353, No. 2-3, pp. 291–300.
doi:10.1016/S0003-2670(97)87788-4.
El-Abd, M., & Kamel, M. S. (2006). A hierarchical cooperative Particle
Swarm Optimizer. Proc. of Swarm Intelligence Symposium, pp. 43–47.
EN 13306: 2001 Maintenance Terminology, European Standard (2001).
CEN (European Committee for Standardization), Brussels.
Espinosa, J., Vandewalle, J., & Wertz, V. (2005). Fuzzy Logic,
Identification and Predictive Control. London: Springer-Verlag.
Estrin, D., Govindan, R., Heidemann, J., & Kumar, S. (1999). Next century
challenges. Proceedings of the 5th annual ACM/IEEE international
conference on Mobile computing and networking - MobiCom ’99, pp.
263–270. New York, New York, USA: ACM Press.
doi:10.1145/313451.313556.
187
Reference
188
Reference
189
Reference
190
Reference
Hook, T. G., Hughes, E. A., Levline, R. E., Morgan, T. A., & Parker, L. M.
(1987). Application of reliability-centered maintenance to San Onofre
Units 2 & 3 auxiliary feed water system, EPRI NP-5430.
Hu, J., Zhang, L., Ma, L., & Liang, W. (2011). An integrated safety
prognosis model for complex system based on dynamic Bayesian
network and ant colony algorithm. Expert Systems with Applications,
Vol. 38, No. 3, pp. 1431–1446. doi:10.1016/j.eswa.2010.07.050.
Hu, S., Zhou, C., & Hu, W. (2000). A New Structure Fault Detection
Method Based on Wavelet Singularity. Journal of Applied Sciences,
Vol. 18, No. 3, pp. 198–201.
Huang, C. J., Lin, C. E., & Huang, C. L. (1992). Fuzzy approach for
generator maintenance scheduling. Electric Power Systems Research,
Vol. 24, No. 1, pp. 31–38. doi:10.1016/0378-7796(92)90042-Y.
Huang, S. (1998). A genetic-evolved fuzzy system for maintenance
scheduling of generating units. International Journal of Electrical
Power & Energy Systems, Vol. 20, No. 3, pp. 191–195.
doi:http://dx.doi.org/10.1016/S0142-0615(97)00080-X.
Huang, Y., Mcmurran, R., & Jones, D. R. P. (2008). Probability based
vehicle fault diagnosisௗ: Bayesian network method, pp. 301–311.
doi:10.1007/s10845-008-0083-7.
Ilie-zudor, E., Kemény, Z., Egri, P., & Monostori, L. (2006). The RFID
Technology and Its Current Applications. Proceeding of The Modern
Information Technology in the Innovation Process of the Industrial
Enterprise-MITIP, pp. 29–36.
Intanagonwiwat, C., Govindan, R., & Estrin, D. (2000). Directed Diffusionௗ:
A Scalable and Robust Communication Paradigm for Sensor Networks.
Proceedings of the ACM Mobi- Com’00, pp. 56–67. Boston.
Isermann, R. (2005). Model-based fault-detection and diagnosis – status and
applications. Annual Reviews in Control, Vol. 29, No. 1, pp. 71–85.
doi:10.1016/j.arcontrol.2004.12.002.
Jain, A. K., Murty, M. N., & Flynn, P. J. (1999). Data clustering: a review.
ACM Computing Surveys, Vol. 31, No. 3, pp. 264–323.
doi:10.1145/331499.331504.
Jansen, A., & Niyogi, P. (2005). A Geometric Perspective on Speech Sounds,
pp. 1–50.
Jardine, A. K. S., Lin, D., & Banjevic, D. (2006). A review on machinery
diagnostics and prognostics implementing condition-based
191
Reference
192
Reference
193
Reference
194
Reference
195
Reference
Loucks, D. P., van Beek, E., Stedinger, J. R., Dijkman, J. P. M., & Villars,
M. T. (2005). Water Resources Systems Planning and Management:
An Introduction to Methods, Models and Applications, p. 690. Paris:
UNESCO.
Lovbjerg, M., & Krink, T. (2002). Extending particle swarm optimisers with
self-organized criticality. Proceedings of the 2002 Congress on
Evolutionary Computation. CEC’02 (Cat. No.02TH8600), Vol. 2, pp.
1588–1593. IEEE. doi:10.1109/CEC.2002.1004479.
Mallat, S. G. (1989). A theory for multiresolution signal decomposition: the
wavelet representation. IEEE Transactions on Pattern Analysis and
Machine Intelligence, Vol. 11, No. 7, pp. 674–693.
doi:10.1109/34.192463.
Markard, J., & Petersen, R. (2009). The offshore trend: Structural changes
in the wind power sector. Energy Policy, Vol. 37, No. 9, pp. 3545–
3556. doi:10.1016/j.enpol.2009.04.015.
Markou, M., & Singh, S. (2003). Novelty detection: a review–part 2: neural
network based approaches. Signal Processing, Vol. 83, No. 12, pp.
2499–2521. doi:10.1016/j.sigpro.2003.07.019.
Marquez, A. C. (2007). The Maintenance Management Framework, p. 340.
Marseguerra, M., Zio, E., & Podofillini, L. (2002). Condition-based
maintenance optimization by means of genetic algorithms and Monte
Carlo simulation. Reliability Engineering & System Safety, Vol. 77, No.
2, pp. 151–165. doi:10.1016/S0951-8320(02)00043-1.
Marwala, T. (2012). Condition Monitoring Using Computational
Intelligence Methods. London: Springer London.
Marwali, M. K. C., & Shahidehpour, S. M. (2000). Coordination between
long-term and short-term generation scheduling with network
constraints. IEEE Transactions on Power Systems, Vol. 15, No. 3, pp.
1161–1167. doi:10.1109/59.871749.
Marzi, H. (2004). Real-time fault detection and isolation in industrial
machines using learning vector quantization. Proceedings of the
Institution of Mechanical Engineers, Part B: Journal of Engineering
Manufacture, Vol. 218, No. 8, pp. 949–959.
doi:10.1243/0954405041486109.
May, R., Dandy, G., & Maier, H. (2011). Review of Input Variable
Selection Methods for Artificial Neural Networks. In K. Suzuki (Ed.),
Artificial Neural Networks - Methodological Advances and Biomedical
Applications. InTech. doi:10.5772/16004.
196
Reference
197
Reference
198
Reference
199
Reference
200
Reference
Roussel, S., Forsberg, G., Steinmetz, V., Grenier, P., & Bellon-Maurel, V.
(1998). Optimisation of electronic nose measurements. Part I:
Methodology of output feature selection. Journal of Food Engineering,
Vol. 37, No. 2, pp. 207–222. doi:10.1016/S0260-8774(98)00081-8.
Ruhanne, A., Hanhikorpi, M., Bertuccelli, F., Colonna, A., Malik, W.,
Ranasinghe, D., López, T. S., et al. (2008). Sensor-enabled RFID Tag
Handbook, pp. 1–47.
Rumelhart, D. E., Hinton, G. E., & Williams, R. J. (1986). Learning Internal
Representations by Error Propagation. In D. E. Rumenhart & J. L.
McCelland (Eds.), Parallel Distributed Processing: Explorations in the
Microstructure of Cognition, pp. 318–362. Cambridge: MIT Press.
Saravanan, N., Kumar Siddabattuni, V. N. S., & Ramachandran, K. I.
(2008). A comparative study on classification of features by SVM and
PSVM extracted using Morlet wavelet for fault diagnosis of spur bevel
gear box. Expert Systems with Applications, Vol. 35, No. 3, pp. 1351–
1366. doi:10.1016/j.eswa.2007.08.026.
Saravanan, N., & Ramachandran, K. I. (2010). Incipient gear box fault
diagnosis using discrete wavelet transform ( DWT ) for feature
extraction and classification using artificial neural network ( ANN ).
Expert Systems With Applications, Vol. 37, No. 6, pp. 4168–4181.
doi:10.1016/j.eswa.2009.11.006.
Satoh, T., & Nara, K. (1991). Maintenance scheduling by using simulated
annealing method (for power plants). IEEE Transactions on Power
Systems, Vol. 6, No. 2, pp. 850–857. doi:10.1109/59.76735.
Saxena, a., & Vachtsevanos, G. (2005). A methodology for analyzing
vibration data from planetary gear systems using complex morlet
wavelets. Proceedings of the 2005, American Control Conference,
2005., Vol. 2, pp. 4730–4735. IEEE. doi:10.1109/ACC.2005.1470743.
Schölkopf, B., Burges, C. J. C., & Smola, A. J. (1999). Advances in Kernel
Methods: Support Vector Learning. Cambridge, MA: The MIT Press.
Schubert, U., Kruger, U., Arellano-Garcia, H., de Sá Feital, T., & Wozny, G.
(2011). Unified model-based fault diagnosis for three industrial
application studies. Control Engineering Practice, Vol. 19, No. 5, pp.
479–490. doi:10.1016/j.conengprac.2011.01.009.
Secrest, B. R., & Lamont, G. B. (2002). Visualizing particle swarm
optimization - Gaussian particle swarm optimization. Proceedings of
the 2003 IEEE Swarm Intelligence Symposium. SIS’03 (Cat.
No.03EX706), pp. 198–204. IEEE. doi:10.1109/SIS.2003.1202268.
201
Reference
Seker, S., & Ayaz, E. (2003). Feature extraction related to bearing damage
in electric motors by wavelet analysis. Journal of the Franklin Institute,
Vol. 340, No. 2, pp. 125–134. doi:10.1016/S0016-0032(03)00015-2.
Shakhnarovich, G., Darrell, T., & Indyk, P. (2005). Nearest-Neighbor
Methods in Learning and Visionࣟ: Theory and Practice. Cambridge,
MA: MIT Press.
Shenoy, D. B., & Bhadbury, B. (1998). Maintenance resources management:
Adapting MRP. London: Taylor & Francis.
Shi, Y., & Eberhart, R. C. (2001). Fuzzy adaptive particle swarm
optimization. Proceedings of the 2001 Congress on Evolutionary
Computation (IEEE Cat. No.01TH8546), Vol. 1, pp. 101–106. IEEE.
doi:10.1109/CEC.2001.934377.
Shibata, K., Takahashi, A., & Shirai, T. (2000). Fault Diagnostics of
Rotating Machinery Through Visualization of Sound Signals.
Mechanical Systems and Signal Processing, Vol. 14, No. 2, pp. 229–
241. doi:10.1006/mssp.1999.1255.
Shinde, A. D. (2004). A Wavelet Packet Based Sifting Process and Its
Application for Structural Health Monitoring. Worcester Polytechnic
Institute.
Si, X., Wang, W., Hu, C., & Zhou, D. (2011). Remaining useful life
estimation – A review on the statistical data driven approaches.
European Journal of Operational Research, Vol. 213, No. 1, pp. 1–14.
doi:10.1016/j.ejor.2010.11.018.
Siegelmann, H. T., & Sontag, E. D. (1994). Analog computation via neural
networks. Theoretical Computer Science, Vol. 131, No. 2, pp. 331–360.
doi:10.1016/0304-3975(94)90178-3.
Silva, C. W. De. (1989). Control sensors and actuators. Englwood Cliff, NJ:
Prentice Hall.
Sindhwani, V., Niyogi, P., & Belkin, M. (2005). Beyond the point cloud:
from transductive to semi-supervised learning. Proceeding of
ICML ’05, pp. 824–831. New York.
Smola, A. J., & Scholkopf, B. (2003). A Tutorial on Support Vector
Regression. Statistics and Computing, Vol. 14, No. 3, pp. 199–222.
Soman, K. P., & Ramachandran, K. I. (2005). Insight into Wavelets from
Theory to Practice, 2nd ed., p. 404. India: PHI Learning Pvt. Ltd.
Soman, R. R., Davidson, E. M., McArthur, S. D. J., Fletcher, J. E., &
Ericsen, T. (2012). Model-based methodology using modified sneak
202
Reference
203
Reference
204
Reference
Valle, Y., Member, S., Venayagamoorthy, G. K., Member, S., & Harley, R.
G. (2008). Particle Swarm Optimizationௗ: Basic Concepts , Variants
and Applications in Power Systems. IEEE TRANSACTIONS ON
EVOLUTIONARY COMPUTATION, Vol. 12, No. 2, pp. 171–195.
Van den Kerkhof, P., Gins, G., Vanlaer, J., & Van Impe, J. F. M. (2012).
Dynamic model-based fault diagnosis for (bio)chemical batch
processes. Computers & Chemical Engineering, Vol. 40, pp. 12–21.
doi:10.1016/j.compchemeng.2012.01.013.
Vasudevan, R. (1985). Application of reliability-centered maintenance to
component cooling-water system at Turkey Point Units 3 and 4, EPRI
NP-4271.
Verma, A., & Kusiak, A. (2012). Fault Monitoring of Wind Turbine
Generator Brushes: A Data-Mining Approach. Journal of Solar Energy
Engineering, Vol. 134, No. 2, pp. 021001. doi:10.1115/1.4005624.
Walker, I. (1987). Development of a maintenance program. Proceedings of
the 14thInte-ram Conference. Toronto.
Wang, C., Kang, Y., Shen, P., Chang, Y., & Chung, Y. (2010). Applications
of fault diagnosis in rotating machinery by using time series analysis
with neural network. Expert Systems with Applications, Vol. 37, No. 2,
pp. 1696–1702. doi:10.1016/j.eswa.2009.06.089.
Wang, C., Zhang, Y., & Zhong, Z. (2008). Fault Diagnosis for Diesel Valve
Trains based on Time–frequency Images. Mechanical Systems and
Signal Processing, Vol. 22, No. 8, pp. 1981–1993.
doi:10.1016/j.ymssp.2008.01.016.
Wang, D. D. (1996). Computational intelligence based machine fault
diagnosis. Proceedings of the IEEE International Conference on
Industrial Technology (ICIT’96), pp. 465–469. IEEE.
doi:10.1109/ICIT.1996.601632.
Wang, H., Song, Z., & Wang, H. (2002). Statistical process monitoring
using improved PCA with optimized sensor locations. Journal of
Process Control, Vol. 12, pp. 735–744.
doi:http://dx.doi.org/10.1016/S0959-1524(01)00048-8.
Wang, K. (2002). Intelligent Condition Monitoring and Diagnosis Systems.
Amsterdam: IOS Press.
Wang, K. (2005). Applied Computational Intelligence in Intelligent
Manufacturing Systems. Australia: Advanced Knowledge International
Pty Ltd.
205
Reference
Wang, K., & Zhang, Z. (2010). Intelligent Fault Diagnosis and Prognosis
systems (IFDPS) for Condition-based Maintenance, pp. 1–21.
Trondheim.
Wang, K., & Zhang, Z. (2012). Application of Radio Frequency
Identification (RFID) to Manufacturing, pp. 1–24. Trondheim.
Wang, Y., & Handschin, E. (2000). A new genetic algorithm for preventive
unit maintenance scheduling of power systems. International Journal
of Electrical Power & Energy Systems, Vol. 22, No. 5, pp. 343–348.
doi:10.1016/S0142-0615(99)00062-9.
Watson, I., & Marir, F. (2009). Case-based reasoning: A review. The
Knowledge Engineering Review, Vol. 9, No. 04, pp. 327.
doi:10.1017/S0269888900007098.
Wei, L., & Keogh, E. (2006). Semi-supervised time series classification.
Proceedings of the 12th ACM SIGKDD international conference on
Knowledge discovery and data mining - KDD ’06, p. 748. New York,
New York, USA: ACM Press. doi:10.1145/1150402.1150498.
Wei, X., & Pan, H. (2010). Particle Swarm Optimization and Intelligent
Fault Diagnosis. Beijing: National Defence Industry Press.
White, J., Kauer, J. S., Dickinson, T. a, & Walt, D. R. (1996). Rapid analyte
recognition in a device based on optical sensors and the olfactory
system. Analytical chemistry, Vol. 68, No. 13, pp. 2191–202.
doi:10.1021/ac9511197.
White, R. M. (1987). A sensor classification scheme. IEEE transactions on
ultrasonics, ferroelectrics, and frequency control, Vol. 34, No. 2, pp.
124–6.
Wilson, A. (2002). Asset Maintenance Management: A Guide to Developing
Strategy and Improving Performance. New York: Industrial Press, Inc.
Wilson, D. M., & DeWeerth, S. P. (1995). Odor discrimination using
steady-state and transient characteristics of tin-oxide sensors. Sensors
and Actuators B: Chemical, Vol. 28, No. 2, pp. 123–128.
doi:10.1016/0925-4005(95)80036-0.
Wireman, T. (1990). World Class Maintenance Management. New York:
Industrial Press.
Worden, K., & Burrows, A. P. (2001). Optimal sensor placement for fault
detection. Engineering Structures, Vol. 23, pp. 885–901.
doi:http://dx.doi.org/10.1016/j.bbr.2011.03.031.
Wu, J., & Chen, J.-C. (2006). Continuous wavelet transform technique for
fault signal diagnosis of internal combustion engines. NDT & E
206
Reference
207
Reference
Yen, G. G., & Lin, K. (1999). Conditional health monitoring using vibration
signatures. Proceedings of the 38th IEEE Conference on Decision and
Control (Cat. No.99CH36304), Vol. 5, pp. 4493–4498. IEEE.
doi:10.1109/CDC.1999.833249.
Yen, G. G., & Lin, K. (2000). Wavelet Packet Feature Extraction for
Vibration Monitoring, Vol. 47, No. 3, pp. 650–667.
Yoshimoto, K., Yasuda, K., Yokoyama, R., & Cory, B. J. (1993).
Decentralized Hopfield neural network applied to maintenance
scheduling of generating units in power systems. Third International
Conference on Artificial Neural Networks, 1993.,, pp. 277–281.
Brighton.
Yu, D., Yang, Y., & Cheng, J. (2007). Application of Time–frequency
Entropy Method based on Hilbert–Huang Transform to Gear Fault
Diagnosis. Measurement, Vol. 40, No. 9-10, pp. 823–830.
doi:10.1016/j.measurement.2007.03.004.
Yuan, J. (2012). Manifold Assumption and Semi-Supervised Learning for
Fault Diagnosis. Data Mining for Zero-Defect Manufacturing, pp. 133–
148. Trondheim: Tapir Academic Press.
Zaher, A., McArthur, S. D. J., Infield, D. G., & Patel, Y. (2009). Online
wind turbine fault detection through automated SCADA data analysis.
Wind Energy, Vol. 12, No. 6, pp. 574–593. doi:10.1002/we.319.
Zhang, Z., & Kusiak, A. (2012). Monitoring Wind Turbine Vibration Based
on SCADA Data. Journal of Solar Energy Engineering, Vol. 134, No.
2, pp. 021004. doi:10.1115/1.4005753.
Zhang, Z., & Wang, K. (2010). Application of Improved Discrete Particle
Swarm Optimization (IDPSO) in Generating Unit Maintenance
Scheduling. In K. Wang, O. Myklebust, & D. Tu (Eds.), International
Workshop of Advanced Manufacturing and Automation (IWAMA2010),
pp. 79–86. Shanghai: Tapir Academic Press.
Zhang, Z., & Wang, K. (2011). Fault Isolation Using Self-organizing Map
(SOM) ANNs. IET International Conference of Wireless Mobile &
Computing, pp. 425–431. Shanghai: Institute Engineering and
Technology.
Zhang, Z., & Wang, K. (2013). Dynamic Condition-Based Maintenance
Scheduling Using Bee Colony Algorithm (BCA). In E. Qi, J. Shen, &
R. Dou (Eds.), Proceedings of International Asia Conference on
Industrial Engineering and Management Innovation (IEMI2012), pp.
1607–1618. Berlin, Heidelberg: Springer Berlin Heidelberg.
doi:10.1007/978-3-642-38445-5_169.
208
Reference
Zhang, Z., Wang, Y., & Wang, K. (2012). Fault diagnosis and prognosis
using wavelet packet decomposition, Fourier transform and artificial
neural network. Journal of Intelligent Manufacturing.
doi:10.1007/s10845-012-0657-2.
Zhang, Z., Wang, Y., & Wang, K. (2013). Intelligent fault diagnosis and
prognosis approach for rotating machinery integrating wavelet
transform, principal component analysis, and artificial neural networks.
The International Journal of Advanced Manufacturing Technology,
Vol. 68, No. 1-4, pp. 763–773. doi:10.1007/s00170-013-4797-0.
Zheng, H., Li, Z., & Chen, X. (2002). Gear Fault Diagnosis Based on
Continuous Wavelet Transform. Mechanical Systems and Signal
Processing, Vol. 16, No. 2-3, pp. 447–457.
doi:10.1006/mssp.2002.1482.
Zheng, Y., Tay, D. B. H., & Li, L. (2000). Signal extraction and power
spectrum estimation using wavelet transform scale space filtering and
Bayes shrinkage. Signal Processing, Vol. 80, No. 8, pp. 1535–1549.
doi:10.1016/S0165-1684(00)00054-2.
Zou, M., Dayan, J., & Green, I. (2000). Dynamic simulation and monitoring
of a non-contacting flexibly mounted rotor mechanical face seal.
Proceedings of the Institution of Mechanical Engineers, Part C:
Journal of Mechanical Engineering Science, Vol. 214, No. 9, pp.
1195–1206. doi:10.1243/0954406001523632.
209