0% found this document useful (0 votes)
26 views92 pages

Ahadov Anar

The thesis by Anar Ahadov focuses on developing a predictive maintenance model for rotating machinery using artificial intelligence to enhance monitoring and maintenance planning. It emphasizes the transition from traditional maintenance methods to predictive strategies aligned with Industry 4.0 principles, leveraging machine learning for improved decision-making. The research explores various maintenance methodologies, highlighting the advantages of predictive maintenance in preventing machinery failures and optimizing operational efficiency.

Uploaded by

hahalolol1717
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views92 pages

Ahadov Anar

The thesis by Anar Ahadov focuses on developing a predictive maintenance model for rotating machinery using artificial intelligence to enhance monitoring and maintenance planning. It emphasizes the transition from traditional maintenance methods to predictive strategies aligned with Industry 4.0 principles, leveraging machine learning for improved decision-making. The research explores various maintenance methodologies, highlighting the advantages of predictive maintenance in preventing machinery failures and optimizing operational efficiency.

Uploaded by

hahalolol1717
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 92

Anar Ahadov

PREDICTIVE MAINTENANCE MODEL OF ROTATING MACHINERY USING AI


PREDICTIVE MAINTENANCE MODEL OF ROTATING MACHINERY USING AI

Anar Ahadov
Master Thesis
Spring 2024
Modern Software and Computing Solutions
Oulu University of Applied Sciences
ABSTRACT

Oulu University of Applied Sciences


Modern Software and Computing Solutions

Author(s): Anar Ahadov


Title of the thesis: Predictive Maintenance Model of Rotating Machinery Using AI
Thesis examiner(s): Manne Hannula
Term and year of thesis completion: Spring 2024 Pages: 92

The primary objective of this thesis is to formulate a predictive maintenance framework capable of
effectively monitoring and analysing the operational status of diverse rotating machinery. This
framework is designed to promptly alert operational personnel regarding the deterioration of ma-
chine components, enabling them to anticipate maintenance requirements and plan accordingly.

In contemporary industrial contexts, the practice of condition-based maintenance has gained wide-
spread adoption, particularly in the realm of rotating machinery. Employing an array of sensors and
programmable logic controllers, diverse signals encompassing parameters such as vibration, tem-
perature, speed, and pressure are systematically captured. Nonetheless, the conclusive assess-
ment of machine health frequently relies on the expertise of rotating machinery engineers, who
scrutinize trends and provide recommendations for component inspections to ascertain instances
of degradation.

Industry paradigms are rapidly evolving, echoing the principles of Industry 4.0, which champion the
creation of fully automated industrial ecosystems through the symbiotic integration of machine
learning and the Internet of Things (IoT). The adept utilization of refined machine learning algo-
rithms empowers proficient decision-making processes, enabling proactive maintenance planning
well in advance. This confluence of advanced technologies heralds a transformative era in industrial
practices, ushering in unprecedented efficiency and foresight in maintenance strategies.

This thesis is dedicated to the intricate analysis of diverse maintenance methodologies, accentuat-
ing the juxtaposition between condition-based monitoring and predictive maintenance strategies.
The narrative meticulously explores the distinctive attributes, advantages, and limitations inherent
to each approach. A crucial focal point of this research is the development of a predictive mainte-
nance model using AI, fuelled by insights gained from open data sources to train machine learning
algorithms. This innovative approach seeks to advance the understanding and application of pre-
dictive maintenance within the broader landscape of artificial intelligence-driven maintenance prac-
tices.

Keywords: Predictive maintenance, Machine learning, Industry 4.0, Condition monitoring.

3
CONTENTS

1 INTRODUCTION ................................................................................................................... 5
2 LITERATURE REVIEW ......................................................................................................... 8
3 MAINTENANCE..................................................................................................................... 9
3.1 Preventive maintenance ............................................................................................. 9
3.2 Corrective maintenance.............................................................................................. 9
4 CONDITION MONITORING ................................................................................................ 11
4.1 Vibration Analysis. .................................................................................................... 12
4.1.1 Evaluating Overall Vibration ....................................................................... 13
4.1.2 Vibration sensors ....................................................................................... 14
4.1.3 Mounting location of vibration transducers ................................................. 20
4.2 Temperature Monitoring ........................................................................................... 21
4.2.1 Infrared temperature monitoring................................................................. 23
5 MACHINE LEARNING ......................................................................................................... 26
5.1 Supervised Learning ................................................................................................ 27
5.2 Un-supervised Learning ........................................................................................... 30
5.3 Deep Learning .......................................................................................................... 33
5.4 Long short-term memory (LSTM) ............................................................................. 40
6 PREDICTIVE MAINTENANCE ............................................................................................ 45
6.1 Datasets ................................................................................................................... 45
6.1.1 Exploring Open Datasets. .......................................................................... 46
6.1.2 Microsoft Azure Predictive Maintenance Dataset ....................................... 50
6.2 Creation of PdM ....................................................................................................... 64
6.2.1 Data Collection and Preparation ................................................................ 65
6.2.2 Model Selection and Training..................................................................... 69
6.2.3 Model Evaluation and Validation ................................................................ 70
6.2.4 Classification Metrics ................................................................................. 71
6.2.5 Regression Evaluation Metrics................................................................... 74
7 CONCLUSION ..................................................................................................................... 86
REFERENCES ............................................................................................................................ 88

4
1 INTRODUCTION

Industries heavily rely on essential machinery for tasks ranging from power generation to fluid com-
pression and pumping, as well as driving various processes. In these operational environments,
machinery breakdowns are not merely an inconvenience; they have the potential to result in cata-
strophic consequences. The financial burden of repair expenses, substantial as they may be, pales
in comparison to the partial or complete production standstills that can incur losses amounting to
millions of dollars per day. These losses can make the difference between a profitable year and a
loss-making one. Given the high stakes involved, mechanical condition monitoring is not merely a
suggestion; it is an absolute necessity.

The continuous surveillance of critical asset parameters, including vibration, temperature, speed,
and an array of other condition indicators, has proven itself as a reliable method for anticipating
and averting mechanical failures. This method's effectiveness has been validated in tens of thou-
sands of industrial facilities worldwide, yielding tangible benefits such as heightened protection
against catastrophic failures, improved machinery reliability and availability, fewer disruptions in
processes, enhanced planning for maintenance and outages, reduced costs associated with
maintenance and repairs, longer intervals between outages, and even a potential decrease in in-
surance premiums.

Maintenance is a crucial part of running any facility. When parts of equipment wear down, it can
lead to breakdowns, causing the facility to stop working, losing production, and even creating safety
risks. That's why regularly checking and maintaining equipment is so important (Lucas Brito 2022).

However, companies often do not like spending excessive money on maintenance. To balance
this, having a smart maintenance plan is very effective. This plan helps control the condition of the
facility, predicts how much money is needed for maintenance, plans when to pause production,
and manages resources and parts.

Nowadays, many companies use two main types of maintenance plans: regular check-ups and
checking only when needed based on equipment conditions. These are helpful but need a lot of
resources. Rotating machine engineers look at trends and suggest when to check machines. After

5
this, they plan and get the needed parts. This process can be hard due to time, resource, and
money limits.

This is where predictive maintenance comes in, which uses machine learning. A well-trained model
looks at data from sensors, how equipment operates, and its conditions. Then it predicts when the
equipment will need maintenance. Predictive maintenance is a smarter approach than traditional
preventive maintenance. Instead of fixing things before they break, predictive maintenance pin-
points when something might fail and fixes it just in time, saving money and avoiding downtime.

Manufacturing has changed a lot over time, from water-powered factories to electricity-powered
mass production to computer-controlled factories. Now, a new era started called Industry 4.0, which
is like a digital makeover for manufacturing (Lu 2017).

Industry 4.0 is all about creating link between machines, controllers, sensors, and HMIs to make a
smart network that talks to each other. This makes factories more flexible, efficient, and able to
make customized products. Industry 4.0 uses lots of new technologies like RFID tags, ERP sys-
tems, cloud computing, the Internet of Things, and social product development (Lu 2017). These
technologies all work together to make manufacturing better. Implementing Industry 4.0 may re-
quire some investment in new tools and data processing solutions, but the long-term benefits are
worth it. This includes more stable production, lower maintenance costs, and stopping unexpected
breakdowns (Das, Das and Birant 2023).

The interconnected benefits of PdM and Industry 4.0 result in a cascade of advantages for manu-
facturers. Proactively predicting failures and scheduling maintenance interventions eliminates un-
scheduled downtime, maximizing asset utilization and enhancing overall productivity. This, in turn,
leads to improved profitability and reduced maintenance costs by preventing breakdowns and
emergency repairs.

This thesis aims to build a strong predictive maintenance model using advanced artificial intelli-
gence (AI) methods. The main goal is to improve maintenance practices, especially taking care of
Industry 4.0. By using different smart machine learning techniques, the research aims to create a
system that can predict and detect potential issues in industrial machinery. Integrating AI not only
follows the principles of Industry 4.0 but also aims to make maintenance more efficient, reliable,

6
and cost-effective. The thesis works towards connecting traditional methods with modern technol-
ogies, promoting a new way of making decisions based on data for better planning and improved
industrial performance.

7
2 LITERATURE REVIEW

There are already several applications of predictive maintenance in different industries, but this
work is focusing on applications on rotating machineries. According to their review, Naive Bayes
classifier, k-NN, ANN and SVM algorithms are most used machine learning algorithms. The k-NN
(k-Nearest Neighbours) algorithm belongs to the instance-based learning category. It postpones
the process of induction or generalization until after classification. Unlike k-means, which computes
overall data, k-NN only focuses on nearest neighbours. This approach reduces training time com-
pared to eager-learning algorithms like ANN (Artificial Neural Networks) and Bayes nets, albeit
demanding more computation time during classification (Ruonan Liu 2018).

In contrast, the Naive Bayes algorithm operates on a probabilistic approach, distinguishing itself
from other AI models. It assigns probabilities to an instance belonging to each class, instead of a
straightforward classification.

SVM (Support Vector Machine) excels in generalization and can even perform well with limited
training data. Using kernel functions, it achieves appropriate nonlinear mapping, enabling the sep-
aration of data from multiple categories using a hyperplane. This characteristic empowers SVM to
high classification accuracy achievement in tasks like fault diagnosis and condition monitoring of
rotating machinery.

ANN, modelled after the human brain's structure. This architecture allows it to approximate intricate
non-linear functions with multiple inputs and outputs. By adapting its structure, ANN demonstrates
commendable fault diagnosis capabilities in various rotating machinery applications (Das, Das and
Birant 2023).

Deep learning emerges as a potent technique for automatic feature learning across multiple ab-
straction levels. It facilitates the direct learning of complicated input-to-output functions, eliminating
the need for independent feature extractors. This quality is particularly advantageous for fault di-
agnosis of industrial rotating machinery. In broad terms, ANN, SVM, and deep learning techniques
execute well where data frequently presents a high level of dimensionality and continuous charac-
teristics. In contrast, naive Bayes and k-NN algorithms showcase superior performance when deal-
ing with discrete features (Roberto M. Souza 2021).

8
3 MAINTENANCE

This section delves into traditional maintenance models, highlighting their merits and limitations.
Maintenance practices involve the assessment of equipment conditions and functionalities. Mainte-
nance can be categorized into several distinct approaches, including preventive maintenance, con-
dition-based maintenance, predictive maintenance and corrective maintenance.

3.1 Preventive maintenance

Even while equipment is operational, certain functions remain dormant until emergency scenarios
arise. Consequently, it becomes imperative to assess the functionality of these features during
maintenance routines. This genre of maintenance is termed "preventive maintenance," which is
commonly complemented by routine maintenance procedures. Typically, an annual preventive
maintenance schedule is instituted to ascertain the functionality of equipment without necessitating
the dismantling or inspection of internal components.

For more comprehensive evaluations, preventive maintenance cycles spanning intervals of three,
five, six years, and so forth, are strategically planned within corporate frameworks. These temporal
benchmarks often align with recommendations provided by manufacturers and documented man-
uals. Periodically, engineers might propose extensions or contractions of these cycles based on
their observations of failures and perceived risks during routine maintenance activities. In rare in-
stances, the execution of periodic maintenance might be postponed or rescheduled to accommo-
date production continuity. However, this decision is usually contingent upon a meticulous analysis
conducted by both operational and maintenance personnel, with safety and risk considerations at
the forefront of the decision-making process.

3.2 Corrective maintenance

Typically, preventive maintenance primarily focuses on upkeep tasks rather than extensive repairs,
especially if the repair is intricate or substantial. Nevertheless, in instances where abnormal situa-
tions are detected during preventive maintenance activities, these occurrences are documented
within maintenance systems as instances of "corrective maintenance." True to its name, corrective

9
maintenance pertains to the rectification of malfunctioning equipment functions. This necessitates
supplementary planning and requisitioning of materials if components require replacement.

Furthermore, corrective maintenance can be requisitioned at any point if anomalies or failures man-
ifest during the production process. Ordinarily, the initial step in corrective maintenance involves
an inspection aimed at verifying the equipment's functionality. Following this assessment, mainte-
nance personnel scrutinize the operational status and subsequently propose either repairs or the
replacement of equipment, contingent upon the nature of the identified issue.

10
4 CONDITION MONITORING

Condition monitoring serves as a fundamental predictive maintenance paradigm that, while not
directly involved in executing maintenance decisions, assumes a pivotal role in precluding equip-
ment failures. This model centres on the systematic acquisition of data through a diverse array of
sensors, meticulously tracking equipment health and expeditiously alerting operators in anticipation
of potential malfunctions.

Embracing the principles of Industry 4.0, the proposed predictive maintenance model integrates
seamlessly into the broader landscape of fully automated industrial ecosystems. Within this para-
digm, condition monitoring transcends its traditional role, becoming an integral component in the
symbiotic relationship between machines and the Internet of Things (IoT). The systematic acquisi-
tion of data through a myriad of sensors aligns with the tenets of Industry 4.0, where the intercon-
nectedness of machines facilitates the seamless flow of information. This interconnectedness al-
lows for real-time data aggregation and analysis, enabling predictive maintenance decisions based
on a broad understanding of equipment health. In this thesis not only underscores the importance
of condition monitoring but also positions it as a key enabler for the predictive maintenance strate-
gies essential for the evolving landscape of Industry 4.0.

Operationalizing this approach entails a carefully orchestrated interplay between data aggregation,
analysis, and decision-making. The data stream, garnered via an assortment of sensors, is chan-
nelled to a central controller, which undertakes the continuous scrutiny of this incoming information.
Notably prominent within this suite of sensors are those dedicated to monitoring vibration, temper-
ature, and pressure – a trinity of paramount parameters that form the foundational basis for evalu-
ating equipment health (Eleonora, Fabio and Ilenia 2021).

The primary function of the controller extends beyond mere data collation; it assumes the mantle
of assessing equipment conditions, identifying deviations from established baselines, and trigger-
ing real-time alerts in the presence of anomalies. This prompt communication empowers operators
with timely insights, enabling them to take immediate action – whether through recalibrations to
reinstate the process within operational thresholds or, if exigent, initiating equipment shutdowns to
avert potential catastrophic repercussions.

11
Despite its apparent simplicity, this model's applications encompass a spectrum of intricacies. Con-
dition monitoring finds its niche in the diagnosis of rotating machinery, encompassing compressors,
pumps, turbines, and gearboxes. The fusion of insights gleaned from vibration, temperature, and
pressure data – synergized with the expertise of rotating machinery engineers – facilitates the iden-
tification of compromised components and the extrapolation of remaining operational lifespans.

In summation, condition monitoring serves as the bedrock of proactive maintenance, capitalizing


on real-time data to furnish a dynamic overview of equipment health. Although it does not inde-
pendently dictate maintenance actions, it significantly empowers decision-makers with the infor-
mation requisite for circumventing disruptions, mitigating downtime, and optimizing overarching
maintenance strategies.

FIGURE 1. Condition Monitoring System (Siemens 2016)

There are wide variety of analysis of condition monitoring systems. The next chapter is about some
of them.

4.1 Vibration Analysis.

Vibration, in the context of machinery, is a valuable source of information highlighting condition of


mechanical parts as they respond to internal and external forces. It serves as a crucial indicator of

12
a machine's mechanical condition, with most issues in rotating machinery manifesting as unusual
vibration patterns. Each mechanical problem generates a distinct vibration signature, making vibra-
tion analysis an indispensable diagnostic tool. When conducting vibration analysis, two essential
aspects of the vibration signal come into focus: frequency and amplitude (Mais and Brady 2002).

Frequency reveals the rate at which vibration events, defined as one vibration cycle, occur within
a specific time frame. Different types of faults often manifest at distinct frequencies, making fre-
quency analysis a potent tool for fault identification.

Amplitude signifies vibration signal magnitudes. It directly correlates with the severity of a fault, with
higher amplitudes indicating more significant issues. Importantly, amplitude is assessed relative to
the normal vibration level of a properly operating machine.

4.1.1 Evaluating Overall Vibration

In the domain of condition monitoring, an effective starting point often involves assessing a ma-
chine's overall vibration level, a practice known as trending. This entails measuring the total vibra-
tion energy in the specific frequency span. Deviations can be identified by gauging the overall rotor
vibration and comparing it to ordinary values. An elevated vibration trend indicates that abnormal
condition observed. However, pinpointing the exact cause is the challenge.

Vibration is a crucial parameter for evaluating machine conditions like misalignment, imbalance,
structural resonance, mechanical looseness, excessive bearing wear etc. For identifying the con-
tributing factor, understanding a vibration signal's signature is essential, which comprises two main
components: frequency range and scale factors (Mais and Brady 2002).

The frequency range for overall vibration measurement can vary depending on the equipment or
be user selectable. While there may be ongoing discussions about the most appropriate frequency
range, consistency in measurement from the same range is essential.

To illustrate, consider the frequency range as a bucket placed on the ground during rainfall. Rainfall
collected in the bucket represents the defined frequency range, while rain falling outside the bucket
represents frequencies beyond that range.

13
Scale factors, which include Peak, Peak-to-Peak, Average, and RMS, determine how measure-
ments are quantified. These factors are interrelated, and it is crucial to maintain consistency when
comparing overall values. For example (figure 2):

The distance from the zero to highest point of waveform is known as the peak value. In turn, from
highest point to the lowest point of waveform is named peak to peak value which measures ampli-
tude. The Average value signifies the waveform's average amplitude, typically zero for pure sine
waves but nonzero for others. The RMS (Root Mean Squared) value, a bit more complex, which is
commonly used for waveform analysis. (Mais and Brady 2002).

FIGURE 2. Scale Factors, Frequency and Amplitude

4.1.2 Vibration sensors

One of the essential tools for machine condition monitoring is vibration sensors, as they provide
valuable information about the performance and condition of rotating machinery. By measuring
vibration levels, vibration sensors can help identify potential problems early on before they lead to
costly downtime or equipment failure.

There are three main types of vibration sensors: accelerometers, proximity probes, and seismic
velocity transducers. Accelerometers measure acceleration, which work principle based on the rate
of change of velocity. Proximity probes measure vibration displacement, which is the change in
position of a machine component. Seismic velocity transducers measure vibration velocity, which
is the rate of change of displacement.

14
The choice of vibration sensor depends on the specific application. For example, accelerometers
are typically used to monitor high-frequency vibrations, while proximity probes are better suited for
low-frequency vibrations.

Velocity measurement directly assesses the rate of displacement change, providing a crucial metric
for evaluating the vibration levels and potential fatigue experienced by a mechanical system. Seis-
mic velocity sensors can be used for these measurements, although accelerometers are more
commonly employed due to better frequency response and cost-effectiveness. Accelerometer sig-
nals can be readily converted into velocity units using dedicated signal processing techniques.

Acceleration measurement delves deeper into the underlying forces driving vibration phenomena.
The relationship is defined as Force = mass × acceleration. Conversion between these measure-
ment types is possible; for example, mathematical differentiation can be used to estimate displace-
ment from velocity or acceleration. However, this approach often introduces noise into the signal
and is therefore rarely utilized. While differentiation can be used to convert velocity to acceleration,
integration is the preferred method for converting both acceleration to velocity and velocity to dis-
placement. Integration can be carried out accurately using simple electronic circuits or software.
This flexibility is one of the primary reasons why accelerometers are the standard transducers for
vibration measurements, as their output can be readily converted into either velocity or displace-
ment readings through a simple integration process, making them the standard transducers for
vibration measurements. (Muruganantham 2022).

Proximity probes stand as specialized sensors that measure the relative distance between the a
nearby surface and probe tip. These probes operate on the principle of changes in electrical in-
ductance, triggered by alterations in the gap between the target surface and the probe. A proximity
pickup system comprises the sensor itself and a signal conditioner responsible for amplifying and
filtering the induced signal. Proximity probes are particularly well-suited for measuring shaft radial
or axial displacement. However, it is essential to ensure that the surface being measured is elec-
trically conducting and there is high dielectric medium in the gap to avoid interference. Proximity
probes can detect vibrations up to 10 kHz, but most machinery vibrations occur below 1000 pri-
marily related to rotating speed, limiting diagnostic capabilities. (Randall 2021)

Velocity seismic transducers, also known as seismic velocity pickups or velocity transducers, are
vibration sensors that measure the rate of change of displacement, or velocity, of a vibrating object.

15
They are specifically designed to detect low-frequency vibrations, typically within the range of 1 to
200 Hz. The principle of operation of velocity seismic transducers relies on a phenomenon called
electromagnetic induction. A permanent magnet is positioned at the centre of a coil of wire. When
the object vibrates, the coil and magnet become relatively close to each other, creating a fluctuating
magnetic field. This fluctuating field induces a voltage in the coil, which the magnitude of this volt-
age is directly proportional to the rate at which the vibration is occurring.

Velocity seismic transducers offer several advantages over other types of vibration sensors, includ-
ing accelerometers:

• High sensitivity to low frequencies: As mentioned earlier, velocity seismic transducers are
particularly well-suited for monitoring low-frequency vibrations, which are often associated
with critical machinery components like bearings and shafts.

• Ease of installation: Unlike accelerometers, which often require specialized mounting


brackets, velocity seismic transducers can be directly attached to objects in horizontal or
vertical orientations without complex installation procedures.

• Self-powered operation: Velocity seismic transducers do not require an external power


source, making them ideal for applications where power access is limited.

However, velocity seismic transducers also have some limitations:

• Bulkiness: Compared to accelerometers, velocity seismic transducers tend to be larger and


less compact, which may restrict their use in certain applications.

• Susceptibility to magnetic fields: Velocity seismic transducers can be affected by strong


magnetic fields, so they should be used with caution in environments with nearby magnets
or electrical equipment.
Despite these limitations, velocity seismic transducers remain valuable tools for vibration monitor-
ing in a variety of applications, particularly in industrial settings where low-frequency vibrations are
prevalent.

Here are some specific examples of how velocity seismic transducers are used:

16
• Monitoring bearing health: Vibration analysis using velocity seismic transducers can detect
early signs of bearing wear, such as excessive radial or axial movement, allowing for timely
maintenance intervention and preventing costly downtime.

• Analysing shaft imbalance: Velocity data can be used to assess shaft imbalance, a com-
mon source of vibrations that can lead to premature wear and machine failure.

• Troubleshooting motor issues: By monitoring vibration levels and patterns, engineers can
identify and diagnose problems with motors, such as loose bearings or worn brushes.

Overall, velocity seismic transducers play a crucial role in condition monitoring and predictive
maintenance strategies, helping to ensure the reliable operation of critical machinery and prevent
costly breakdowns. (Muruganantham 2022)

Accelerometers are versatile and valuable tools for measuring high-frequency vibrations. These
are some of the reasons why accelerometers are well-suited for measuring high-frequency vibra-
tions:

• High sensitivity: Accelerometers can measure very small accelerations, which is important
for detecting high-frequency vibrations, which have small displacements.

• Wide frequency range: Accelerometers can measure vibrations over a wide range of fre-
quencies, from a few hertz to several thousand hertz. This makes them well-suited for a
wide variety of applications.

• Small size: Accelerometers are relatively small and lightweight, which makes them easy to
mount on machinery.
Some specific applications of accelerometers for measuring high-frequency vibrations:

• Monitoring the health of rotating machinery: Accelerometers can be used to detect early
signs of bearing wear, looseness, and imbalance.

17
• Identifying structural defects: Accelerometers can be used to identify cracks, loose
bolts, and other structural defects in bridges, buildings, and other structures.

• Monitoring the performance of automotive components: Accelerometers can be used to


monitor the performance of engine components, such as pistons and valves.

The frequency response of accelerometers depends on the mounting method. These sensors are
suitable for detecting faults in rotating machinery, but the choice of mounting method is critical for
optimal performance (Mais and Brady 2002).

Piezoelectric Accelerometers (figure 3) are a type of accel-


erometer that generates an electrical signal when subjected to
sudden acceleration. They are well-suited for measuring
shocks and vibrations because they are very sensitive and can
respond quickly to changes in acceleration.

The sensing crystal in a piezoelectric accelerometer is made of


a material such as quartz or lead zirconate titanate (PZT).
When the crystal is subjected to force, it produces an electrical
charge. The amount of charge produced is proportional to the
FIGURE 3. Piezoelectric Accel- force applied.
erometer (Fajar n.d.)

The output voltage from a piezoelectric accelerometer can be


amplified and filtered to produce a clean signal that can be used to measure acceleration. Piezoe-
lectric accelerometers are typically used in applications where high-frequency vibrations or shocks
are present, such as in machinery condition monitoring, automotive safety, and blast protection.
(Omega n.d.).

Piezoresistive accelerometers are a type of accelerometer that measures acceleration by chang-


ing its resistance. They are less sensitive than piezoelectric accelerometers, but they are more
durable and can be used to measure higher accelerations.

Piezoresistive accelerometers typically use a Wheatstone bridge circuit to measure resistance. The
bridge consists of four resistors, and the output voltage from the bridge is proportional to the change

18
in resistance. The output voltage can be amplified and filtered to produce a clean signal that can
be used to measure acceleration.

Piezoresistive accelerometers are typically used in applications where high accelerations are pre-
sent, such as in vehicle crash testing, weapons testing, and industrial machinery safety.

Capacitive accelerometers are a type of accelerometer that measures acceleration by changing


its capacitance. They are very common in smartphones and other consumer electronics because
they are small, lightweight, and low power.

Capacitive accelerometers typically consist of two conductive plates and a diaphragm between
them. The space between the plates is filled with a dielectric material, such as air or a polymer.
When the diaphragm moves due to acceleration, the capacitance between the plate’s changes.
The change in capacitance is proportional to the acceleration.

Capacitive accelerometers are typically used in applications where low-frequency vibrations or low-
g accelerations are present, such as in smartphones, wearable devices, and automotive applica-
tions. The output voltage from a capacitive accelerometer can be amplified and filtered to produce
a clean signal that can be used to measure acceleration.

Triaxial accelerometers, as the name suggests, are known for three orthogonal directions accel-
eration measurement: X, Y, and Z. This allows them to capture the full range of vibrations experi-
enced by an object. By measuring acceleration in all three axes, triaxial accelerometers provide a
more extensive view of the vibrational behaviour of a system.

The combination of three sensing elements, each oriented perpendicular to the other, enables tri-
axial accelerometers to accurately measure acceleration components in each direction. This is par-
ticularly useful in applications where vibrations are complex and may occur simultaneously in mul-
tiple directions.

Triaxial accelerometers find extensive applications in various fields, including:

• Structural Health Monitoring (SHM): One of the application place of triaxial accelerometers
is SHM to record the vibrational behaviour of buildings, bridges, and other structures. By

19
analysing vibration data, engineers can identify potential structural defects and assess the
overall health of the structure.

• Turbine Operations: Triaxial accelerometers are crucial for monitoring the vibration levels
of turbines, which are critical components in power generation systems. By detecting early
signs of vibration anomalies, operators can prevent potential turbine failures and ensure
the efficient operation of power plants.

• High-Speed Machinery: High-speed machinery, such as engines and motors, generates


significant vibrations that can impact their performance and lifespan. Triaxial accelerome-
ters are used to monitor vibration levels in these machines, providing valuable insights into
their operational health.

The ability of triaxial accelerometers to measure acceleration in three orthogonal directions makes
them indispensable tools for a wide range of applications where accurate vibration analysis is es-
sential. (Omega n.d.).

4.1.3 Mounting location of vibration transducers

The mounting of vibration sensors is crucial for ensuring accurate measurements. When a sensor
is affixed to a machine, the internal vibrations of the machine induce vibrations in the sensor, which
are then detected by the internal electronics. An incorrectly mounted sensor can lead to spurious
vibrations unrelated to the machine's condition, rendering the vibration data useless.

The method used to mount a vibration sensor to a machine has a remarkable impact on the ability
of sensors to accurately measure vibrations, especially at high frequencies. This is because the
mounting method can introduce a mechanical low-pass filter that attenuates high-frequency vibra-
tions.

The ideal mounting method for vibration sensors is to securely screw them directly to the machine's
surface. This method provides the most direct and precise transfer of vibration energy from the
machine to the sensor. However, screwing the sensor in place is often not practical or feasible,
especially for temporary or portable monitoring systems (Muruganantham 2022).

20
For such applications, alternative mounting methods can be used, such as direct stud mounting or
the use of epoxy and cementing pads. These methods provide a more secure attachment than
using adhesives or greases, but they may still introduce some loss of high-frequency vibration en-
ergy.

It is important to mount the vibration sensor as close as possible to the point of interest on the
machine. This will minimize the amount of vibration energy that is lost due to transmission through
the mounting material. Additionally, the sensor should not be mounted on thin sections, guards,
cantilevers, or vibration-free areas (antinodes), as these locations may not accurately reflect the
vibrations of the machine itself (Muruganantham 2022).

Finally, it is crucial to prevent debris from accumulating between the sensor and the mounting sur-
face. This debris can act as a dampening material and significantly reduce the sensor's ability to
measure high-frequency vibrations. (Muruganantham 2022)

FIGURE 4. Mounting methods of vibration transducers and frequency response

4.2 Temperature Monitoring

Temperature monitoring is a crucial facet of condition monitoring for rotating machinery. It is well
understood that the temperature of rotating components within machines can escalate due to fric-
tion. Typically, under normal operating conditions, these parameters remain comfortably within the
prescribed operational limits. However, in instances of abnormal operation or impending issues,
temperature becomes a secondary parameter that exhibits rapid changes.

21
The utility of temperature monitoring lies in its ability to serve as a sentinel for the rotating parts of
machinery, offering insights into their condition. When temperatures surge beyond the established
operational thresholds, it often serves as an early indicator of potential problems such as friction-
induced wear and tear, imbalance, or internal component damage.

In essence, monitoring temperature alongside other parameters provides a holistic view of a ma-
chine's health, enabling timely intervention and proactive maintenance measures to avoid costly
breakdowns and operational disruptions. This multifaceted approach to condition monitoring is in-
dispensable in today's industrial landscape, where machine reliability and productivity are para-
mount.

Effective temperature monitoring plays a pivotal role in comprehensive condition monitoring strat-
egies. It is a crucial indicator in evaluating the health and condition of machinery and components.
Continuous monitoring of temperature variations helps to identify abnormal patterns or deviations
which is usually one of the signs to abnormal condition or malfunctions. Temperature is a sensitive
indicator of equipment stress, wear, or inadequate lubrication, which is essential in monitoring the
condition of the machinery (S. Bagavathiappan 2013).

In addition to identifying potential issues, temperature monitoring is crucial for optimizing opera-
tional parameters. It allows for the early detection of overheating, enabling timely intervention to
prevent critical failures. Moreover, temperature data contributes to predictive maintenance models,
where machine learning is analysing the trend temperature to predict future behaviour and antici-
pate maintenance needs. Integrating temperature monitoring into the broader framework of condi-
tion monitoring enhances the ability to proactively manage and optimize the performance of indus-
trial assets.

Traditional temperature measurement systems, typically employed thermocouples and RTDs, rely
on physical contact for temperature readings but cannot provide illustration of the object being
measured. Infrared thermography (IRT) emerges as a cutting-edge non-destructive testing (NDT)
technique that revolutionizes temperature assessment. Unlike traditional methods, IRT allows for
remote temperature measurement, providing a thermal image of machine component without the

22
need for physical contact. This innovative approach enhances the efficiency and scope of temper-
ature monitoring, offering a comprehensive and visually insightful perspective on the thermal char-
acteristics of the subject (S. Bagavathiappan 2013).

4.2.1 Infrared temperature monitoring

Infrared thermography (IRT) stands out as a non-contact method of temperature measurement,


relying on infrared radiation detection emitted by a body. This approach utilizes Stefan–Boltzmann's
law (formula 1.) to derive the temperature of the target without direct physical contact. A key ad-
vantage of IRT is its minimum requirements. The essential component of IRT is an infrared camera
which is giving recording infrared thermal images.

𝑞𝑞
𝐴𝐴
= 𝜀𝜀𝜀𝜀𝑇𝑇 4 (1),

in which q is energy emission rate (W), T is the absolute temperature (K), A is the emitted surface
area (m2), and σ is the Stefan–Boltzmann’s constant (σ = 5.676 × 10−8 W m−2 K−4) and ε is the
emitting surface emissivity for absolute temperature T and a fixed wavelength (S. Bagavathiappan
2013).

23
The simplicity of the setup contributes to the versatil-
ity and accessibility of IRT-based monitoring tech-
niques. The key considerations in choosing an infra-
red camera revolve around performance parameters
that significantly impact the quality and accuracy of
thermal images. One such critical parameter is the
spectral range, which determines the sensitivity of
the camera to different wavelengths of infrared radi-
ation. Appropriate spectral range selection is im-
portant for capturing relevant thermal information and
ensuring the effectiveness of the IRT system in de-
tecting temperature variations in diverse industrial
applications.

Thermal imaging emerges as an asset in predictive


maintenance strategies, offering a non-intrusive and
FIGURE 5. Thermal imaging (Hitchcock
2004) comprehensive approach to monitor the health of in-
dustrial machinery. The application of thermal imag-
ing in predictive maintenance involves capturing and analysing thermal patterns to detect early
signs of potential issues, enabling proactive interventions before failures occur. By harnessing ma-
chine learning models, thermal images can be automatically processed, and patterns associated
with specific fault modes can be identified.

CNNs excel in image recognition tasks, making them well-suited for analysing thermal images and
extracting relevant features indicative of machinery health. These models can learn to discern sub-
tle temperature variations, thermal gradients, and hotspots that may signify impending faults. In
contrast, recurrent neural networks (RNNs) are remarkable at sequential data processing, which
makes them well-suited for analysing time-dependent thermal data. The integration of machine
learning models with thermal imaging not only enhances the accuracy of fault detection but also
enables the development of predictive maintenance models that leverage historical thermal data to
forecast potential issues.

In practical terms, a predictive maintenance framework utilizing thermal imaging and machine
learning involves regularly capturing thermal images of machinery, preprocessing the images for

24
analysis, and training models to recognize thermal patterns associated with specific failure modes.
The trained models can then be deployed for real-time monitoring, continuously analysing incoming
thermal data to identify deviations from normal operating conditions. This proactive approach facil-
itates timely maintenance interventions, minimizes downtime, and extends the lifespan of industrial
equipment.

FIGURE 6. Brief process of CNN (D. Manno 2021)

25
5 MACHINE LEARNING

The core concept of machine learning revolves around enabling computers to extract patterns from
data, even when the underlying rules or relationships between inputs and outputs are not explicitly
defined.

In mathematical terms, finding a function (f) that can take these inputs (x) and accurately predict
the outputs (Y). However, what this function f looks like is unknown. If it is known, then machine
learning wouldn’t be used. The function itself would be used directly.

But here is the tricky part: there is always some level of error (denoted as e) in predictions, which
is not related to input data. This error can occur for various reasons, like missing information or
randomness in the data.
So, the prediction equation becomes (Brownlee 2019):

Y = f(x) + e (2),

in which e represents an irreducible error because, no matter how sophisticated our machine learn-
ing model becomes, it can never be eliminated. It is like trying to predict the exact weather a year
from now; there will always be some uncertainty.

This is why machine learning is a challenging field. It is used to uncover hidden patterns in data
and create a function f that can make accurate predictions. Machine learning algorithms are tools
to tackle this complex task. They help us learn and refine f from data, even when what f looks like
unknown in advance.

Machine learning models are primarily categorized into two broad categories: supervised and un-
supervised learning. Supervised learning equips the model with labelled data, which every input is
paired with its corresponding output. This structured approach grants the model to establish intri-
cate associations between input features and target outcomes. As a result, the trained model ac-
quires the expertise to analyse unlabelled data and generate accurate predictions based on the
patterns it has absorbed during training. (Andreas Theissler 2021).

26
On the other hand, unsupervised learning operates without the luxury of labelled data. In this ap-
proach, the model autonomously explores the dataset without human intervention. By identifying
patterns, structures, or relationships within the data, the unsupervised model develops its own al-
gorithms and parameter settings. This method is particularly useful when the inherent structure of
the data needs to be uncovered, often for tasks like dimensionality reduction, similar data clustering
or anomaly detection (Garcia-Canadilla, et al. 2020).

In summary, supervised learning relies on labelled data for training, while unsupervised learning
delves into data independently to uncover hidden insights and relationships. Both methods are
invaluable in the realm of machine learning, offering diverse capabilities for solving real-world prob-
lems.

FIGURE 7. Machine learning hierarchy (Garcia-Canadilla, et al. 2020)

5.1 Supervised Learning

When delving into the realm of machine learning, the first concepts that often come to mind are
classification and regression models. These two foundational approaches are important in shaping

27
the landscape of predictive analytics. Dissecting their structures, uncovering their diverse use
cases, and highlighting the advantages they offer.

Classification Models: At its core, a classification model is designed to undertake labels assign-
ment task or input data categorising, thereby making predictions. Picture it as the digital equivalent
of a sorting hat, determining which category an input belongs to. This process essentially partitions
datasets into distinct classes, a partitioning driven by a myriad of parameters.

Binary Classification: One of the simplest forms of classification is binary classification, where the
model predicts one of two possible labels. For instance, think of a spam email filter deciding whether
an incoming email is spam or not. This is akin to a controller output, where the model's decision is
either a resounding "true" or "false." Binary classification finds applications in numerous scenarios,
including financial decisions like predicting whether to buy or sell an asset.

Multi-Class Classification: On the flip side, multi-class classification extends beyond the binary
realm, encompassing scenarios where the model needs to predict from a palette of more than two
classes. Imagine a model classifying objects in images into various categories, such as animals,
plants, or vehicles. It is the go-to choose when dealing with diverse categories and need to discern
which class best fits the input (Zhiqin Zhu 2023).

In essence, classification models serve as powerful tools for sorting and categorizing data, and
they find utility in diverse fields, from natural language processing to medical diagnosis. They offer
the ability to bring order to the chaos of data by organizing it into meaningful classes, making them
an indispensable asset in the machine learning toolkit.

FIGURE 8. Classification Model

28
Regression Models: If it is required to have continued values then regression model is one of the
widely used machine learning models. What does continued values means. If it is required to pre-
dict the market prices, trends which is giving value continuously then regression model is great
option.

Linear regression stands out as one of the most fundamental and straightforward models in the
machine learning and statistics. Usually it is the first choice for modelling relationships between
variables due to its simplicity and interpretability.

In a linear regression model, the input variables (predictors) and the output variable (response)
relationship can be represented as a straight line. This line is characterized by two parameters: the
intercept (the point where the line intersects the Y-axis) and the slope (the ration of the output
variable changes with respect to changes in the input variable). These parameters are learned from
the data (Oliveira Ewerton Cristhian Lima de 2022).

While linear regression assumes linearity, it is worth noting that not all relationships in the real world
are truly linear. However, linear regression can still provide valuable insights and predictions in
many cases, especially when the relationship is approximately linear or when a simple, interpreta-
ble model is searched.

Non-linear regression: When relationships are more complex and non-linear, that's when more
advanced techniques like non-linear regression or other machine learning models come into play.
Nonlinear regression is a powerful tool in statistical modelling and data analysis, breaking away
from the constraints of linear regression's straight-line assumption (typically represented as y = mx
+ b). Nonlinear regression allows for a more flexible, curved relationship between variables.

The main target of nonlinear regression is identifying the mathematical function that most precisely
aligns with the provided data. This function is defined by parameters that are fine-tuned to minimize
the discrepancies between the detected data points and the predicted points generated by the
model.

29
This process involves selecting an appropriate nonlinear model, such as logarithmic functions, trig-
onometric functions, exponential functions, power functions, and more, depending on the relation-
ships being modelled and data structure. The model parameters are then adjusted iteratively using
optimization techniques until the best-fit curve is obtained.

Nonlinear regression is a versatile tool that can capture complex relationships in data, making it
useful in various fields, including science, engineering, economics, and biology. It allows research-
ers to describe and understand relationships that cannot be adequately represented by linear mod-
els.

FIGURE 9. Regression Model

5.2 Un-supervised Learning

The algorithm is exposed to unlabelled and unclassified data and tasked with identifying patterns,
similarities, and differences within that data without prior guidance or training.

Unlike supervised learning, which relies on labelled data and a teacher to guide its learning pro-
cess, unsupervised learning operates in a more autonomous manner. It delves into the inherent
structure and relationships embedded within unlabelled data to extract meaning.

Clustering is one of the primary goals of unsupervised learning, which involves the identification of
natural groups or clusters within a dataset. This task involves grouping data points that share similar
characteristics together, effectively segmenting the data into distinct categories. By doing so, the
algorithm discovers hidden structures within the data without any external assistance. This can be
particularly useful for tasks like customer segmentation, anomaly detection, or data compression.

30
In unsupervised learning, the machine is essentially left to its own devices to uncover meaningful
insights from raw, unprocessed data and relationships within unlabelled data, revealing insights
that may not be apparent from simply looking at the data itself.

Clustering Model: Clustering is a powerful technique for exploring and organizing unlabelled data,
enabling us to uncover relationships and hidden patterns within the data. During the clustering
process, similar data points are grouped together into distinct clusters, forming distinct categories
based on their shared characteristics. This process reveals the inherent structure and groupings
within the data, providing valuable insights that may not be immediately apparent from simply ob-
serving the data itself. (Andreas Theissler 2021).

In essence, clustering is a technique for organizing and categorizing data based on inherent pat-
terns of similarity and dissimilarity. It is a way of grouping together objects or data points that share
common characteristics or properties. This process allows for the identification of underlying struc-
tures within the data, making it easier to gain insights and draw conclusions.

Clustering is a versatile machine learning technique that finds widespread applications in various
domains, as image recognition, recommendation systems, and anomaly detection. It can be used
to uncover hidden patterns, relationships, and outliers, which is important for data scientists, ana-
lysts, and decision-makers across a wide range of industries.

Dimensionality reduction: Dimensionality reduction is a strong implementation for simplifying


complicated data and extracting meaningful insights. Its primary objective is to streamline and en-
hance the interpretability of these datasets by transforming them into lower-dimensional represen-
tations that retain the most essential and valuable information. In essence, dimensionality reduction
bridges the gap between the intricate nature of high-dimensional data and ability to comprehend
and extract meaningful insights from it.

By reducing dimensionality, data scientists and analysts can achieve several benefits. High-dimen-
sional data can be challenging to work with and visualize. Dimensionality reduction simplifies the
dataset, making it more manageable and comprehensible. Algorithms and models often perform
better with fewer features, as they become less prone to overfitting and require less computational

31
resources. Irrelevant or noisy features can be a hindrance in data analysis. Dimensionality reduc-
tion helps filter out these unimportant variables. Lower-dimensional data is easier to visualize, al-
lowing for better insights and pattern recognition (Oliveira Ewerton Cristhian Lima de 2022).

FIGURE 10. ML Models

Dimensionality reduction is a versatile and powerful technique in machine learning and data anal-
ysis, aimed at addressing the challenges posed by datasets with many features or variables. As
datasets grow in complexity, the number of features may become unwieldy, leading to increased
computational demands and the risk of overfitting. Dimensionality reduction methods are employed
to mitigate these issues and extract the most salient information from the data while reducing its
dimensionality (Jiawei Han 2012).

32
To convert high-dimensional data into a lower-dimensional subspace while preserving the maxi-
mum variance, Principal Component Analysis (PCA) is broadly used. This enables a more compact
representation of the dataset which holds important information.

Another widely used method is t-Distributed Stochastic Neighbour Embedding (t-SNE), particularly
effective in visualizing high-dimensional data and identifying clusters or groups of data points. t-
SNE emphasizes maintaining the local data points structure, making it valuable for exploratory data
analysis and clustering.

The benefits of dimensionality reduction extend beyond computational efficiency. It can enhance
model interpretability, eliminate overfitting risks, and facilitate the visualization of data patterns.
However, practitioners should carefully consider specific characteristics of their dataset and analy-
sis before choosing of dimensionality reduction method.

5.3 Deep Learning

Deep learning is a captivating subset of ML which draws inspiration from remarkable complexity of
the human brain. In recent years, it has gained widespread recognition and success, particularly in
applications such as natural language processing, speech recognition, image recognition and au-
tonomous decision-making. Deep neural networks, the foundational elements of deep learning, are
constructed with multiple layers of interconnected nodes. These layers progressively extract in-
creasingly intricate and abstract features from input data.

33
FIGURE 11. Artificial Intelligence (SANGEETHA, Shree and Upadhyay 2019)

Imagine a neural network as a digital model that replicates the human brain's intricate network of
neurons. However, instead of simply processing data on a surface level, deep learning delves
deeper, peeling away layers of information to unveil concealed patterns, akin to a detective solving
a complex puzzle. These layers of artificial neurons constitute the essence of deep learning, with
each layer building upon the knowledge acquired by the preceding one, much like chapters in a
book revealing a larger narrative.

Within the realm of deep learning, each neuron serves as a fundamental building block. These
intricate units receive and analyse information, akin to individual musicians contributing their unique
notes to create a harmonious symphony. This collective orchestration of neurons culminates in the

34
network's ability to make predictions, classifications, or decisions, resembling the synergy of instru-
ments in an orchestra crafting a captivating musical composition.

In this symphony of computation, the input and output layers serve as the central stage. The input
layer acts as a diligent usher, receiving and preparing data for its journey through the neural net-
work. It serves as the starting point, where raw information enters the system. Conversely, the
output layer functions as a spotlight on this grand stage, delivering the final verdict a prediction,
classification, or decision based on the neural network's analysis (figure12).

Deep learning is not solely about machines learning from data. It is the technology that powers self-
driving vehicles, facilitates real-time language translation, and fuels the imaginative creativity of AI-
generated art and music. Layer by layer, connection by connection, deep learning propels us into
a future where machines achieve a profound understanding of our complexities, revolutionizing the
way interact with technology and unlocking new realms of innovation (Meriem Bahi 2018).

FIGURE 12. Neural Network

Neural networks are versatile for different application. Each type has its own pros and cons. Some
of them is mentioned below.

35
Convolutional Neural Network (CNN) is like a superhero in the world of deep learning. It is famous
for being incredibly good at understanding and working with images. This superhero is made up of
different parts, and each part has its own important job in understanding what's in a picture.

The heart of a CNN is the convolutional layers. These layers are like a team of detectives who
carefully inspect every pixel in an image. They look for important things like lines, textures, and
shapes that make up the image. It is like they are uncovering hidden clues in a picture (figure 13).

But the story does not end there. The information from the convolutional layers goes on a journey
to the pooling layers. Here, it is like the data is getting refined, keeping only the most important stuff
while letting go of the less important details. This makes the data simpler and easier to work with.

Finally, after this adventure through the convolutional and pooling layers, the data is ready for the
big reveal. This is where the fully connected layers come into play. They are like the magicians who
take all the refined information and turn it into a final answer. They might say, "This picture shows
a cat!" or "This is a picture of a dog!" (Hoeser and Kuenzer 2020).

36
FIGURE 13. CNN

Recurrent Neural Network (RNN): Unlike traditional neural networks, RNNs possess a unique
ability to process sequential information by introducing a concept known as recurrence. This inher-

37
ent capability positions them as powerful tools in domains where understanding temporal depend-
encies is critical, as time-series prediction, natural language processing, and speech recognition.
(Donges 2023).

At the core of an RNN lies a network of interconnected nodes, each equipped with memory that
retains information about previous inputs. This memory mechanism enables RNNs to capture and
consider historical context when processing current inputs, fostering an understanding of sequen-
tial patterns. The recurrence feature allows information to persist within the network, addressing
the challenge of contextual awareness in tasks where the order of data matters (Saeed 2023).

Information is flowing like a river—calmly moving from one point to another. If A feed-forward neural
network and a Recurrent Neural Network (RNN) is introduced two vessels.

In the feed-forward network, it is like a straightforward journey down a river. The information flows
in one direction, passing through different layers until it reaches its destination. However, this net-
work lacks memory—it does not remember where it came from, and it cannot predict what lies
ahead. It is like a traveller with no recollection of the path taken, just focusing on the current sur-
roundings.

However, in the RNN, the information taking a circular route, cycling through a loop. When the RNN
decides, it does not just consider the current input; it reflects on what it has learned from all the
inputs encountered along the way. It is akin to a seasoned traveller who not only appreciates the
current view but also recalls the entire journey, learning from each step taken.

These two images (figure 14) vividly illustrate the contrast in information flow. While the feed-for-
ward neural network moves linearly, the RNN forms a continuous loop, capturing the essence of
past experiences to make informed decisions for the future.

38
FIGURE 14. RNN vs Feed Forward Neural Network (Donges 2023)

Four types of RNN available. One to one means there is only one input and one output. One to
many where there is only one input and multiple outputs. Many to one is called when there are
multiple inputs and one output. And multiple inputs and multiple outputs is called many to many
(figure 15) (IBM n.d.).

FIGURE 15. Types of RNN

However, traditional RNNs have limitations, particularly in capturing long-term dependencies. This
challenge led to the evolution of more advanced architectures like Long Short-Term Memory
(LSTM) networks and Gated Recurrent Units (GRUs). These networks incorporate gating mecha-
nisms that known as information control through the network, effectively mitigating the vanishing
gradient problem and enabling them to capture more effective long-term dependencies.

39
5.4 Long short-term memory (LSTM)

Two notable variants of RNN is famous: Bidirectional Recurrent Neural Networks (BRNNs) and
Long Short-Term Memory (LSTM) networks. While BRNNs facilitate information flow in both direc-
tions, finding their applications predominantly in Natural Language Processing (NLP), our focus
here leans towards the LSTM variant.

Long short-term memory is advances version of RNN. It overcomes long range dependencies more
precisely than recurrent neural networks. It was introduced to solve the vanishing gradient issue
which often-plagued traditional RNNs, which made them struggle to learn from data points that
were temporally distant from one another. Vanishing gradient problem is the issue where the gra-
dient of the loss function with respect to the network weights becomes increasingly smaller as the
gradient propagates through the network's layers.

Neural networks with their complex architecture and intricate learning mechanisms, are typically
trained using gradient descent algorithms. These algorithms substitute the variance between the
the actual target values (i.e., the loss function) and predicted outputs by adjusting the network's
parameters. Gradients provide crucial information about the amount of each parameter should be
updated to reduce this difference. During training, gradients are computed by backpropagating the
error out of the output layer backward via the network. Backpropagation refers to the process of
adjusting the weights within a neural network. To increase model reliability proper tuning of the
weights are required. This means computing gradients for each layer with respect to the loss func-
tion. The vanishing gradient issue is observed during backpropagation, when the gradients of cer-
tain layers are very close to zero. When gradients are close to zero, the network's weights are not
being updated significantly in those layers. As a result, these layers do not learn effectively from
the training data (Olah 2015).

The vanishing gradient problem can lead to slow convergence during training, where the network
learns very slowly or not at all. It can also result in suboptimal models that cannot detect long-range
dependencies, which is an essential capability in many applications, like natural language pro-
cessing or time series prediction.

To mitigate vanishing gradient problem there are several methods, which indeed on of the effective
solutions is using LSTM architecture to create the model.

40
Recurrent neural networks (RNNs) are characterized by their sequential structure, consisting of
repeating modules. In traditional RNNs, modules are just a single tanh layer (figure 16).

FIGURE 16. Standard RNN model (Olah 2015)

Long Short-Term Memory networks, or LSTMs, maintain this sequential structure. However, their
repeating modules are distinct, featuring four interconnected layers that collaborate in a special
and complex manner.

FIGURE 17. LSTM model (Olah 2015)

In the diagram above (figure 17), it is seen that each line serves as a conduit for complete vectors,
transferring information from output to another node’s input. Element-wise operations symbolised
by pink circles, while the yellow boxes denote neural network layers that adapt and learn. Lines
coming together indicate combining information, while a line branching off represents the content
being duplicated and sent to multiple destinations.

41
The cell state is a crucial element of LSTMs, seen as a horizontal line at the beginning of the model
(figure 18). If cell state is considered as conveyor belt—it moves towards the whole LSTM with only
a few changes. Information moves without getting messed up. It is like a smooth ride for data!
These changes are controlled by some mathematical operations (Staudemeyer and Morris 2019)
(Liang Huang 2021).

FIGURE 18. LSTM conveyor and Sigmoid gate at the right (Olah 2015)

LSTM can decide to keep or toss out information from its memory bank, and it is all managed by
these things called gates. Gates act like guards, holding and releasing information. They're made
of multiplication functions and a sigmoid neural layer which does some math to decide how im-
portant each piece of info is. If it says zero, it means "keep nothing," and if it says one, it means
"keep everything!" (Olah 2015) (Aditianu1998 2023).

LSTMs use gates, which are like filters or controllers, to manage the flow of data and prevent long-
term dependency issues. Here is how they work step by step:

1. Forget Gate: helps to network to forget unimportant information from the cell state. Determina-
tion is done by a sigmoid function based on the previous hidden state (ht-1) and the current input
(xt) at time t. It produces a vector called the forget gate (ft) with values between 0 and 1. These
values decide what gets discarded (figure 19) (Yan Yan 2022).

42
(3),

FIGURE 19. Forget Gate (Olah 2015)

2. Input Gate: This step involves adding new information (xt) to the cell state and updating it. It
consists of two parts. New information should be added (1) or ignored (0) is decided by a sigmoid
layer. A tanh layer that assigns weights to the values that should be added, determining their im-
portance (between -1 and 1). They work together to update the cell state. For creating the updated
cell state (𝐶𝐶̃ t) the new memory (it) is added to the old memory (Ct-1) (figure 20) (Yan Yan 2022)
(Aditianu1998 2023).

(4),

(5),

FIGURE 20. Input Gate (Olah 2015)

At this stage of the process, focusing on transforming the past cell state Ct-1 into the present cell
state Ct. The forget gate 𝑓𝑓𝑡𝑡 and the new values 𝐶𝐶̃ t , have already determined the course of action.
Now, it is time to put the decisions into action (Yan Yan 2022). This involves two key operations:

Forgetting the Old Information: The old cell state Ct-1 is multiplied elementwise by the forget gate
𝑓𝑓𝑡𝑡 . This operation selectively retains, or discards information based on the forget gate's decision.
If 𝑓𝑓𝑡𝑡 is close to 1, it means "completely keep this," and if it is close to 0, it means "completely get
rid of this." (Yan Yan 2022).

43
Incorporating the New Information: The scaled new candidate values 𝑖𝑖𝑡𝑡 ∗ 𝐶𝐶̃𝑡𝑡 are added to the
product from the first step. This introduces the information deemed relevant for inclusion, with the
scaling factor reflecting the decision on how much to update each state value (figure 21).

(6),

FIGURE 21. Input Gate (Olah 2015)

3. Output Gate: At the end, the output values (ht) are achieved from output cell state (ot). Here is
how it is done. Sigmoid layer is decision maker. Tanh layer creates new values from the cell state
(Ct) then the output of the sigmoid gate (ot) is multiplied to them. This multiplication results in the
final output values (ht) (Yan Yan 2022).

(7),

(8),

FIGURE 22. Output Gate (Olah 2015)

So, LSTMs are designed to selectively retain, update, and output information, and they use these
gates to ensure that only relevant data is passed along the memory chain. This is essential for
managing long sequences of data (Le 2019).

44
6 PREDICTIVE MAINTENANCE

Predictive maintenance is an innovative maintenance plan which integrates condition monitoring


and advanced analytics to forecast when maintenance should be performed on machinery or equip-
ment. This approach leverages machine learning and sensor data analysis to predict maintenance
needs without the need for continuous human intervention. Achieving a dependable predictive
maintenance system involves selecting appropriate machine learning models, selecting quality da-
taset, and creating a suitable testing environment.

Indeed, the world of machine learning offers a plethora of models, each with its own set of strengths
and weaknesses. The key to success lies in choosing the right model for the specific task at hand,
coupled with high-quality data.

In the realm of predictive maintenance, where the goal is to detect changes in the condition of
machines promptly, Recurrent Neural Networks (RNNs) emerge as a powerful choice. RNNs, as it
had been explored earlier, excel at learning from both current and past inputs, making them well-
suited for continuous monitoring and timely anomaly detection.

LSTMs are particularly compelling for predictive maintenance. These networks possess the re-
markable ability to capture and remember patterns over extended sequences of data, a trait inval-
uable when dealing with the intricacies of machine condition monitoring. With their capacity to retain
and leverage long-term dependencies, LSTMs can effectively detect subtle shifts in machine be-
haviour, providing a solid foundation for predictive maintenance efforts (Pemmada 2022).

6.1 Datasets

In the vast landscape of predictive maintenance, where the echoes of machinery resonate with the
hum of data, datasets emerge as the unsung heroes shaping the future of operational efficiency.
This chapter unfurls the intricate tapestry of our dataset, a canvas adorned with the rhythmic pul-
sations of telemetry, the suspenseful twists of errors, and the climactic moments of failures. As it
embarked on this exploration, the dataset becomes more than a collection of numbers; it transforms
into a storyteller, narrating the saga of machines, their challenges, and the quest for longevity.

45
The journey begins with an examination of the dataset's composition—a melange of telemetry,
errors, maintenance records, failures, and the distinctive characteristics of each machine encapsu-
lated in metadata. Telemetry, with its hourly cadence, captures the heartbeat of machinery, while
errors and maintenance records chronicle the challenges and interventions that punctuate the op-
erational rhythm. Failures, the turning points in this narrative, embody the culmination of vulnera-
bilities and the resilience of the machines.

In the realm of metadata, machines unveil their identities, each with a unique story etched in model
types and ages. Yet, before the dataset steps into the limelight, it undergoes a metamorphosis.
Preprocessing becomes the chisel that sculpts raw data into a refined form, ready to be embraced
by the algorithms that will breathe life into predictive models.

As it is navigated through the sections of this chapter, the dataset transforms from a silent observer
to an active participant in the quest for predictive maintenance excellence. Its nuances and intrica-
cies set the stage for a journey that transcends traditional maintenance paradigms, ushering in an
era where data becomes the compass guiding us through the labyrinth of machinery health. As it
is unravelled the tapestry of data, seeking insights that will illuminate the path to operational resili-
ence and efficiency.

6.1.1 Exploring Open Datasets.

In the vast landscape of data, open datasets shine like hidden treasures waiting to be discovered.
These datasets, generously shared with the world, hold valuable insights and opportunities for ex-
ploration. This chapter embarks on a journey to explore some of these open datasets, each a
unique piece of the data universe.

Our first stop is Mendeley, a repository of academic papers and datasets. Here, it can be found a
wealth of knowledge contributed by researchers and scholars. Kaggle, our next destination, is a
playground for data enthusiasts. It hosts diverse datasets and engaging competitions that challenge
and inspire.

46
Google Datasets is our final stop, a platform that opens the door to a wide array of publicly available
datasets. From images to text and beyond, Google Datasets offers a glimpse into the diverse
realms of data.

As it is navigated through these platforms, datasets that range from the ordinary to the extraordinary
will be uncovered. Whether it is predicting machine failures, understanding climate patterns, or
delving into the world of natural language processing, each dataset presents an opportunity to
learn, innovate, and contribute to the collective pool of knowledge.

The journey promises not only to expand our understanding of the world but also to inspire new
ideas and discoveries. Treasure chest of open datasets and the potential they hold for advancing
our understanding of the data-driven universe are discussed below.

Mechanical faults in rotating machinery dataset (normal, unbalance, misalignment, loose-


ness) is one of the open datasets in Mendeley data source. The dataset was meticulously crafted
with the primary objective of tackling the classic faults often encountered in rotating machinery.
These faults encompass a range of issues, including unbalance, misalignment, and mechanical
looseness, along with a representation of the normal operating condition. To create a comprehen-
sive and diverse dataset, various faults were deliberately introduced and simulated on a dedicated
test bench (figure 23) (Lucas Brito 2022).

The faults introduced in this dataset span multiple


components within the rotating machinery, providing
a holistic view of potential issues that can arise in
real-world scenarios. These components include the,
frequency inverter, motor, two distinct bearings, two
pulleys, bearing house, the belt, and the rotor (disc).
By incorporating these elements, the dataset offers a
nuanced understanding of how each component's
condition can impact the overall performance and
health of the machinery (Lucas, et al. 2023).
FIGURE 23. Test bench

Researchers and practitioners can leverage this me-


ticulously developed dataset to train and evaluate predictive maintenance models. It serves as a

47
valuable resource for testing the effectiveness of diagnostic algorithms in identifying and classifying
faults across these critical components. Moreover, the inclusion of normal operating conditions
allows for a baseline comparison, enabling the detection of anomalies and deviations from optimal
performance.

In essence, this dataset represents a vital tool in the realm of predictive maintenance, aiding in the
advancement of condition monitoring techniques and fault detection for rotating machinery. It em-
powers engineers and data scientists to proactively address issues, minimize downtime, and ex-
tend the lifespan of crucial industrial equipment.

The experimentation process involved a rigorous series of 20 tests, each thoughtfully designed to
encompass various operational conditions. These tests were evenly distributed across four distinct
conditions: normal operation, misalignment, unbalance, and mechanical looseness. In each of
these conditions, a set of five tests was meticulously executed, resulting in a comprehensive da-
taset for analysis and modelling. (Lucas Brito 2022)

Each individual test was conducted with precision, involving the continuous collection of data
through accelerometers. Specifically, every test encompassed four distinct sets of data acquisition,
with each set comprising 420 signal readings. These signals were gathered continuously through-
out the testing process, with each file containing a substantial 25,000 data points.

The sequence of tests was intentionally randomized, reflecting the unpredictability and variability
often encountered in real industrial settings. To further emulate the dynamic nature of industrial
environments, the test bench returned to its normal operating condition before the commencement
of each test. This realistic approach allowed for the introduction of specific faults and variations,
closely mirroring the challenges faced in practical industrial scenarios.

In essence, the experimental procedure was carefully designed to connect controlled laboratory
conditions and the complexities of industrial reality. This dataset not only serves as a valuable
resource for predictive maintenance but also as a testament to the commitment to accuracy and
authenticity in the pursuit of advancing fault detection and condition monitoring techniques.

Vibration, acoustic, temperature, and motor current dataset of rotating machine under var-
ying operating conditions for fault diagnosis is another great open dataset (Wonho Jung 2023).

48
This article introduces a comprehensive time-series dataset that encompasses a diverse range of
data types, shedding light on the intricate behaviour of rotating machines across varying operational
scenarios. The dataset is a product of meticulous data acquisition efforts, capturing essential pa-
rameters such as temperature, acoustic, vibration, and driving current data. This wealth of infor-
mation offers invaluable insights into the performance and health of these machines under different
operating conditions (figure 24).

The data collection process was executed with precision, employing a suite of sensors and instru-
ments. Specifically, the dataset comprises data acquired from four ceramic shear ICP-based ac-
celerometers, a sensitive microphone, two thermocouples for temperature monitoring, and three
current transformers (CT) designed in accordance with the international organization for standard-
ization (ISO) standards. This comprehensive sensor array ensures that critical aspects of the ma-
chine's behaviour are thoroughly captured (Wonho Jung 2023).

FIGURE 24. Test Bench

The dataset's conditions encompass a spectrum of operational scenarios, including normal opera-
tion, shaft misalignment, various bearing faults (both inner and outer races), and rotor unbalance.
To further enhance its utility, the dataset includes data collected under three conditions of torque
load are 0 Nm, 2 Nm, and 4 Nm. This multifaceted approach enables a holistic understanding of
how different faults and load conditions manifest in the collected data.

49
Additionally, this article presents a specific subset of the dataset, focusing on driving current data
and the vibration originating from a rolling element. They achieved under different speeds, ranging
from 680 RPM to 2460 RPM. This subset provides a deep dive into the behaviour of rolling element
bearings across different rotational speeds, a crucial aspect of machinery analysis and predictive
maintenance.

In building this thesis, the emphasis on a timeseries prediction model highlights the crucial need
for datasets that come with clear and well-defined time stamps. Consequently, the Microsoft Azure
predictive maintenance dataset has been selected to meet this crucial requirement. The inclusion
of time stamps in this dataset not only aligns with the temporal nature of the predictive maintenance
model but also ensures a robust foundation for accurate and effective timeseries predictions. The
Microsoft Azure dataset, renowned for its quality and relevance, enhances the credibility and ap-
plicability of the LSTM model, contributing to the overall robustness of the research endeavour.

6.1.2 Microsoft Azure Predictive Maintenance Dataset (Biswas 2020)

The dataset at the core of this thesis encapsulates a comprehensive array of information crucial for
predictive maintenance modelling (Biswas 2020). Each component is described below:

Telemetry Time Series Data (PdM_telemetry.csv):


• Content: This dataset provides average of rotation, voltage, vibration, and pressure de-
tected from 100 machines every hour throughout 2015 (Biswas 2020).
• Significance: The telemetry data serves as the foundation for analysing the operational
parameters of the machines, offering insights into their performance over time.

Error (PdM_errors.csv):
• Content: Documents machines errors in operation. They are not caused machine shut-
downs. However, to understand operational challenges they are important (Biswas 2020).
Maintenance (PdM_maint.csv):
• Content: Records instances where components of a machine are replaced, reflecting both
proactive and reactive maintenance scenarios. It encompasses records from both 2014
and 2015.
• Preventive Maintenance: Scheduled component replacements during regular visits.

50
• Reactive Maintenance: Unscheduled component replacements in response to break-
downs.

Failures (PdM_failures.csv):
• Content: Chronicles components replaced when failure is detected. These instances sig-
nify critical failures.

Metadata of Machines (PdM_Machines.csv):


• Content: Provides metadata including age of the machines and the model type.
• Significance: Offers contextual information about the machines, enabling a nuanced anal-
ysis of how different models and ages correlate with maintenance patterns.

In essence, this dataset (Biswas 2020) not only encapsulates the operational nuances of the ma-
chines but also meticulously records errors, maintenance events, and critical failures. The temporal
alignment with hourly telemetry data ensures a cohesive and synchronized foundation for the pre-
dictive maintenance model outlined in the thesis (Arnab 2023).

FIGURE 25. Reading the CSV files

51
After reading the data (figure25) it is time to print (figure 26) basic information about each dataset,
as the row numbers, columns, and data types.

FIGURE 26. Basic information about each dataset

To see characteristics of the telemetry data, a statistical summary was conducted using the ‘de-
scribe’ method in pandas (figure 27), with the parameter ‘datetime_is_numeric=True’. This al-
lowed for the inclusion of descriptive statistics for the datetime column, providing key information
about the temporal aspects of the dataset. The summary outlines minimum, 25th percentile, mean,
median, the count, 75th percentile, and maximum values for the datetime feature, offering a com-
prehensive overview of its distribution. This analysis aids in understanding the central tendencies,

52
variabilities, and temporal ranges present in the telemetry data, forming a crucial foundation for
subsequent temporal analyses and interpretations in this study (table 1).

FIGURE 27. Code to describe telemetry data.

TABLE 1. Description of telemetry data

Distribution of key features in telemetry data is visualized below (figure 28). The distribution of key
features in telemetry data is essential for gaining insights into the patterns and variability of crucial
parameters that characterize the operational state of machines. In the provided visualization, a
figure size of 15 by 8 inches accommodates histograms for the features 'volt,' 'rotate,' 'pressure,'
and 'vibration,' with the added benefit of kernel density estimation (KDE) for a smoother represen-
tation. Each histogram provides a snapshot of the distribution of values for its respective feature.
The x-axis in each subplot denotes the range of feature values, and the y-axis represents the count
of occurrences (frequency) within each range. This visual exploration aids in identifying central
tendencies, outliers, and the overall spread of values for each key telemetry feature. Such an anal-
ysis is crucial for understanding the data's statistical properties, uncovering potential anomalies,
and guiding subsequent steps in data preprocessing or machine learning model development.

53
FIGURE 28. Distribution of Key Telemetry Features

The histogram depicting the distribution of machine ages offers a succinct overview of the age
composition within the dataset (figure 30). With a figure size of 15 by 8 inches, the histogram pro-
vides a clear visualization of the frequency or count of machines across various age intervals. The
inclusion of a kernel density estimate (KDE) plot enhances the representation by offering a smooth
curve that captures the underlying pattern of age distribution (figure 29). The x-axis, labelled "Ma-
chine Age," indicates the age intervals, while the y-axis, labelled "Count," denotes the frequency of
machines within each interval. This graphical representation serves as a valuable exploratory tool,
shedding light on the age demographics of the machines under consideration. The insights derived
from this visualization contribute to a comprehensive understanding of the dataset, aiding in the
identification of any predominant age groups or patterns that may impact subsequent analyses or
decision-making processes.

FIGURE 29. Code to plot histogram of machine ages

54
FIGURE 30. Histogram showing Machine Ages

In the exploration of machinery health and maintenance dynamics, a crucial aspect is understand-
ing the components that are regularly replaced across the fleet. The horizontal bar plot presented
here offers a visual representation of the maintenance components replaced for each machine
(figure 32). With Machine IDs on the y-axis and the count of maintenance activities on the x-axis,
the plot provides an insightful snapshot of the fleet's maintenance patterns. The distinctive colours
assigned to various components enable a quick identification of the types of replacements occur-
ring. This visualization serves as a powerful tool for maintenance analysts and engineers to indetify
potential areas for improvement in the machinery maintenance processes.

FIGURE 31. Code to plot replaced components for each machine

55
FIGURE 32. Maintenance components replaced for each machine

56
FIGURE 33. Code to plot time series of telemetry data

The time series analysis of key telemetry features is paramount in understanding the longitudinal
behaviour of critical machine parameters. In this code snippet (figure 33), the 'datetime' column,
initially in standard format, is converted into a datetime type to enable temporal analysis. Subse-
quently, the 'datetime' column is set as the index, facilitating efficient time-based operations. To
achieve a clearer perspective of trends, the high-frequency telemetry data is resampled to a coarser
daily frequency, providing a more comprehensible overview.

57
The resulting time series plots vividly capture the variations in voltage, rotation, pressure, and vi-
bration over time. Each subplot represents one of these key features, offering a visual narrative of
how these parameters evolve daily. Such visualizations are invaluable for engineers and analysts
seeking to identify temporal trends, detect anomalies, and draw correlations between different te-
lemetry variables. The careful arrangement of subplots allows for a holistic exploration of machine
health dynamics, aiding in predictive maintenance and proactive decision-making.

FIGURE 34. Timeseries plots of key telemetry features

To visualize the patterns and trends preceding failures, telemetry data plotted (figure 34) for a sub-
set of machines. This subset was randomly selected to ensure a representative view across the
entire fleet. The plots showcase the fluctuations in vibration, voltage, and pressure parameters over
time. Additionally, failure events are marked on the plots, allowing for a direct correlation between
parameter variations and impending failures. This analysis provides a comprehensive understand-
ing of the machinery's behaviour prior to failure, offering valuable insights for predictive mainte-
nance strategies.

As observed from the plotted telemetry data, during normal machine operation, the values of vibra-
tion, voltage, and pressure exhibit a relatively stable pattern. This consistency in parameter values
is indicative of the machinery operating under typical conditions. However, a compelling insight
emerges as the period leading up to failures examined. Near failure events, there is a noticeable

58
shift in the behaviour of these parameters. Little time before a failure occurs, all three parameters—
vibration, voltage, and pressure experience rapid fluctuations. This abrupt change in the parameter
values serves as a precursor to the impending failure. The temporal proximity of these fluctuations
to the failure events suggests a potential causal relationship. The identification of these distinctive
patterns provides a crucial foundation for the development of predictive maintenance strategies.
By leveraging these insights, maintenance teams can proactively address irregularities in parame-
ter values, minimizing downtime and optimizing machinery performance.

FIGURE 35. Code to plot telemetry data with failures.

59
FIGURE 36. Telemetry data with Failures

In the presented analysis, this work is delved into the telemetry data of industrial machines to gain
insights into the patterns preceding failures. By focusing on two randomly selected machines, it is
scrutinized the behaviour of key parameters, namely vibration, voltage, and pressure, in the prox-
imity of failure events. To enhance clarity, a smoothing technique was applied to mitigate minor
fluctuations. The resultant plots (figure 36) vividly illustrate that, during normal operation, these
parameters exhibit a relatively stable trend. However, a distinctive pattern emerges as failure oc-
currences occurs. Several days before a failure, there is a notable surge in the vibration parameter,
indicating heightened mechanical activity. This observation is complemented by fluctuations in volt-
age and pressure, suggesting potential correlations between these variables and the impending

60
failure. These findings underscore the potential of predictive maintenance, where abnormal devia-
tions in telemetry parameters can serve as early indicators, allowing for proactive interventions to
prevent machinery failures.

FIGURE 37. Code to plot telemetry data around the failure points

61
FIGURE 38. Telemetry data around the failure points

In the subsequent analysis, understanding of machinery failures enriched by incorporating infor-


mation on the age of each industrial machine (figure 39 & 40). By merging the failures dataset with
the machine’s dataset, which includes the age of each machine, it is aimed to discern any discern-
ible patterns or trends related to the age of the machinery and its susceptibility to failures. The
resulting bar plot, displaying the count of failures categorized by machine age, provides a useful
graph of the distribution of failures across different age groups. From the data, a noteworthy trend
emerges, indicating that as machines get older, the count of failures tends to increase. This insight-
ful observation is instrumental in identifying whether there exists a correlation between the age of
a machine and its likelihood to encounter failures. Such insights are crucial for formulating effective

62
maintenance strategies, as they can inform decisions regarding the replacement or refurbishment
of aging machinery to mitigate the risk of unexpected failures.

FIGURE 39. Code to plot count of failures by machines ages

FIGURE 40. Count of failures by machine age

In this chapter, the Microsoft Azure Predictive Maintenance (PdM) dataset was meticulously exam-
ined, encompassing various dimensions such as telemetry data, maintenance records, error logs,
machine specifications, and machine failures. (Arnab 2023)

The exploration began with an assessment of data completeness and cleanliness, addressing
missing values and ensuring consistency across datasets. Subsequent analysis of the telemetry
data revealed patterns and fluctuations in machine parameters such as vibration, voltage, pressure,
and rotation.

63
The temporal aspects of failures were elucidated by integrating failures data with telemetry data.
This facilitated the visualization of parameter changes leading up to machinery failures, offering
valuable insights into potential precursors or indicators of malfunction.

The relationship between machine age and failures was also explored, revealing a notable trend –
an increase in failure occurrences as machines age. This finding carries practical implications for
maintenance strategies, prompting considerations for the replacement or refurbishment of aging
machinery.

By merging technical expertise with data-driven insights, this exploration of the Microsoft Azure
PdM dataset provides a solid foundation for subsequent chapters, where predictive modelling and
a robust maintenance development strategy with the help of machine learning algorithms will be
discussed.

6.2 Creation of PdM

In this chapter, PdM model is developed for predicting the next failure of rotating machines using
historical data. As the core of the exploration delves into Predictive Maintenance (PdM), a revolu-
tionary approach comes to light. Here, historical data and sophisticated algorithms intertwine, al-
lowing machines to whisper when they might need extra care. The focus lies on harnessing data
and intelligence to create a symbiotic relationship between machines.

Data Collection and Preparation: The initial step involves amassing a reservoir of historical data—
an indispensable asset. The data is meticulously groomed, standing as a testament to orderliness,
ready for algorithms to orchestrate.

Feature Engineering: Features take the spotlight—a cast of traits extracted from machine vibra-
tions, timestamps, and the rhythm of the week. This process mirrors the curation of a culinary
masterpiece, where each ingredient contributes a unique nuance to the overall flavour of the model.

64
Model Selection and Training: The next act unfolds in the realm of model creation. Choosing the
right ensemble of tools, infusing smart algorithms, and teaching the model to discern patterns be-
come the focal points. This parallels the art of instructing a keen learner—a digital companion eager
to grasp the intricacies of the machine's narrative.

Model Evaluation and Validation: But how is the brilliance of the creation measured? Testing it
against unseen data ensures it transcends mere memorization, understanding the broader context.
It is akin to a final exam for the model, where excellence is the benchmark.

6.2.1 Data Collection and Preparation

The first step of PdM model development is collection of dataset. This data should include both
failure events and non-failure events. For rotating machines, relevant data may include sensor
readings, vibration measurements, and maintenance records (Gunawan 2021).

In this work focus is on the telemetry and failures data. Therefore, they were merged on 'datetime'
and 'machineID'. It should to be pre-processed and cleaned after their collections (figure 41). This
involves removing outliers, scaling the data to a consistent range and handling missing values. The
goal of data preparation is to ensure high quality data is suitable for use in machine learning algo-
rithms.

FIGURE 41. Code to merge telemetry and failure data

65
When data types of the merged_data checked, identified that datetime is object. The
astype('datetime64[ns]') method is used to convert the 'datetime' column to the datetime data type.
This conversion is important for accurate date-based analysis and visualization (figure 42).

FIGURE 42. Code to check data types, convert objects to int and datetime64 format

The 'failure' column is mapped to numerical values using the map method and the specified comp
mapping dictionary. This conversion is useful for numerical analysis and modelling, providing a
clear representation of failure types.

The chosen time span for the dataset encompasses one year, starting from January 1, 2015, to
January 1, 2016. This deliberate selection allows for a comprehensive exploration of machine be-
haviour across seasons and operational conditions. To rigorously assess the model's predictive
prowess, a strategic division is made: the final three months serve as a dedicated testing ground,
unveiling the model's effectiveness in foreseeing impending issues. The preceding nine months
are earmarked for training, facilitating the model's immersion in the nuances of historical data (fig-
ure 43). This temporal segregation aligns with the best practices in machine learning, ensuring a
robust and reliable Predictive Maintenance model.

66
FIGURE 43. Splitting dataset to train and test

As it is obvious there are 100 machines data in the dataset. To avoid overloading the model and
improve visualization clarity one machine data is going to be used (figure 44).

FIGURE 44. One machine data

Feature Engineering

Feature engineering is selecting process of relevant features from the raw data. create_feature
function is used to generate features for the selected machine based on vibration, timestamp, and
day of the week. The features are combined, and the dataset is split into input (X) and output (y)
for training a machine learning model. The vibration data is normalized using Min-Max scaling, and
the scaler is returned for later use (figure 45).

shape_sequence function is used to create sequences of specified length from input data. It then
shapes the input (X) and output (y) sequences and splits them into training and validation sets
using train_test_split.

67
FIGURE 45. Code to process relevant features from raw data

68
6.2.2 Model Selection and Training

Developing the Predictive Maintenance (PdM) model involves a meticulous design and configura-
tion of the underlying neural network architecture. The pivotal function in this process is the "cre-
ate_model" function, which serves as the blueprint for the LSTM (Long Short-Term Memory) model.

At the heart of the model lie the LSTM layers, specialized for gathering temporal dependencies and
long-range relations within the time-series data. These layers are augmented by BatchNormaliza-
tion, a technique that enhances training stability and convergence by normalizing inputs between
layers. The inclusion of Dense layers adds a dimensionality to the network, allowing it to learn
relationship between data points which is not visible.

To safeguard against overfitting, a common challenge in machine learning, a dropout layer is intro-
duced. Dropout probabilistically disables a proportion of neurons during training, preventing any
single node from becoming overly specialized and enhancing the model's generalization capabili-
ties.

Loss function and optimizer are paramount in guiding the training process. In this model, the
RMSprop optimizer is employed with a specified learning rate. RMSprop individually adjusts the
learning rate, providing a dynamic and efficient optimization process. MeanSquaredError serves
as the loss function, evaluating deviation between estimated and actual outcomes. This choice
aligns with the regression nature of the problem, aiming to minimize the squared differences.

A crucial addition to the model is the implementation of early stopping. Performance of model on
validation dataset is tracked during training and when it stops improving, the process is stopped.
By this way training data is prevented from overfitting.

In summary, the "create_model" function orchestrates a sophisticated architecture, leveraging


LSTM layers, BatchNormalization, Dense layers, dropout, and careful choices of optimizer and loss
function. This amalgamation of components forms the backbone of a robust Predictive Mainte-
nance model, poised to glean insights, and predict impending issues in rotating machinery (figure
46).

69
FIGURE 46. Model

6.2.3 Model Evaluation and Validation

Following the intricate process of training the Predictive Maintenance (PdM) model, a comprehen-
sive evaluation of its performance becomes imperative. This evaluation unfolds in two stages: a

70
systematic examination based on established metrics, as per conventional practices, and the vali-
dation of the model through real-world examples.

In adherence to the standard protocols of model assessment, the focus shifts to metrics that provide
a quantitative measure of its efficacy. Accuracy, F1-score, recall, and precision take centre stage
in this analysis. Accuracy quantifies the proportion of instances correctly classified by the model,
providing a measure of its overall predictive performance. Precision delves into the accuracy of
positive predictions, determining the ability of model to prevent false positives. In turns, recall as-
sesses proficiency of model in capturing all relevant instances, especially those designated as pos-
itive. Combination of precision and recall is called F1-score, which provides a balanced and com-
prehensive assessment of the model's effectiveness.

Cross Validation is a strategic method which diverges from using the entire dataset for training,
opting instead to reserve a portion for testing the model. Among various types, K Fold Cross Vali-
dation stands out as a commonly utilized approach. The K Fold Cross Validation technique divides
the initial dataset into k smaller parts, often referred to as folds. This process iterates k times, where
one-fold is exclusively allocated for testing, the rest k-1 folds contribute to model training. Every
data point, in turn, serves as both a testing and training subject. This technique proves effective in
generalizing the model and minimizing error rates (Jhimlic1 2023).

Holdout: Holdout, a simpler approach, finds application in neural networks and numerous classifi-
ers. In this technique, the dataset is straightforwardly splitting into training and test datasets, typi-
cally adhering to ratios like 70:30 or 80:20. Training the model takes most of the data partition and
testing reserve only small part. This uncomplicated yet efficient method provides a clear delineation
between training and testing phases, facilitating a straightforward evaluation of model performance
(Jhimlic1 2023).

6.2.4 Classification Metrics

The main goal of classification task is predicting the target variable, typically represented by dis-
crete categories. Here are key benchmarks used to assess and compare the performance of clas-
sification models (Amsten 2023):

71
Classification Accuracy. This is a common metric, representing measurement the proportion of
correctly classified cases out of the entire dataset. It performs adequately when classes are evenly
distributed, it can present an inaccurate assessment when class imbalances exist.

𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 𝑜𝑜𝑜𝑜 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝


𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴 = 𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇 𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛 𝑜𝑜𝑜𝑜 𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝
, (9),

For instance, there is a binary classification problem. From samples 90% classified to Class A and
10% to Class B. A model that simply predicts all samples as Class A would achieve a high accuracy
of 90%. However, it completely misses the samples of Class B, leading to a high false positive rate
for Class B. If it is tested with 40% from class B and 60% from class A, it will achieve only 60%
(Amsten 2023).

Logarithmic Loss. Log Loss is indeed a critical metric in the realm of classification, particularly for
multi-class scenarios. Its fundamental principle lies in penalizing false classifications, especially
false positives.

Here is how it operates: the classifier is required to assign probabilities for each class across all
samples. The key idea is not just whether the model predicts the correct class but also how certain
or uncertain it is about its predictions. This is where the probabilistic nature of Log Loss comes into
play.

It is broken down a bit more below. For each sample, the classifier provides a set of probabilities
corresponding to each class. Log Loss then evaluates how well these predicted probabilities align
with the true class labels. The more confident and accurate the predictions, the lower the Log Loss.

The penalization aspect is crucial, particularly when dealing with false positives. A high Log Loss
indicates that predictions of model are significantly different than actual outcomes, emphasizing
the need for precise and well-calibrated probability estimates for each class.

In essence, Logarithmic Loss goes beyond simple accuracy metrics, providing a nuanced under-
standing of a model's performance in scenarios where probabilities and uncertainties matter, as is
often the case in multi-class classification tasks.

72
F1 Score is a metric nestled between recall with precision, operates as a harmonic mean, and
resides within the range of [0, 1]. This metric serves as a comprehensive indicator of how precise
(correctly classifies instances) and robust (does not miss significant instances) our classifier is.

Components of F1 Score is shown below:

Precision served as a vital gauge of performance of a model, quantifying the proportion of positive
instances among all positive classifications. The precision score is achieved by dividing the true
positive predictions number to sum of false positive and true positive predictions (Amsten 2023).

𝑇𝑇𝑇𝑇
𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 = 𝑇𝑇𝑇𝑇+𝐹𝐹𝐹𝐹
, (10),

Recall (sensitivity) assesses the ability of model to detect relevant instances. It can be found divid-
ing true positive predictions numbers to the sum of false negative and true positive predictions
(Amsten 2023).

𝑇𝑇𝑇𝑇
𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅 = 𝑇𝑇𝑇𝑇+𝐹𝐹𝐹𝐹
, (11),

While lower recall and higher precision contribute to overall accuracy, they might result in missing
a substantial number of instances. This is where the F1 Score comes into play, offering a balanced
perspective. The F1 Score harmonise recall and precision. It is expressed mathematically as
(Amsten 2023):

1
𝐹𝐹1 = 2 ∗ 1 1 , (12),
+
𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅

If F1 Score is high, it signifies a more balanced and effective performance, considering both recall
and precision in tandem.

By combining these metrics, we gain a comprehensive understanding of ability of model in detect-


ing positive and ignoring false negatives.

73
6.2.5 Regression Evaluation Metrics

When tasked with predicting continuous values of the target variable, several evaluation metrics
come into play to gauge the model performance. These metrics have been explored below (Amsten
2023):

Mean Absolute Error (MAE) shows the degree to which predicted values deviate from actual val-
ues. It gives a quantitative assessment of how closely the model's predictions resemble the true
output. Mathematically, it is expressed as (Amsten 2023):

1
𝑀𝑀𝑀𝑀𝑀𝑀 = 𝑁𝑁 ∑𝑁𝑁 �𝑗𝑗 ] ,
𝑗𝑗=1[𝑦𝑦𝑗𝑗 − 𝑦𝑦 (13),

Mean Squared Error (MSE): Like MAE, MSE takes squared differences average of original and
predicted values. This metric holds an advantage in gradient calculations compared to MAE. It
emphasizes larger errors, allowing a focus on significant deviations. Mathematically (Amsten 2023):

1
𝑀𝑀𝑀𝑀𝑀𝑀 = 𝑁𝑁 ∑𝑁𝑁 �𝑗𝑗 )2 ,
𝑗𝑗=1(𝑦𝑦𝑗𝑗 − 𝑦𝑦 (14),

Root Mean Square Error (RMSE): RMSE is derived by taking the square root of the MSE, offering
a metric that is sensitive to larger errors. It acknowledges the impact of outliers on predictions. The
formula is (Amsten 2023):

∑𝑁𝑁 � 𝑗𝑗 )2
𝑗𝑗=1(𝑦𝑦𝑗𝑗 −𝑦𝑦
𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅 = � , (15),
𝑁𝑁

Root Mean Squared Logarithmic Error (RMSLE): RMSLE comes into play when the target vari-
able exhibits a board variety of values. The metric is more sensitive to underestimation than over-
estimation, catering to scenarios where precision in lower values is crucial. The formula is a modi-
fication of the RMSE, accommodating logarithmic transformations (Amsten 2023):

∑𝑁𝑁 � 𝑗𝑗 +1�−log�𝑦𝑦𝑗𝑗 +1�)2


𝑗𝑗=1(log�𝑦𝑦
𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅 = � , (16),
𝑁𝑁

74
R2 Score also famous as the determination coefficient, evaluates the performance of a linear re-
gression model. It assesses how well the observed results align with the predictions, considering
the possibility of diverse output values, shaped by the input variables.

The journey into the evaluation phase of our code, will witness how these metrics come to life,
enriching understanding of the correctness and precision of our predictive models. Hands-on ex-
ploration of our model's prowess is described below (figure 47).

FIGURE 47. Metrics of the model.

Mean Squared Error (MSE) assesses a model's predictive accuracy by computing the mean
squared variance between forecasted and actual values. A lower MSE signifies greater precision
in the model's performance. For instance, a model MSE of 0.0199 suggests minimal squared error
between predicted and actual values.

Root Mean Squared Error (RMSE) a measure of error magnitude derived from MSE, represents
the mean absolute variance between forecasted and actual values (Zahra Khalilzadeh 2023). With
an RMSE of 0.1411, the predictions od model tend to be almost same with the right values.

Mean Absolute Error (MAE): A value of 0.1097 shows that the mean absolute variance between
forecasted and right values is small. It indicates that the predictions of model are generally same
with the true values.

This segment scrutinizes the proficiency of model in dealing with testing data, specifically focusing
on its capacity to detect faults. By plotting the down sampled predictions alongside the original
values, an informative graphical illustration of how well the model identifies anomalies in the vibra-
tion patterns is gained. To quantify its precision, MSE highlights discrepancies between actual and
predicted values is employed. The resulting visualizations, depicted on a matplotlib graph which on
the x-axis datetime and on the y-axis vibration values, serve as a fault detection tool, offering a
comprehensive understanding of the model's fault-sensing capabilities.

75
FIGURE 48. Down sampled predictions vs original values

76
It evaluates the performance of model on a specific date in the testing data, computes the testing
error using mean squared error, and visualizes the predictions and original values for the first 200
datapoints. The x-axis shows datetime, and the y-axis illustrates vibration values (figure 49).

FIGURE 49. Predicted vibration vs original when failure detected

77
As evident, the prediction of vibration closely aligns with the actual values, particularly during ab-
normal conditions. The subsequent test data serves as a litmus test, gauging the model's capability
to accurately predict the next three months' data, which a segment is not incorporated in the training
dataset.

FIGURE 50. Selecting test data for machine 17

A similar procedure will be executed with the reserved test data, mirroring the process (figure 50,
51 & 52).

FIGURE 51. Preparing raw data to process

78
FIGURE 52. Predicted pressure vs original on test data.

79
The plotted graph below illustrates the timeframe during which a failure was detected (figure 53).

FIGURE 53. Predicted pressure vs original when failure detected

80
The accuracy of the model is further validated using different machine data, confirming the robust-
ness of the predictions.

FIGURE 54. Selecting test data from machine 82

The same processes as described above are repeated once more for the data from machine 82,
reaffirming the model's consistency and adaptability across various machine datasets (figures 54,
55 and 56).

FIGURE 55. Preparing raw data to process

81
FIGURE 56. Predicted pressure vs original on machine 82

82
The time frame during which failures are detected is utilized to create a more comprehensive and
visually insightful comparison diagram (figure 57).

FIGURE 57. Predicted pressure vs original on machine 82 when failure is detected

83
In the ongoing refinement of our predictive maintenance model, the introduction of a failure trigger
point takes centre stage, enabling proactive planning for anticipated failure dates. This trigger point
strategically flags the predicted failure instances, offering a foresighted perspective for planning
and pre-emptive maintenance. By incorporating this trigger mechanism, we shift from mere antici-
pation to actionable insights, allowing us to prepare and strategize well in advance. This approach
not only amplifies the practical utility of our predictive model but also empowers us to undertake
timely interventions, minimizing downtime and optimizing the overall efficiency of our machinery.

The threshold for triggering failures was applied to the dataset represented by failures_trig-
gered_82, which was a copy of the df_sel_test_82 data frame within the specified time range. To
ensure compatibility, the lengths of the predicted values (y_pred_test) and failures_triggered_82
were compared, and the column 'predicted_failure' was initialized with zeros. A threshold value of
0.75 was chosen for triggering failures (figure 58).

Subsequently, the code checked if the number of data points (num_points) was greater than zero
before attempting to set values based on the calculated mask. The mask, determined by comparing
the predicted values to the threshold, was applied to the 'predicted_failure' column of failures_trig-
gered_82. The dates corresponding to the triggered failures were then printed.

FIGURE 58. Creating trigger points

Copying df_sel_test_82 was done to create a separate DataFrame, failures_triggered_82, that is


independent of the original dataset. This copy ensures that modifications made to the 'pre-
dicted_failure' column and any other changes during the threshold application process do not affect
the original dataset.

84
By working with a copy, maintains the integrity of the source
data, allowing for a clean and isolated application of the failure-
triggering mechanism. This approach is a good practice in data
analysis and modelling to avoid unintended consequences and
to facilitate the exploration of various thresholds and parame-
ters without altering the original dataset.

FIGURE 59. Triggered failure


date and times From the tests conducted, the model exhibits a commendable
ability to predict telemetry data across various machines,
providing a solid foundation for fault detection. It is noteworthy that these results are derived from
a dataset spanning only one year, showcasing the model's promising fault-prediction capabilities
within a limited timeframe. As the dataset continues to grow, the model is expected to enhance its
understanding and further refine its predictions over time, particularly in detecting and addressing
faults. The current performance sets a promising baseline, and with the influx of continuous data,
the model's accuracy is anticipated to ascend to even greater heights, particularly in the realm of
fault detection.

This affirmation is rooted in the inherent nature of machine learning models, where prolonged ex-
posure to evolving data leads to enhanced learning and adaptability. The model's predictive prow-
ess, already evident in the existing dataset, lays the foundation for future advancements in accu-
racy and reliability. As industries continue to accumulate data, the model's capacity to discern pat-
terns, anomalies, and subtle shifts in machinery behaviour is expected to flourish.

85
7 CONCLUSION

In conclusion, this thesis delves into the intricate realm of Predictive Maintenance (PdM) for rotating
machines, employing a meticulous approach that combines feature engineering, collection of data,
preprocessing, and the deployment of advanced models of machine learning. The foundational
strategy involves the comprehensive collection of historical data, encompassing both malfunction-
ing and functional events. Telemetry and failure data take centre stage, coalescing on 'datetime'
and 'machineID' for detailed understanding of the machine's operational dynamics.

The subsequent phase focuses on refining and preparing the dataset, involving cleaning, prepro-
cessing, and meticulous handling of missing values and outliers. Key conversions, such as map-
ping the 'failure' column to numerical values, enhance the dataset's utility for numerical analysis
and modelling.

Feature engineering emerges as a crucial stride, where vibration, timestamp, and day of the week
amalgamate to generate meaningful features. The sequences are then shaped and split into input
(X) and output (y), setting the stage for model training. A detailed examination of the data's temporal
and vibrational aspects ensures the model is attuned to the details of machine behaviour.

In navigating the intricacies of this research journey, the strategic selection and training of the Long
Short-Term Memory (LSTM) model emerge as a pivotal crossroads. The architecture of the model,
intricately weaving LSTM layers, BatchNormalization, Dense layers, and dropout mechanisms, is
a testament to meticulous design aimed at preventing overfitting. The optimizer and loss function
choices are calibrated to the nuanced challenges of the problem, fortified by early stopping mech-
anisms that refine the model's robustness.

The subsequent phases of evaluation and validation cast a spotlight on the model's proficiency,
subjecting it to meticulous scrutiny through metrics like precision, accuracy, F1-score and recall.
The marriage of these quantitative assessments with insightful visualizations, powered by mat-
plotlib, vividly depicts the predictive prowess of the model as it navigates through down sampled
predictions and original values.

86
As the model strides confidently through the validation phase, the spotlight shifts to a rigorous
examination of its performance on unseen data. Tasked with predicting the next three months of
data, this phase serves as a critical litmus test, assessing the model's adaptability to real-world
conditions. The verification process seeks to unveil the model's capacity to seamlessly apply its
acquired knowledge to unforeseen scenarios.

In essence, this thesis unravels the intricate dance between data science and the realm of predic-
tive maintenance. The fusion of scrupulous data handling, innovative feature engineering, and the
deployment of sophisticated models culminates in a robust framework adept at anticipating and
mitigating machinery failures. Beyond its academic contributions, this work lays a pragmatic foun-
dation for industries in pursuit of efficient and proactive maintenance strategies.

Amidst the ever-evolving landscape of machine learning and predictive analytics, this thesis stands
as a guiding light, showcasing the transformative power of insights derived from data-driven meth-
odologies in reshaping the landscape of maintenance strategies. The quest for enhanced reliability
and operational efficiency in rotating machinery finds a formidable ally in the precision and foresight
offered by predictive maintenance models. The addition of triggered predicted failure dates ampli-
fies this foresight, enabling proactive planning for optimal machinery performance.

87
REFERENCES

Aditianu1998. “Understanding of LSTM Networks.” geeksforgeeks. 05 June 2023.


https://www.geeksforgeeks.org/understanding-of-lstm-networks/ (accessed November 29, 2023).

Amsten. “Evaluation Metrics in Machine Learning.” Geeksforgeeks. 05 May 2023.


https://www.geeksforgeeks.org/metrics-for-machine-learning-model/ (accessed December 01,
2023).

Andreas Theissler, Judith Pérez-Velázquez, Marcel Kettelgerdes, Gordon Elger. “Predictive


maintenance enabled by machine learning: Use cases and challenges in the automotive industry.”
Reliability Engineering & System Safety 215 (2021): 107864.

Arnab, Biswas. “Predictive Maintenance: Exploratory Data Analysis.” Kaggle. 2023.


https://www.kaggle.com/code/arnabbiswas1/predictive-maintenance-exploratory-data-analysis
(accessed November 30, 2023).

Biswas, Arnab. “Microsoft Azure Predictive Maintenance.” Kaggle. 2020.


https://www.kaggle.com/datasets/arnabbiswas1/microsoft-azure-predictive-maintenance/data
(accessed November 30, 2023).

Brownlee, Jason. How Machine Learning Algorithms Work (they learn a mapping of input to output).
12 08 2019. https://machinelearningmastery.com/how-machine-learning-algorithms-work/
(accessed November 28, 2023).

D. Manno, G. Cipriani, G. Ciulla, V. Di Dio, S. Guarino, V. Lo Brano. “Deep learning strategies for
automatic fault diagnosis in photovoltaic systems by thermographic images.” Energy Conversion
and Management 241 (2021).

Das, Oguzhan, Duygu Bagci Das, and Derya Birant. “Machine learning for fault analysis in rotating
machinery: A comprehensive review.” Heliyon 9, no. 6 (2023): e17584.

Donges, Niklas. A Guide to Recurrent Neural Networks: Understanding RNN and LSTM Networks.
28 02 2023. https://builtin.com/data-science/recurrent-neural-networks-and-lstm (accessed
November 25, 2023).

88
Eleonora, Florian, Sgarbossa Fabio, and Zennaro Ilenia. “Machine learning-based predictive
maintenance: A cost-oriented model for implementation.” International Journal of Production
Economics 236 (2021): 108114.

Fajar. A Brief Introduction to Vibration Analysis of Process Plant Machinery (VI). n.d.
https://freevibrationanalysis.blogspot.com/2011/08/brief-introduction-to-vibration_5714.html?m=1
(accessed November 26, 2023).

Garcia-Canadilla, Patricia, Sergio Sánchez Martínez, Fatima Crispi, and Bart Bijnens. “Machine
Learning in Fetal Cardiology: What to Expect.” Fetal Diagnosis and Therapy 47, no. 5 (2020): 1-
10.

Gunawan, Jeffrey. “Predictive Maintenance: Time-Series Forecasting.” Kaggle. 2021.


https://www.kaggle.com/code/jegun19/predictive-maintenance-time-series-forecasting (accessed
November 27, 2023).

Hitchcock, Leith. Machinery Lubrication. 3 2004.


https://www.machinerylubrication.com/Read/583/thermal-imaging-lubrication (accessed
November 29, 2023).

Hoeser, Thorsten, and Claudia Kuenzer. “Object Detection and Image Segmentation with Deep
Learning on Earth Observation Data: A Review-Part I: Evolution and Recent Trends.” Remote
Sensing, 2020: 12.

IBM. IBM. n.d. https://www.ibm.com/topics/recurrent-neural-networks (accessed October 29,


2023).

Jhimlic1. “Machine Learning Model Evaluation.” Geeksforgeeks. 09 January 2023.


https://www.geeksforgeeks.org/machine-learning-model-evaluation/ (accessed November 30,
2023).

Jiawei Han, Micheline Kamber, Jian Pei. “3 - Data Preprocessing.” In Data Mining (Third Edition),
by Morgan Kaufmann, 83-124. 2012.

Le, Xuan Hien & Ho, Hung & Lee, Giha & Jung, Sungho. “Application of Long Short-Term Memory
(LSTM) Neural Network for Flood Forecasting.” Water 11, no. 7 (2019): 1387.

89
Liang Huang, Shijie Chen, Zaixun Ling, Yibo Cui, Qiong Wang. “Non-invasive load identification
based on LSTM-BP neural network.” Energy Reports 7, no. 1 (2021): 485-492.

Lu, Yang. “Industry 4.0: A survey on technologies, applications and open research issues.” Journal
of Industrial Information Integration, 2017: 1-10.

Lucas Brito, Antonio Susto Gian, Nei Brito Jorge, Marcus Duarte. “An explainable artificial
intelligence approach for unsupervised fault detection and diagnosis in rotating machinery.”
Mechanical Systems and Signal Processing 163 (January 2022).

Lucas, Brito, Susto Gian Antonio, Brito Jorge Nei, and Duarte Marcus. “Mechanical faults in rotating
machinery dataset (normal, unbalance, misalignment, looseness).” Vers. 3. Mendeley Data. 28 July
2023. https://data.mendeley.com/datasets/zx8pfhdtnb/3 (accessed October 20, 2023).

Mais, Jason, and Scott Brady. Introduction guide to vibration. SKF USA. May 2002.
https://cdn.skfmediahub.skf.com/api/public/0901d196802179ea/pdf_preview_medium/0901d1968
02179ea_pdf_preview_medium.pdf (accessed September 15, 2023).

Meriem Bahi, Mohamed Batouche. “Deep Learning for Ligand-Based Virtual Screening in Drug
Discovery.” 3rd International Conference on Pattern Analysis and Intelligent Systems (PAIS). 2018.
1-5.

Muruganantham, Dr Bubathi. “Introduction to Vibration-Based Condition Monitoring.” Application


Engineering Manager, Honeywell Process Solutions, 2022.

Olah, Christopher. “colah's blog.” 27 August 2015. http://colah.github.io/posts/2015-08-


Understanding-LSTMs/ (accessed November 28, 2023).

Oliveira Ewerton Cristhian Lima de, Santana Kauê, Junior Paulo Sergio Taube, Lima Anderson
Júnior Claudomiro de Souza de Sales. “Biological Membrane-Penetrating Peptides: Computational
Prediction and Applications.” Frontiers in Cellular and Infection Microbiology, 2022: 12.

Omega. How to Measure Acceleration? n.d. https://www.omega.com/en-


us/resources/accelerometers (accessed September 17, 2023).

Pemmada, Suresh Kumar & Behera, Prof. Dr. H. & Nayak, Janmenjoy & Naik, Bighnaraj.
“Correlation-based modified long short-term memory network approach for software defect
prediction.” Evolving Systems 13, no. 7 (2022).

90
Randall, Robert Bond. Vibration–based Condition Monitoring: Industrial, Automotive and
Aerospace Applications, Second Edition. John Wiley & Sons, Ltd, 2021.

Roberto M. Souza, Erick G.S. Nascimento, Ubatan A. Miranda, Wenisten J.D. Silva, Herman A.
Lepikson. “Deep learning for diagnosis and classification of faults in industrial rotating machinery.”
Computers & Industrial Engineering 153 (2021): 107060.

Ruonan Liu, Boyuan Yang, Enrico Zio, Xuefeng Chen. “Artificial intelligence for fault diagnosis of
rotating machinery: A review.” Mechanical Systems and Signal Processing 108 (2018): 33-47.

S. Bagavathiappan, B.B. Lahiri, T. Saravanan, John Philip, T. Jayakumar. “Infrared thermography


for condition monitoring – A review.” Infrared Physics & Technology 60 (2013): 35-55.

Saeed, Mehreen. An Introduction to Recurrent Neural Networks and the Math That Powers Them.
06 January 2023. https://machinelearningmastery.com/an-introduction-to-recurrent-neural-
networks-and-the-math-that-powers-them/ (accessed October 11, 2023).

SANGEETHA, PADIGI REDDY, Kaavya Shree, and Shravani Upadhyay. Computer Vision in
Space Science Technology: Advancements and Applications. 08 2019.
https://medium.com/@prsangeetha/computer-vision-in-space-science-technology-advancements-
and-applications-fcfcaf3aea8d (accessed September 21, 2023).

Siemens. “Condition Monitoring Systems SIPLUS CMS2000.” Siemens operating instruction, 2016.

Staudemeyer, Ralf, and Eric Morris. “Understanding LSTM -- a tutorial into Long Short-Term
Memory Recurrent Neural Networks.” 2019.

Wonho Jung, Seong-Hu Kim, Sung-Hyun Yun, Jaewoong Bae, Yong-Hwa Park. “Vibration,
acoustic, temperature, and motor current dataset of rotating machine under varying operating
conditions for fault diagnosis.” Data in Brief 28 (2023): 109049.

Yan Yan, Hong-yan Xing,. “A sea clutter detection method based on LSTM error frequency domain
conversion.” Alexandria Engineering Journal 61, no. 1 (2022): 883-891.

Zahra Khalilzadeh, Motahareh Kashanian, Saeed Khaki, Lizhi Wang. “A HYBRID DEEP
LEARNING-BASED APPROACH FOR OPTIMAL.” arxiv.org. 25 September 2023.
https://export.arxiv.org/pdf/2309.13021 (accessed 01 27, 2024).

91
Zhiqin Zhu, Yangbo Lei, Guanqiu Qi, Yi Chai, Neal Mazur, Yiyao An, Xinghua Huang. “A review of
the application of deep learning in intelligent fault diagnosis of rotating machinery,.” Measurement
206 (2023): 112346.

92

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy