0% found this document useful (0 votes)
132 views8 pages

Developing A Maintenance Strategy

The document discusses developing an effective maintenance strategy. It explains that a maintenance strategy should be developed in three stages: 1) Identifying the maintenance requirements of each asset based on its functions, 2) Acquiring the necessary resources like personnel, spare parts and tools to execute the strategy, and 3) Implementing systems to manage the resources efficiently. It recommends using a Reliability-Centered Maintenance (RCM) approach which involves asking questions about an asset's functions, potential failures, causes, effects and preventative actions. RCM helps formulate a customized maintenance strategy for each asset.

Uploaded by

Paulo Roberto
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
132 views8 pages

Developing A Maintenance Strategy

The document discusses developing an effective maintenance strategy. It explains that a maintenance strategy should be developed in three stages: 1) Identifying the maintenance requirements of each asset based on its functions, 2) Acquiring the necessary resources like personnel, spare parts and tools to execute the strategy, and 3) Implementing systems to manage the resources efficiently. It recommends using a Reliability-Centered Maintenance (RCM) approach which involves asking questions about an asset's functions, potential failures, causes, effects and preventative actions. RCM helps formulate a customized maintenance strategy for each asset.

Uploaded by

Paulo Roberto
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

DEVELOPING A MAINTENANCE STRATEGY

Presented by
Peter Stock
New Dimension Solutions, Inc.

1 INTRODUCTION
The world of physical asset management has changed dramatically over the last twenty to thirty years.
These changes have come in light of businesses becoming more and more dependant on machines, leading
to an explosive growth in the numbers of machines that need to be maintained throughout the world. The
designs of the physical assets have also changed from robust over-designed machines that required minimal
maintenance in 1940’s to more complex and highly automated, mechanized processes of today. Along with
this growth our expectations as users and owners of these machines have also changed. Previously our
focus was primarily on minimizing downtime and reducing maintenance costs, whereas today not only do
we focus just higher availability but we also focus on higher reliability, as well as better product quality.
Our attention has also been focused on those failures that have serious safety and environmental
consequences. No longer is it acceptable to allow equipment to fail where it is not going to conform to
society’s safety and environmental expectations, otherwise we will get shutdown. Finally, as our
dependence on physical assets grows so does the cost of owning and operating them continue to escalate.
This is going to have an impact on our return on investment if it not managed properly, hence we need to
ensure that the machines operate efficiently and are maintained cost effectively throughout their
technological useful lives.

Another area of change is challenging our fundamental belief about the relationship between operating age
and failure. In the past we were led to believe that most failures were age related and as a result developed
maintenance policies accordingly. This was generally true for those failures where the equipment came into
direct contact with the product, i.e. a pump impeller. However, through new research, it is apparent that
there is less and less connection between the operating age of most physical assets and how likely they are
to fail. We need to point out though that the equipment itself has become more complex and has led to
substantial changes in the failure behavior. Consequently, there is not just one pattern of failure but six and
these need to be recognized when selecting suitable failure management policies.

Finally, another area where there has been rapid growth is in new maintenance techniques and concepts.
Looking at condition monitoring on its own there are between 300 and 400 techniques of which about a
third can be applied effectively to modern day physical assets. Engineers are also are a lot more focused
about reliability and maintainability when it comes to designing equipment. Another trend with
organizations today is the shift towards teamwork and more involvement of the work force in decision-
making. In light of the above, the challenge facing organizations today is to find out which of these
techniques will be worthwhile and cost effective.

The next part of this paper describes a developing a physical asset management strategy process that
enables users to address the most important changes simultaneously, and ends with how this process was
applied at the Massachusetts Water Resources Authority.

2 DEVELOPING A MAINTENANCE STRATEGY


Given all the day-to-day pressures facing maintenance managers, the first question is where does one start?

The answer lies with the fact that every physical asset is put into service because someone wants its to do
something. In other words, it is expected to fulfil a specific function or functions. Therefore maintenance is

1
all about preserving the functions of physical assets to ensure they continue to do what their users want
them to do. It is only when these functions have been defined that it becomes clear exactly what
maintenance is trying to achieve and precisely what is meant by “failed”. This makes it possible to move on
the next step, which is to identify the reasonably likely causes and effects of each failed state. Once failure
causes (or failure modes) and effects have been identified, we are then in a position to assess how much
each failure matters. This in turn enables us to determine which of the full array of failure management
options should be used to manage each failure mode.

At this point, we have decided what must be done to preserve the functions of our assets. This process is
often called “work identification”.

When the tasks that need to be done – the maintenance requirements of each asset – have been clearly
identified, the next step is to decide sensibly what resources are needed to do each task by asking the
following questions:
• Who is to do each task: skilled maintainer? The operator? A contractor? The training department
(if training is required)? Engineers (if the asset must be redesigned)?
• What spares and tools are needed to do each task, including tools such as condition monitoring
equipment.
It is only when the resource requirements are clearly understood that we can decide exactly what systems
are needed to manage the resources in such a way that the task gets done, and hence that the functions of
the assets are preserved.

This process can be likened to building a house. The foundations are the maintenance requirements of each
asset, the walls are the resources required to fulfil the maintenance requirements (people/skills and
spares/materials/tools) and the roof represents the systems needed to manage the resources (CMMS).

To summarize, a maintenance strategy is developed and executed in three stages:


• To formulate a maintenance strategy for each asset (work identification)
• Acquire the resources needed to execute the strategy effectively (people, spares, tools)
• Execute the strategy (acquire, deploy and operate the systems needed to manage the resources
efficiently).
In other words build your foundations first, then your walls, then your roof.

In the absence of any comparable asset management strategy formulation processes, the only really
effective way to do all this at once for modern, complex industrial processes is to arrange for groups of
appropriately trained operators, maintainers, supervisors and specialists who live with the asset on a day-to-
day basis to apply Reliability-Centered Maintenance under the guidance of a suitably qualified facilitator.

3 RELIABILITY-CENTERED MAINTENANCE
Reliability-centered Maintenance is defined as ‘a process used to determine what must be done to ensure
that any physical asset continues to do whatever its users want it to do in its present operating context’. The
RCM process entails asking seven questions about the asset or system under review, as follows:

• What are the functions and associated performance standards of the asset in its present operating
context?
• In what ways does it fail to fulfill its functions?
• What causes each functional failure?
• What happens when each failure occurs?
• In what way does each failure matter?
• What can be done to predict or prevent each failure?
• What should be done if a suitable proactive task cannot be found?

These questions are reviewed in the following paragraphs.

2
3.1 Functions and Performance Standards
Part two of this paper mentioned that it is only when the functions of an asset have been defined that it
becomes clear exactly what maintenance is trying to achieve, and also precisely what is meant by “failed”.

For this reason the first step in the RCM process is to define the functions of each asset in its operating
context, together with the associated desired standards of performance. The users of the assets are usually
in the best position to know exactly what contribution each asset makes to the physical and financial well-
being of the organization as a whole, so it is essential that they are involved in the RCM process from the
outset.

3.2 Functional Failures


The objectives of maintenance are defined by the functions and associated performance expectations of the
asset. But how does maintenance achieve these objectives?

The only occurrence that is likely to stop any asset performing to a standard required by its users is some
kind of failure. However, before we can apply a suitable blend of failure management tools, we need to
identify what failures can occur. The RCM process does this at two levels:
• By identifying what circumstances amount to a failed state
• Then by asking what events can cause the asset to get into a failed state.

In the world of RCM, failed states are known as functional failures because they occur when an asset is
unable to fulfill a function to a standard of performance which is acceptable to the user. In addition to the
total inability to function, this definition encompasses partial failures, where the asset still functions but at
an unacceptable level of performance (including situations where the asset cannot sustain acceptable levels
of quality or accuracy).

3.3 Failure Modes


Once each functional failure has been identified, the next step is to try to identify all the events, which are
reasonably likely to cause each failed state. These events are known as failure modes. ‘Reasonably likely’
failure modes include those that have occurred on the same or similar equipment operating in the same
context, failures, which are currently being prevented by existing maintenance regimes, and failures that
have not happened yet but that are considered to be real possibilities in the context in question.

Most traditional lists of failure modes incorporate failures by deterioration or normal wear and tear.
However, the list should include failures caused by human errors (by operators or maintainers) and design
flaws so that all reasonably likely causes of equipment failure can be identified and dealt with
appropriately. It is also important to identify the cause of each failure in enough detail to make it possible
to identify a suitable failure management policy.

3.4 Failure Effects


The fourth step in the RCM process entails listing failure effects, which describe what physically happens
when each failure mode occurs. These descriptions should include all the information needed to support the
evaluation of the consequences of failure, such as:
• What evidence (if any) that the failure has occurred?
• In what ways (if any) it poses a threat to safety or the environment?
• In what ways (if any) it affects production or operations?
• What physical damage (if any) is caused by the failure?
• What must be done to repair the failure?

3
3.5 Failure Consequences
A detailed analysis of an average industrial undertaking is likely to yield between three and ten thousand
possible failure modes. As mentioned in the introduction of this paper, each of these failures affects the
organization in some way, but in each case, the consequences are different. The RCM process classifies
these consequences into four groups, as follows:

• Hidden failure consequences: Hidden failures have no direct impact, but they expose the
organization to multiple failures with serious, often catastrophic, consequences.

• Safety and environmental consequences: A failure has safety consequences if it could hurt or kill
someone. It has environmental consequences if it could lead to a breach of any corporate, regional,
national or international environmental standard.

• Operational consequences: A failure has operational consequences if it affects production


(output, product quality, customer service or operating costs in addition to direct cost of repair)

• Non-operational consequences: Evident failures that fall into this category affect neither safety or
production, so they involve only the direct cost of repair.

The RCM process uses these categories as the basis of a strategic framework for maintenance decision-
making. By forcing a structured review of the consequences of each failure mode in terms of these
categories, it focuses attention on the maintenance activities which have the most effect on the performance
of the organization, and diverts energy away from those that have little or no effect (or which may even be
actively counterproductive). It also encourages users to think more broadly about different ways of
managing failure, rather than to concentrate only on failure prevention.

3.6 Failure Management Policy Selection


Failure management policies are divided into two categories:

• Proactive tasks: these tasks undertaken before a failure occurs, in order to prevent the item from
getting into a failed state. As discussed below, RCM further subdivides these tasks into scheduled
restoration, scheduled discard and on-condition maintenance

• Default actions: these deal with the failed state, and are chosen when it is not possible to identify
an effective proactive task. Default actions include failure-finding, redesign and run-to-failure.

Scheduled restoration and scheduled discard tasks


Scheduled restoration entails remanufacturing a component or overhauling an assembly at or before a
specified age limit, regardless of its condition at the time. Similarly, scheduled discard entails discarding an
item at or before a specified life limit, regardless of its condition at the time.
Collectively, these two types of tasks are now generally known as preventive maintenance.

On-condition tasks
On-condition techniques rely on the fact that most failures give some warning of the fact that they are about
to occur. These warnings are known as potential failures, and are defined as identifiable physical
conditions, which indicate that a functional failure is about to occur or is in the process of occurring.
On-condition tasks are used to detect potential failures so that action can be taken to reduce or eliminate the
consequences that could occur if they were to degenerate into functional failures. This category of tasks
includes all forms of predictive maintenance, condition-based maintenance and condition monitoring.

Failure-finding
Failure-finding entails checking hidden functions to find out if they have failed (as opposed to the on-
condition tasks described above, which entail checking if something is failing).

4
Redesign
Redesign entails making any one-time change to the built-in capability of a system. This includes changes
to hardware, one-time changes to procedures and if necessary, training.
No scheduled maintenance
This default entails making no effort to anticipate or prevent failure modes to which it is applied, and so
those failures are simply allowed to occur and then repaired. This default is also called run-to-failure.

3.7 The RCM Task Selection Process


The RCM process applies a highly structured consequence evaluation and policy selection algorithm to
each failure mode. It incorporates precise and easily understood criteria for deciding which (if any) of the
proactive tasks is technically feasible in any context, and if so for deciding how often and by whom the
tasks should be done. It also incorporates criteria for deciding whether any task is worth doing, a decision,
which is governed by how well, the candidate task deals with the consequences of the failure. Finally, if a
proactive task cannot be found that is both technically feasible and worth doing, the algorithm leads users
to the most suitable default action for dealing with the failure.

This approach means that proactive tasks are only specified for failures that really need them, which in turn
leads to substantial reductions in routine workloads. In fact, if RCM is correctly applied to existing
maintenance programs, it reduces the amount of routine work (in other words, tasks to be undertaken on
cyclic basis) issued in each period, usually by 40% to 70%. On the other hand, if RCM is used to develop a
new maintenance program, the resulting scheduled workload is much lower than if the program is
developed by traditional methods. Less routine work also means that the remaining tasks are more likely to
be done properly. This together with the elimination of counterproductive tasks leads to more effective
maintenance.

4 APPLYING RCM AT THE MASSACHUSETTS WATER RESOURCES


AUTHORITY
The Massachusetts Water Resources Authority (MWRA) is responsible for providing wholesale water and
sewerage services, in whole and in part, to 61 communities and 2.6 million people in the greater Boston
metropolitan area. In addition to its operating responsibilities, MWRA is responsible for rehabilitating,
repairing, and maintaining the regional water and sewerage systems.

Since its assumption of the ownership and operations of the systems in 1985, MWRA has undertaken an
ambitious program of water and wastewater system capital improvements with estimated expenditures for
fiscal years 1986 through 2009 of more than $7 billion. Under one massive construction effort, the Boston
Harbor Project, the MWRA assumed maintenance responsibility for the $3.8 billion Deer Island Treatment
Plant (DITP) designed to treat 1.2 billion gpd. Deer Island is the second largest wastewater treatment
facility in the USA. The new treatment plant’s operations and discharge water quality are closely monitored
by state and federal agencies and environmental organizations through an extremely stringent permit.

Given the significant value and critical nature of the MWRA assets, maintenance is of paramount
importance. In 1996, the Facilities Asset Management Program (FAMP) initiative was created as a
comprehensive agency-wide effort to most effectively manage the region’s water and sewerage
infrastructure. The purpose of the FAMP initiative is to optimize the effectiveness and efficiency of
MWRA equipment maintenance practices (i.e., minimize critical equipment failures, minimize unnecessary
maintenance practices, improve equipment reliability, and heighten system knowledge within the work
force.

In 1999, the MWRA Capital Programs Group selected New Dimension Solutions (NDS) to help facilitate
changes in MWRA maintenance practices. One part of the FAMP initiative (Task 4) was for NDS to
review various industry-wide maintenance optimization strategies and recommend a maintenance strategy

5
that would best suited for the DITP. After reviewing these various strategies NDS recommended
Reliability-centered Maintenance, however the final decision lay with MWRA management. RCM was
selected to pilot because it addressed many of the questions and concerns posed by the MWRA. The
following paragraphs addresses the steps followed and the results achieved by applying RCM to a pilot
project on the Primary Clarifier Battery A physical assets.

4.1 Planning
The successful application of RCM at the MWRA depended on first and perhaps foremost on meticulous
planning and preparation. The key elements of the planning that were followed were as follows:

Scope of the Pilot Project


Twelve pre-selected sub-systems were identified within an area of the DITP located in the Primary Clarifier
Battery A area. The systems selected were as follows:
• Primary Sludge Pumps
• Primary Scum Collectors
• Channel Aeration Blowers
• Cross Collector
• Upper and Lower Long Collectors
• Chlorine Monitoring
• HVAC
• Power Supply
• Fire Protection
• Sump Pumps
• Sample Pumps
• Hot Water Flushing

Define and Quantify the Objectives of Each Project


Performance metrics were identified for each of the above sub-systems in order to measure the impact of
the pilot project in terms of system reliability, PM-hours increase/reduction, safety related failures and
overall reduction in maintenance costs.

Estimate the time needed to review each sub-system


NDS estimated the number of three-hour meetings needed to review each of the sub-systems, as well as
identifying the system boundaries for each analysis.

Identify Project Manager and Facilitators.


A MWRA project manager was identified and as well as eight potential facilitators.

Identify Participants
Each of the review groups for each sub-system was identified by name and by title. These involved
personnel from operations, maintenance and engineering.

RCM Training
In support of the implementation of the RCM process, NDS conducted three three-day RCM training
courses, training 60 personnel, ranging from senior management to engineers, supervisors, union
representatives, and review group participants. The training was absolutely essential to develop a culture
that was going to be responsive change. A ten-day facilitator course was also conducted where NDS trained
and coached eight facilitators.

Project Planning
Each and every meeting was planned along with the scheduled audit meeting and the implementation of the
recommendations.

6
4.2 Management Buy-In
Management support was extremely important for the successful RCM effort at the MWRA. The
commitment of resources for the RCM analyzes and training was significant and required the support from
senior management. The Director of DITP made a point of kicking-off several one-hour RCM presentations
given to over 300 DITP staff, where he communicated the importance of the whole program. He also made
sure he was in attendance at each audit meeting, adding additional support to the whole process. Credit
must also be given to other managers at the MWRA for maintaining an extremely high visibility throughout
the pilot project.

4.3 The Outcome of the RCM Analyzes


The RCM analyzes resulted in basically three tangible outcomes, as follows:
• Schedules to be done by the maintenance department
• Revised operating procedures for the operators of the asset
• A list of once-off changes that must be made to the design of the asset or to the way it is operated
to deal with situations where the asset cannot deliver the desired performance in its current
configuration.

4.4 Auditing
After the review had been completed for each sub-system, seniors managers with the overall responsibility
for the assets satisfied themselves that the decisions that were made by the review group members were
sensible and defensible. The Director of DITP made a point of attending each audit meeting which added
tremendous support for the project.

4.5 Results Achieved


Preventive Maintenance Results
The results of the RCM pilot resulted in an overall reduction of approximately 25% in scheduled PM
maintenance hours. Preventive maintenance tasks such as inspections of components for noise, leaks, and
flow tests were assigned to operations who were doing these tasks anyway. In addition, low value PMs and
intrusive PMs that could introduce failures were eliminated.

Primary Sludge Pumps - The primary sludge pumps transfer sludge from the 12 clarifiers to the residual
facility for further processing. Each set of two clarifiers has three dedicated sludge pumps. During normal
operation, one sludge pump is aligned to each clarifier leaving one redundant pump for each set of two
clarifiers.

Prior to the RCM analysis the primary sludge pumps were rotated to provide equal run time and wear on
each pump. As part of the recommendations from the RCM analysis two pumps were designated duty and
the remaining pump as a standby. The resulting PM tasks for the standby pump were reduced significantly
and replaced by a three month functional test to ensure the availability in the event a duty pump failed.

The primary sludge pumps were installed with a gear reducer and internal needle bearings that required
greasing every 10,000 hours or 3.5 years. This task was time consuming requiring the pump to be
disassembled to repack the gear reducer bearings with grease. In addition 39 gear reducer shaft failures had
occurred due to the bearings seizing. The pump vendor was brought in to review the shaft failures and the
conclusion was that the bearings were overheating due to friction from the gear reducer components. A
redesign of the pump was recommended which replaced a metallic component with a ceramic component.
As a result no failures have occurred since.

Aeration Blowers – Five aeration blowers supply air to the primary and secondary influent channels for
aeration and mixing to suspend solids and improve the performance of the primary clarifiers.

7
The RCM analysis concluded that the plant demand requires a maximum of two of the five aeration
blowers. As a result, two aeration blowers were designated as duty blowers and one aeration blower was
designated as a standby unit. The reaming two were moth-balled resulting in savings in PM and operational
costs.

References:
“Responsible Custodianship” presented by Aladon in collaboration with John Moubray. Sections 2 and 3
are reproduced in part by kind permission of Aladon Ltd.
“RCM Pilot Results at Deer Island Treatment Plant” presented by Dan Keough at the SMRP Conference
Nashville, Tennessee, October 2002.

About Peter Stock:


Peter Stock is qualified in mining and industrial engineering. He lives in Cape Canaveral, Florida and is a
VP RCM Consulting for New Dimension Solutions based in New York. After graduating in 1975 he
worked for a major gold mining company in South Africa for three years, thereafter working as an
industrial engineer for a large industrial battery manufacturer for the next four years.

For the next ten years, he worked for large multi-disciplinary management consulting companies,
specializing in the development and implementation of manual and computerized maintenance
management systems for a wide variety of clients in mining, paper and steel manufacturing and electric
utility companies.

In 1991 under the guidance and mentorship of John Moubray, Peter Stock started his own company in
South Africa specializing in the application of RCM. He later joined Aladon in Canada and the USA,
playing a major role in the development and dissemination of the RCM methodology. He researched and
co-authored Appendix 4 – “Condition Monitoring Techniques” in the book written by John Moubray on
RCM. He joined New Dimension Solutions in 1999 and has since played a major role in the successful
RCM initiatives at several major organizations within the USA.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy