Failure Modes and Effects
Failure Modes and Effects
Analysis
Dr. M. Hodkiewicz
Contents
• Definitions and background
• Design and Process FMEA
• Terminology
• The FMEA Process
• Discussion
• Future developments
Hodkiewicz – 2
“FMEA”
Learning Outcomes
• After this session you will be able to:
– Explain the role of FMEA/ FMECA in the AM life-
cycle process
– Identify key components in the FMEA process
– Conduct and report on a simple FMEA exercise
– Appreciate challenges with FMEA implementation
– Appreciate how FMEA can be updated by
integration into the routine maintenance
environment
Hodkiewicz – 3
“FMEA”
References (1 of 2)
[1]Dailey, K.W., The FMEA Pocket Handbook. 2004: DW
Publishing company.
[2]McDermott, R.E., R.J. Mikulak, and M.R. Beauregard,
The basics of FMEA. 1996: Productivity.
[3]SAE J1739: Potential Failure Modes and Effects analysis in
Design and Potential Failure Effects in Manufacturing and
Assembly Processes Reference Manual - Draft for review.
2005.
[4]MIL-STD-1629A: Procedure for performing a failure
mode, effects and criticality analysis. 1980.
Hodkiewicz – 4
“FMEA”
References (2of 2)
[5]Tweeddale, M., Managing Risk and Reliability of Process
Plants. 2003: Gulf Publishing.
[6]ISO 14224: Petroleum and natural gas industries -
Collection and exchange of reliability and maintenance data
for equipment. 1999.
[7]IEC 60050-191: International Electrotechnical Vocabulary
- Dependability and Quality of Service. 1990.
[8] MIL-STD-721C Definitions of terms for reliability and
maintainability. 1995.
[9] Macaulay, D., The Way things work. 1988: RD Press. [10]
What's wrong with your existing FMEAs, 24/7
Quality.com.
Hodkiewicz – 5
“FMEA”
Software/Internet Resources
• FMEMAH1InfoCentre:
http://www.fmeainfocentre.com/
• h
ttp://www.weibull.com/basics/fme
a.htm
• On-line paper: B. S. Dhillon, Failure modes
and effects analysis-Bibliography,
Microelectronics
Hodkiewicz – 6
“FMEA”
Slide 6
Hodkiewicz – 8
“FMEA”
Informal definition
• “FMEA is a non-quantitative analysis that aims to
identify the nature of the failures that can occur in a
system, machine, or piece of equipment by examining the
sub-systems or components in turn, considering for each
the full range of possible failure types and the effect on
the system of each type of failure.
• FMECA is an extension of FMEA that assigns a ranking
to both the severity of the possible effects and their
likelihood, enabling the risks to be ranked” [From 5]
Hodkiewicz – 9
“FMEA”
Philosophy
• FMEA is a ‘common sense’ procedure.
• The aim is to provide a framework/process to assist the
thought process of a competent person engaged in
identifying system or design problems.
• The process focuses on what we want the equipment to do
not what it actually is. By identifying what functions need
to be achieved, we can then identify situations when the
equipment does not perform the required function, and
focus attention on the related causes and effects.
Hodkiewicz – 10
“FMEA”
An example FMEA report
Hodkiewicz – 11
“FMEA”
For what activities is FMEA
appropriate?
• New designs, new technology, or new process
• Modifications to existing design or process
• Use of existing design or process in a new
environment, location or application
• Identify monitoring and inspection practices for
equipment
• Identifying failure codes for the CMMS system
• Part of the RCM process
Hodkiewicz – 12
“FMEA”
Design and Process FMEA
Types of FMEA
• FMEA can be applied to a physical
entity or to a functional entity.
• For example,
– it can be applied to a particular equipment
(design FMEA), or to
– A process function (process FMEA).
Hodkiewicz – 14
“FMEA”
Example
ANTI-SURGE (LP) ANTI-SURGE (HP)
PCV 1(LP)
m DISCH.
LP LP m
GAS SUCTION COOLER
EXPORT LP
P P DISCH. HP
COOLER SUCTIO COOLER SUCTIO P
HEADER T DEHYD.
N T NDRUM T
DRU P PACK
M
T
M PCV 2
P GEARBO LP STAGE HP STAGE
X COMPRESSION COMPRESSION P
TO SUBSEA
2ND GAS EXPORT COMPRESSION TRAIN PIPELINE
Hodkiewicz – 15
“FMEA”
Design FMEA (DFMEA)
• Identifies functional requirements of a design
• Evaluates the initial design for manufacturing,
assembly, service and recycling requirements.
• Used by Design Team. The customer for the
design team may be the end user, the design
engineer of the higher level assemblies or the
manufacturing process/assembly team.
Hodkiewicz – 16
“FMEA”
Design FMEA in the AM
context
• If you are the maintenance engineer in an oil and
gas or similar facility, it is unlikely that you will
be involved in a design FMEA process.
• However, if you are (1) troubleshooting
equipment, (2) developing failure codes or (3)
engaged in RCM, then information from the
design FMEA conducted by the manufacturer may
be helpful.
Hodkiewicz – 17
“FMEA”
Process FMEA (PFMEA)
• Identifies the process functions, process
requirements, potential product and process
failures and the effects on the customer.
• Identifies process/operational variables on which to
focus controls.
• Traditionally used by Manufacturing/ Assembly/
Process team. The customer can be a downstream
team, a service operation, or even government
regulations.
• In the AM arena, there is some overlap between
HAZOP and Process FMEA for operational
equipment
Hodkiewicz – 18
“FMEA”
Machinery FMEA (MFMEA)
• This is a new category in the draft SAE
J1739-2005 aimed at Plant Machinery and
Tools.
• In AM, machinery FMEA may be
applied to important maintenance
support tools such as lathes, cranes,
milling machines etc.
• There are similarities in approach between
DFMEA, PFMEA and MFMEA.
Hodkiewicz – 19
“FMEA”
Relationship
SYSTEM DESIGN PROCESS
MACHINERY
Components, Components, Manpower,
sub-systems, sub-systems, Machine, Method,
main systems main systems Material, Tools, Work
Measurement, stations, production
Environment lines, operator
training, processes,
gauges
Hodkiewicz – 20
“FMEA”
Approaches to FMEA
• A FMEA may be based on a
• (a) hardware/physical, or (b) functional approach.
• (a) The hardware approach lists individual hardware
items and analyses their possible failure modes.
• (b) The functional approach recognises that every item
is designed to perform a number of functions that can
be classified as outputs. The outputs are listed and
their failure modes analysed.
• For complex systems, a combination of (a) and (b) may
be required.
Hodkiewicz – 21
“FMEA”
Maintenance
• For Maintenance Personnel, FMEA is a direct approach
to the reduction of maintenance costs through the
elimination of faults that give rise to the maintenance
task.
• FMEA identifies the most critical problems first paving the
way for improved maintenance techniques
• FMEA on installed equipment provides suggestions for
redesign and ‘proactive maintenance’
– (From: Hastings, 1998, Reliability and Maintenance Course
notes, QUT)
Hodkiewicz – 22
“FMEA”
Terminology
Terms & definitions (1)
• Failure: Termination of the ability of an item to
perform a required function [7]
• Required function: Function, or combination of
functions, of an item which is considered
necessary to provide a given service [7]
• Failure mode: The manner by which a failure is
observed. Generally describes the way the failure
occurs and its impact on equipment operation [4].
Hodkiewicz – 24
“FMEA”
Terms & definitions (2)
• Failure cause:
• (1) Circumstance during design, manufacture or use
which have led to failure [7].
• (2) The physical or chemical processes, design
defects, quality defects, part misapplication, or
other processes which are the basic reason for
failure or which initiate the physical process by
which deterioration proceeds [4].
• Failure mechanism: Physical, chemical or other
process which has led to failure [7]
Hodkiewicz – 25
“FMEA”
Terms & definitions (3)
• Failure effect: The consequence a failure
mode has on the operation, function, or
status of an item [4].
• Critical failure: Failure of an equipment unit
which causes an immediate cessation of the
ability to perform its required function [6]
• Non-critical failure: Failure of an equipment unit
which does not cause an immediate cessation of
the ability to perform its required
function [6]
Hodkiewicz – 26
“FMEA”
Terms & definitions (4)
• Criticality: A relative measure of the
consequences of a failure mode and its
frequency of occurrence [4]
• Severity: The consequence of a failure mode.
Severity considers the worst potential
consequence of a failure, determined by degree of
injury, property damage, or system damage that
could ultimately occur.
Hodkiewicz – 27
“FMEA”
Terms & definitions (5)
• Reliability [8]: The probability that an item will
perform its intended function(s) for a specified
interval under stated conditions.
• Undetectable (Hidden) failure: A postulated
failure mode in the FMEA for which there is no
failure detection method by which the operator
is made aware of the failure [4].
Hodkiewicz – 28
“FMEA”
The FMEA Process
DEFINE SCOPE
DEFINE LEVEL OF
ANALYSIS
IDENTIFY CAUSES OF
FAILURE
ASSIGN OCCURRENCE
(FREQUENCY) RATING
FMEA
NUMBER FOR EACH EFFECT
Hodkiewicz – 30
“FMEA”
Steps in the FMEA process
(from [3])
1. Define the SCOPE of the study (System
boundary)
2. Decide on the LEVEL of analysis (System,
sub-system, components)
3. For the selected system or sub-systems,
IDENTIFY and list functions and the potential
failure modes. Failure modes may be assessed at
the hardware or functional level, or a
combination of both.
Hodkiewicz – 31
“FMEA”
Step 3 in the maintenance
context
• IDENTIFY and list failure modes …
– Information on what failed and when on a specific
piece of equipment or in a system should be available
in the maintenance management system (CMMS)
– Depending on the organization of the system and the
data quality processes then there may be a failure code
indicating the cause of failure.
Hodkiewicz – 32
“FMEA”
Continued …
4. For the selected system or sub-system and for each of
the identified failure mode, identify the POTENTIAL
EFFECT(s) on the machine, system or process and the
relative importance
(SEVERITY) of the effect(s).
Hodkiewicz – 33
“FMEA”
Continued …
4. continued.
The effects could include:
– Injury to people
– Damage to the environment
– Damage to equipment
– Loss of production
– Reduced quality of production
– Increased cost of operation
Hodkiewicz – 34
“FMEA”
Continued …
5. Assign an (OCCURRENCE) ranking to each failure
mode
6. For each failure mode for each element, identify
CONTROLS
– The means of preventing the failure by design,
operating and maintenance practices, and
management.
– The means of detecting the failure and responding
effectively to it
– The means (if any) of limiting the impact of the
failure, particularly by design changes.
Hodkiewicz – 35
“FMEA”
Continued ..
7. For each of the controls assign a DETECTION ranking
8. Calculate the Risk Priority Number (RPN) for each
effect
9. Prioritise the failure modes for action (RANKING)
10.Take ACTION to eliminate or reduce the high risk
failure modes
11.Calculate the resulting RPN as the failure modes are
reduced or eliminated.
Hodkiewicz – 36
“FMEA”
DEFINE SCOPE
DEFINE LEVEL OF
ANALYSIS
IDENTIFY CAUSES OF
FAILURE
ASSIGN OCCURRENCE
(FREQUENCY) RATING
FMEA
CALCULATE RISK PRIORITY
NUMBER FOR EACH EFFECT
Hodkiewicz – 37
“FMEA”
Selecting the team
• Have you got representatives from all the
stakeholders?
• Do you have a facilitator?
• Are the team members familiar with the
subject but from diverse vantage points?
Hodkiewicz – 38
“FMEA”
Setting up the meeting
• Provide advance notice
• Who will record meeting minutes?
• Who will facilitate?
• Establish ground rules
• Provide and follow an agenda
• Evaluate meetings
• Who will you report the results to?
• Allow no interruptions
Hodkiewicz – 39
“FMEA”
Brainstorming rules [1]
• Participants must be enthusiastic and give their
imagination free reign
• The recorder must be given time to record ideas
• The ideas must be concisely recorded and placed in
clear view of participants
• Idea evaluation occurs after the session
• Set a firm time limit
• Clearly define the problem you want solved
• The moderator must keep the group on subject and
moving
• When time is up, the group rank the ideas
Hodkiewicz – 40
“FMEA”
Selecting systems/sub-
systems and components
• It is important to have an agreed taxonomy when
breaking systems down into sub-systems and
components.
• This may be agreed with by the team for a specific
FMEA, or they may choose to use a taxonomy
described in a Standard, for example: [6].
Hodkiewicz – 41
“FMEA”
Deciding level for analysis [6]
Subunit Power Pump unit Control & Lubrication Miscellaneou
transmission monitoring s
Hodkiewicz – 42
“FMEA”
Defining functions and failure
modes
• Required function: Function, or combination of
functions, of an item which is considered
necessary to provide a given service [7].
• Be explicit so it is clear when a functional
failure has occurred.
• Failure mode: The manner by which a failure is
observed. Generally describes the way the failure
occurs and its impact on equipment operation [4].
Hodkiewicz – 43
“FMEA”
Defining functions and failures
• Equipment: Diesel Engine Crankshaft
• Function: To convert reciprocating force from
pistons and connecting rods into rotational force
through the bearings and crankshaft to the drive
coupling at a maximum rate of up to ‘x’ kW per
cylinder at up to ‘y’ rpm continuously or ‘z’ kW
per cylinder at ‘w’ rpm for up to ‘v’ hours in 12.
• Question: What are some possible functional
failures?
Hodkiewicz – 44
“FMEA”
Functional failure
• Function: To convert reciprocating force from
pistons and connecting rods into rotational force
through the bearings and crankshaft to the drive
coupling
• Functional Failure: Unable to convert and
transmit any force from the pistons
Hodkiewicz – 45
“FMEA”
Failure mode (1 of 2)
• Failure Mode (1): Damaged crankshaft axial
alignment bearing (ball race) due to lubrication
failure
• Failure Effect (1): Crankshaft will float axially and
foul on crankcase, misalignment of gear drives.
• Existing controls (1): Daily fuel dilution test,
weekly oil screen, change oil and filters as
required.
Hodkiewicz – 46
“FMEA”
Failure mode (2 of 2)
• Failure Mode (2): Damaged crankshaft axial
alignment bearing (ball race) due to bearing
material failure
• Failure Effect (2): Same as (1). Crankshaft
will float axially and foul on crankcase,
misalignment of gear drives.
• Existing controls (2): Routine vibration
monitoring. Replace bearing as required.
Hodkiewicz – 47
“FMEA”
Examples of failure modes
• For mechanical equipment
– Cracked, Loosened, Fractured, Leaking,
Oxidised, Loss of structural support,
Deformed, Slips, Disengages too fast,
Failure to transmit torque.
• For electrical equipment
– No signal, Intermittent signal, Inadequate
signal, Sticking, Drift,
Hodkiewicz – 48
“FMEA”
Process Failure modes
(from [4])
Failure mode Definition Failure mode Definition
Hodkiewicz – 49
“FMEA”
Examples of design failure causes
• Improper tolerances
• Incorrect (stress or other)
calculations
• Wrong assumptions
• Wrong material
• Lower grade components
• Lack of design standards
• Incorrect algorithm
• Insufficient lubrication capability
• Excessive heat
Hodkiewicz – 50
“FMEA”
Examples of failure causes in
manufacturing & process
1. Skipped steps 11 Poor control procedures
2. Processing errors 12 Improper equipment
maintenance
3. Set up errors 13 Bad ‘recipe’
4. Missing parts 14 Fatigue
5. Wrong parts 15 Lack of safety
6. Processing incorrect work piece 16 Hardware failure
7. Mis-operation 17 Failure to enforce controls
8. Adjustment error 18 Environment
9. Equipment improperly set-up 19 Stress connections
10 Tools improperly prepared 20 Poor FMEAS
Hodkiewicz – 51
“FMEA”
Example of design failure
mechanisms
• Yield
• Fatigue
• Material instability
• Creep
• Wear
• Corrosion
• Chemical oxidation
Hodkiewicz – 52
“FMEA”
Examples of design controls
• Prototype testing
• Design reviews
• Worst case stress analysis
• FEA
• Fault tree analysis
Hodkiewicz – 53
“FMEA”
ACTIVITY
• Workshop activity to identify functions
and failure modes
• The aim of this activity is to see how the
functions are broken down and assessed at
the different levels.
Hodkiewicz – 54
“FMEA”
Bicycle for (Male) Commuter [from 3]
• List some design objectives (functions) of a
regular commuter bicycle
• For two of the functions identify potential
failure modes?
Hodkiewicz – 55
“FMEA”
Bicycle example continued
• Identify some of the sub-systems of the
bicycle
• For one subsystem: Identify at least two
functions and failure modes.
Hodkiewicz – 56
“FMEA”
DEFINE SCOPE
DEFINE LEVEL OF
ANALYSIS
IDENTIFY CAUSES OF
FAILURE
ASSIGN OCCURRENCE
(FREQUENCY) RATING
FMEA
NUMBER FOR EACH EFFECT
Hodkiewicz – 57
“FMEA”
Recording the FMEA process
Hodkiewicz – 58
“FMEA”
Severity (S)
• A relative ranking, within the scope of the
individual FMEA.
• A reduction in Severity can be achieved by
design change to system, sub-system or
component, or a redesign of the process.
• The rank depends on the evaluation criteria.
Examples of suitable tables are available in the
literature. Some companies may have standard
tables.
Hodkiewicz – 59
“FMEA”
Severity Tables (from [1])
Hodkiewicz – 60
“FMEA”
Occurrence (O)
• This is the likelihood that a specific
cause/mechanism (listed in the previous
column) will occur. Occurrence is
usually based on ranking charts and is a
relative rating within the scope of the
FMEA.
Hodkiewicz – 61
“FMEA”
Occurrence Tables (from [1])
Hodkiewicz – 62
“FMEA”
Controls
• (1) prevent to the extent possible the failure
mode or cause from occurring or reduce the rate
of occurrence, or
• (2) detect the cause/ mechanism and lead to
corrective action, or
• (3) detect the failure mode or cause should it
occur.
Hodkiewicz – 63
“FMEA”
Detection ranking (D)
• A rank associated with the best type of
control listed in the previous column.
Detection is a relative ranking within the
scope of the FMEA.
Hodkiewicz – 64
“FMEA”
Detection Tables (from [1])
Hodkiewicz – “FMEA”
65
Risk Priority number
• RPN = (S) x (O) X (D)
• Within the scope of the individual
FMEA, the resulting value (between 1
and 1000) can be used to rank order the
concerns identified by the process. This
allows the highest ranking items to be
identified and addressed.
Hodkiewicz – 66
“FMEA”
Action plans
• Recommended action(s)
• Corrective action should be addressed at high severity,
high RPN issues. The intent of the action is to reduce
rankings in the order of preference: severity, occurrence
and detection.
• Actions taken and resulting revised ratings
• After a preventative/corrective action has been identified,
estimate and record the resulting S, O and D rankings. All
revised rankings should be reviewed to see if further
action is necessary.
Hodkiewicz – 67
“FMEA”
Design FMEA actions
• An increase in design validation/ verification
actions will result in reduction of ‘detection’
ranking only
• Occurrence ranking can be effected by
removing or controlling the causes or
mechanisms through design revision
• Design revision can also affect severity
ranking
Hodkiewicz – 68
“FMEA”
DEFINE SCOPE
DEFINE LEVEL OF
ANALYSIS
IDENTIFY CAUSES OF
FAILURE
ASSIGN OCCURRENCE
(FREQUENCY) RATING
FMEA
CALCULATE RISK PRIORITY
NUMBER FOR EACH EFFECT
Hodkiewicz – 69
“FMEA”
Discussion
Drawbacks (1 of 2)
• 1. The ratings and RPN number are subjective
• 2. The categorisation into failure mode and cause does not
allow for thinking in terms of causal chains, the’ 5 WHYS’
or other processes. For each mode you must have a failure
cause. This cause may have a deeper cause. Sometimes the
cause and the mode are the same.
• 3. It can be difficult to control brainstorming sessions
Hodkiewicz – 71
“FMEA”
Drawbacks (2 of 2)
• 4. Legal ramifications: if you have identified a failure
mode but you have not eliminated it, are you culpable of
negligence?
• 5. Approach makes it difficult to allow for the
interaction of two benign failure models.
• 6. FMEA often assumes that the part is ‘in tolerance’. To
assume otherwise expands the scope of FMEA
considerably. However in real life, out of spec parts are
common.
Hodkiewicz – 72
“FMEA”
Common problems with FMEA
(from [10])
• Engineers often do not follow a recognised standard and
company format
• Multiple descriptions of the exact same failure mode,
cause or effect
• No recommendations or corrective action for high
RPM items
• Inconsistent documents between parts of the study
• No document control or revision control
• Engineers don’t see a value in a FMEA, they find
them a pain to perform and labour intensive
• Companies perform a FMEA study when it is too late.
Hodkiewicz – 73
“FMEA”
Benefits of design FMEA
(1 of 2)
• Aids in objective evaluation of design, including
functional requirements and design
alternatives
• Evaluating the initial design for manufacturing,
assembly, service and recycling requirements
• Increasing the probability that potential failure
modes and their effects on the system have been
considered in the design/development process.
Hodkiewicz – 74
“FMEA”
Benefits of design FMEA
(2 of 2)
• Developing a ranked list of potential failure modes
according to their effect on the customer (can be the
assembly team), thus establishing a priority system for
design improvements, development and validation
testing/analysis.
• Providing an open-issue format for recommending and
tracking risk reduction actions
• Providing future reference eg lessons learned, to aid in
analysing field concerns, evaluating design changes and
developing advanced designs
Hodkiewicz – 75
“FMEA”
Benefits of Process FMEA
• Identifies the process functions and requirements
• Identifies potential product and process related failure
modes
• Assesses the potential customer effects of the
failures
• Identifies process variables on which to focus
process controls
• Develops a ranked list of potential failure modes thus
establishing a priority system for preventative/corrective
action considerations
• Documents the results of the analysis of the
manufacturing, assembly or production process
Hodkiewicz – 76
“FMEA”
Future developments
FMEA and Failure Analysis: Closing
the Loop Between Theory and
Practice
Hodkiewicz – 79
“FMEA”
What happens now?
• FMEA process:
– Proactive but subjective analysis of hypothetical
– Integrated into other methodologies
– Large upfront costs
– Results or benefits rarely substantiated
– Static process
– Non-inclusive
– Completed reports collect dust
– Data & process owned by engineering
Hodkiewicz – 80
“FMEA”
What happens now?
• Failure analysis process & storage in CMMS:
– Retrospective & selective view of reality
– Hampered by bad/missing data
– Evolved functionality
– Dictated by accountants
– Interfaces ruled by codes & structure
– Poor integration with non-financial systems
– Widely distributed, used & disliked
– Data & process owned by ops/maintenance
Hodkiewicz – 81
“FMEA”
What happens now?
• FMEA & CMMS data rarely linked &/or integrated
• Why?
– Different process owners
– Non-uniform coding between FMEA & CMMS
systems
– Reporting may be at different hierarchy levels
– Hierarchies may be different
– Tradition
Hodkiewicz – 82
“FMEA”
Issues to overcome
• Structural issues
– Consistency of coding & reporting
– Adapt systems to users not administrators
Hodkiewicz – 83
“FMEA”
Issues to overcome
• Organizational issues
– Implement cultural change
– Include the disenfranchised
– Increase frequency & quality of feedback
– Improve status of data collectors
Hodkiewicz – 84
“FMEA”
What is our vision?
• Living FMEA
• Live links between theoretical (FMEA) &
actual (CMMS)
• Inclusive process, shared ownership for both
datasets
• Managed, audited & utilized data processes
• Live feeds into various business improvement
systems
Hodkiewicz – 85
“FMEA”
How can we get there?
• Living FMEA model
• See Handout
Hodkiewicz – 86
“FMEA”
Benefits
• Aids prioritization & guides business response
• Improves reliability analysis from:
– Reduced reliance on free text
– More consistent failure classification
• Facilitates maintenance optimization
• Uncovers disparity between theory & reality
• Creates knowledge workers
• Ensures a recorded, managed & auditable process
Hodkiewicz – 87
“FMEA”
More benefits
• Reviews & measures success of FMEA
process
Hodkiewicz – 88
“FMEA”
End