0% found this document useful (0 votes)
14 views97 pages

Konstantoulakisi Rootcauseanalysis

The document is a comprehensive guide on Root Cause Analysis (RCA), detailing its definitions, methodologies, and applications in various fields such as quality control and failure analysis. It outlines a structured process for performing RCA, emphasizing the importance of addressing root causes rather than just symptoms to prevent recurrence of issues. Additionally, it discusses various RCA methods, principles, and their relevance in the maritime industry and other sectors.

Uploaded by

Chandru Jatti
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views97 pages

Konstantoulakisi Rootcauseanalysis

The document is a comprehensive guide on Root Cause Analysis (RCA), detailing its definitions, methodologies, and applications in various fields such as quality control and failure analysis. It outlines a structured process for performing RCA, emphasizing the importance of addressing root causes rather than just symptoms to prevent recurrence of issues. Additionally, it discusses various RCA methods, principles, and their relevance in the maritime industry and other sectors.

Uploaded by

Chandru Jatti
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 97

ΕΘΝΙΚΟ ΜΕΤΣΟΒΙΟ ΠΟΛΥΤΕΧΝΕΙΟ

ΣΧΟΛΗ ΝΑΥΠΗΓΩΝ ΜΗΧΑΝΟΛΟΓΩΝ ΜΗΧΑΝΙΚΩΝ

ROOT CAUSE ANALYSIS

ΚΩΝΣΤΑΝΤΟΥΛΑΚΗΣ ΙΩΑΝΝΗΣ

Επιβλέπων καθηγητής: Βασίλειος Ι. Παπάζογλου

ΑΘΗΝΑ 2010
INDEX
CHAPTER 1 – DEFINITIONS .............................................................4

1.1. ROOT CAUSE ANALYSIS .............................................................4


1.2 QUALITY CONTROL .....................................................................5
1.2.1 Quality assurance.....................................................................5
1.2.2 Failure testing..........................................................................5
1.2.3 Statistical control .....................................................................5
1.2.4 Company quality .....................................................................6
1.2.5 Total quality control ................................................................6
1.3 FAILURE ANALYSIS .....................................................................7
1.3.1 Forensic investigation..............................................................7
1.4 SYSTEMS ANALYSIS ....................................................................8
1.4.1 Overview .................................................................................8
1.4.2 Practitioners.............................................................................8
1.5 GENERAL PRINCIPLES OF ROOT CAUSE ANALYSIS..............9

CHAPTER 2 - GENERAL PROCESS FOR PERFORMING AND


DOCUMENTING AN RCA – BASED CORRECTIVE ACTION ...10

2.1 PHASE I – DATA COLLECTION ..................................................11


2.2 PHASE II – ASSESSMENT ............................................................13
2.2.1 Assessment and Reporting Guidance .......................................14
2.2.1.1 Analyze and determine the events and casual factor
chain...........................................................................14
2.2.1.2 Summarize findings, list the casual factors, and list
corrective actions .......................................................15
2.3 PHASE III – CORRECTIVE ACTIONS .........................................16
2.4 PHASE IV – INFORM ....................................................................17
2.5 PHASE V – FOLLOW-UP ..............................................................18

CHAPTER 3 - ROOT CAUSE ANALYSIS METHODS ..................19

3.1 BAYESIAN INFERENCE...............................................................20


3.1.1 Evidence and changing beliefs .................................................20
3.2 CURRENT REALITY TREE ..........................................................20
3.2.1 Simplified explanation .............................................................21
3.2.2 Contextual explanation.............................................................21
3.2.3 Example ...................................................................................21
3.3 FAILURE MODE AND EFFECTS ANALYSIS .............................23
3.3.1 History .....................................................................................23
3.3.2 Implementation ........................................................................24

1
3.3.3 Using FMEA when designing ..................................................24
3.3.4 Timing of FMEA .....................................................................24
3.3.5 Uses of FMEA .........................................................................25
3.3.6 Advantages ..............................................................................25
3.3.7 Limitations ...............................................................................26
3.3.8 Software...................................................................................26
3.3.9 Types of FMEA .......................................................................27
3.4 FAULT TREE ANALYSIS .............................................................27
3.4.1 History .....................................................................................27
3.4.2 Why Fault Tree Analysis?........................................................28
3.4.3 Methodology............................................................................29
3.4.4 Analysis ...................................................................................29
3.5 5-WHYS ..........................................................................................31
3.5.1 Example ...................................................................................31
3.5.2 History .....................................................................................32
3.5.3 Criticism ..................................................................................32
3.6 ISHIKAWA DIAGRAM .................................................................33
3.6.1 Overview .................................................................................33
3.6.2 Causes......................................................................................33
3.6.3 Categories ................................................................................34
3.7 KEPNER-TREGOE PROBLEM ANALYSIS .................................35
3.7.1 Kepner-Tregoe (company) .......................................................35
3.7.2 Kepner-Tregoe (technique) ......................................................35
3.8PARETO ANALYSIS ......................................................................36
3.8.1 Steps to identify the important causes using Pareto analysis.....36
3.9 RPR PROBLEM DIAGNOSIS ........................................................37
3.9.1 Overview .................................................................................37
3.9.2 Limitations ...............................................................................37
3.9.3 History .....................................................................................38

CHAPTER 4 - BASIC ELEMENTS OF ROOT CAUSE


ANALYSIS...........................................................................................39

CHAPTER 5 - ROOT CAUSE ANALYSIS AND CASUALTY


INVESTIGATION IN MARITIME INDUSTRY ..............................40

5.1 THE WAY INVESTIGATIONS USED TO BE DONE...................40


5.2 WHY INVESTIGATE INCIDENTS?..............................................40

CHAPTER 6 - REPORTS AND ANALYSIS OF NON-


CONFORMITIES AS DESCRIBED IN THE COMPANY’S
SAFETY MANAGEMENT SYSTEM MANUAL .............................41

2
6.1 GENERAL.......................................................................................41
6.2 RESPONSIBILITIES.......................................................................41
6.3 DEFINITIONS.................................................................................42
6.3.1 Non-conformities .....................................................................42
6.3.2 Accidents .................................................................................42
6.3.3 Hazardous Occurrences (Near-misses) .....................................42
6.4 PROCEDURES ...............................................................................42
6.5 REPORTING NON-CONFORMITIES (NCRs) ..............................43
6.6 CORRECTIVE ACTIONS ..............................................................44
6.6.1 Treatment.................................................................................44
6.6.2 Corrective actions ....................................................................44
6.6.3 Assessment of the cause ...........................................................45
6.6.3.1 Root Cause Analysis Methodology................................46
6.6.3.2 Training for Root Cause Analysis ..................................47
6.6.4 Recording.................................................................................48

CHAPTER 7 - ANALYSIS OF REAL CASES WITH NON-


CONFORMITIES AND NEAR-MISSES...........................................49

CHAPTER 8 - SUMMARY AND RECOMMENDATIONS.............94

CHAPTER 9 – REFERENCES...........................................................95

3
ROOT CAUSE ANALYSIS

CHAPTER 1
DEFINITIONS
1.1 ROOT CAUSE ANALYSIS
Root cause analysis (RCA) is a class of problem solving methods
aimed at identifying the root causes of problems or events. The practice
of RCA is predicated on the belief that problems are best solved by
attempting to correct or eliminate root causes, as opposed to merely
addressing the immediately obvious symptoms. By directing corrective
measures at root causes, it is hoped that the likelihood of problem
recurrence will be minimized. However, it is recognized that complete
prevention of recurrence by a single intervention is not always possible.
Thus, RCA is often considered to be an iterative process, and is
frequently viewed as a tool of continuous improvement.

RCA, initially, is a reactive method of problem detection and


solving. This means that the analysis is done after an event has occurred.
By gaining expertise in RCA it becomes a pro-active method. This means
that RCA is able to forecast the possibility of an event even before it
could occur.

Root cause analysis is not a single, sharply defined methodology;


there are many different tools, processes, and philosophies of RCA in
existence. However, most of these can be classed into five, very-broadly
defined "schools" that are named here by their basic fields of origin:
safety-based, production-based, process-based, failure-based, and
systems-based.

· Safety-based RCA descends from the fields of accident analysis


and occupational safety and health.
· Production-based RCA has its origins in the field of quality control
for industrial manufacturing.
· Process-based RCA is basically a follow-on to production-based
RCA, but with a scope that has been expanded to include business
processes.
· Failure-based RCA is rooted in the practice of failure analysis as
employed in engineering and maintenance.
· Systems-based RCA has emerged as an amalgamation of the
preceding schools, along with ideas taken from fields such as
change management, risk management, and systems analysis.

4
ROOT CAUSE ANALYSIS

Despite the seeming disparity in purpose and definition among the


various schools of root cause analysis, there are some general principles
that could be considered as universal. Similarly, it is possible to define a
general process for performing RCA.

1.2 QUALITY CONTROL


In engineering and manufacturing, quality control and quality
engineering are used in developing systems to ensure products or services
are designed and produced to meet or exceed customer requirements.

Quality control is the branch of engineering and manufacturing


which deals with assurance and failure testing in design and production of
products or services, to meet or exceed customer requirements.

1.2.1 Quality assurance


One of the most widely used paradigms for quality assurance
management is the PDCA (Plan-Do-Check-Act). This problem solving
process was made popular by Dr. W. Edwards Deming, who is
considered by many to be the father of modern quality control.

1.2.2 Failure testing


A valuable process to perform on a whole consumer product is
failure testing (also known as stress testing), the operation of a product
until it fails, often under stresses such as increasing vibration, temperature
and humidity. This exposes many unanticipated weaknesses in a product,
and the data is used to drive engineering and manufacturing process
improvements.

1.2.3 Statistical control


Many organizations use statistical process control to bring the
organization to Six Sigma levels of quality, in other words, so that the
likelihood of an unexpected failure is confined to six standard deviations
on the normal distribution. This probability is 3.4 one-millionths. Items

5
ROOT CAUSE ANALYSIS

controlled often include clerical tasks such as order-entry as well as


conventional manufacturing tasks.
Traditional statistical process controls in manufacturing operations
usually proceed by randomly sampling and testing a fraction of the
output. Variances of critical tolerances are continuously tracked, and
manufacturing processes are corrected before bad parts can be produced.

1.2.4 Company quality


During the 1980s, the concept of “company quality” with the focus
on management and people came to the fore. It was realized that, if all
departments approached quality with an open mind, success was possible
if the management led the quality improvement process.
The company-wide quality approach places an emphasis on three aspects:

1. Elements such as controls, job management, defined and well


managed processes, performance and integrity criteria and
identification of records.
2. Competence such as knowledge, skills, experience, qualifications.
3. Soft elements, such as personnel integrity, confidence,
organizational culture, motivation, team spirit and quality
relationships.

The quality of the outputs is at risk if any of these three aspects is


deficient in any way.

1.2.5 Total quality control


Total Quality Control is the most necessary inspection control of
all in cases where, despite statistical quality control techniques or quality
improvements implemented, sales decrease.

If the original specification does not reflect the correct quality


requirements, quality cannot be inspected or manufactured into the
product.

For instance, all parameters for a pressure vessel should include not
only the material and dimensions but operating, environmental, safety,
reliability and maintainability requirements.

6
ROOT CAUSE ANALYSIS

1.3 FAILURE ANALYSIS


Failure analysis is the process of collecting and analyzing data to
determine the cause of a failure and how to prevent it from recurring. It is
an important discipline in many branches of manufacturing industry, such
as the electronics industry, where it is a vital tool used in the development
of new products and for the improvement of existing ones. It relies on
collecting failed components for subsequent examination of the cause or
causes of failure using a wide array of methods, especially microscopy
and spectroscopy. The NDT or nondestructive testing methods are
valuable because the failed products are unaffected by analysis, so
inspection always starts using these methods.

1.3.1 Forensic investigation


Forensic inquiry into the failed process or product is the starting
point of failure analysis. Such inquiry is conducted using scientific
analytical methods such as electrical and mechanical measurements, or by
analysing failure data such as product reject reports or examples of
previous failures of the same kind. The methods of forensic engineering
are especially valuable in tracing product defects and flaws. They may
include fatigue cracks, brittle cracks produced by stress corrosion
cracking or environmental stress cracking, for example. Witness
statements can be valuable for reconstructing the likely sequence of
events and hence the chain of cause and effect. Human factors can also be
assessed when the cause of the failure is determined. There are several
useful methods to prevent product failures occurring in the first place,
including Failure Mode and Effects Analysis(FMEA) and Fault Tree
Analysis (FTA), methods which can be used during prototyping to
analyse failures before a product is marketed.

Failure theories can only be constructed on such data, but when


corrective action is needed quickly, the precautionary principle demands
that measures be put in place. In aircraft accidents, for example, all planes
of the type involved can be grounded immediately pending the outcome
of the inquiry.

Another interesting aspect of failure analysis is associated with No


Fault Found (NFF) which is a term used in the field of failure analysis to
describe a situation where an originally reported mode of failure can not

7
ROOT CAUSE ANALYSIS

be duplicated by the evaluating technician and therefore the potential


defect can not be fixed.
NFF can be attributed to oxidation, defective connections of
electrical components, temporary shorts or opens in the circuits, software
bugs, temporary environmental factors, but also to the operator error.
Large number of devices that are reported as NFF during the first
troubleshooting session often return to the failure analysis lab with the
same NFF symptoms or a permanent mode of failure.

The term Failure analysis also applies to other fields such as


business management and military strategy.

1.4 SYSTEMS ANALYSIS


Systems analysis is the interdisciplinary part of science, dealing
with analysis of sets of interacting entities, the systems, often prior to
their automation as computer systems, and the interactions within those
systems. This field is closely related to operations research. It is also an
explicit formal inquiry carried out to help someone, referred to as the
decision maker, identify a better course of action and make a better
decision than he might have otherwise made.

1.4.1 Overview
The terms analysis and synthesis come from classical Greek where
they mean respectively "to take apart" and "to put together". These terms
are used in scientific disciplines from mathematics and logic to economy
and psychology to denote similar investigative procedures. In general,
analysis is defined as the procedure by which we break down an
intellectual or substantial whole into parts or components. Synthesis is
defined as the opposite procedure: to combine separate elements or
components in order to form a coherent whole.

The systems discussed within systems analysis can be within any


field such as: industrial processes, management, decision making
processes, environmental protection processes, etc. The brothers Howard
T. Odum and Eugene Odum began applying a systems view to ecology in
1953, building on the work of Raymond Lindeman (1942) and Arthur
Tansley (1935).

8
ROOT CAUSE ANALYSIS

Systems analysis researchers apply mathematical methodology to


the analysis of the systems involved trying to form a detailed overall
picture.

1.4.2 Practitioners
Practitioners of systems analysis are often called upon to dissect
systems that have grown haphazardly to determine the current
components of the system. This was shown during the year 2000 re-
engineering effort as business and manufacturing processes were
examined and simplified, as part of the Year 2000 Problem (also known
as the Y2K problem or the millennium bug) automation upgrades.
Current employment titles utilizing systems analysis include, but are not
limited to, Systems Analyst, Business Analyst, Manufacturing Engineer,
Enterprise Architect, etc.

While practitioners of systems analysis can be called upon to create


entirely new systems, their skills are more often used to modify, expand
or document existing systems (processes, procedures and methods).

1.5 GENERAL PRINCIPLES OF ROOT CAUSE


ANALYSIS
1. Aiming performance improvement measures at root causes is more
effective than merely treating the symptoms of a problem.
2. To be effective, RCA must be performed systematically, with
conclusions and causes backed up by documented evidence.
3. There is usually more than one root cause for any given problem.
4. To be effective the analysis must establish all known causal
relationships between the root cause(s) and the defined problem.
5. Root cause analysis transforms an old culture that reacts to
problems to a new culture that solves problems before they
escalate, creating a variability reduction and risk avoidance
mindset.

9
ROOT CAUSE ANALYSIS

CHAPTER 2
GENERAL PROCESS FOR PERFORMING AND
DOCUMENTING AN RCA – BASED
CORRECTIVE ACTION
Every root cause investigation and reporting process should
include five phases. While there may be some overlap between phases,
every effort should be made to keep them separate and distinct. [1]

Phase I. Data Collection. It is important to begin the data collection phase


of root cause analysis immediately following the occurrence
identification to ensure that data are not lost. (Without compromising
safety or recovery, data should be collected even during an occurrence).
The information that should be collected consists of conditions before,
during, and after the occurrence; personnel involvement (including
actions taken); environmental factors; and other information having
relevance to the occurrence.

Phase II. Assessment. Any root cause analysis method may be used that
includes the following steps:
1. Identify the problem.
2. Determine the significance of the problem.
3. Identify the causes (conditions or actions) immediately preceding and
surrounding the problem.
4. Identify the reasons why the causes in the preceding step existed,
working back to the root cause (the fundamental reason which, if
corrected, will prevent recurrence of these and similar occurrences
throughout the facility).

Phase III. Corrective Actions. Implementing effective corrective actions


for each cause reduces the probability that a problem will recur and
improves reliability and safety.

Phase IV. Inform. Entering the report on the Occurrence Reporting and
Processing System (ORPS) is part of the inform process. Also included is
discussing and explaining the results of the analysis, including corrective
actions, with management and personnel involved in the occurrence. In
addition, consideration should be given to providing information of
interest to other facilities.

10
ROOT CAUSE ANALYSIS

Phase V. Follow-up. Follow-up includes determining if corrective action


has been effective in resolving problems. An effectiveness review is
essential to ensure that corrective actions have been implemented and are
preventing recurrence. Management involvement and adequate allocation
of resources are essential to successful execution of the five root cause
investigation and reporting phases.

2.1 PHASE I – DATA COLLECTION


As mentioned before, is important to begin the data collection
phase of the root cause process immediately following occurrence
identification to ensure that data are not lost. (Without compromising
safety or recovery, data should be collected even during an occurrence).
The information that should be collected consists of conditions before,
during, and after the occurrence; personnel involvement (including
actions taken); environmental factors; and other information having
relevance to the condition or problem. For serious cases, photographing
the area of the occurrence from several views may be useful in analyzing
information developed during the investigation. Every effort should be
made to preserve physical evidence such as failed components, ruptured
gaskets, burned leads, blown fuses, spilled fluids, partially completed
work orders and procedures. This should be done despite operational
pressures to restore equipment to service. Occurrence participants and
other knowledgeable individuals should be identified.

Once all the data associated with this occurrence have been
collected, the data should be verified to ensure accuracy. The
investigation may be enhanced if some physical evidence is retained.
Establishing a quarantine area, or the tagging and segregation of pieces
and material, should be performed for failed equipment or components.

The basic need is to determine the direct, contributing and root causes
so that effective corrective actions can be taken that will prevent
recurrence. Some areas to be considered when determining what
information is needed include:

· Activities related to the occurrence


· Initial or recurring problems
· Hardware (equipment) or software (programmatic-type issues)
associated with the occurrence
· Recent administrative program or equipment changes

11
ROOT CAUSE ANALYSIS

· Physical environment or circumstances.

Some methods of gathering information include:

· Conducting interviews/collecting statements - Interviews must be


fact finding and not fault finding. Preparing questions before the
interview is essential to ensure that all necessary information is
obtained.

Interviews should be conducted, preferably in person, with


those people who are most familiar with the problem. Individual
statements could be obtained if time or the number of personnel
involved make interviewing impractical. Interviews can be
documented using any format desired by the interviewer. Consider
conducting a "walk-through" as part of this interview if time
permits.

Although preparing for the interview is important, it should


not delay prompt contact with participants and witnesses. The first
interview may consist solely of hearing their narrative. A second,
more-detailed interview can be arranged, if needed. The
interviewer should always consider the interviewee’s objectivity
and frame of reference.
· Interviewing others - Consider interviewing other personnel who
have performed the job in the past. Consider using a "walk-
through" as part of the interview.
· Reviewing records - Review relevant documents or portions of
documents as necessary and reference their use in support of the
root cause analysis. Record appropriate dates and times associated
with the occurrence on the documents reviewed. Examples of
documents include the following:

Operating logs
Correspondence
Inspection/surveillance records
Maintenance records
Meeting minutes
Computer process data
Procedures and instructions
Vendor Manuals
Drawings and specifications
Functional retest specification and results
Equipment history records

12
ROOT CAUSE ANALYSIS

Design basis information


Safety Analysis Report (SAR)/Technical Specifications
Related quality control evaluation reports
Operational Safety Requirements
Safety Performance Measurement System/Occurrence
Reporting and Processing System (SPMS/ORPS)
Reports
Radiological surveys
Trend charts and graphs
Facility parameter readings
Sample analysis and results (chemistry, radiological, air,
etc.)
Work orders

· Acquiring related information - Some additional information that


an evaluator should consider when analyzing the causes includes
the following:

Evaluating the need for laboratory tests, such as


destructive/nondestructive failure analysis. Viewing physical
layout of system, component, or work area; developing layout
sketches of the area; and taking photographs to better understand
the condition. Determining if operating experience information
exists for similar events at other facilities. Reviewing equipment
supplier and manufacturer records to determine if correspondence
has been received addressing this problem.

2.2 PHASE II – ASSESSMENT


The assessment phase includes analyzing the data to identify the
causal factors, summarizing the findings, and categorizing the findings by
the cause categories. The major cause categories are:

· Equipment/Material Problem
· Procedure Problem
· Personnel Error
· Design Problem
· Training Deficiency
· Management Problem
· External Phenomena

13
ROOT CAUSE ANALYSIS

These categories have been carefully selected with the intent to


address all problems that could arise in conducting DOE operations.
Those elements necessary to perform any task are equipment/material,
procedures (instructions), and personnel. Design and training determine
the quality and effectiveness of equipment and personnel. These five
elements must be managed; therefore, management is also a necessary
element. Whenever there is an occurrence, one of these six program
elements was inadequate to prevent the occurrence. (External phenomena
beyond operational control serves as a seventh cause category.) These
causal factors can be associated in a logical causal factor chain. (Note that
a direct, contributing, or root cause can occur any place in the causal
factor chain; that is, a root cause can be an operator error while a
management problem can be a direct cause, depending on the nature of
the occurrence.)

2.2.1 Assessment and Reporting Guidance


To perform the assessment and report the causal factors and
corrective actions:

2.2.1.1 Analyze and determine the events and casual factor chain

Any root cause analysis method that includes the following basic steps
maybe used.

(a) Identify the problem. Remember that actuation of a protective


system constitutes the occurrence but is not the real problem; the
unwanted, unplanned condition or action that resulted in actuation is the
problem to be solved. For an example, dust in the air actuates a false fire
alarm. In this case, the occurrence is the actuation of an engineered safety
feature. The smoke detector and alarm functioned as intended; the
problem to be solved is the dust in the air, not the false fire alarm.
Another example is when an operator follows a defective procedure and
causes an occurrence. The real problem is the defective procedure; the
operator has not committed an error. However, if the operator had been
correctly trained to perform the task and, therefore, could reasonably have
been expected to detect the defect in the procedure, then a personnel
problem may also exist.

(b) Determine the significance of the problem. Were the


consequences severe? Could they be next time? How likely is recurrence?

14
ROOT CAUSE ANALYSIS

Is the occurrence symptomatic of poor attitude, a safety culture problem,


or other widespread program deficiency? Base the level of effort of
subsequent steps of your assessment upon the estimation of the level of
significance.

(c) Identify the causes (conditions or actions) immediately


preceding and surrounding the problem (the reason the problem
occurred).

(d) Identify the reasons why the causes in the preceding


identification step existed, working your way back to the root cause (the
fundamental reason that, if corrected, will prevent recurrence of this and
similar occurrences throughout the facility and other facilities under your
control). This root cause is the stopping point in the assessment of causal
factors. It is the place where, with appropriate corrective action, the
problem will be eliminated and will not recur.

2.2.1.2 Summarize findings, list the casual factors, and list corrective
actions

Summarize your findings, and classify each finding or cause by the


cause categories.

Select the one (most) direct cause and the root cause (the one for
which corrective action will prevent recurrence and have the greatest,
most widespread effect). In cause selection, focus on programmatic and
system deficiencies and avoid simple excuses such as blaming the
employee. Note that the root cause must be an explanation (the why) of
the direct cause, not a repeat of the direct cause. In addition, a cause
description is not just a repeat of the category code description; it is a
description specific to the occurrence. Also, up to three (contributing)
causes may be selected. Describe the corrective actions selected to
prevent recurrence, including the reason why they were selected, and how
they will prevent recurrence. Collect additional information as necessary.

15
ROOT CAUSE ANALYSIS

2.3 PHASE III – CORRECTIVE ACTIONS


The root cause analysis enables the improvement of reliability and
safety by selecting and implementing effective corrective actions. To
begin, identify the corrective action for each cause; then apply the
following criteria to the corrective actions to ensure they are viable. If the
corrective actions are not viable, re-evaluate the solutions.

1. Will the corrective action prevent recurrence?


2. Is the corrective action feasible?
3. Does the corrective action allow meeting primary objectives or
mission?
4. Does the corrective action introduce new risks? Are the assumed
risks clearly stated? (The safety of other systems must not be
degraded by the proposed corrective action.)
5. Were the immediate actions taken appropriate and effective?

A systems approach, such as Kepner-Tregoe (is the creation of


structured, systematic processes which are used to maximize the critical
thinking skills of key stakeholders in a particular situation, problem,
potential or real, decision or opportunity), should be used in determining
appropriate corrective actions. It should consider not only the impact they
will have on preventing recurrence, but also the potential that the
corrective actions may actually degrade some other aspect of nuclear
safety. Also, the impact the corrective actions will have on other facilities
and their operations should be considered. The proposed corrective
actions must be compatible with facility commitments and other
obligations. In addition, those affected by or responsible for any part of
the corrective actions, including management, should be involved in the
process. Proposed corrective actions should be reviewed to ensure the
above criteria have been met, and should be prioritized based on
importance, scheduled (a change in priority or schedule should be
approved by management), entered into a commitment tracking system,
and implemented in a timely manner. A complete corrective action
program should be based, not only on specific causes of occurrences, but
also on items such as lessons learned from other facilities, appraisals, and
employee suggestions.

A successful corrective action program requires management that is


involved at the appropriate level and is willing to take responsibility and
allocate adequate resources for corrective actions.

16
ROOT CAUSE ANALYSIS

Additional specific questions and considerations in developing and


implementing corrective actions include:

· Do the corrective actions address all the causes?


· Will the corrective actions cause detrimental effects?
· What are the consequences of implementing the corrective actions?
· What are the consequences of not implementing the corrective
actions?
· What is the cost of implementing the corrective actions (capital
costs, operations, and maintenance costs)?
· Will training be required as part of the implementation?
· In what time frame can the corrective actions reasonably be
implemented?
· What resources are required for successful development of the
corrective actions?
· What resources are required for successful implementation and
continued effectiveness of the corrective actions?
· What impact will the development and implementation of the
corrective actions have on other work groups?
· Is the implementation of the corrective actions measurable?

2.4 PHASE IV – INFORM


Electronic reporting to ORPS (Occurrence Reporting and
Processing System) is part of the inform process for all occurrences. (For
those occurrences containing classified information, an unclassified
version shall be entered into ORPS.) Effectively preventing recurrences
requires the distribution of these reports (especially the lessons learned)
to all personnel who might benefit. Methods and procedures for
identifying personnel who have an interest is essential to effective
communications.

In addition, an internal self-appraisal report identifying


management and control system defects should be presented to
management for the more serious occurrences. The defective elements
can be identified using MORT (Management Oversight and Risk Tree
Analysis) or Mini-MORT.

17
ROOT CAUSE ANALYSIS

Consideration should be given to directly sharing the details of root


cause information with similar facilities where significant or long-
standing problems may also exist.

2.5 PHASE V – FOLLOW-UP


Follow-up includes determining if corrective actions have been
effective in resolving problems. First, the corrective actions should be
tracked to ensure that they have been properly implemented and are
functioning as intended. Second, a periodic structured review of the
corrective action tracking system, normal process and change control
system, and occurrence tracking system should be conducted to ensure
that past corrective actions have been effectively handled. The recurrence
of the same or similar events must be identified and analyzed. If an
occurrence recurs, the original occurrence should be re-evaluated to
determine why corrective actions were not effective. Also, the new
occurrence should be investigated using change analysis. The process
change control system should be evaluated to determine what
improvements are needed to keep up with changing conditions. Early
indications of deteriorating conditions can be obtained from tracking and
trend analyses of occurrence information. In addition, the ORPS database
should be reviewed to identify good practices and lessons learned from
other facilities. Prompt corrective actions should be taken to reverse
deteriorating conditions or to apply lessons learned.

18
ROOT CAUSE ANALYSIS

CHAPTER 3
ROOT CAUSE ANALYSIS METHODS

Many of the Root Cause Analysis (RCA) methods are specialized


and apply to specific situations or objectives. Most have their own cause
categorizations, but all are very effective when used within the scope for
which they were designed.

The most common methods are:

· Barrier analysis - a technique often used in particularly in process


industries. It is based on tracing energy flows, with a focus on
barriers to those flows, to identify how and why the barriers did not
prevent the energy flows from causing harm.
· Bayesian inference.
· Causal factor tree analysis - a technique based on displaying causal
factors in a tree-structure such that cause-effect dependencies are
clearly identified.
· Change analysis - an investigation technique often used for
problems or accidents. It is based on comparing a situation that
does not exhibit the problem to one that does, in order to identify
the changes or differences that might explain why the problem
occurred.
· Current Reality Tree A method developed by Eliahu M. Goldratt in
his Theory of Constraints that guides an investigator to identify and
relate all root causes using a cause-effect tree whose elements are
bound by rules of logic (Categories of Legitimate Reservation).
The CRT begins with a brief list of the undesirables things we see
around us, and then guides us towards one or more root causes.
This method is particularly powerful when the system is complex,
there is no obvious link between the observed undesirable things,
and a deep understanding of the root cause(s) is desired.
· Failure mode and effects analysis, also known as FMEA.
· Fault tree analysis.
· 5 Whys.
· Ishikawa diagram, also known as the fishbone diagram or cause
and effect diagram.
· Kepner-Tregoe Problem Analysis - a root cause analysis process
developed in 1958, which provides a fact-based approach to
systematically rule out possible causes and identify the true cause.
· Pareto analysis.

19
ROOT CAUSE ANALYSIS

· RPR Problem Diagnosis - An ITIL-aligned method for diagnosing


IT problems.

3.1 BAYESIAN INFERENCE


Bayesian inference is statistical inference in which evidence or
observations are used to update or to newly infer the probability that a
hypothesis may be true. The name "Bayesian" comes from the frequent
use of Bayes' theorem in the inference process. Bayes' theorem was
derived from the work of the Reverend Thomas Bayes. [2]

3.1.1 Evidence and changing beliefs


Bayesian inference uses aspects of the scientific method, which
involves collecting evidence that is meant to be consistent or inconsistent
with a given hypothesis. As evidence accumulates, the degree of belief in
a hypothesis ought to change. With enough evidence, it should become
very high or very low. Thus, proponents of Bayesian inference say that it
can be used to discriminate between conflicting hypotheses: hypotheses
with very high support should be accepted as true and those with very
low support should be rejected as false. However, detractors say that this
inference method may be biased due to initial beliefs that one holds
before any evidence is ever collected. (This is a form of inductive bias).
Bayesian inference uses a numerical estimate of the degree of
belief in a hypothesis before evidence has been observed and calculates a
numerical estimate of the degree of belief in the hypothesis after evidence
has been observed. (This process is repeated when additional evidence is
obtained.) Bayesian inference usually relies on degrees of belief, or
subjective probabilities, in the induction process and does not necessarily
claim to provide an objective method of induction. Nonetheless, some
Bayesian statisticians believe probabilities can have an objective value
and therefore Bayesian inference can provide an objective method of
induction.

3.2 CURRENT REALITY TREE


One of the Thinking Processes in the Theory of Constraints, a
Current Reality Tree (CRT), is a way of analyzing many system or

20
ROOT CAUSE ANALYSIS

organizational problems at once. By identifying root causes common to


most or all of the problems, the CRT can greatly aid focused
improvement of the system. [3]

3.2.1 Simplified explanation


This process treats multiple problems as symptoms arising from a
few ultimate root causes. It describes, in a simple visual drawing, the
main perceived symptoms (along with secondary/hidden ones that lead up
to the perceived symptom(s)) of a problem scenario and ultimately the
apparent root cause(s) or conflict. The benefit of doing this is that it is
much easier to identify the connections or dependencies among these.
Thus, focus can be placed on the bits which would cause the biggest
positive change if tackled. [4]

3.2.2 Contextual explanation


A current reality tree is a statement of an underlying core problem
and the symptoms that arise from it. It maps out a sequence of cause and
effect from the core problem to the symptoms. Most of the symptoms will
arise from the one core problem or a core conflict. Remove the core
problem and we may well be able to remove each of the symptoms as
well. Operationally we work backwards from the apparent undesirable
effects or symptoms to uncover or discover the underlying core cause. [5]

3.2.3 Example
A CRT begins with a list of problems, known as undesirable
effects (UDEs.) These are assumed to be symptoms of a deeper common
cause. To take a somewhat frivolous example, a car owner may have the
following UDEs:

1. The car's engine will not start.


2. The air conditioning is not working.
3. The car's radio sounds distorted.
The CRT depicts a chain of cause-and-effect reasoning
(IF...AND...THEN) in graphical form, where ellipses or circles represent
an "AND".

21
ROOT CAUSE ANALYSIS

The graphic is constructed by:

§ attempting to link any two UDEs using cause-and-effect


reasoning. For example, IF the engine needs fuel in order to
run AND fuel is not getting to the engine, THEN the car's engine will
not start.
§ elaborating the reasoning to ensure it is sound and plausible. For
example, IF the air intake is full of water THEN air conditioning is
not working. Elaboration (because air is not able to circulate) gets
added as in-between step.
§ linking each of the remaining UDEs to the existing tree by
repeating the previous steps.
This approach tends to converge on a single root cause. In the
illustrated case, the root cause of the above UDEs is seen as being a
faulty handbrake.

22
ROOT CAUSE ANALYSIS

3.3 FAILURE MODE AND EFFECTS ANALYSIS

A failure modes and effects analysis (FMEA) is a procedure in


operations management for analysis of potential failure modes within a
system for classification by severity or determination of the effect of
failures on the system. It is widely used in manufacturing industries in
various phases of the product life cycle and is now increasingly finding
use in the service industry. Failure modes are any errors or defects in a
process, design, or item, especially those that affect the customer, and can
be potential or actual. Effects analysis refers to studying the
consequences of those failures. [6]

3.3.1 History
Learning from each failure is both costly and time consuming, and
FMEA is a more systematic method of studying failure. As such, it is
considered better to first conduct some thought experiments.

FMEA was formally introduced in the late 1940s for military usage
by the US Armed Forces. Later it was used for aerospace/rocket
development to avoid errors in small sample sizes of costly rocket
technology. An example of this is the Apollo Space program. The
primary push came during the 1960s, while developing the means to put a
man on the moon and return him safely to earth. In the late 1970s the
Ford Motor Company introduced FMEA to the automotive industry for
safety and regulatory consideration after the Pinto affair. They also used
it to improve production and design. [7]

Although initially developed by the military, FMEA methodology


is now extensively used in a variety of industries including semiconductor
processing, food service, plastics, software, and healthcare. It is integrated
into Advanced Product Quality Planning (APQP) to provide primary risk
mitigation tools and timing in the prevention strategy, in both design and
process formats. The Automotive Industry Action Group (AIAG) requires
the use of FMEA in the automotive APQP process and publishes a
detailed manual on how to apply the method. Each potential cause must
be considered for its effect on the product or process and, based on the
risk, actions are determined and risks revisited after actions are complete.
Toyota has taken this one step further with its Design Review Based on
Failure Mode (DRBFM) approach. The method is now supported by the

23
ROOT CAUSE ANALYSIS

American Society for Quality which provides detailed guides on applying


the method.

3.3.2 Implementation
In FMEA, failures are prioritized according to how serious their
consequences are, how frequently they occur and how easily they can be
detected. An FMEA also documents current knowledge and actions about
the risks of failures for use in continuous improvement. FMEA is used
during the design stage with an aim to avoid future failures. Later it is
used for process control, before and during ongoing operation of the
process. Ideally, FMEA begins during the earliest conceptual stages of
design and continues throughout the life of the product or service.

The purpose of the FMEA is to take actions to eliminate or reduce


failures, starting with the highest-priority ones. It may be used to evaluate
risk management priorities for mitigating known threat vulnerabilities.
FMEA helps select remedial actions that reduce cumulative impacts of
life-cycle consequences (risks) from a systems failure (fault).

3.3.3 Using FMEA when designing


FMEA can provide an analytical approach, when dealing with
potential failure modes and their associated causes. When considering
possible failures in a design – like safety, cost, performance, quality and
reliability – an engineer can get a lot of information about how to alter the
development/manufacturing process, in order to avoid these failures.
FMEA provides an easy tool to determine which risk has the greatest
concern, and therefore an action is needed to prevent a problem before it
arises. The development of these specifications will ensure the product
will meet the defined requirements.

3.3.4 Timing of FMEA


The FMEA should be updated whenever:

· A cycle begins (new product/process)


· Changes are made to the operating conditions
· A change is made in the design

24
ROOT CAUSE ANALYSIS

· New regulations are instituted


· Customer feedback indicates a problem

3.3.5 Uses of FMEA


· Development of system requirements that minimize the likelihood
of failures.
· Development of methods to design and test systems to ensure that
the failures have been eliminated.
· Evaluation of the requirements of the customer to ensure that those
do not give rise to potential failures.
· Identification of certain design characteristics that contribute to
failures, and minimize or eliminate those effects.
· Tracking and managing potential risks in the design. This helps
avoid the same failures in future projects.
· Ensuring that any failure that could occur will not injure the
customer or seriously impact a system.
· To produce world class quality products.

3.3.6 Advantages
· Improve the quality, reliability and safety of a product/process.
· Improve company image and competitiveness.
· Increase user satisfaction.
· Reduce system development timing and cost.
· Collect information to reduce future failures, capture engineering
knowledge.
· Reduce the potential for warranty concerns.
· Early identification and elimination of potential failure modes.
· Emphasis in problem prevention.
· Minimize late changes and associated cost.
· Catalyst for teamwork and idea exchange between functions.
· Reduce the possibility of same kind of failure in future.

25
ROOT CAUSE ANALYSIS

3.3.7 Limitations
Since FMEA is effectively dependent on the members of the
committee which examines product failures, it is limited by their
experience of previous failures. If a failure mode cannot be identified,
then external help is needed from consultants who are aware of the many
different types of product failure. FMEA is thus part of a larger system of
quality control, where documentation is vital to implementation. General
texts and detailed publications are available in forensic engineering and
failure analysis. It is a general requirement of many specific national and
international standards that FMEA is used in evaluating product integrity.
If used as a top-down tool, FMEA may only identify major failure modes
in a system. Fault tree analysis (FTA), discussed in 3.4, is better suited
for "top-down" analysis. When used as a "bottom-up" tool FMEA can
augment or complement FTA and identify many more causes and failure
modes resulting in top-level symptoms. It is not able to discover complex
failure modes involving multiple failures within a subsystem, or to report
expected failure intervals of particular failure modes up to the upper level
subsystem or system.

Additionally, the multiplication of the severity, occurrence and


detection rankings may result in rank reversals, where a less serious
failure mode receives a higher Risk Priority Number (RPN) than a more
serious failure mode. The reason for this is that the rankings are ordinal
scale numbers, and multiplication is not a valid operation on them. The
ordinal rankings only say that one ranking is better or worse than another,
but not by how much. For instance, a ranking of "2" may not be twice as
bad as a ranking of "1," or an "8" may not be twice as bad as a "4," but
multiplication treats them as though they are.

3.3.8 Software
The usage of software will improve the documentation process of
FMEA. When selecting the software package, it is important to choose
one that is easy to learn and promotes consistent updating of the
documentation. It is not necessary to spend a lot of money to have an
effective, user-friendly system. Some FMEA software companies provide
free upgrades, free support, and software with unlimited licenses. This is
especially helpful in ensuring the long-term acceptance, understanding,
and implementation of FMEAs. FMEA is applicable to all engineering
process.

26
ROOT CAUSE ANALYSIS

3.3.9 Types of FMEA


· Process: analysis of manufacturing and assembly processes.
· Design: analysis of products prior to production.
· Concept: analysis of systems or subsystems in the early design
concept stages.
· Equipment: analysis of machinery and equipment design before
purchase.
· Service: analysis of service industry processes before they are
released to impact the customer.
· System: analysis of the global system functions.
· Software: analysis of the software functions.

3.4 FAULT TREE ANALYSIS


Fault tree analysis (FTA) is a failure analysis in which an undesired
state of a system is analyzed using boolean logic to combine a series of
lower-level events. This analysis method is mainly used in the field of
safety engineering to quantitatively determine the probability of a safety
hazard. [8]

3.4.1 History
Fault Tree Analysis (FTA) attempts to model and analyze failure
processes of engineering and biological systems. FTA is basically
composed of logic diagrams that display the state of the system and is
constructed using graphical design techniques. Originally, engineers were
responsible for the development of Fault Tree Analysis, as a deep
knowledge of the system under analysis is required. Often, FTA is
defined as another part, or technique, of reliability engineering. Although
both model the same major aspect, they have arisen from two different
perspectives. Reliability engineering was, for the most part, developed by
mathematicians, while FTA, as stated above, was developed by
engineers.

Fault Tree Analysis usually involves events from hardware wear


out, material failure or malfunctions or combinations of deterministic
contributions to the event steming from assigning a hardware/system
failure rate to branches or cut sets. Typically, failure rates are carefully
derived from substantiated historical data such as mean time between
failure of the components, unit, subsystem or function. Predictor data

27
ROOT CAUSE ANALYSIS

may be assigned. Assigning a software failure rate is elusive and not


possible. Since software is a vital contributor and inclusive of the system
operation it is assumed the software will function normally as intended.
There is no such thing as a software fault tree unless considered in the
system context. Software is an instruction set to the hardware or overall
system for correct operation. Since basic software events do not fail in the
physical sense, attempting to predict manifestation of software faults or
coding errors with any reliability or accuracy is impossible, unless
assumptions are made. Predicting and assigning human error rates is not
the primary intent of a fault tree analysis, but may be attempted to gain
some knowledge of what happens with improper human input or
intervention at the wrong time.

Fault Tree Analysis was initially developed for projects where


errors are intolerable (e.g., an error in a nuclear reactor is not tolerated).
Bell Telephone Laboratories started the development of FTA during the
early 60's for the United States Air Force's Minuteman System
(Intercontinental Ballistic Missiles and Bombers). Later, U.S. nuclear
power plants and the Boeing Company used the system extensively. FTA
can be used as a valuable design tool, can identify potential accidents, and
can eliminate costly design changes. It can also be used as a diagnostic
tool, predicting the most likely system failure in a system breakdown.
FTA is used in safety engineering and in all major fields of engineering.

3.4.2 Why Fault Tree Analysis?


Since no system is perfect, dealing with a subsystem fault is a
necessity, and any working system eventually will have a fault in some
place. However, the probability for a complete or partial success is
greater than the probability of a complete failure or partial failure.
Assembling a FTA is thus not as tedious as assembling a success tree
which can turn out to be very time consuming.

Because assembling a FTA can be a costly and cumbersome


experience, the perfect method is to consider subsystems. In this way
dealing with smaller systems can assure less error work probability, less
system analysis. Afterward, the subsystems integrate to form the well
analyzed big system.

28
ROOT CAUSE ANALYSIS

3.4.3 Methodology
In the technique known as "fault tree analysis", an undesired effect
is taken as the root ('top event') of a tree of logic. There should be only
one Top Event and all concerns must tree down from it. Then, each
situation that could cause that effect is added to the tree as a series of
logic expressions. When fault trees are labeled with actual numbers about
failure probabilities (which are often in practice unavailable because of
the expense of testing), computer programs can calculate failure
probabilities from fault trees.

The Tree is usually written out using conventional logic gate


symbols. The route through a tree between an event and an initiator in the
tree is called a Cut Set. The shortest credible way through the tree from
fault to initiating event is called a Minimal Cut Set.

Some industries use both Fault Trees and Event Trees. An Event
Tree starts from an undesired initiator (loss of critical supply, component
failure, etc.) and follows possible further system events through to a
series of final consequences. As each new event is considered, a new
node on the tree is added with a split of probabilities of taking either
branch. The probabilities of a range of 'top events' arising from the initial
event can then be seen.

Classic computer programs include the Electric Power Research


Institute's (EPRI) CAFTA software, which is used by many of the US
nuclear power plants and by a majority of US and international aerospace
manufacturers, and the Idaho National Laboratory's SAPHIRE, which is
used by the U.S. Government to evaluate the safety and reliability of
nuclear reactors, the Space Shuttle, and the International Space Station.
Outside the US, the software RiskSpectrum is a popular tool for Fault
Tree and Event Tree analysis and is licensed for use at almost half of the
worlds nuclear power plants for Probabilistic Safety Assessment.

3.4.4 Analysis
Many different approaches can be used to model a FTA, but the
most common and popular way can be summarized in a few steps.
Remember that a fault tree is used to analyze a single fault event, and that
one and only one event can be analyzed during a single fault tree. Even
though the “fault” may vary dramatically, a FTA follows the same

29
ROOT CAUSE ANALYSIS

procedure for an event, be it a delay of 0.25 msec for the generation of


electrical power, or the random, unintended launch of an ICBM.
FTA analysis involves five steps:

1. Define the undesired event to study

Definition of the undesired event can be very hard to catch,


although some of the events are very easy and obvious to observe. An
engineer with a wide knowledge of the design of the system or a system
analyst with an engineering background is the best person who can help
define and number the undesired events. Undesired events are used then
to make the FTA, one event for one FTA; no two events will be used to
make one FTA.

2. Obtain an understanding of the system

Once the undesired event is selected, all causes with probabilities


of affecting the undesired event of 0 or more are studied and analyzed.
Getting exact numbers for the probabilities leading to the event is usually
impossible, for the reason that it may be very costly and time consuming
to do so. Computer software is used to study probabilities; this may lead
to less costly system analysis.
System analysts can help with understanding the overall system. System
designers have full knowledge of the system and this knowledge is very
important for not missing any cause affecting the undesired event. For the
selected event all causes are then numbered and sequenced in the order of
occurrence and then are used for the next step which is drawing or
constructing the fault tree.

3. Construct the fault tree

After selecting the undesired event and having analyzed the system
so that we know all the causing effects (and if possible their probabilities)
we can now construct the fault tree. Fault tree is based on AND and OR
gates which define the major characteristics of the fault tree.

30
ROOT CAUSE ANALYSIS

4. Evaluate the fault tree

After the fault tree has been assembled for a specific undesired
event, it is evaluated and analyzed for any possible improvement or in
other words study the risk management and find ways for system
improvement. This step is as an introduction for the final step which will
be to control the hazards identified. In short, in this step we identify all
possible hazards affecting in a direct or indirect way the system.

5. Control the hazards identified

This step is very specific and differs largely from one system to
another, but the main point will always be that after identifying the
hazards, all possible methods are pursued to decrease the probability of
occurrence.

3.5 5-WHYS
The 5 Whys is a question-asking method used to explore the cause/effect
relationships underlying a particular problem. Ultimately, the goal of
applying the 5 Whys method is to determine a root cause of a defect or
problem.

3.5.1 Example
The following example demonstrates the basic process:

· My car will not start. (the problem)

1. Why? - The battery is dead. (first why)


2. Why? - The alternator is not functioning. (second why)
3. Why? - The alternator belt has broken. (third why)
4. Why? - The alternator belt was well beyond its useful service life
and has never been replaced. (fourth why)
5. Why? - I have not been maintaining my car according to the
recommended service schedule. (fifth why, a root cause)

The questioning for this example could be taken further to a sixth,


seventh, or even higher level. This would be legitimate, as the "five" in 5
Whys is not gospel; rather, it is postulated that five iterations of asking

31
ROOT CAUSE ANALYSIS

why is generally sufficient to get to a root cause. The real key is to


encourage the troubleshooter to avoid assumptions and logic traps and
instead to trace the chain of causality in direct increments from the effect
through any layers of abstraction to a root cause that still has some
connection to the original problem.

3.5.2 History
The technique was originally developed by Sakichi Toyoda and
was later used within Toyota Motor Corporation during the evolution of
their manufacturing methodologies. It is a critical component of problem
solving training delivered as part of the induction into the Toyota
Production System. The architect of the Toyota Production System,
Taiichi Ohno, described the 5 whys method as "... the basis of Toyota's
scientific approach ... by repeating why five times, the nature of the
problem as well as its solution becomes clear." The tool has seen
widespread use beyond Toyota, and is now used within Kaizen, lean
manufacturing, and Six Sigma. [9]

3.5.3 Criticism
While the 5 Whys is a powerful tool for engineers or technically savvy
individuals to help get to the true causes of problems, it has been
criticized by Teruyuki Minoura, former managing director of global
purchasing for Toyota, as being too basic a tool to analyze root causes to
the depth that is needed to ensure that the causes are fixed. Reasons for
this criticism include:

· Tendency for investigators to stop at symptoms rather than going


on to lower level root causes.
· Inability to go beyond the investigator's current knowledge - can't
find causes that they don't already know
· Lack of support to help the investigator to ask the right "why"
questions.
· Results aren't repeatable - different people using 5 Whys come up
with different causes for the same problem.
· The tendency to isolate a single root cause, whereas each question
could elicit many different root causes

These can be significant problems when the method is applied through


deduction only. On-the-spot verification of the answer to the current

32
ROOT CAUSE ANALYSIS

"why" question, before proceeding to the next, is recommended as a good


practice to avoid these issues.

3.6 ISHIKAWA DIAGRAM


Ishikawa diagrams (also called fishbone diagrams or cause-and-effect
diagrams) are diagrams that show the causes of a certain event. Common
uses of the Ishikawa diagram are product design and quality defect
prevention, to identify potential factors causing an overall effect.

3.6.1 Overview
Ishikawa diagrams were proposed by Kaoru Ishikawa in the 1960s,
who pioneered quality management processes in the Kawasaki shipyards,
and in the process became one of the founding fathers of modern
management. [10]

It was first used in the 1960s, and is considered one of the seven
basic tools of quality management, along with the histogram, Pareto
chart, check sheet, control chart, flowchart, and scatter diagram. It is
known as a fishbone diagram because of its shape, similar to the side
view of a fish skeleton.

Mazda Motors famously used an Ishikawa diagram in the


development of the Miata sports car, where the required result was "Jinba
Ittai" or "Horse and Rider as One". The main causes included such
aspects as "touch" and "braking" with the lesser causes including highly
granular factors such as "50/50 weight distribution" and "able to rest
elbow on top of driver's door". Every factor identified in the diagram was
included in the final design.

3.6.2 Causes
Causes in the diagram are often categorized, such as to the 4 M's,
described below. Cause-and-effect diagrams can reveal key relationships
among various variables, and the possible causes provide additional
insight into process behavior.

33
ROOT CAUSE ANALYSIS

Causes can be derived from brainstorming sessions, successively


sorted through affinity-grouping to collect similar ideas together. These
groups can then be labeled as categories of the fishbone. They will
typically be one of the traditional categories mentioned above but may be
something unique to the application in a specific case. Causes can be
traced back to root causes with the 5 Whys technique.

3.6.3 Categories
The original 4 M's

· Machine (Equipment)
· Method (Process/Inspection)
· Material (Raw,Consumables etc.)
· Man power

More categories

· Mother Nature (Environment)


· Man Power (physical work)
· Mind Power (Brain Work): Kaizens, Suggestions
· Measurement (Inspection)
· Maintenance
· Money Power
· Management

The 8 P's (Used in Service Industries)

· People
· Process
· Policies
· Procedures
· Price
· Promotion
· Place/Plant
· Product

The 4 S's (Used in Service Industries)

· Surroundings
· Suppliers

34
ROOT CAUSE ANALYSIS

· Systems
· Skills

3.7 KEPNER-TREGOE PROBLEM ANALYSIS

3.7.1 Kepner-Tregoe (company)


Founded in 1958 by Dr. Charles Kepner and Dr. Benjamin Tregoe,
Kepner-Tregoe, Inc., is a global organisation providing consulting and
training services around problem solving, decision making and project
execution methodologies.[11]

3.7.2 Kepner-Tregoe (technique)


Kepner-Tregoe's trademark technique, Rational Process, which is
commonly referred to as the 'KT Process', is the creation of structured,
systematic processes which are used to maximise the critical thinking
skills of key stakeholders in a particular situation, problem (potential or
real), decision or opportunity.
The Rational Processes are broken down into the following:

SITUATION APPRAISAL - the process of ensuring that priority and


order are established for multiple concerns associated with a specific
issue.
Example - The company's in-house built payroll system is becoming
outdated and increasingly difficult to support.

PROBLEM ANALYSIS - a systematic process for finding the cause of a


positive or negative deviation.
Example - The payroll system is grinding to a halt at 7am each day.

DECISION ANALYSIS - a systematic process for making a balanced


choice.
Example - What is the best alternative payroll solution to fit with the
company's needs?

POTENTIAL PROBLEM (OR OPPORTUNITY) ANALYSIS - a


systematic process for protecting an action or a plan.
Example - Determine key risks associated with implementation of a new
payroll system.

35
ROOT CAUSE ANALYSIS

3.8 PARETO ANALYSIS


Pareto analysis is a statistical technique in decision making that is
used for selection of a limited number of tasks that produce significant
overall effect. It uses the Pareto principle – the idea that by doing 20% of
work you can generate 80% of the advantage of doing the entire job. Or,
in terms of quality improvement, a large majority of problems (80%) are
produced by a few key causes (20%).

Pareto analysis is a formal technique useful where many possible


courses of action are competing for one’s attention. In essence, the
problem-solver estimates the benefit delivered by each action, then
selects a number of the most effective actions that deliver a total benefit
reasonably close to the maximal possible one.

Pareto analysis is a creative way of looking at causes of problems


because it helps stimulate thinking and organize thoughts. However, it
can be limited by its exclusion of possibly important problems which may
be small initially, but which grow with time. It should be combined with
other analytical tools such as failure mode and effects analysis and fault
tree analysis for example.

3.8.1 Steps to identify the important causes using Pareto


analysis
· Step 1: Form a table listing the causes and their frequency as a
percentage.
· Step 2: Arrange the rows in the decreasing order of importance of
the causes (i.e, the most important cause first)
· Step 3: Add a cumulative percentage column to the table
· Step 4: Plot with causes on x- and cumulative percentage on y-axis
· Step 5: Join the above points to form a curve
· Step 6: Plot (on the same graph) a bar graph with causes on x- and
percent frequency on y-axis
· Step 7: Draw line at 80% on y-axis parallel to x-axis. Then drop
the line at the point of intersection with the curve on x-axis. This
point on the x-axis separates the important causes (on the left) from
the trivial ones (on the right)
· Step 8: Review the chart to ensure you are capturing at least 80%
of the causes

36
ROOT CAUSE ANALYSIS

3.9 RPR PROBLEM DIAGNOSIS


RPR is a problem diagnosis method specifically designed to determine
the Root Cause of IT problems.

3.9.1 Overview
RPR (Rapid Problem Resolution) deals with failures, incorrect output
and performance issues, and its particular strengths are in the diagnosis of
ongoing and recurring grey problems, the method comprises of:

· Core Process, and


· Supporting Techniques

The Core Process defines a step-by-step approach to problem


diagnosis and has three phases:

· Discover
o Gather and review existing information
o Reach an agreed understanding
· Investigate
o Create and execute a diagnostic data capture plan
o Analyse the results and iterate if necessary
o Identify Root Cause
· Fix
o Translate diagnostic data
o Determine and implement fix
o Confirm Root Cause addressed

The Supporting Techniques detail how the objectives of the Core


Process steps are achieved, and cite examples using tools and techniques
that are available in every business.

3.9.2 Limitations
RPR has some limitations and considerations, including:

· RPR deals with a single symptom at a time


· RPR is not a forensic technique and so historical data alone is
rarely sufficient

37
ROOT CAUSE ANALYSIS

· The Investigate phase requires the user to experience the problem


one more time

3.9.3 History
The method was originally developed in 1990 as the Rapid
Problem Resolution Method, with the first fully documented version
produced in 1995. Early versions included problem management
guidance but this was removed over time as the method became more
closely aligned to International Technology Infrastructure Library (ITIL).
RPR is now focused on Problem Diagnosis based on Root Cause
Identification. Due to the highly practical nature of the Supporting
Techniques and the ever changing IT landscape, Advance7 continues to
develop RPR to keep it relevant to current IT environments.
Until November 2007 Advance7 made the RPR material available to its
employees only, although a limited number of other IT professionals had
been trained in the use of the method. In late 2007 the company
announced its intention to make RPR training and material more widely
available.

38
ROOT CAUSE ANALYSIS

CHAPTER 4

BASIC ELEMENTS OF ROOT CAUSE ANALYSIS

· Materials
o Defective raw material
o Wrong type for job
o Lack of raw material
· Machine / Equipment
o Incorrect tool selection
o Poor maintenance or design
o Poor equipment or tool placement
o Defective equipment or tool
· Environment
o Orderly workplace
o Job design or layout of work
o Surfaces poorly maintained
o Physical demands of the task
o Forces of nature
· Management
o No or poor management involvement
o Inattention to task
o Task hazards not guarded properly
o Other (horseplay, inattention....)
o Stress demands
o Lack of Process
· Methods
o No or poor procedures
o Practices are not the same as written procedures
o Poor communication
· Management system
o Training or education lacking
o Poor employee involvement
o Poor recognition of hazard
o Previously identified hazards were not eliminated
o 4ME (Man, Machine, Materials, Method and Environment).

39
ROOT CAUSE ANALYSIS

CHAPTER 5

ROOT CAUSE ANALYSIS AND CASUALTY


INVESTIGATION IN MARITIME INDUSTRY

Investigation of accidents, casualties, or near misses is critical in


prevention of future reoccurrence. In the maritime industry valuable
lessons can be learned from mishaps if they are investigated properly and
expeditiously; unfortunately these are becoming a rarity. Let me explain.

5.1 THE WAY INVESTIGATIONS USED TO BE DONE

If one goes back to the 1970s and earlier there was a standard
format. First, there was a gathering of facts called FINDINGS OF
FACTS. Each of these were substantiated by some document included or
referenced. When the facts were all accumulated, the investigator used to
then draw some CONCLUSIONS directly from the FACTS. And lastly
RECOMMENDATIONS were made strictly upon the FACTS and
CONCLUSIONS.

So basically everything was supported by something. Simple


reports were typically completed within a month, whereas more
complicated casualty investigations maybe took about 6-8 months. The
reports were released while the casualty was still fresh in mariners’ minds
and was of value to the maritime community.

5.2 WHY INVESTIGATE INCIDENTS?

Maritime industries experience incidents that range from major


accidents to near misses (or more appropriately, near hits). Why should
these incidents be investigated? International agreements mandate it (such
as the IMO “International Safety Management Code”), many flag
administrations require it, and industry initiatives (such as the Oil
Companies International Marine Forum’s Tanker Management Self
Assessment scheme) encourage it. Investigating incidents is also good
business: if one can prevent recurrence and reduce the likelihood of other
incidents with the same root causes, costs (human, environmental and
property) associated with incidents can be eliminated.

40
ROOT CAUSE ANALYSIS

CHAPTER 6

REPORTS AND ANALYSIS OF NON-


CONFORMITIES AS DESCRIBED IN THE
COMPANY’S SAFETY MANAGEMENT SYSTEM
MANUAL

6.1 GENERAL

Non-conformities, accidents and hazardous occurrences reports are


sent to the Company by the Master as soon as possible after the
occurrence. The first report may be verbal. However, a written report
must always be sent as soon as possible. [12]

Reports are sent by the Master to the responsible Managers and to


the Designated Person in the following cases of incidents:

a. Accidents
b. Hazardous occurrences
c. Non-conformities

The Designated Person will analyse all reports on non-conformities


and will arrange for reviews with the Senior Management at the annual
Management review. Necessary corrective actions will be also discussed
at the annual Management review.

6.2 RESPONSIBILITIES

All personnel, both ashore and onboard ship, are responsible for
identifying and reporting non-conformities.

The Manager/Master under whose responsibility the non-


conformity rests, informs other relevant Departments or personnel and
initiates the required corrective action.

The Designated Person is responsible for:

a. Recording of all non-conformities reports (NCRs)

41
ROOT CAUSE ANALYSIS

b. Evaluating each NCR to determine whether there has been a failure


in the Safety Management System (SMS)
c. Ensuring that the measures taken to correct the deficiencies are
effective

6.3 DEFINITIONS

6.3.1 Non-conformities

Operations which do not comply with established procedures or


specifications are considered as NON-CONFORMITIES. Material
breakdowns due to fair wear and tear and with no impact on work tasks
shall not be regarded as non-conformities.

6.3.2 Accidents

All undesired events that result in harm to people, damage to


property or loss to process are considered as ACCIDENTS.

6.3.3 Hazardous Occurrences (Near-misses)

All undesired events, which, under slightly different circumstances,


could have resulted in harm to people, damage to property or loss to
process, are considered as HAZARDOUS OCCURRENCES.

6.4 PROCEDURES

The following procedures should be in place in order to ensure that a


non-conformity of any nature is identified and corrective action taken:

a. Recording, analysing and making decisions which have resulted in


complaints from Owners, Charterers, Terminal or Port Senior
Management and flag or port state competent authorities

b. Provide resources to enable corrective action to be taken when non-


conformities have been identified.

42
ROOT CAUSE ANALYSIS

c. Maintain effective communications to ensure all applicable parties


are kept informed on:
1. Corrective action taken or planned
2. Progress of the corrective action

d. Review and update Company’s procedures, when necessary

6.5 REPORTING NON-CONFORMITIES (NCRs)

All shore-based and marine personnel may report non-conformities


to their respective Manager or Master, either in writing or verbally.

All non-conformities are distinguished into the following categories:

a. Shipboard non-conformities will be noted by the Master of


concerned vessel and will be reported to the appropriate Manager,
who in turn will evaluate the report and, where necessary will raise a
NCR.

b. Shore-based non-conformities will be brought to the attention of


the responsible Manager or alternatively identified by him.

c. Non-conformities identified by internal and external auditors are


brought to the attention of the Designated Person.

All NCRs are finally brought to the attention of the Designated


Person, who reviews and evaluates them and takes all the necessary
actions.

The following three main sources may provide evidence of non-


conformities:

a. Information within the Company. Such information may come from:


· Deck and Engine log books.
· Communications to/from vessels.
· Voyage abstracts.
· Inspection and maintenance reports.
· Internal and External audit reports.
· Port State control and Flag State Inspection reports (detentions,
safety and pollution control items, drills, maintenance, crew and

43
ROOT CAUSE ANALYSIS

vessel certificates).
· Classification Society reports.
· Loading/discharging documents.

Detentions, in particular, shall always be treated as internal non-


conformities and processed accordingly.

b. The Owner or Charterer. This may be in the form of a letter of


protest, complaints or claims.
c. A Third Party in general. Such information may be obtained through
survey inspection reports, incident requests from another vessel or
claims.

The person who reports the NCR must describe the incident in
detail, give information about the possible causes, and inform the
Company on the corrective actions already taken or suggested to be
taken.

6.6 CORRECTIVE ACTIONS

6.6.1 Treatment

The treatment of a non-conformity depends on:

· The competent Administration and/or the classification society


· The range of severity
· The cause
· The effect
· The type

Management undertakes to correct immediately any damages or


failures that affect the safety of the ship and the protection of
environment.

44
ROOT CAUSE ANALYSIS

6.6.2 Corrective actions

The Department’s Manager or the Master, as appropriate, shall


plan and initiate all corrective actions and follow-up to ensure that such
actions have been effective. All corrective actions shall be reported to the
Designated Person.

Appropriate corrective action may be:

· Revision of a procedure or operating instructions


· Issue of a new procedure or operating instructions
· Removal of a supplier or sub-contractor from the Company’s
approved list
· Ensuring that personnel adhere to Safety procedures
· Further Training/Education

6.6.3 Assessment of the cause

Following initial corrective action, investigation shall be made to


determine the underlying cause of non-conformity and plans shall be
formulated for permanent correction.

The Manager or Master concerned, should make assessment of the


underlying cause of non-conformity, with assistance from the Designated
Person.

All incidents, near misses or hazardous occurrences, non-


conformances and defects where the underlying cause is not clear shall be
investigated using the Root Cause Analysis Methodology and relevant
form RCA (Chapter 7). The root cause may comprise of a number of
contributory factors, which are interlinked or result from interacting
systems or work activities.

The “immediate cause” is the circumstance of an event, defect or


failure, which results in an immediate consequence.

The “root cause” is the underlying cause and may be made of a


number of factors such as training, planning or lack of inspection,
reporting and review.

45
ROOT CAUSE ANALYSIS

Some accidents result from the untimely “conspiracy” of events or


the effects of energized systems being operated concurrently
unexpectedly.

6.6.3.1 Root Cause Analysis Methodology

The Root Cause Analysis comprises of an investigation into the


defect, NC or incident in order to identify the factors that led to the
problem.

There are a number of stages that can be implemented in the Root


Cause Analysis Process:

1. Determine the significance of the event, defect or NC.


2. Deploy the “5 WHY’s” principal:
WHY – do we have the defect, incident of non-conformance?
WHY – was there malfunction or poor condition?
WHY – immediate Cause (what happened at the time)?
WHY – underlying cause?
WHY – was the underlying cause allowed to happen?
3. Investigate the proximate cause (being non-compliance with a
Legal Requirement – such as a particular SOLAS Regulation),
relating the incident.
4. Identify and investigate the “process” and “non-process” factors
that may have conspired to contribute to the incident.
5. Gather relevant data and evidence.
6. Interview methods and interviewing of relevant persons.
7. Use the fault tree diagram to assist in the identification of
possible contributory factors sometimes known as the “casual
causes”.
8. Analyze data to identify all contributory factors.
9. Develop a cause and effect analysis out of contributory factors.
10. Develop “fault tree” to help you find all possible scenarios and
determine the most likely scenario.
11. Develop conclusions and produce recommendations.

Contributory factors may be made up of a number of potential causes.


The findings are evaluated for significance and corrective actions
recommended in order to improve the processes which were ineffective in

46
ROOT CAUSE ANALYSIS

preventing the defect or incident for the purpose of improving procedures


and preventing reoccurrence.

A Root Cause Analysis Report (Form RCA should be attached to


the incident, non-conformities report (NCR), or defect report and given
an appropriate series number).

The corrective and preventive measures shall be reviewed


immediately by the Designated Person Ashore (DPA) and implemented
with appropriate changes and amendments. Implementation can take
place either by direct inclusion into the Safety Management System
(SMS) or by inclusion in a relevant Circular for a reasonable trial period
before final inclusion in the SMS. In either case corrective and preventive
measures shall be audited and reviewed to assess their effectiveness and
any changes made to improve the measures taken to prevent recurrence.

6.6.3.2 Training for Root Cause Analysis

The DPA, Asst. DPA, Operations Manager(s) and Technical


Manager(s) shall be trained in the principals and methods of “Root
Cause Analysis”.

Training -verifiable from records - shall include Form RCA and


methods used to:

- Define a near miss, hazardous occurrence, defect and Non-


conformance.
- Investigate the immediate or proximate cause of the event, defect or
Non-conformance.
- Train others to recognize and report defects and incidents.
- Initiate and conduct investigations.
- Gather information and evidence during an investigation
interviewing.
- Review data.
- Establish a team or working with other personnel.
- Identify possible contributory causes.

47
ROOT CAUSE ANALYSIS

6.6.4 Recording

All raised non-conformities are recorded using form NCR. The


reports are numbered in sequence using separate numbering for each
month in the format mm/yy/xx, where xx is a two-digit serial number and
mm and yy are the month number and the last two digits of the year
respectively. Each vessel and the Office have separate numbering.

The non-conformity reports are always prepared in duplicate. One


copy is kept by the Designated Person in a separate file, namely the
“NON-CONFORMITIES” file, while the other copy is kept by the
Master or the Head of the Department concerned.

The DPA, in co-operation with the responsible department, should


decide on the time limit for the completion of corrective actions. This
time limit must be clearly stated on the report. If, however, it is
anticipated that the set time limit will be exceeded due to objective or
unforeseen reasons, it can be extended and the NCR be rescheduled
accordingly.

A non-conformity remains “open” until the decided corrective


actions have been applied and this has been verified accordingly. For
each corrective action applied by the responsible person, the form is
suitably endorsed. After verification of all corrective actions, the non-
conformity is closed-out and the relevant form is suitably endorsed.

48
ROOT CAUSE ANALYSIS

CHAPTER 7
ANALYSIS OF REAL CASES WITH NON-
CONFORMITIES AND NEAR-MISSES
During the previous year fourteen (14) cases of Non-Conformities
and Near-Misses were raised in a Shipping Company. In order to analyse
each case, the form “RCA” has to be filled in, as well as a fault tree
diagram, constructed. A blank form “RCA” is shown in the next page.

49
ROOT CAUSE ANALYSIS

RCA No:

RCA Date:

Type of Incident:

Incident Details:

Immediate Cause:

Procedures Reviewed:

1 - WHY

2 - WHY

5 WHYs Process: 3 - WHY

4 - WHY

5 - WHY

Analysis process:
Contributory Causes:

Root Cause(s):

Recommendations:

50
ROOT CAUSE ANALYSIS

The fourteen cases are as follow:

1. Bad record keeping of fuel consumption and change-over


procedure of low Sulphur. Old version of Oil Record Book on
board. Consumption of high sulphur fuel oil in restricted area.
2. Emergency lights found inoperative during internal audit.
3. On repeated occasions obsolete forms were in use on-board
company vessels instead of updated ones. This was despite
corrective actions taken after each occurrence.
4. Failing to identify, report and correct vessel defects and non-
conformances in timely manner, thereby causing delay in
implementation of corrective actions.
5. S-VDR and Inmarsat C found out of order / Flag not advised
6. Various failures in monitoring of lub-oil analysis, including
frequency of sampling, availability of results and follow-up on
remedial actions when results indicate "caution" or "alert" status.
7. Maintenance and inspection of safety equipment items (life boats,
fire pump, CO2 bottles) and incorrect/insufficient implementation
of existing procedures.
8. During cargo operations the crane jib fell on deck due to hoisting
wire failure.
9. Poisoning of Master, all officers and crew dew to cargo hold
fumigation (aluminium phosphide) entering accommodation.
10. Only one deck officer designated to carry out tank and hold
inspections and there was no evidence that this person was in any
event familiar with the publication "Guidelines for surveys and
assessment of hull structures" and may be regarded as untrained as
per requirements of the SMS.
11. The new Chief Officer had only recently joined the vessel (one
month ago) and he had not yet been trained up as the designated
hold and tank inspector. There was no evidence that this person
was in any event familiar with the publication "Guidelines for
surveys and assessment of hull structures" and may be regarded as
untrained as per requirements of the SMS.
12. The existing multi gas detector had been supplied with the wrong
accessory for remote sampling.
13. The fuel tank of the Emergency Generator was found half full.
14. It was noted that no printouts are kept for intermediate loading /
discharging stages and it could not be verified that such
information was readily available for all cargo operations through
the ship's loading calculations software.

In the sequel the above mentioned cases are analysed:

51
ROOT CAUSE ANALYSIS

Case No1

Bad record keeping of fuel consumption and change-over procedure of


low Sulphur. Old version of Oil Record Book on-board. Consumption of
high sulphur fuel oil in restricted area.

Regarding this case


From May 19th 2006, all vessels operating within SOx Emission
Control Areas (SECAs) must use fuel with sulphur content that does not
exceed 1,5% m/m.

These areas are:


1. Baltic Sea - came into force on 19 May 2005
2. North Sea and English Channel - came into force on 11 August
2007
3. California - came into force on 1 July 2009

The sulphur content of the fuel oil received on-board shall be


documented by the supplier in the Bunker Delivery Note. Each vessel
shall have fuel tanks with sufficient capacity designated for the storage of
low-sulphur fuel. Prior to entering a Sulphur Emission Control Area
(SECA), the Chief Engineer shall verify that all necessary arrangements
have been made to ensure that all machinery on-board affected by the
Regulation, will be operating on low sulphur fuel (less or equal to 1.5%
sulphur content) only, for the entire passage through the SECA. The
volume of low sulphur fuel oils in each tank, as well as the date, time and
position of the ship when any fuel changeover operation between high-
sulphur fuel and low-sulphur fuel is completed, shall be duly recorded in
the Marine Fuel Sulphur Record Book. Also, in order to ensure that low-
sulphur fuel is not mixed with high sulphur fuel residues, vessels without
dedicated low-sulphur fuel tanks should arrange fuel transfers in such a
way that at least one fuel tank is stripped and drained to the maximum
extent possible prior to bunkering low-sulphur fuel.

It was noted that in several vessels, the above procedures were not
properly followed. Furthermore, in many cases the Oil Record Book was
not filled in correctly.

It order to avoid such deficiencies in the future it is important to


provide detailed instructions in the native language of the crew regarding
new regulations and on how to fill in Record Books.

52
ROOT CAUSE ANALYSIS

RCA No: 001

Type of Incident: Non-conformity


Incident Details: Bad record keeping of fuel consumption and change-over procedure of low
Sulphur. Old version of Oil Record Book on board. Consumption of high sulphur fuel oil in restricted
area.

Immediate Cause: Incomplete records in the Marine Sulphur Fuel Record Book, incorrect records
of fuel quality/quantity.

ch. Engineer was not fully aware how to keep records in


1 - WHY
the Marine fuel Sulphur record book?

2 - WHY he did not read instructions properly?

the instruction how to fill the book were not translated into
5 WHYs Process: 3 - WHY
native language of engineer?

4 - WHY the other ships were not informed?

5 - WHY

Analysis process: Fault Tree Diagram - See Attached


Contributory Causes: Lack of participation, training of ch. Engineer and auditing from the office.
Lack of corrective action. Lack of information from the office (not supplying new forms of oil record
book)

Root Cause(s): General lack of English language and office processing skills of individual ch.
Engineers.

Recommendations: When new record book issued it is better to provide detailed instruction in
native language of crew how to fill in. Office control of supply of new publications to be improved.

53
ROOT CAUSE ANALYSIS

54
ROOT CAUSE ANALYSIS

Case No2

Emergency lights found inoperative during internal audit.

Regarding this case

All Shipping Companies have adopted specific testing procedures


for equipment and systems on-board the vessels, identified as critical, in
order to ensure their functional reliability. Such systems include all
emergency equipment, like:

1. Emergency steering gear


2. Emergency fire pump
3. Emergency Diesel Generators or batteries
4. Emergency air compressors
5. Emergency lights

As confirmation that such testing procedures and inspections are


carried out, forms must be completed, in which all defects are reported in
conjunction with their possible causes and corrective actions, if any.

As a conclusion, anyone can understand that it is unacceptable


part of the emergency equipment to be found inoperative during internal
an internal audit, because any defects should have been identified
beforehand by the crew.

55
ROOT CAUSE ANALYSIS

RCA No: 002

Type of Incident: Non-conformity


Incident Details: Emergency lights found inoperative during internal audit.

Immediate Cause: Emergency lights (Failure of some bulbs) were not tested by safety officer in
operation as required.

1 - WHY emergency lights were not operational?

2 - WHY lights were not tested during inspection?

5 WHYs Process: 3 - WHY did senior officers not supervise quality of inspection?

4 - WHY lack of supervision allowed to happen?

5 - WHY the other ships were not informed?

Analysis process: Fault Tree Diagram - See Attached


Contributory Causes: Lack of participation and supervision. Incorrect analysis of criticality.
Electrician did not carry out regular inspections as described in the Safety Management System
Manual.

Root Cause(s): General Lack of the safety culture of the individual ship's officers

Recommendations: Training for emergency awareness of safety officers to be carried out. Master
of the other ships of the fleet to be informed about incident in form of Fleet circular in order to
improve emergency awareness to provide crew members with information about
accidents/casualties happened on merchant vessels on regular basis.

56
ROOT CAUSE ANALYSIS

57
ROOT CAUSE ANALYSIS

Case No3

On repeated occasions obsolete forms were in use on-board company


vessels instead of updated ones. This was despite corrective actions taken
after each occurrence.

Regarding this case


Any changes to the documented system are recorded on
Amendment Record Sheets shown in the front of each copy of the Safety
Management System Manual (SMSM). Modifications in the SMSM
consist of revisions of manual pages and version changes of the entire
manual. Modifications in Safety Management System (SMS) Forms
consist of version changes only.

The master has the overall responsibility of maintaining the SMS


filing system and ensuring that any amendments therein are complied
with. Together with the Chief Officer they are responsible to maintain the
SMS files related to deck operations, navigation, safety, training and deck
maintenance.

The Chief Engineer is responsible to maintain the SMS files related


to machinery operations and machinery maintenance.

All SMS forms are kept in separate files distinguished by different


colours and/or numbers. The forms are divided according to the
operations and procedures they cover. Changes in Forms result in
replacement of modified forms, thus modifying the form’s Version No.
The version number and issue date are indicated on the top right of all
form pages.

58
ROOT CAUSE ANALYSIS

RCA No: 003

Type of Incident: Non-conformity


Incident Details: . On repeated occasions obsolete forms were in use on-board company vessels
instead of updated ones. This was despite corrective actions taken after each occurrence.

Immediate Cause: New versions of forms were forwarded to vessels but Master failed to update
the controlled Safety Management System Manual (SMSM) on board and/or to ensure that
obsolete form was withdrawn. As a result old version kept being used.

1 - WHY old version of forms found in use

2 - WHY SMSM on board was not controlled effectively

5 WHYs Process: 3 - WHY Master failed to control SMSM on board

4 - WHY procedures for Document Control failed

5 - WHY office was not aware of above failure

Analysis process: Fault Tree Diagram - See Attached


Contributory Causes: Masters did not realise importance of document control. Lack of training
and guidance of masters. Inadequate reviews of Non Conformities /monitoring of effect of
corrective actions. Failure of corrective actions to address the long tem problem dealing with the
every time current symptom instead.

Root Cause(s): Lack of clear procedures for document control on-board vessels. Insufficient
monitoring/auditing.

Recommendations: Clear relevant procedure to be included in SMSM. Awareness training of


Master on the subject. Effectiveness / compliance to be audited.

59
ROOT CAUSE ANALYSIS

60
ROOT CAUSE ANALYSIS

Case No4

Failing to identify, report and correct vessel defects and non-


conformances in timely manner, thereby causing delay in implementation
of corrective actions.

Regarding this case

The Company has established procedures for reporting


defects/malfunctions observed during scheduled inspections and damages
discovered during malfunctions or breakdowns of hull, machinery and
equipment.
The relevant forms must be used and be immediately submitted to the
responsible Technical Manager. Copy of the above form must also be
kept in a separate file onboard.

Furthermore The Master shall immediately notify the Designated


Person Ashore of any major defects, malfunctions or breakdowns of hull,
machinery and equipment, seriously affecting the safety and health of the
personnel, the ship or the pollution prevention arrangements, which
cannot be repaired by the shipboard personnel.

The procedures are clear, but the Captain due to inadequate


training and awareness in requirements of the company Safety
Management System Manual (SMSM), was not able to identify and
report defects in timely manner.

61
ROOT CAUSE ANALYSIS

RCA No: 004

Type of Incident: Non-conformity


Incident Details: Failing to identify, report and correct vessel defects and non-conformances in
timely manner thereby causing delay in implementation of corrective actions.

Immediate Cause: Failure by Master to insure adequate inspections of the vessel and failing to
report defects using form M001

Procedures Reviewed: Master Responsibilities 10.1, Reporting Non-compliance 5.2, Crew


motivation 5.3, Masters Reviews of SMS 5.5, Maintenance Guidelines 5.4, Monitoring Effectiveness
of SMS 5.5, Reporting defects/damage SMS 10.1.3.3.1, Critical Equipments 10.1.4

1 - WHY were the defects not reported?

2 - WHY were the defects not identified?

5 WHYs Process: 3 - WHY if defects identified were they not corrected?

4 - WHY was SMS procedure 10.1.3.3.1

was Master/ch. Engineer not aware of requirements of


5 - WHY
SMS and Solace?
Analysis process: Fault Tree Diagram - See Attached
Contributory Causes: Lack of Comprehensive Inspection. Training of safety officers and those
responsible for the maintenance of SLA/FFE and critical equipment. Inadequate record keeping.
Non-compliance with SMS 10.1.3.3.1. Inadequate training and awareness of Masters in
requirements of the company SMS.

Root Cause(s): Lack of understanding of the principles of the ISM Code of the Masters.
Inadequate understanding of importance of reporting and correcting defects on regular basis. Lack
of appreciation of significance of Port State Control Inspections and ultimately of management of
defects /NCNs.

Recommendations: Training management, masters and chief Engineers in principals of the ISM
code and implementation of the company SMS especially in the areas of maintenance and
reporting. Improved internal auditing to capture onboard management failings. Longer term review
of increase in frequency of vessel visits by shore Management.

62
ROOT CAUSE ANALYSIS

63
ROOT CAUSE ANALYSIS

Case No5

S-VDR and Inmarsat C found out of order / Flag not advised

Regarding this case

S-VDR and Inmarsat C are part of the safety equipment, therefore


both devices should be checked daily for faults or errors.

Furthermore when defects or faulty items are discovered during


scheduled inspections or when breakdowns occur, they must be reported
through the vessel’s Technical Manager to the vessel’s Administration
and/or Classification Society, because seaworthiness is affected.

64
ROOT CAUSE ANALYSIS

RCA No: 005

Type of Incident: Non-conformity


Incident Details: S-VDR and Inmarsat C found out of order / Flag not advised

Immediate Cause: S-VDR not reported as not considered critical to safety. Was newly fitted and
under makers warranty. Inmarsat C - second unit temporarily out of order - other unit still
functioning and situation not considered a risk during lead up to repair.

Procedures Reviewed: Safety Management System Manual sections 1.2.2.2, 1.2.3, 1.4, 5, 6, 8,
10, 11, 12 Nothing in particular SMS Para 10.2.3 Statutory and Class Surveys. SOLAS Ch. 1 reg 9,
reg 11, ch. V reg 18, reg 20.

1 - WHY not reported to Class and of Flag?

2 - WHY the office not reported defects to Class/Flag?

5 WHYs Process: 3 - WHY requirements of SOLAS ch. 1 reg 11 not complied with?

not Master aware of requirement to advise Class and of


4 - WHY
Flag?

5 - WHY

Analysis process: Fault Tree Diagram - See Attached


Contributory Causes: No company procedures for reporting defects under SOLAS ch. 1 Reg 11
of items of equipment mandatorily required for the safety equipment survey and ships radio survey
certificate, noting SOLAS ch. V reg 18 and 20 and taking into consideration IMO MSC 163(78)
concerning SVDRs. The Master failed to insure that Class were informed DPA did not identify
requirement to report these defects. Failure to monitor and review effectiveness of SMS.

Root Cause(s): Lack or relevant procedures and adequate of SMS audits and effective preventive
action.

Recommendations: Improvement to SMS procedures esp. 10.2.3 - Statutory & Class Surveys to
include guidelines on reporting to class and defects that either erectly affect safety of the vessel
and or defects of those items included in Class Survey Certificate. Company is to Provide guidance
in the SMS to Master, Chief Engineers and other officers on what might constitute Class items that
require reporting. Awareness training of technical Managers and DPA.

65
ROOT CAUSE ANALYSIS

66
ROOT CAUSE ANALYSIS

Case No6

Various failures in monitoring of lub-oil analysis, including frequency of


sampling, availability of results and follow-up on remedial actions when
results indicate "caution" or "alert" status.

Regarding this case

It has come to the attention of the company that on several


occasions, oil samples for Main Engine, Auxiliary Engines, Steering Gear
and Stern Tube were not landed for analysis at the required frequencies.
Furthermore, it has been noted that landed samples have been lost on
occasions without this being timely identified. In this respect, the Master
and Chief Engineer are requested to send notifications to the Technical
Department each time oil samples are landed for analysis.

This notification should include:


i. the date when samples were landed,
ii. the place (port) where landed, and
iii. equipment from which samples were taken.

Samples should be collected according to the following schedule:


1. Main Engine: every 3 months, samples before and after purifier.
2. Diesel Generators: every 3 months, unless the oil of the
generators was changed recently. Oil changes to be reported
in the remarks section of the engine log abstract.
3. Steering gear and Stern tube: every 6 months

In order to follow this time schedule the company setup an


automated reminder system to monitor due dates for analysis (and track
dates coming overdue) and issue appropriate reminders for the vessels
and the Technical Department. Office personnel should check for
availability of results on the basis of submittal dates of samples and
forward results to the vessels and the technical department with reminder
to co-ordinate in follow-up actions when this is required by the results.

67
ROOT CAUSE ANALYSIS

RCA No: 006

Type of Incident: Non-conformity


Incident Details: Various failures in monitoring of lub-oil analysis, including frequency of sampling,
availability of results and follow-up on remedial actions when results indicate "caution" or "alert"
status.

Immediate Cause: Existing procedures not properly followed.

Procedures Reviewed: SMS documentation or relevant procedure in paragraphs 10.1.2.4,


10.2.2.7

1 - WHY sampling frequencies not followed?

2 - WHY analysis results not monitored?

5 WHYs Process: 3 - WHY no follow-up when remedial actions required?

4 - WHY no means for taking samples provided?

5 - WHY

Analysis process: Fault Tree Diagram - See Attached


Contributory Causes: Poor control of supply of sampling kits, inconvenient trading schedule
(including long voyages or remote ports), poor co-operation from port agents.

Root Cause(s): Lack of proper monitoring of sampling frequencies. Lack of monitoring of results.
Lack of provision of resources required to perform requested task (availability of sampling kits)

Recommendations: To setup an automated reminder system to monitor due dates for analysis
(and track dates coming overdue) and issue appropriate reminders for the vessels and the Tech.
dept. To assign office personnel to check for availability of results on the basis of submittal dates of
samples and forward results to the vessels and the technical dept. with reminder to co-ordinate in
follow-up actions when this is required by the results. To setup a procedure for monitoring
availability to sufficient sampling kits on board and request replenishment when used-up.

68
ROOT CAUSE ANALYSIS

69
ROOT CAUSE ANALYSIS

Case No7

Maintenance and inspection of safety equipment items (life boats, fire


pump, CO2 bottles) and incorrect/insufficient implementation of existing
procedures.

Regarding this case

The ship’s Safety Officer (in many shipping companies is the


third officer) who is working under the Master, is responsible for
ensuring that regular inspections and tests of all shipboard Safety and
Environment Protection equipment onboard are carried out and that crew
follow the emergency drills and onboard training.
Furthermore he is responsible to record the test date and condition of such
equipment on the relevant Card enclosed in the aforementioned manual
duly signed and dated by both the Master and Chief Officer. Any defect
will be also reported in accordance with the relevant procedures
explained in this company’s safety Management System Manual.

70
ROOT CAUSE ANALYSIS

RCA No: 007

Type of Incident: Non-conformity


Incident Details: Maintenance and inspection of safety equipment items (life boats, fire pump,
CO2 bottles) and incorrect/insufficient implementation of existing procedures.

Immediate Cause: Failure by Master and other responsible officers to report defects and ensure
proper follow-up.

1 - WHY the maintenance was ineffective?

2 - WHY the inspection regime was ineffective?

5 WHYs Process: 3 - WHY there were no defect reports?

4 - WHY there were no corrective actions?

5 - WHY

Analysis process: Fault Tree Diagram - See Attached


Contributory Causes: Inadequate understanding of the importance of reporting defects.
Insufficient company procedures to monitor routine check lists completed on board.

Root Cause(s): Providing insufficient of inaccurate information of shipboard reports.

Recommendations: Training of Master and responsible officers in the areas of maintenance and
reporting. Improve onboard supervision of safety officer. Evaluate quality of reports by reviewing
observed vessel condition during attendances against submitted reports. Possible increase of
attending frequency.

71
ROOT CAUSE ANALYSIS

72
ROOT CAUSE ANALYSIS

Case No8

During cargo operations the crane jib fell on deck due to hoisting wire
failure.

Regarding this case

The investigation revealed that the hoisting wire was not in good
condition although greasing schedules had been maintained. In order to
deal with this problem specific instructions were given to all fleet vessels,
suggesting that all crane wires shall be changed after five years of the
installation date regardless of operating hours. Appropriate records must
be kept and timely requisitions made in order to ensure that this
procedure is adhered to.

73
ROOT CAUSE ANALYSIS

RCA No: 008

Type of Incident: Hazardous Occurrence


Incident Details: During cargo operations the crane jib fell on deck due to hoisting wire failure.

Immediate Cause: Fortunately there were no injuries and the crane sustained minor damage.
Lack of analysis of reporting incidents with objective to improve safety.

the incident was not formally investigated although


1 - WHY
internally reported?

2 - WHY corrective/preventive actions not properly documented?

management did not disclose relevant information to


5 WHYs Process: 3 - WHY
concerned parties?

4 - WHY

5 - WHY

Analysis process: Fault Tree Diagram - See Attached


Contributory Causes: The investigation revealed that the hoisting wire was not in good condition
although greasing schedules had been maintained. Insufficient monitoring and follow-up on
corrective/preventive actions. Poor co-operation between responsible departments.

Root Cause(s): Incomplete instructions in relevant SMS procedures.

Recommendations: All crane wires shall be changed after five years of the installation date
regardless of operating hours. Appropriate records must be kept and timely requisitions made in
order to ensure that this procedure is adhered to.

74
ROOT CAUSE ANALYSIS

75
ROOT CAUSE ANALYSIS

Case No9

Poisoning of Master, all officers and crew dew to cargo hold fumigation
(aluminium phosphide) entering accommodation.

Regarding this case

Aluminum Phosphide is a chemical that reacts with moisture to


release the fumigant, phosphine, or hydrogen phosphide. The aluminum
phosphide fumigant formulation contains approximately 55 percent
aluminum phosphide and 45 percent inert ingredients to regulate the
release of the fumigant and suppress flammability. Inert ingredients may
include ammonium carbonate, ammonium bicarbonate, urea, and
paraffin. It reacts with moisture in the air to produce phosphine (hydrogen
phosphide), which is highly toxic to all forms of animal and human life.
Phosphine is a colorless, odorless gas.

Symptoms of exposure to phosphine are:

1. Slight or mild poisoning which produces a feeling of fatigue,


ringing in the ears, nausea, pressure in the chest, and uneasiness.
All of these symptoms will normally disappear when the person is
removed to fresh air.
2. Moderate exposure that leads to general fatigue, nausea, gastro-
intestinal symptoms accompanied by vomiting, stomach ache,
diarrhea, disturbance of equilibrium, strong pains in the chest, and
difficulty in breathing.
3. Exposure to very high concentrations which rapidly produces
strong difficulty in breathing, bluish-purple skin color, difficulty in
walking or reaching, subnormal blood oxygen content,
unconsciousness, and death. Death can be immediate or may be
delayed until several days later.

If a member of the crew experience any of the symptoms previously


described, he should be immediately removed to fresh air and a physician
should be contacted as soon as possible.

76
ROOT CAUSE ANALYSIS

RCA No: 009

Type of Incident: Hazardous Occurrence


Incident Details: Poisoning of Master, all officers and crew dew to cargo hold fumigation
(aluminium phosphide) entering accommodation.

Immediate Cause: Master reported all crew with medium degree of poisoning. Lack of analysis of
reporting incidents with objective to improve safety.

the incident was not formally investigated although


1 - WHY
internally reported?

2 - WHY corrective/preventive actions not properly documented?

management did not disclose relevant information to


5 WHYs Process: 3 - WHY
concerned parties?

4 - WHY

5 - WHY

Analysis process: Fault Tree Diagram - See Attached


Contributory Causes: Since the holds were sealed and had been hose tested recently we can
only assume that a quantity of fumigant remained by the cargo holds, undetected by the crew,
causing the poisoning. Insufficient monitoring and follow-up on corrective/preventive actions. Poor
co-operation between responsible departments.

Root Cause(s): Incomplete instructions in relevant SMS procedures.

Recommendations: When carrying fumigated cargo, the fumigators are requested to provide the
vessel with appropriate detectors and that living quarters adjacent to the cargo holds are regularly
monitored (using these detectors) in order to identify any presence of poisonous substance for min
two days after fumigation. All crew must be properly informed of the fumigant hazardous properties
and symptoms of poisoning.

77
ROOT CAUSE ANALYSIS

78
ROOT CAUSE ANALYSIS

Case No10

Only one Deck Officer designated to carry out tank and hold inspections
and there was no evidence that this person was in any event familiar with
the publication "Guidelines for surveys and assessment of hull structures"
and may be regarded as untrained as per requirements of the SMS.

Regarding cases No10 and No11

Cargo holds and ballast tanks should be inspected in greater detail,


according to the following procedure:

One (1) cargo hold is to be inspected during every ballast voyage with
duration of 5 days or longer.

Two (2) ballast tanks are to be inspected during every loaded voyage with
duration of 5 days or longer.

The condition of each inspected compartment must be reported.


Reports can be supported by photos covering all areas of the
compartment, with additional close-ups of suspect areas.

Furthermore at least two Officers, one of whom shall be the Chief


Officer, should be assigned to carry out these inspections and prepare the
reports. These officers should familiarize themselves with the available
on board “Guidelines for Surveys, Assessment and Repair of Hull
Structures”. This training should be recorded and shall be repeated at
each employment, regardless of previous services on board the same or
similar vessels. The reports are prepared in duplicate. Original remains on
board while a copy is sent to the Technical department.

79
ROOT CAUSE ANALYSIS

RCA No: 010

Type of Incident: Non-conformity


Incident Details: Only one deck officer designated to carry out tank and hold inspections and
there was no evidence that this person was in any event familiar with the publication "Guidelines for
surveys and assessment of hull structures" and may be regarded as untrained as per requirements
of the SMS.

Immediate Cause: Failure of the Master to designate a second deck officer as tank and hold
inspector due to lack of familiarity with the company requirements.

Procedures Reviewed: SMS procedure 10.2.6.1.2 (inspection of cargo holds and ballast tanks)
and 10.2.6.1.3 responsibilities and reporting.

1 - WHY was a second deck officer not allocated?

was an officer who had been designated as a tank and


2 - WHY
hold inspector not been trained?
are the SMS procedures not clear enough on the subject
5 WHYs Process: 3 - WHY
of designation and familiarisation?

4 - WHY

5 - WHY

Analysis process: Fault Tree Diagram - See Attached


Contributory Causes: Failure of Masters to familiarise themselves with the relevant SMS
procedures.

Root Cause(s): Effective failure of SMS.

Recommendations: The procedure should clarify that the Master should designate two such
officers as tank and hold inspectors, one of which is the ch. officer. Company to improve auditing
methods so as to identify areas of non implementation of SMS.

80
ROOT CAUSE ANALYSIS

81
ROOT CAUSE ANALYSIS

Case No11

The new Chief Officer had only recently joined the vessel (one month
ago) and he had not yet been trained up as the designated hold and tank
inspector. There was no evidence that this person was in any event
familiar with the publication "Guidelines for surveys and assessment of
hull structures" and may be regarded as untrained as per requirements of
the SMS.

82
ROOT CAUSE ANALYSIS

RCA No: 011

Type of Incident: Non-conformity


Incident Details: The new Chief Officer had only recently joined the vessel (one month ago) and
he had not yet been trained up as the designated hold and tank inspector. There was no evidence
that this person was in any event familiar with the publication "Guidelines for surveys and
assessment of hull structures" and may be regarded as untrained as per requirements of the SMS.

Immediate Cause: Failure of the Master to train the Chief Officer in a short time period as tank
and hold inspector due to lack of familiarity with the company requirements.

Procedures Reviewed: SMS procedure 10.2.6.1.2 (inspection of cargo holds and ballast tanks)
and 10.2.6.1.3 responsibilities and reporting.

was the Ch. officer not immediately allocated and


1 - WHY
familiarised at the time of signing on?
are the SMS procedures not clear enough on the subject
2 - WHY
of designation and familiarisation?

5 WHYs Process: 3 - WHY

4 - WHY

5 - WHY

Analysis process: Fault Tree Diagram - See Attached


Contributory Causes: Failure of Masters to familiarise themselves with the relevant SMS
procedures.

Root Cause(s): Effective failure of SMS.

Recommendations: The procedure should clarify that the Master should train the Chief Officer in
a short time period. Company to improve auditing methods so as to identify areas of non
implementation of Safety Management System (SMS).

83
ROOT CAUSE ANALYSIS

84
ROOT CAUSE ANALYSIS

Case No12

The existing multi gas detector had been supplied with the wrong
accessory for remote sampling.

Regarding this case

Entering and working in closed spaces like cargo holds, tanks and
void spaces can be dangerous. Oxygen may have been absorbed, or CO2
or other toxic gases may have diluted the atmosphere. Prior to entering
such spaces and during work, monitoring of gases is of the utmost
importance for workers´ safety. For this purpose ships need appropriate
portable gas detection instruments catered to their vessel and cargo.

Without the correct accessories, like the aspirator hoses and probes
the equipment would be unfit for remote sampling.

85
ROOT CAUSE ANALYSIS

RCA No: 012

Type of Incident: Non-conformity


Incident Details: The existing multi gas detector had been supplied with the wrong accessory for
remote sampling.

Immediate Cause: The equipment was unfit for sampling cargo holds in line with the BC Code
requirements.

1 - WHY the accessory for remote sampling was wrong?

2 - WHY nobody checked it?

5 WHYs Process: 3 - WHY nobody informed the OPS?

4 - WHY

5 - WHY

Analysis process: Fault Tree Diagram - See Attached


Contributory Causes: The crew failed to identify the problem and inform the company.

Root Cause(s): Lack of proper monitoring of onboard equipment inventory.

Recommendations: Requisition to be sent to Operations dept. for replacement of incompatible


accessory. Accessory to be replaced prior to loading of cargo that requires atmosphere readings to
be taken.

86
ROOT CAUSE ANALYSIS

87
ROOT CAUSE ANALYSIS

Case No13

The fuel tank of the Emergency Generator was found half full.

Regarding this case

As described before in Case No2, all equipment and systems on-


board the vessels, identified as critical, should be regularly tested and
maintained in a good working condition.

In this case we can assume that after several tests of the Emergency
Generator, the fuel was consumed and the Chief Engineer, who is
responsible for the emergency equipment, did not keep the tank full.

88
ROOT CAUSE ANALYSIS

RCA No: 013

Type of Incident: Non-conformity


Incident Details: The fuel tank of the Emergency Generator was found half full.

Immediate Cause: The E.G. would work for the half time in case of a drill or an emergency.

1 - WHY the Ch. Engineer didn’t fill up the tank?

2 - WHY he was not aware of his responsibilities?

5 WHYs Process: 3 - WHY he was not familiarized with the SMS procedures?

4 - WHY

5 - WHY

Analysis process: Fault Tree Diagram - See Attached


Contributory Causes: The Ch. Engineer was not fully aware of his responsibilities.

Root Cause(s): Familiarization of Ch. Engineer.

Recommendations: Emergency Generator fuel tank to be kept full at all times. A sign to be
posted by the tank, written in the working language of the crew.

89
ROOT CAUSE ANALYSIS

90
ROOT CAUSE ANALYSIS

Case No14

It was noted that no printouts are kept for intermediate loading /


discharging stages and it could not be verified that such information was
readily available for all cargo operations through the ship's loading
calculations software.

Regarding this case

Bulk carriers must be handled with care in port as well as at sea.


Ships’ Officers responsible for cargo operations become key partners for
ship safety, because the lives of seafarers depend on careful cargo
handling.
The ship’s loading calculations software provides a mean to
calculate the shear forces and bending moments in any load or ballast
condition and to assess these against the assigned maximum permissible
values. Therefore intermediate loading and discharging stages are also
very important, in order to minimize the risks of over-stressing the hull
structure, something that can lead to catastrophic failure.

91
ROOT CAUSE ANALYSIS

RCA No: 014

Type of Incident: Non-conformity


Incident Details: It was noted that no printouts are kept for intermediate loading / discharging
stages and it could not be verified that such information was readily available for all cargo
operations through the ship's loading calculations software.

Immediate Cause: Incomplete records for intermediate loading / discharging stages.

1 - WHY the C/O not fully aware of his responsibilities?

2 - WHY he didn't know the correct procedure?

the Master didn't monitor his records in order to identify


5 WHYs Process: 3 - WHY
any problems?

4 - WHY

5 - WHY

Analysis process: Fault Tree Diagram - See Attached


Contributory Causes: Lack of training of C/O. Lack of clear procedures.

Root Cause(s): Familiarisation with SMS procedures.

Recommendations: Procedures to be submitted in the company as evidence.

92
ROOT CAUSE ANALYSIS

93
ROOT CAUSE ANALYSIS

CHAPTER 8

SUMMARY and RECOMENDATIONS


The basic reason for investigating and reporting the causes of
occurrences is to enable the identification of corrective actions adequate
to prevent recurrence protect the health and safety of the public, the
workers, and the environment.

To make it simple, ROOT CAUSE ANALYSIS is just a process


where you keep asking WHY until you get to the root of the matter. It is
peeling away the layers until you reach the heart of the problem/issue
(generally this is about 5 layers of WHY’s). Most important is the follow-
up. It does little good to identify the problem and do nothing about it!
Another benefit of learning the ROOT CAUSE ANALYSIS process is
that it can be applied to problem solving not just casualty analysis. [13]

From the analysis of the fourteen cases, some points are important
to be mentioned as recomendations:

1. Adequate training and guidance of Masters and Chief Engineers


should be made by the office.
2. It is better to provide detailed instructions in the native language of
the crew.
3. Masters of the other ships of the fleet should be informed about
incidents in order to prevent reoccurance.
4. It is important to have clear proceduresin the Company’s Safety
Management System Manual.
5. Masters Should understand the importance of reporting defects and
incidents.

94
ROOT CAUSE ANALYSIS

CHAPTER 9

REFERENCES
1. Root Cause Analysis Guidance Document, February 1992, DOE-NE-
STD-1004-92, pp 4-16

2. Douglas Hubbard "How to Measure Anything: Finding the Value of


Intangibles in Business" pg. 46, John Wiley & Sons, 2007

3. Dettmer, H. W., (1997) Goldratt’s Theory of Constraints: a systems


approach to continuous improvement. ASQC Quality Press, pp 62-
119.

4. Dettmer, H. W., (1998) Breaking the constraints to world class


performance. ASQ Quality Press, pp 69-102.

5. Scheinkopf, L., (1999) Thinking for a change: putting the TOC


thinking processes to use. St Lucie Press/APICS series on constraint
management, pp 143-169

6. Langford, J. W., Logistics: Principles and Applications, McGraw Hill,


1995, pp-488.

7. Sperber, William H. and Richard F. Stier. "Happy 50th Birthday to


HACCP: Retrospective and Prospective". FoodSafety magazine.
December 2009-January 2010. pp. 42, 44-46

8. Acharya, Sarbes; et. al. (1990). Severe Accident Risks: An


Assessment for Five U.S. Nuclear Power Plants. Washington, DC:
U.S. Nuclear Regulatory Commission. NUREG–1150. Retrieved
2010-01-17.

9. Taiichi Ohno; foreword by Norman Bodek (1988). Toyota production


system: beyond large-scale production. Portland, Or: Productivity
Press. ISBN 0915299143.

10. Hankins, Judy (2001). Infusion Therapy in Clinical Practice. pp. 42.

11. Thinking About Problems: Kepner-Tregoe.


http://www.itsmsolutions.com/newsletters/DITYvol2iss24.htm

95
ROOT CAUSE ANALYSIS

12. Shipping’s Company “Safety Management System Manual”

13.Root Cause Analysis and Casualty Investigation by Alan Dujenski


www.ardujenski.com

96

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy