Chapter 1
Chapter 1
By far the greater number of aeroplane accidents are due to precisely the same
circumstances that have caused previous accidents. A distressing feature of
these accidents is the evidence they afford of the unwillingness, or the inabil-
ity, of many pilots to profit from the experiences and mistakes of others.
- Gustav Hamel and Charles C. Turner
In the past, aviation safety improvement was characterized by a fly-crash-fix-fly
approach. We would fly airplanes, have the occasional unfortunate crash, and inves-
tigate the cause(s) to prevent it from happening again. Sometimes the causes would
be weather-related or a mechanical failure, but more often the cause was determined
to be human error – usually the pilot. Essentially, the prevailing philosophy was that
once the cause was determined to be the pilot, we simply needed to encourage other
pilots not to make the same mistakes.
Today, we realize that it is much more productive to engineer a system in which,
to the extent possible, causes of failure have been designed out. As one might imag-
ine, there are many elements to this engineering effort, and many of these will be
discussed in this book. The modern, well-informed aviation safety practitioner must
have a working understanding of hazard identification, risk management, systems
theory, human factors engineering, organizational culture, quality engineering and
management, quantitative methods, and decision theory.
Safety Management Systems (SMSs), of course, are not just for aviation – they
are found in a wide variety of diverse industries, such as chemical, oil, construction,
occupational health, food, highway, electrical, and fire protection, among others.
SMS is not a new concept in these industries – references to SMS in the literature of
some of these industries can be found as far back as the early 1980s. Many of these
industries had historically poor safety records and have benefited from the philoso-
phy and structure SMS provides.
SMS is not just in the United States. Many people mistakenly think that the
United States always leads when it comes to aviation safety. While the United States
does have an enviable safety record, other countries are considerably further along in
their efforts to develop and implement aviation SMS. Transport Canada committed
to the implementation of SMS in aviation organizations in 2005. Europe and New
Zealand, to name two others, have moved forward with SMS more rapidly than the
United States.
DOI: 10.1201/9781003286127-2 13
14 Safety Management Systems in Aviation
into occupational health and SMSs, environmental management system, and others.
Current SMS concepts stress a process approach, similar to those in ISO 9000. The
ISO (International Organization for Standardization) standard also evolved, start-
ing in the 1980s as a quality control standard, moving into quality assurance in
the 1990s, and finally into the process approach in the 2000 and 2008 versions of
the 9000 standard. There were other initiatives such as the New Zealand Internal
Quality Assurance program requirements that emanated from a 1988 Review of
Civil Aviation Safety Regulations and the Resources, Structure and Functions of
the New Zealand Ministry of Transport, known as the Swedavia-McGregor report.
This review questioned the continued efficacy of the approach that stressed intensive
inspection of end items, products, and activities and moved toward an emphasis on
the systems that produced them.
The European Joint Aviation Authority (JAA – now replaced by EASA) also
had a requirement for a QMS that could be combined with another requirement for
an accident prevention program – a separate, ICAO (International Civil Aviation
Organization) required program. This moved toward a combined, risk-based assur-
ance system with a lot of the same features of the systems that have become known
as SMS.
In the United States, the Federal Aviation Administration (FAA) began looking
at system safety in oversight systems in the late 1990s, resulting in the development
of the Air Transportation Oversight System (ATOS). No longer in use, ATOS took
a more systemic view of the operator’s processes but is an FAA oversight system.
For system safety to truly work, it must be practiced by the system/process owner –
the operator. This led the FAA to investigate the application of QMS principles and
subsequently to look at what other countries are doing in SMS. This began about
2001–2002 and coalesced into an SMS initiative from 2003 to 2004. The FAA pub-
lished its first air operator SMS guidance in June 2006, in AC 120-92 Introduction
to Safety Management Systems for Air Operators. In parallel, the FAA’s air traffic
organization (ATO), along with air traffic management organizations in a number
of other countries, began developing SMS at about the same time. On January 8,
2015, the FAA published in the federal register (Vol. 80, No. 5) its final SMS rule
for air carriers (Safety Management Systems for Domestic, Flag, and Supplemental
Operations Certificate Holders, 2015).
When did SMS begin and who started it? As you can discern from the above
explanation, SMS did not begin at a specific time with any single event. Rather, it
has been the result of an evolutionary process with a lot of combining of ideas from
other management and scientific domains and a lot of sharing of information within
the air safety community.
(ICAO, 2018, p. viii). The following paragraphs will decompose the term SMSs to bet-
ter understand it and present a definition that attempts to capture the essence of SMS.
At the simplest level, we collect data about events and try to discern trends. We
will get into greater depth later concerning the appropriate use of statistical tools in
SMS, but for now let’s take a look at some statistical charts to help understand the
state of safety in our system today. First, Figure 1.1 depicts the number of fatal acci-
dents for U.S. air carriers from 1987 through 2020 (14 CFR 121, scheduled service2)
(NTSB, 2022a, Table 6). The dashed line is a best-fit logarithmic trendline (selected
due to the variability and the leveling of the data), which comfortingly indicates that
the trend during this period is decreasing. Of course, these are raw numbers and,
consequently, they do not really tell us if we’re becoming more or less safe. To get
that picture, we must normalize the data, as described below.
Figure 1.2 shows the accident rate per 100,000 air carrier departures (14 CFR 121
scheduled service) from 1987 to 2020 (NTSB, 2022a, Table 6); thus, it normalizes the
data by taking into account the increase in air travel. The figure shows two rates – one
for all accidents and the other for accidents with fatalities. Both logarithmic trendlines
show a slight decrease over time. Most people might examine this chart and conclude
that these data support the use of the word “safe,” considering the ICAO definition.
Let’s also take a look at general aviation safety statistics (see Figure 1.3) (NTSB,
2022c, Table 10). As opposed to the airline safety data, which is based on the number
of departures, the general aviation statistics use the number of flight hours. Again,
FIGURE 1.1 U.S. airline accidents with fatalities from 1987 to 2020, 14 CFR 121 scheduled
service.
Introduction to SMS 17
FIGURE 1.2 U.S. airline accident rate per 100,000 departures from 1987 to 2020, “all” and
“with fatalities,” 14 CFR 121 scheduled service.
the (dashed) logarithmic trendlines (all accidents rate is the top line and fatal acci-
dents rate is the bottom line) are on a downward slope, which is good, but that still
doesn’t tell us whether we’re safe.
Finally, it’s worth noting the relative safety among airlines (14 CFR 121, sched-
uled), air taxis (14 CFR 135, scheduled) (National Transportation Safety Board,
2022b, Table 8), and general aviation operations (see Figure 1.4). Clearly, gen-
eral aviation has a significantly higher accident rate (although it had been steadily
FIGURE 1.3 U.S. general aviation accident rates per 100,000 flight hours, “all” and “with
fatalities,” from 1987 to 2020.
18 Safety Management Systems in Aviation
FIGURE 1.4 Comparison of accident rates per 100,000 flight hours among airlines (14 CFR
121, scheduled), air taxis (14 CFR 135, scheduled), and general aviation in the United States
from 1987 to 2020.
decreasing until plateauing in the 1990s) than either CFR FAR 135 commuter air-
lines or CFR FAR 121 air carriers, and air carriers have the lowest of the three.
Probably everyone who views this chart will have a different perception about the
level of safety that exists in each of these operations, the trends associated with each,
and the reasons for the differences.
This discussion underscores the point that, as safety professionals or others inter-
ested in safety as a discipline, we must understand the concept of safety as compli-
cated and only having real meaning when considered in light of processes designed
to control the outcome.
Management
A generally accepted definition for management is that management is the process of
getting activities completed efficiently and effectively with and through other people.
The functions normally associated with management are planning, organizing, staff-
ing, directing, controlling, and (sometimes) budgeting. Management is leading and
directing an organization or an activity through the deployment and manipulation
of resources (something managers do), whether the resources are human, financial,
intellectual, material, or other.
Systems
The dictionary defines “systems” as “a regularly interacting or interdependent group
of items forming a unified whole” (“systems” Merriam-Webster Online Dictionary,
Introduction to SMS 19
2022). A system is more than the sum of its parts. A useful way to think about the
concept of systems is that it is an amalgam of people, procedures and processes,
and equipment that is integrated to perform a specific function or activity within a
particular environment.
Definition of SMS
The following is offered as a comprehensive definition of SMS: A dynamic risk man-
agement system based on Quality Management System (QMS) principles in a structure
scaled appropriately to the operational risk, applied in a safety culture environment.
This definition and its components will be examined in some detail throughout
this book. This section includes only a cursory overview.
R=S×L
A few decades ago, even the best safety analyses were forensic in nature. Note that
this definition of risk is forensic as well. The two measures on which this tradi-
tional calculation of risk is based both depend upon an analysis of undesired events.
Moreover, the data from which these calculations are drawn are historical. For
example, suppose that a hard landing occurs. A forensic approach to risk analysis
would have the safety department look into the various safety databases maintained
by the airline and review the “hard landing” reports on file. After review of those
reports, subject matter experts would assign a measure of severity to the reports
and then aggregate those assignments into an index that describes the severity of
the hard landing event. Then an attempt would be made to calculate a rate statistic
(the number of hard landings divided by the exposure statistic, in this case the total
number of landings in the system), thus deriving the likelihood of occurrence index.
Using these two indices, a final “risk index” would be obtained by referencing a risk
matrix. ICAO’s risk matrix is shown in Figure 1.5 (ICAO, 2018, p. 2–16).
Most operators have a management guidance document that describes appropri-
ate mitigating action and allowable timelines for corrective and preventive actions
based upon this risk index.
Accomplishing this process on the various types of undesired events experienced
in operations would also give the management team the ability to prioritize actions
based upon the relative value of the risk indices assigned to each event type.
20 Safety Management Systems in Aviation
Risk Management
Risk management is the process of measuring risk and developing strategies to
manage it. These strategies usually include reducing the negative effect of the risk.
In the forensic risk analysis equation above, changing the severity (S) or likelihood
(L) would accomplish that task. A quality engineered approach will include a rigor-
ous analysis of the system of interest – identifying hazards, understanding the inter-
actions between these hazards, and engineering detection systems, incorporating
parallel and/or redundant systems when appropriate, and determining clear go/no
go decision points. Finally, as SMS is incorporated into an integrated management
system, strategic risk planning will include transferring the risk (e.g., to insurance
carriers), avoiding the risk, and/or accepting the consequences – in whole or in
part – of the risk (see Figure 1.6).
It is important to note the following regarding risk management strategies:
1. They are not mutually exclusive. Choosing one of them doesn’t mean you
can’t also choose others. In fact, often these strategies will be mixed and
combined to varying degrees.
2. The individual strategies are not an all-or-nothing proposition. Strategies
can be and often are partially deployed.
3. The strategies are somewhat of a balancing act.
4. The decisions regarding risk management must be made by the organiza-
tion’s management.
performance. SMSs are based on QMS principles. These principles can be found
throughout effective SMSs. One of the primary organizations that promotes qual-
ity is the ISO,3 which is a non-governmental, international standards setting body
composed of 157 member countries. A brief overview of the principles of QMS as
articulated by ISO includes the following (ISO, 2022):
These seven principles should be kept at the forefront of any effort to develop and
manage an SMS.
The astute reader would notice that one major distinction between the FAA and
ICAO definitions, and the one that we presented earlier in this chapter, is that our defini-
tion includes the notion that SMS must be scaled appropriately to the operational risk.
designed to mitigate this risk must therefore be appropriately scaled to include all
stakeholders as active participants, with common goals and shared accountability.
The challenge of designing a management system that can be effective in such col-
laborations will be discussed in later chapters.
Finally, some safety issues require attention all the way from the bottom to the top
of the food chain. These issues, national or even global in scope, are being addressed
in many creative and promising programs, such as the Commercial Aviation Safety
Team and its component Joint Implementation Measurement and Data Analysis
Team, the General Aviation Joint Steering Committee, and others. These programs
are evolving into central components of SMSs.
Finally, from our definition, an SMS is applied in a safety culture environment.
While all members of the organization must know their responsibilities and be both
empowered and involved with respect to safety, the ultimate responsibility for the
safety of the system cannot be delegated down from top management. SMS identifies
key behaviors that demonstrate this involvement, such as inclusion of safety goals
into strategic plans and regular management review of the SMS. Executive manage-
ment involvement is a key requirement in the SMS policy documentation.
One of the authors was invited to participate in the FAA’s Design and
Manufacturing SMS Pilot Project from 2010 to 2012 and was tasked to develop a
process that could be used by organizations for defining their system description. A
brief synopsis of that work is included in Chapter 12.
Hazard Identification
Once processes are well understood, hazards in the system and its operating envi-
ronment can be identified, documented, and controlled. The first step in the pro-
cess, hazard identification, is based on a thorough understanding of the system,
emphasizing the importance of the previous steps concerning system description.
Once the system is well-understood, one can review the written system description
or the process workflow diagram and at each component of the workflow, ask the
question “what if ….” What if this component failed? What if that threat appeared?
What if the other error was made? As with system and task descriptions, judgment
is required to determine the adequate level of detail. While identification of every
conceivable hazard would be impractical, organizations are expected to exercise
due diligence in identifying significant and reasonably foreseeable hazards related
to their operations.
Controlling Risk
Once the preceding steps have been completed, measures to reduce or control the
risk must be designed and implemented. These may be additional or revised proce-
dures, new controls, changes to training, additional or modified equipment, changes
to staffing arrangements, or any of a number of other system changes. SMS requires
that clear lines of responsibility and authority be drawn that assign the task of con-
trolling risk.
System Operation
The risk management component of SMS should be designed to not only continu-
ously monitor, assess, analyze, and control risk, but also provide the next component,
Safety Assurance – an efficient means of auditing, analyzing, and reviewing the
results of its efforts. Risk management works in concert with Safety Assurance to
ensure effective functioning in a changing operational environment.
Introduction to SMS 27
Data Acquisition
Former FAA Associate Administrator for Aviation Safety, Nick Sabatini, stated that
data are the lifeblood of an SMS. Safety Assurance uses information from a variety
of sources: audits, investigations of safety-related events, monitoring of key pro-
cess indicators in routine operations, and information submitted by employees into
employee reporting systems. A key concept in SMS is that these various oversight
systems should feed into a system of management review. As you will read through-
out this book, SMS is about fact-based decision-making, and getting those facts is
a vital component of Safety Assurance. Continuous monitoring of processes par-
ticularly by line managers is a key source of information for Safety Assurance. Line
managers are the process owners and are in the best position to assess the perfor-
mance of those processes through continuous monitoring.
SMS assigns immediate responsibility for the safety of every process within an
organization to the owner of that process. Process owners are the domain technical
experts in any organization and thus the most knowledgeable about the technical pro-
cesses involved. Managers within operational departments are assigned the respon-
sibility for monitoring their own processes through an internal auditing program.
SMS also defines an additional audit function – internal evaluation – at the orga-
nizational level. This level provides a quality assurance function to assure that the
more in-depth and technical reviews accomplished by the departmental internal
audits are accomplishing organizational goals by assessing and mitigating risk. In
U.S. airline operations, this is the Internal Evaluation Program or IEP. These audits
provide executive management with the information for decision-making required
for the evaluation of the overall SMS.
External audits provide yet another level of Safety Assurance. These audits may
be required by regulation or may be third-party audits initiated by the organization
to provide an objective evaluation of its processes. Once again, SMS does not sup-
plant the need for these external oversight systems, but rather considers these audits
as another information source for management review.
Investigations should be focused on discovering why a safety event happened as
opposed to assigning blame for the event. Information gathered during investigations
should be fed back into the Safety Assurance system.
Employee Reporting Systems are truly an essential part of any effective SMS. A
robust, confidential reporting system is one in which all employees feel empowered
28 Safety Management Systems in Aviation
to report any and all safety concerns without fear of reprisal. Data gathered from
employee reporting systems should be monitored to identify hazards and also to
inform Safety Assurance processes.
Management of Change
Simply put, change management is a structured approach to moving employees
and organizations from a current state to a desired state. Effective change manage-
ment helps ensure this can be done without disaffecting workers or causing other
undesirable or unintended outcomes and, importantly, ensure that the desired state
becomes institutionalized – that is, the change sticks. Management of change pro-
cesses should ensure safety performance throughout the implementation of the
change.
Continuous Improvement
Like quality, a key feature of SMS is continuous improvement. Continuous improve-
ment is a cyclical, data-driven process to ensure that risk controls and SMS effective-
ness are improved through intentional actions of the organization.
Safety Cultures
One of the most challenging elements of SMS is the creation and nurturing of a
safety culture, in which every person, from CEO to a new hire, understands his or
her role in maintaining a safe operation and actively participates in controlling and
minimizing risk.
Creating a safety culture begins at the top of the organization, with the incor-
poration of policies and procedures that cultivate a reporting culture (where struc-
tures are in place that allow safety-related information to flow from all levels of
the organization into a system empowered to correct problems) and a just culture
(in which individuals are both held accountable for their actions and treated fairly
by the organization). Maintaining a safety culture requires constant attention by
every layer of management and every department within the organization. A cen-
tral tenet of SMS is this realization – that the safety department does not own
safety, rather safety is owned by every employee. Safety culture is discussed in
detail in Chapter 3.
Introduction to SMS 29
EMERGENCY RESPONSE
In both ICAO and FAA documentation, emergency response is included as an inte-
gral part of SMS. For readers already familiar with existing emergency response
requirements in ICAO and FAA regulations, and with existing emergency response
programs at large air carriers and airports, the inclusion of this topic in SMS plan-
ning and implementation can immediately arouse suspicion. Why, one might ask, do
we once again need to revisit something that is already very highly regulated, which
already requires significant resource assignment within the organization, and which
already works pretty well? A common concern is that any additional requirements
imposed by an SMS system will only be burdensome and increase complexity, with
little return on investment as to the quality of emergency response.
To that criticism we would point the concerned reader to a component of our
own definition of SMS, as “scaled appropriately to the operational risk.” The natural
evolution of safety management in our industry has driven the reactive response to
disasters such that emergency response is already well-developed in areas of poten-
tially high operational risk, such as at Class I airports or major air carriers. Anyone
involved in safety planning at a hub airport or large airline knows that existing ERPs
(emergency response plans, as ICAO designates them) are very well developed and
extensive, and regularly tested as required by regulation.
For those operators, it is very likely that existing ERPs will fulfill all SMS
requirements, so fluster or panic is not necessary. For those existing well-developed
programs, the extent of the burden in incorporating their ERPs into an SMS frame-
work will probably be very low. But we ask the patience of the reader with this
particular orientation because, as mentioned earlier, SMS is intentionally a scalable
system, whose principles apply to both the large and the small service provider.
Therefore, the general outlines for emergency response in SMS are worthy of con-
sideration, with the knowledge that some levels of operation already have robust
systems, while others will benefit from a review.
Appendix 3 to Chapter 5 of the ICAO Safety Management Manual (third edition)
is devoted to emergency response planning and is, in our judgment, an excellent ref-
erence for service providers to use to review the fundamentals of their own programs.
There are of course specific regulations governing levels of emergency planning and
response, dependent upon the location, scale, and type of operations involved, but
the purpose of our review here is to highlight the essences. Exhaustively covering the
30 Safety Management Systems in Aviation
topic of emergency response is far beyond the scope of this book, and for a detailed
review of the essentials, the reader is referred to the ICAO document. But a quick
review is in order so that the SMS practitioner can understand how the ERP fits in to
the larger philosophies and techniques of SMS.
The ICAO Safety Management Manual states that:
The overall objective of the ERP is the safe continuation of operations and the return
to normal operations as soon as possible. This should ensure an orderly and efficient
transition from normal to emergency operations, including assignment of emergency
responsibilities and delegation of authority. It includes the period of time required to
re-establish “normal” operations following an emergency. The ERP identifies actions
to be taken by responsible personnel during an emergency. Most emergencies will
require coordinated action between different organizations, possibly with other service
providers and with other external organizations such as non-aviation-related emer-
gency services. The ERP should be easily accessible to the appropriate key personnel
as well as to the coordinating external organizations.
(ICAO, 2018, 9-8)
The purpose of requiring that an ERP be a part of an SMS is to ensure that a service
provider has thought through each one of the enumerated items above and has estab-
lished a plan of operations prior to the need to use the plan. This purpose is entirely
driven by the same underlying motivation that energizes SMS in the first place – the
control of risk. In this case, the risk being controlled is not specifically aimed at the
circumstances that led to the emergency (though SMS would drive the need to consider
corrective action to prevent the emergency in the future, of course). Rather, the risk that
is mitigated by having the ERP is that associated with handling the emergency itself.
An emergency is an event that is by its very nature high risk, certainly for those
victims at the immediate scene, but also for first responders, for those assisting those
responders, and especially for those other customers who continue to receive ser-
vices from the organization while the emergency is in progress, even if those cus-
tomers are a thousand miles away from the scene. An ERP exists to control the
organizational response to the emergency in such a way as to minimize the risk for
all facets of the operation. An ERP is a control mechanism.
An earlier version of the ICAO document mentioned several constituent elements
of a well-designed ERP.
Governing Policies
An ERP should have explicit references to the regulations governing emergency
response in the organization’s operational environment and should contain the com-
pany policies and procedures that determine how the organization will respond to
the emergency.
Organization
Emergency response is a process and ideally should be created using the same disci-
pline as applies to the creation of any process under SMS. The ERP should describe
Introduction to SMS 31
who has responsibility and authority in various aspects of the response, how that
response is conducted, what resources will be available, and so on.
Notifications
The ERP should contain a very clear notification process so that assistance is avail-
able when needed. Not to be neglected, of course, is the terrible task of notifying
relatives of those involved in the event. Other steps in the ERP will also address the
responsibilities the operator has to the families involved.
Initial Response
The initial response to an emergency is potentially a very high-risk environment.
This section should be especially well considered, keeping first-responders in mind.
Additional Assistance
The ERP should be designed such that backup is immediately available when
needed. All available resources should be considered. This step feeds back into the
notifications step.
Records
There are both regulatory and practical requirements for good record-keeping dur-
ing an emergency. The ERP planning team should assure that all record-keeping
requirements are identified and that someone is assigned the responsibility for main-
taining these records.
Accident Site
The accident site itself is an extremely high-risk environment, and the operator must
assure that no further harm is done in responding to the event. That means access
control must be a part of the plan, and protective equipment must be available for
first responders. There are regulatory responsibilities the operator has concerning
the protection of the site, and those responsibilities must be assigned.
News Media
It is inevitable that an operator involved in a serious emergency will have contact
with the media. Having a plan to control that contact might not immediately seem
32 Safety Management Systems in Aviation
like risk management, but it is. The media interfaces with other groups the operator
clearly has responsibilities to, such as the families of victims and employees. Not
the least of the reasons to have a media plan in an ERP is to assure that those actu-
ally managing the crisis are isolated from a barrage of questions and requests by the
media so that they can do their jobs.
Formal Investigations
The operator needs to plan on how to support the formal investigations that are an
inevitable part of post-incident operations. The time required to support such inves-
tigations can be quite significant, and good planning beforehand can help assure that
company interests are represented without removing critical personnel from routine
operations.
Family Assistance
The operator clearly has a responsibility to the families of victims – not only a moral
one, but also a legal one. Those responsibilities include setting up family assistance
services, travel accommodations for family members, financial assistance in some
circumstances, and especially satisfying the need for accurate and up-to-date infor-
mation concerning the event.
Post-Occurrence Review
The essence of SMS is continuous improvement. As such, an ERP should include
plans to debrief everyone involved in the event and should require a post-incident
review of activity.
Readers familiar with ERPs existing at major air carriers or airports will recog-
nize these elements as already existing in the emergency planning documentation
required by regulation. Smaller operators would be well served to review their own
procedures in light of these ICAO suggestions. An ERP that covers these issues will
satisfy all the requirements in SMS – as long as one other step is included.
Management review is one of the most important steps in any quality process,
and since SMS is exactly that – a quality process – it is essential that an ERP con-
tain within it the requirement for regular management review of the plan. And with
something as serious as the ERP, that review cannot be only a document review. It
is necessary to exercise the plan on a regular basis. Large operators and airports
already have such a requirement mandated by regulation. For smaller operators
embracing SMS, it is very important that the ERP is actually taken off the shelf at
regular intervals and run in a simulated environment.
For those in the United States, the Department of Homeland Security has created
outstanding resources for operators to use to create a schedule for emergency
response exercises. There is no need to reinvent the wheel – a smart management
team will take advantage of this excellent work and appoint someone in the organi-
zation to become fully certified in the HSEEPS program (Homeland Security
Exercise and Evaluation Program). This training is designed to support emergency
Introduction to SMS 33
management, and a well-designed toolkit exists to assist in the creation of all levels
of emergency response practice – from tabletop drills, to functional practice, and to
full-scale exercises.
SMS IN PRACTICE
Jill Wilson, Head of Safety at Joby Aviation, writes about the Importance of
Practiced Emergency Response:
I began my career as a safety professional and aircraft accident investigator,
working for established aircraft manufacturers and operations. During this
time, I was made acutely aware of the value of having a well-documented,
rehearsed, and reviewed emergency response plan (ERP), which is no surprise
as this is standard practice for experienced aviation organizations. Additionally,
the development and use of an ERP is a key element of a successful SMS.
My traditional experience paid off when I joined a startup leveraging
cutting-edge technology to revolutionize the aviation industry. This organiza-
tion had a wide variety of operations including Part 91, Flight Test, and even
a burgeoning Part 135. However, being a new entrant to the aviation industry,
their flight operations emergency plan was rudimentary and needed improve-
ment. Knowing the importance of a strong response plan, I made it one of my
first objectives to building a more robust ERP for each of our operation types.
Because we were in the process of becoming a Part 135 operator, we began
with writing a response plan for this type of operation.
Another lesson learned from my past is that having a plan is just the first
step. To be effective, an ERP must be practiced, critiqued, and improved regu-
larly. Those responsible for executing the plan need experience in performing
it BEFORE it’s truly needed. Practice began with the Part 135 operations team
and enabled us to identify and strengthen weak points. From there, we adapted
the ERP for use by the Flight Test Team, responsible for test-flying our proto-
type vehicles.
Approximately one week into working with the Flight Test Team on their
ERP, we had a significant emergency. Although it was just a draft, the Flight
Test ERP was used to guide our activation and response. Even though we had
only practiced the plan once, the ERP helped the company respond to a crisis
in a fashion much more organized that if it hadn’t existed. Immediate actions
which were taken as a result of the (draft) ERP put the company in a much
better position to support the official investigation and ultimately allowed for
smoother business continuity and an organized response.
Takeaways:
Have a plan
Practice the plan
Improve the plan
34 Safety Management Systems in Aviation
Policy
This is the same word as is used in the first component, but with a distinction. A care-
ful reading of the four components of guidance reveals that other important concepts
are included under the heading of policy. In addition to references to the importance
of clear policy guidelines in SMS, there is also a discussion of the process definition.
Policies and processes are two different concepts. And while there is an acknowledg-
ment of the necessity of record-keeping in the advisory circular, we will elevate this
topic to equal that of policy.
Policies are the shalls and shall-nots of the organization and tend to be more fixed
than process descriptions. Policies reflect the strategic vision and commitment to the
values of the organization. Policies also provide guidance for the creation of new
processes, and standards against which processes or process measures can be evalu-
ated. For example, an organization might (should!) have a policy stating, “All process
36 Safety Management Systems in Aviation
descriptions will identify one or more Key Process Indicators (KPIs) through which
the performance of the process can be evaluated.” Or perhaps a policy might specify
that “All processes will clearly define the roles of responsibility for and authority
over that process.”
Policy documentation is extremely important to SMS. Just as a quality manage-
ment system must have a quality policy manual, an SMS must have the equivalent of
a Safety Policy Manual. You can recognize an SMS by its documentation.
Process Descriptions
An organization with an SMS will understand what you mean when you ask to see
examples of its process descriptions. Process descriptions can be as simple as a set
of instructions for an employee to use to do his/her job, or as complex as a multi-
departmental process workflow diagram. The best process descriptions will follow
a standardized format so that no matter which one of the organization’s many pro-
cesses you examine, you can readily tell who is responsible, or how they measure the
success of the process, or which records must be kept. We assert that quality-based
process descriptions are the distinguishing feature of a mature SMS.
Process Measurements
Just as a quality organization knows that it must establish measures within its pro-
cesses to enable continuous monitoring of performance, an SMS must have measures
within processes to determine whether those processes are meeting their safety tar-
gets. In an SMS, measures are directed at those points within the process that are
most revealing of risk. An SMS does not collect data simply to collect data. The
SMS practitioner in that organization can readily answer why a particular process
measure has been established.
Record-Keeping
An organization with an SMS is good at keeping records and can readily answer
why it does so; keeping records to be prepared for audits is not the right answer! Of
course, there are regulatory reasons for keeping many records but, from an SMS
perspective, records are kept to facilitate management review. Those records include
the process measurements described above, but also include narratives submitted in
an employee self-reporting system, results of internal and external audits, and even
those parameters not directly associated with safety issues, such as routine opera-
tional performance numbers (flights per day, fuel logs, maintenance schedules, and
so on).
But perhaps most importantly, an SMS carefully records, and frequently refer-
ences, the decision-making processes involved in management review. For incidents
and events (reactive safety), this kind of record includes categories such as what
happened, why did it happen, what was the effect, how are we going to decrease the
risk of such an event, who is responsible for the action, and, critically, did it work?
For proactive and predictive safety efforts, the record includes what might happen,
why it might happen, etc., down to “how will we tell if our intervention is working?”
Management review requires the availability of good records, and a mature SMS
will allow those documents to be readily produced when needed.
Introduction to SMS 37
Risk Assessment
An organization’s SMS practitioners will be able to immediately answer when asked
the question “How do you assess risk?” Their answer will reveal that they have a
process for that assessment, not just a guess. That is not to say that every SMS must
have software to run complex problems, such as Monte Carlo simulations, or use
probabilistic risk assessment, or stochastic modeling. Not every organization’s meth-
odology needs to be the same, though guidance is available upon which to build a
risk assessment process. What all SMSs will have in common is a considered, ratio-
nal, and thoughtful way to assess and prioritize risk.
department, but they could be any other work group employed by the organization.
Ask a few of them three questions, and from their answers you will begin to know
whether the organization has a mature and well-functioning SMS. Those questions are:
1. What are the three greatest areas of risk in your work, and what do you do
to mitigate that risk?
2. When was the last time that you or one of your fellow workers were asked
to help figure out how to make the job and the company’s products safer?
3. What happens to you when you make a mistake?
Everyone complains about their job now and then. Overlooking this fact of life, the
employees of an organization with a mature SMS will be able to point out where the
risk is in their work, because the organization has invested in ways to communicate
that information to each employee. And the organization is not communicating just
generic safety information, but also information relevant to that specific employee’s
work. Those in charge also know that the most accurate and informed sources of
risk information for every process within that organization are the employee groups
performing that process. Therefore, there will be mechanisms in place to tap this
vital source of intelligence.
This is one of the most important features of an SMS. In an SMS, safety must
begin at the top and permeate throughout the organization, including to those on the
“shop floor.”
Finally, employees of an organization with a mature SMS understand that they
are fully accountable for their actions, but not punished for unfortunate but natu-
ral human errors. To the question of what happens when they make a mistake, the
employees would answer that they would probably feel bad, maybe even ashamed
that they did it, but not enough to keep them from participating in the self-reporting
systems the company has created. They understand that they are not responsible for
being perfect but are responsible for striving for continuous improvement, and one
of the best ways to reach that goal is to submit a report.
The remainder of this book will immerse the reader into significantly more detail
about the history and components of, and the theory underlying, SMS. But once one
is familiar with the concepts, recognizing a vibrant SMS is similar to distinguishing
great art – you know it when you see it. Verification of the existence of an SMS is not
presently accomplished (nor probably should it ever be) by merely the achievement
of having 8 out of 10 boxes checked on the “Is There an SMS Here?” form. SMS is
far more organic and integral to the fabric of an organization, and there is no one-
size-fits-all SMS. But once you are an SMS practitioner yourself, spend a short time
visiting an organization with a mature program, and you’ll know, because safety
management is everywhere you look.
In closing this chapter, we remind the reader of something that the late Dr. Don
Arendt said. Don was a driver in promoting SMS during his work as a Senior
Technical Specialist with the FAA. Somewhat facetiously, Don argued that the term
safety management system should be changed to simply Safety Management. His
rationale was that SMS isn’t something that you have; it’s not a program that sits on a
shelf collecting dust. Instead, it’s something that you are actively doing – something
that his proposed wording implies – you are managing safety.
Introduction to SMS 39
REVIEW QUESTIONS
1. Explain the relative nature of the term “safe.” Is commercial aviation get-
ting more or less safe?
2. What is meant by the term forensic aviation safety management?
3. Why is it important that an SMS is “scalable”? What are the possible con-
sequences of SMS not being scalable?
4. What are the four components of SMS?
5. Why is it important that SMS be supported by top management?
6. How does management ensure all parts of an organization embrace and
practice SMS?
7. Explain what is meant by the term “dynamic risk management system.”
8. Who is responsible and accountable for safety in an organization?
9. What are some of the ways you can recognize SMS in an organization?
10. What is safety culture and why is it important?
NOTES
1. The International Civil Aviation Organization (ICAO) is a specialized agency of the
United Nations that was created with the signing in Chicago, on December 7, 1944, of
the Convention on International Civil Aviation. ICAO is the permanent body charged
with the administration of the principles laid out in the Convention. It sets the standards
for aviation safety, security, efficiency, and regularity, as well as aviation environmental
protection, and encourages their implementation. ICAO’s membership comprises 193
Signatory States. Its headquarters are in Montréal and it has regional offices in Bang-
kok, Cairo, Dakar, Lima, Mexico City, Nairobi, and Paris.
2. 14 CFR 121 refers to Title 14 of the United States Code of Regulations, Part 121, which
covers the FAA’s regulations for scheduled air carriers. Likewise, 14 CFR 135 covers
and refers to the FAA’s regulations for Part 135 operators (charter operators). From this
part forward, we will simply refer to these types of operations as “Part 121 operators”
or “Part 135 operators” instead of calling out 14 CFR each time.
3. Founded on February 23, 1947, the International Organization for Standardization
(ISO) is a non-governmental, international standard-setting body composed of repre-
sentatives from national standards bodies. ISO sets worldwide industrial and commer-
cial standards which often become law through treaties or national standards.
4. The U.S. government requires that before a regulatory action is taken, such as issuing
a new regulation, the government agency proposing the new rule must first issue a pub-
lic Notice of Proposed Rulemaking (NPRM). The purpose of the NPRM is to solicit
public input before enacting a new regulation. In some cases, the agency may issue an
Advance Notice of Proposed Rulemaking (ANPRM) to seek input for what the NPRM
should consider.
REFERENCES
Federal Aviation Administration [FAA]. (2015). Safety Management Systems for Aviation
Service Providers. Advisory Circular 120-92B. Retrieved January 6, 2015, from http://
www.faa.gov/documentLibrary/media/Advisory_Circular/AC_120-92B.pdf
International Civil Aviation Organization [ICAO]. (2018). Safety Management Manual
(SMM), 4th ed. (Doc 9859). Montréal, Canada: ICAO. ISBN 978-92-9258-552-5.
International Organization for Standardization [ISO]. (2022). Quality Management
Principles. Retrieved March 10, 2022, from https://www.iso.org/files/live/sites/isoorg/
files/store/en/PUB100080.pdf
40 Safety Management Systems in Aviation