0% found this document useful (0 votes)

33 views5 pages

System Reliability and Availability

This document discusses techniques for calculating system availability based on the availability of its components. It explains that system availability is determined by modeling how components are connected in series or parallel. Components in series have availability equal to the product of their individual availabilities, making the system no more available than the least available component. Components in parallel have availability calculated as 1 minus the probability that all components fail, making parallel configurations more available than any single component. The document provides an example to calculate the availability of a signal processing system based on the reliabilities of its individual parts.

Uploaded by

Ariel Andres Amaya Sanabria

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

33 views5 pages

System Reliability and Availability

Uploaded by

Ariel Andres Amaya Sanabria

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 5

System Reliability and Availability

We have already discussed reliability and availability basics in a previous article. This
article will focus on techniques for calculating system availability from the availability
information for its components.

The following topics are discussed in detail:

 System Availability
o Availability in Series
o Availability in Parallel
o Partial Operation Availability
 Availability Computation Example
o Understanding the System
o Reliability Modeling of the System
o Calculating Availability of Individual Components
o Calculating System Availability

System Availability
System Availability is calculated by modeling the system as an interconnection of parts in
series and parallel. The following rules are used to decide if components should be placed
in series or parallel:

 If failure of a part leads to the combination becoming inoperable, the two parts are
considered to be operating in series
 If failure of a part leads to the other part taking over the operations of the failed part,
the two parts are considered to be operating in parallel.

Availability in Series

As stated above, two parts X and Y are considered to be operating in series if failure of
either of the parts results in failure of the combination. The combined system is operational
only if both Part X and Part Y are available. From this it follows that the combined
availability is a product of the availability of the two parts. The combined availability is
shown by the equation below:

A = Ax Ay

The implications of the above equation are that the combined availability of two
components in series is always lower than the availability of its individual components.
Consider the system in the figure above. Part X and Y are connected in series. The table
below shows the availability and downtime for individual components and the series
combination.

Component Availability Downtime

X 99% (2-nines) 3.65 days/year
Y 99.99% (4-nines) 52 minutes/year
X and Y Combined 98.99% 3.69 days/year

From the above table it is clear that even though a very high availability Part Y was used,
the overall availability of the system was pulled down by the low availability of Part X.
This just proves the saying that a chain is as strong as the weakest link. More specifically, a
chain is weaker than the weakest link.

Availability in Parallel

As stated above, two parts are considered to be operating in parallel if the combination is
considered failed when both parts fail. The combined system is operational if either is
available. From this it follows that the combined availability is 1 - (both parts are
unavailable). The combined availability is shown by the equation below:

A = 1-(1-Ax )2

The implications of the above equation are that the combined availability of two
components in parallel is always much higher than the availability of its individual
components. Consider the system in the figure above. Two instances of Part X are
connected in parallel. The table below shows the availability and downtime for individual
components and the parallel combination.

Component Availability Downtime

X 99% (2-nines) 3.65 days/year
Two X components operating in parallel 99.99% (4-nines) 52 minutes/year
Three X components operating in parallel 99.9999% (6-nines) 31 seconds /year !

From the above table it is clear that even though a very low availability Part X was used,
the overall availability of the system is much higher. Thus parallel operation provides a
very powerful mechanism for making a highly reliable system from low reliability. For this
reason, all mission critical systems are designed with redundant components. (Different
redundancy techniques are discussed in the Hardware Fault Tolerance article)

Partial Operation Availability

Consider a system like the Xenon switching system. In Xenon, XEN cards handle the call
processing for digital trunks connected to the XEN cards. The system has been designed to
incrementally add XEN cards to handle subscriber load. Now consider the case of a Xenon
switch configured with 10 XEN cards. Should we consider the system to be unavailable
when one XEN card fails? This doesn't seem right, as 90% of subscribers are still being
served.

In such systems where failure of a component leads to some users losing service, system
availability has to be defined by considering the percentage of users affected by the failure.
For example, in Xenon the system might be considered unavailable if 30% of the
subscribers are affected. This translates to 3 XEN cards out of 10 failing. The availability
for this system can be computed by calculating A(p,q) as specified below:

A(p,q) = C(q,p) × Aq-p × (1-A)p

Here p is the number of failed units and q is the total number of units.

Availability Computation Example

In this section we will compute the availability of a simple signal processing system.

Understanding the System

As a first step, we prepare a detailed block diagram of the system. This system consists of
an input transducer which receives the signal and converts it to a data stream suitable for
the signal processor. This output is fed to a redundant pair of signal processors. The active
signal processor acts on the input, while the standby signal processor ignores the data from
the input transducer. Standby just monitors the sanity of the active signal processor. The
output from the two signal processor boards is combined and fed into the output transducer.
Again, the active signal processor drives the data lines. The standby keeps the data lines
tristated. The output transducer outputs the signal to the external world.

Input and output transducer are passive devices with no microprocessor control. The Signal
processor cards run a real-time operating system and signal processing applications.

Also note that the system stays completely operational as long as at least one signal
processor is in operation. Failure of an input or output transducer leads to complete system
failure.
Reliability Modeling of the System

The second step is to prepare a reliability model of the system. At this stage we decide the
parallel and serial connectivity of the system. The complete reliability model of our
example system is shown below:

A few important points to note here are:

 The signal processor hardware and software have been modeled as two distinct
entities. The software and the hardware are operating in series as the signal
processor cannot function if the hardware or the software is not operational.
 The two signal processors (software + hardware) combine together to form the
signal processing complex. Within the signal processing complex, the two signal
processing complexes are placed in parallel as the system can function when one of
the signal processors fails.
 The input transducer, the signal processing complex and the output transducer have
been placed in series as failure of any of the three parts will lead to complete failure
of the system.

Calculating Availability of Individual Components

Third step involves computing the availability of individual components. MTBF (Mean
time between failure) and MTTR (Mean time to repair) values are estimated for each
component (See Reliability and Availability basics article for details). For hardware
components, MTBF information can be obtained from hardware manufactures data sheets.
If the hardware has been developed in house, the hardware group would provide MTBF
information for the board. MTTR estimates for hardware are based on the degree to which
the system will be monitored by operators. Here we estimate the hardware MTTR to be
around 2 hours.

Once MTBF and MTTR are known, the availability of the component can be calculated
using the following formula:
Estimating software MTBF is a tricky task. Software MTBF is really the time between
subsequent reboots of the software. This interval may be estimated from the defect rate of
the system. The estimate can also be based on previous experience with similar systems.
Here we estimate the MTBF to be around 4000 hours. The MTTR is the time taken to
reboot the failed processor. Our processor supports automatic reboot, so we estimate the
software MTTR to be around 5 minute. Note that 5 minutes might seem to be on the higher
side. But MTTR should include the following:

 Time wasted in activities aborted due to signal processor software crash

 Time taken to detect signal processor failure
 Time taken by the failed processor to reboot and come back in service

Component MTBF MTTR Availability Downtime

Input Transducer 100,000 hours 2 hours 99.998% 10.51 minutes/year
Signal Processor Hardware 10,000 hours 2 hours 99.98% 1.75 hours/year
Signal Processor Software 2190 hours 5 minute 99.9962% 20 minutes/year
Output Transducer 100,000 hours 2 hours 99.998% 10.51 minutes/year

Things to note from the above table are:

 Availability of software is higher, even though hardware MTBF is higher. The main
reason is that software has a much lower MTTR. In other words, the software does
fail often but it recovers quickly, thereby having less impact on system availability.
 The input and output transducers have fairly high availability, thus fairly high
availability can be achieved even without redundant components.

Calculating System Availability

The last step involves computing the availability of the entire system. These calculations
have been based on serial and parallel availability calculation formulas.

Component Availability Downtime

Signal Processing Complex (software + hardware) 99.9762% 2.08 hours/year
Combined availability of Signal Processing Complex 0 and 1 3.15
99.99999%
operating in parallel seconds/year
21.08
Complete System 99.9960% minutes/year

http://www.eventhelix.com/realtimemantra/faulthandling/
system_reliability_availability.htm#.UqzVp8t3vIU

IT602-MidTerm Handouts by Yasir Ejaz
0% (1)
IT602-MidTerm Handouts by Yasir Ejaz
201 pages
Information Technology Infrastructure IT602
No ratings yet
Information Technology Infrastructure IT602
19 pages
BDS Session 3
No ratings yet
BDS Session 3
68 pages
Chapter 7 Basic Probability Concepts 22
No ratings yet
Chapter 7 Basic Probability Concepts 22
7 pages
Chapter 2 Maintnability Reliability and Availability
100% (1)
Chapter 2 Maintnability Reliability and Availability
60 pages
Unit 11 Dependability-and-Security
No ratings yet
Unit 11 Dependability-and-Security
39 pages
DDCS V3.1 MANUAL V3 Projeto Final 2
100% (2)
DDCS V3.1 MANUAL V3 Projeto Final 2
84 pages
Combine 1-4 Week IT602
No ratings yet
Combine 1-4 Week IT602
116 pages
Problem Avail I Bit Li y
No ratings yet
Problem Avail I Bit Li y
2 pages
7.fault Tolerance
No ratings yet
7.fault Tolerance
35 pages
Chapter 8 - Final
No ratings yet
Chapter 8 - Final
48 pages
Reliability, Availability Maintainability New
No ratings yet
Reliability, Availability Maintainability New
28 pages
SDA Session 8
No ratings yet
SDA Session 8
17 pages
Reliability and Availability in Software
No ratings yet
Reliability and Availability in Software
3 pages
Week09-Fault Tolerant System
No ratings yet
Week09-Fault Tolerant System
26 pages
HTML Slides
No ratings yet
HTML Slides
192 pages
Gtag Auditing Network and Comms MGMT 2nd Ed Rev
No ratings yet
Gtag Auditing Network and Comms MGMT 2nd Ed Rev
46 pages
Gartner Market Guide For Network Detection and Response 2022
No ratings yet
Gartner Market Guide For Network Detection and Response 2022
13 pages
MS PDF VIEWER Snowsetanswers 2
No ratings yet
MS PDF VIEWER Snowsetanswers 2
475 pages
IT 602 Week 2 - Slides
No ratings yet
IT 602 Week 2 - Slides
31 pages
Availability Concepts
No ratings yet
Availability Concepts
32 pages
Introduction
No ratings yet
Introduction
99 pages
Availability
No ratings yet
Availability
30 pages
Reliable System Design: Hardware Design Checklist Testing Embedded Systems Critical Systems
No ratings yet
Reliable System Design: Hardware Design Checklist Testing Embedded Systems Critical Systems
28 pages
Introduction
No ratings yet
Introduction
100 pages
EENG 415 Power System Reliability Analytical Methods: Lecture # 5
No ratings yet
EENG 415 Power System Reliability Analytical Methods: Lecture # 5
51 pages
Arno 2012
No ratings yet
Arno 2012
7 pages
Ieee STD p3006.7 Presentation
No ratings yet
Ieee STD p3006.7 Presentation
21 pages
UNIT5
No ratings yet
UNIT5
150 pages
Calculating Total System Availability: Hoda Rohani, Azad Kamali Roosta
No ratings yet
Calculating Total System Availability: Hoda Rohani, Azad Kamali Roosta
27 pages
Beyond Five-Nines: Awhitepaperondesigningavoipnetwork Forappropriateavailability
No ratings yet
Beyond Five-Nines: Awhitepaperondesigningavoipnetwork Forappropriateavailability
8 pages
Networking Intelligence
No ratings yet
Networking Intelligence
18 pages
Chapter 3
No ratings yet
Chapter 3
40 pages
Power System Analyses
No ratings yet
Power System Analyses
12 pages
Opmanual Model 280pd
No ratings yet
Opmanual Model 280pd
34 pages
High Availability Design: Ram Dantu
No ratings yet
High Availability Design: Ram Dantu
18 pages
Availability Digest: Reliability Diagrams
No ratings yet
Availability Digest: Reliability Diagrams
8 pages
System Reliability Availability Calculations
No ratings yet
System Reliability Availability Calculations
6 pages
Pressure Transmitters
No ratings yet
Pressure Transmitters
8 pages
An Empirical Appraisal of The Availability of A 500KVA Stand
No ratings yet
An Empirical Appraisal of The Availability of A 500KVA Stand
12 pages
Reference Book Principles of Distributed Database System Chapters
No ratings yet
Reference Book Principles of Distributed Database System Chapters
25 pages
M BetaBrite Window Display User Manual
No ratings yet
M BetaBrite Window Display User Manual
50 pages
A Case Study On Applying ITIL Availability Management Best Practice
No ratings yet
A Case Study On Applying ITIL Availability Management Best Practice
12 pages
Tipos de Disponibilidad y Metodo de Calculo
No ratings yet
Tipos de Disponibilidad y Metodo de Calculo
9 pages
Skill Java Script 2nd Sem 2024
No ratings yet
Skill Java Script 2nd Sem 2024
25 pages
AvailabilityTactic PDF
No ratings yet
AvailabilityTactic PDF
3 pages
Information Technology Infrastructure IT602
No ratings yet
Information Technology Infrastructure IT602
10 pages
Ieee Ha Swieorick
No ratings yet
Ieee Ha Swieorick
19 pages
Critical Examination of A Common Assumption in System Availability Computations
No ratings yet
Critical Examination of A Common Assumption in System Availability Computations
24 pages
Grape Detection With Convolutional Neural N - 2020 - Expert Systems With Applica
No ratings yet
Grape Detection With Convolutional Neural N - 2020 - Expert Systems With Applica
9 pages
Technical Essentials of HP Servers, Rev. 11.41
No ratings yet
Technical Essentials of HP Servers, Rev. 11.41
72 pages
IEEEStd 30067 - 2013presentation
100% (3)
IEEEStd 30067 - 2013presentation
42 pages
Active Parallel: Advantages Disadvantages
No ratings yet
Active Parallel: Advantages Disadvantages
34 pages
Oracle 1Z0-083 v2022-05-21 q220 - 2
No ratings yet
Oracle 1Z0-083 v2022-05-21 q220 - 2
75 pages
Rtos Group 10
No ratings yet
Rtos Group 10
9 pages
Safety Critical Computer Systems: Failure Independence and Software Diversity Effects On Reliability of Dual Channel Structures
No ratings yet
Safety Critical Computer Systems: Failure Independence and Software Diversity Effects On Reliability of Dual Channel Structures
10 pages
Artigo
No ratings yet
Artigo
6 pages
16 Fault Tolerance
No ratings yet
16 Fault Tolerance
34 pages
2IL50 Data Structures: 2018-19 Q3 Lecture 1: Introduction
No ratings yet
2IL50 Data Structures: 2018-19 Q3 Lecture 1: Introduction
61 pages
Design Patterns For High Availability
No ratings yet
Design Patterns For High Availability
10 pages
Introduction To Gift Shop
50% (2)
Introduction To Gift Shop
9 pages
Practical System Reliability
No ratings yet
Practical System Reliability
17 pages
VJ628D Service Manual
No ratings yet
VJ628D Service Manual
423 pages
MSI Application Packaging
No ratings yet
MSI Application Packaging
48 pages
User Manual: Geforce Gtx10 Series MXM Graphics Board Aetina M3N1050-LN Aetina M3N1050TI-LN
No ratings yet
User Manual: Geforce Gtx10 Series MXM Graphics Board Aetina M3N1050-LN Aetina M3N1050TI-LN
29 pages
Designing A Control System For High Availability
No ratings yet
Designing A Control System For High Availability
10 pages
Exercises On Reliability in Electrical Systems
No ratings yet
Exercises On Reliability in Electrical Systems
21 pages
Microshoft Word Shortcut Keys2
No ratings yet
Microshoft Word Shortcut Keys2
21 pages
Review of Deep Learning Algorithms and Architectur
No ratings yet
Review of Deep Learning Algorithms and Architectur
29 pages
Q3 Module1 G11 CSS-NCII Sison-Central-Is
No ratings yet
Q3 Module1 G11 CSS-NCII Sison-Central-Is
10 pages
Fault Avoidance and Tolerance Technique
No ratings yet
Fault Avoidance and Tolerance Technique
15 pages
Storytelling - 3 - Zappos
No ratings yet
Storytelling - 3 - Zappos
13 pages
Multiple Output Power Supply
No ratings yet
Multiple Output Power Supply
15 pages
CN Practical
No ratings yet
CN Practical
9 pages
Network Availability-MTTR MTBF
100% (1)
Network Availability-MTTR MTBF
3 pages
High Availability Process
No ratings yet
High Availability Process
8 pages
Nurse Call System: For Healthcare
No ratings yet
Nurse Call System: For Healthcare
12 pages
Xi4 Series Parts Catalog en Us
No ratings yet
Xi4 Series Parts Catalog en Us
17 pages
Calculating Reliability
100% (1)
Calculating Reliability
7 pages
Modulador Digital IP To RF User's Manual - V1.0
No ratings yet
Modulador Digital IP To RF User's Manual - V1.0
12 pages
Calculating Availability of Individual Components: Reliability and Availability Basics
No ratings yet
Calculating Availability of Individual Components: Reliability and Availability Basics
2 pages
DS Dwa-171 D1 Eng
No ratings yet
DS Dwa-171 D1 Eng
3 pages
Third Space Learning GCSE Topic Revision List Foundation and Higher
No ratings yet
Third Space Learning GCSE Topic Revision List Foundation and Higher
2 pages
IB Math AA SL Questionbank - Exponents & Logarithms
No ratings yet
IB Math AA SL Questionbank - Exponents & Logarithms
1 page
MT8121XE3 Datasheet ENG
No ratings yet
MT8121XE3 Datasheet ENG
2 pages
De Pin
No ratings yet
De Pin
22 pages
PhpMyAdmin Blowfish Secret Generator
No ratings yet
PhpMyAdmin Blowfish Secret Generator
1 page
Definition of Reliability
No ratings yet
Definition of Reliability
8 pages
Introduction to the simulation of power plants for EBSILON®Professional Version 15
From Everand
Introduction to the simulation of power plants for EBSILON®Professional Version 15
Steffen Swat
No ratings yet
Data Driven System Engineering: Automotive ECU Development
From Everand
Data Driven System Engineering: Automotive ECU Development
James Wen
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

System Reliability and Availability

Uploaded by

System Reliability and Availability

Uploaded by

System Reliability and Availability

The following topics are discussed in detail:

Component Availability Downtime

Component Availability Downtime

Partial Operation Availability

A(p,q) = C(q,p) × Aq-p × (1-A)p

Availability Computation Example

Understanding the System

A few important points to note here are:

Calculating Availability of Individual Components

 Time wasted in activities aborted due to signal processor software crash

Component MTBF MTTR Availability Downtime

Things to note from the above table are:

Calculating System Availability

Component Availability Downtime

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.