0% found this document useful (0 votes)
11 views75 pages

2 - SR Maths

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views75 pages

2 - SR Maths

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 75

-Software Reliability Mathematics-

By Priya Singh (Assistant Professor, Dept of SE, DTU)


Background
• The term quality of a product or service means fitness for use or in other
words, meeting customer stated as well as implied requirements.
• This is rather a qualitative measure which means that we can say whether
each need of the customer is met or not.
• We cannot quantify the degree of customer satisfaction.
• After examining or testing, we arrive at a conclusion as to what the product
meets the customer's requirements or not.
• On the contrary, reliability is a quantitative measure.
• It can be expressed in hard numbers, like, for example, reliability of our
system is 0.6 at 100-th hour of operation or 0.45 at 1000 hours of operation.
• Since reliability is the probability of survival of our product, it is essential
to understand statistics, probability and queuing theory in particular in order
to get a reasonable understanding of the subject.

By Priya Singh (Assistant Professor, Dept of SE, DTU)


Some Terms & Definitions
1. Population-
The sum total of all the units under discussion for a purpose.
E.g. Assume that we want to study the daily consumption of electricity by the households in
the city of Delhi.
The quantity of electricity consumed by each household is a population of one kind.
2. Assume that we want to check the quantity of PCs manufactured in a given organization.
Then, all the PCs manufactured by the organization is considered to be a population.

The population can be divided into a number of other populations, such as the PCs
manufactured each day, each week, each month and each year.

By Priya Singh (Assistant Professor, Dept of SE, DTU)


2. Sample-
Any subset of the population is called a sample.
A sample should be drawn at random to study the characteristics of a given population
without any bias.
3. Statistical Inference-
The field of statistics has evolved to draw inferences about a population by study or
experimenting with smaller number of items from the population called samples.
Statistics has gained the confidence of people at large due to the fact that if properly
applied, statistics gives right inferences about the population.
Statistical inference, such as reliability estimation, requires carrying out experiments on
the given sample.

By Priya Singh (Assistant Professor, Dept of SE, DTU)


4. Random experiment-
• An experiment is defined as any physical action or process that is observed and the result
noted.
• An experiment is called a random experiment if when repeated under the same condition,
it is such that the outcome cannot be predicted with certainty, but all possible outcomes
are known in advance.
• For instance, if we toss a fair coin, once we will not be able to predict the outcome with
certainty, it can be either a head or tail, but we know all possible outcomes, so tossing a
fair coin can be considered as a random experiment.
5. Trial-
• A single performance of a given experiment is called a trial.
• For instance, tossing a fair coin is called a trail.

By Priya Singh (Assistant Professor, Dept of SE, DTU)


6. Event or outcome-
The result of each trial, is called as an event or outcome.
For instance, getting a head or tail on each trial of a fair toss is an event or outcome.

By Priya Singh (Assistant Professor, Dept of SE, DTU)


Random Variable
• A variable whose value is determined by the outcome of a random experiment
is called a random variable.
• There are two types of a random variable-
i. Discrete random variable
ii. Continuous random variable
A discrete random variable is one whose set of assumed values is countable in
natural numbers like 1,2,3… and a continuous random variable is one whose set
of assumed values is uncountable.

By Priya Singh (Assistant Professor, Dept of SE, DTU)


Continuous random variables
• Denoted by uppercase letter for instance when it is a random variable say T
then t1,t2,t3 are all the values of the random variable T in any random
experiment.
• E.g.- If we are testing software system let T denote the random variable of
cumulative time to failure then t1,t2,t3 represent the cumulative times at
which first second and third failures occurred.
• Since time is continuously monitored T is an example of a continuous
random variable at t1,t2,t3 are values of random variable T in the random
experiment of testing software and looking for failures.
• Continuous variables are unlimited in their degree of precision for instance
failure occurring at 24.501 hour. On the contrary discrete variables are
limited to specific natural numbers like week 1,2,3.
By Priya Singh (Assistant Professor, Dept of SE, DTU)
Discrete Random variable

By Priya Singh (Assistant Professor, Dept of SE, DTU)


Example-

By Priya Singh (Assistant Professor, Dept of SE, DTU)


Expectation of a Random Variable

Example

By Priya Singh (Assistant Professor, Dept of SE, DTU)


By Priya Singh (Assistant Professor, Dept of SE, DTU)
Variance of a Random Variable

Example

By Priya Singh (Assistant Professor, Dept of SE, DTU)


By Priya Singh (Assistant Professor, Dept of SE, DTU)
Distribution Function
1. Binomial probability distribution

Bernoulli Trials

By Priya Singh (Assistant Professor, Dept of SE, DTU)


By Priya Singh (Assistant Professor, Dept of SE, DTU)
Expectation of X and variance of Binomial Distribution

Example

By Priya Singh (Assistant Professor, Dept of SE, DTU)


By Priya Singh (Assistant Professor, Dept of SE, DTU)
2. Poisson Probability Distribution

By Priya Singh (Assistant Professor, Dept of SE, DTU)


Example

By Priya Singh (Assistant Professor, Dept of SE, DTU)


By Priya Singh (Assistant Professor, Dept of SE, DTU)
Example-

By Priya Singh (Assistant Professor, Dept of SE, DTU)


3. Exponential Distribution

a. Probability Density Function

By Priya Singh (Assistant Professor, Dept of SE, DTU)


a. Probability Density Function-

b. Cumulative Density Function

By Priya Singh (Assistant Professor, Dept of SE, DTU)


By Priya Singh (Assistant Professor, Dept of SE, DTU)
By Priya Singh (Assistant Professor, Dept of SE, DTU)
Example-

By Priya Singh (Assistant Professor, Dept of SE, DTU)


Sol.

By Priya Singh (Assistant Professor, Dept of SE, DTU)


Example-

By Priya Singh (Assistant Professor, Dept of SE, DTU)


Bath-tub Curve
Each process –manufacturing, flight operation,
failure of hardware, failure of computer software
has unique characteristics when we model
processes.
We need to select probability distribution functions
that are suitable to uniquely model the process.
We noted that the exponential distribution has a
constant hazard rate.
The popular bath-tub curve depicting the failure
rate of a product in hardware systems like
automobiles, grinders etc. during the entire life like
it has three regions .

By Priya Singh (Assistant Professor, Dept of SE, DTU)


• The early part known as infant mortality period has decreasing hazard rate.
• The middle part described the constant hazard rate region during useful life period which
will be very long as compared to infant mortality period.
• The wear out period represents increasing hazard rate.
• Three Types of failure rate-
1. Decreasing with time;
2. Constant,
3. Increasing with time.
The exponential distribution can only model constant hazard rate and is not flexible to
describe failure rate situations that are similar to those bath-tub curve.

By Priya Singh (Assistant Professor, Dept of SE, DTU)


4. Exponentiated Weibull Distribution
• Exponential distribution provides simple solutions and its generalization
introduced by Weibull resulting in Weibull family of distributions.
• They are used for modelling systems with more monotone failure rates that is
both continually increasing and continuously decreasing failure rates as in the
bathtub curve.
• They add that data in reliability analysis over the life cycle of the product can
involve high initial failure rate (infant mortality) and eventual high failure due
to aging and wear out, exhibiting bathtub curve failure-rates and therefore
proposed for the generalization of the Weibull family called the Exponentiated
Weibull Family.

By Priya Singh (Assistant Professor, Dept of SE, DTU)


The cumulative distribution function CDF and
probability distribution function PDF of the
exponential distribution function with two
shape parameters and one scale parameter are
given as-
• The Exponentiated Weibull Distribution
with two shape parameters α and for θ
can accommodate a variety of failure rate
functions (when the scale parameter σ is a
constant)
• The Exponential Weibull distribution can be
used to model a variety of patterns and
therefore it is a potential distribution
function to represent ROCOF in future
software reliability models.

By Priya Singh (Assistant Professor, Dept of SE, DTU)


5. Generalized Exponential Distribution
• The generalized exponential distribution, also called exponentiated exponential
distribution is defined as a particular case of exponentiated Weibull distribution function
with α = 1.
• The equations for CDF F(t) and PDF f(t) of the generalized exponential distribution are
given below-

• This does not have any location parameter


• The reliability R(t) and the hazard function h(t) of 2-parameter GE distribution are given
below-

By Priya Singh (Assistant Professor, Dept of SE, DTU)


• Plot of PDF of GE distribution for θ= 1:
• The PDF takes two distinct shapes.
• When β <=1, it is reverse J shaped or log-convex.
• When β >1, it is log-concave.
• Even at higher values of β, it does not become symmetric but remains right-
skewed which is an interesting characteristic of the GE distribution.
• Plot of the hazard function for various values of β for θ=1
• The GE distribution reduces to the exponential distribution when β=1. The
hazard rate is constant and equal to 1/θ.
• When β >1, the hazard rate increases from 0 to an infinite number namely 1/
θ.
• When β<1, the hazard rate decreases to a finite number namely 1/θ that is
one of the interesting features of GE distribution.

By Priya Singh (Assistant Professor, Dept of SE, DTU)


Fig. PDF of GE distribution Fig. Hazard function of GE distribution

By Priya Singh (Assistant Professor, Dept of SE, DTU)


Example-

By Priya Singh (Assistant Professor, Dept of SE, DTU)


Solution-

By Priya Singh (Assistant Professor, Dept of SE, DTU)


6. Weibull Distribution with three parameters
• The CDF, PDF, reliability function and hazard function of Weibull distribution function
with three parameters are given as follows

By Priya Singh (Assistant Professor, Dept of SE, DTU)


• The parameter γ is also called failure-free time.
• By assigning a value of zero to the location
parameter γ in the above equation, the equation
for the corresponding functions of a two-
parameter Weibull distribution function can be
obtained.
• E.g. the equation for as a function of two-
parameter Weibull distribution is given below-

• The hazard function of two parameters Weibull


Distribution
for a scale parameter θ=1 is plotted as given-

By Priya Singh (Assistant Professor, Dept of SE, DTU)


Example 1

Example 2

By Priya Singh (Assistant Professor, Dept of SE, DTU)


Solution 1

Solution 2

By Priya Singh (Assistant Professor, Dept of SE, DTU)


7. Importance of Reliability
• Reliability is one of the most important characteristics of a product whether it is hardware
or software.
• Reliability is a measure of the performance of the product with time.
• Product recalls are common only after time elapses.
• E.g., in October 2006 one major multinational corporation recalled up to 9.6 million of its
personal computer batteries. Such recalls can cause a lot of embarrassment and
inconvenience to the manufacturer and the customers alike.
• Big corporate houses are spending millions of dollars on warranty claims.
• Reliability of the product, therefore, assumes importance.
• Reliability can be defined as the probability that the given system will perform its required
for under specific conditions for a specific period of time.
• Reliability can also be assessed depending on the frequency of failures.

By Priya Singh (Assistant Professor, Dept of SE, DTU)


8. Non-repairable and repairable systems
Systems can be classified into-
a. Repairable, and
b. Non-repairable

Non-repairable systems are those that do not get repaired when they fail. Specifically, the
components of the system are not repaired when they fail.
Repairable systems are those that get repaired when they fail by repairing or replacing the
failed components in the system.
Example-
An automobile is an example of a repairable system if the automobile is rendered
inoperative when a component of the subsystem fails. The component is typically repaired
or replaced rather than purchasing a new automobile.
During the testing phase, a software system is repairable and during the operational phase it
is not repairable.
By Priya Singh (Assistant Professor, Dept of SE, DTU)
9. Mean Time Between Failures
• Average time a system will run between failures is referred to as mean time between
failures (MTBF).
• It is usually expressed in hours.
• This metric is more easily understood by the user than the reliability measure.
• It is the average time between successive failures.
• It is used for repairable systems.
• If a products life distribution i.e. distribution which describes the life of a product is
exponential then the failure rate γ is the reciprocal of the mean time between failures.
γ= 1/MTBF

By Priya Singh (Assistant Professor, Dept of SE, DTU)


10. Mean Time To Failure (MTTF)
• Mean Time To Failure (MTTF) is another indicator of Reliability
• It is the average time that elapses until a failure occurs for one shot or non-repairable
device.
• For non-repairable components or one-shot devices be used MTTF.

By Priya Singh (Assistant Professor, Dept of SE, DTU)


11. Exponential Failure Law
• Reliability of a system is often modeled using exponential distribution because if the
failure is due to random events, it follows an exponential distribution.
• In the event of the absence of information about the cause of failure, we assume a random
failure process.
• Is such an event-
where λ is the failure rate.
MTBF is the average time a system will run between failures and it is given by-

In other words the MTBF of a system is the reciprocal of the failure rate.
If λ is the number of failures per hour the MTBF is expressed in hours.

By Priya Singh (Assistant Professor, Dept of SE, DTU)


Example-

By Priya Singh (Assistant Professor, Dept of SE, DTU)


Solution-

By Priya Singh (Assistant Professor, Dept of SE, DTU)


12. System Reliability
• A system is a collection of components, subsystems and/or assemblies arranged to a
specific design in order to achieve desired functions with acceptable performance and
reliability.
• The main task of system reliability estimation is the construction of a reliable model (life
distribution) that represents the time to failure of the entire system based on the life
distributions of the components, sub-assemblies and/or assemblies from which it is
composed.
• In the context of computer systems, we have to analyze and identify suitable probability
distribution functions for the failure of hardware, software etc. and make a reliability
block diagram.
• The types of components, their quantities, their failure rates and the manner in which they
are arranged within the system have a direct impact on the system reliability

By Priya Singh (Assistant Professor, Dept of SE, DTU)


• Often the relationship between a system and its components is misunderstood or
oversimplified.
• E.g. the following statement is not valid-
All of the components in a system have 90% reliability for a given mission time, thus the
reliability of the system is 90% for that time.
It is neither so simple nor straightforward.
• We have to construct a reliability block diagram RBD to arrive at the reliability of the
system from the reliability of the components of a system as it depends on the reliability
of the components used in the system and how they are connected.

By Priya Singh (Assistant Professor, Dept of SE, DTU)


13. Reliability Block Diagram
• The reliability block diagram RBD is a graphical representation of the components of the
system and how they are reliability-wise related or connected.
• Block diagrams are used to describe the interrelationship between the components and to
define the system.
• When used in this fashion, the block diagram is referred to as a reliability block diagram.
• E.g. RBD of a simplified computer system with a redundant fan configuration is shown as
below-

By Priya Singh (Assistant Professor, Dept of SE, DTU)


13.1 System Reliability Function
• A designer of the system to study the reliability characteristics of each block or
component.
• After studying the properties of each block in a system, the block can then be connected
in a reliability-wise manner to create a reliability block diagram for the system.
• THE RBD provides a visual representation of the way the blocks are reliability-wise
arranged.
• This diagram demonstrates the effect of the success or failure of each component on the
success of or failure of the overall system
• E.g. if all the components in a system must succeed in order for the system to succeed,
then the components will be arranged reliability-wise in series.
• If one of the many components must succeed in order for the system to succeed those
components will be any reliability-wise in parallel.

By Priya Singh (Assistant Professor, Dept of SE, DTU)


Series System Reliability
• A reliability-wise series system is given as below-

• In a series configuration, a failure of any


component results in failure for the entire system.
• In most cases when considering a complete system
at their basic subsystem level, it is found that these
are arranged reliability-wise in series
configuration.
• E.g. a personal computer may consists of four
basic components- the motherboard, the hard
drive, the power supply and the processor. These
are reliability-wise in series and the failure of any
of the subsystems will cause a system failure

By Priya Singh (Assistant Professor, Dept of SE, DTU)


• In other words all of these units in a series system must succeed for the system to succeed.

• The reliability of a series system in the probability that all the units succeed for the system
to succeed.
• If the components of subsystems are independent and if the reliability of each subsystem
is R1, R2… then the reliability of the series system is given as

By Priya Singh (Assistant Professor, Dept of SE, DTU)


Example-

By Priya Singh (Assistant Professor, Dept of SE, DTU)


Parallel System
• The components of a system can be
connected reliability-wise parallel.
• In a simple parallel system at least one of the
units must succeed for the system to succeed.
• The components in parallel are also referred
to as redundant units.
• Redundancy is a very important aspect of
system design and reliability in that hiding
redundancy is one of the several methods of
improving system reliability

By Priya Singh (Assistant Professor, Dept of SE, DTU)


The unreliability of the system if it consists of an independent
components in parallel is given by

By Priya Singh (Assistant Professor, Dept of SE, DTU)


Some comments on Serial and Parallel System

• Serial System • Parallel System


Overall reliability of the serial system is The component with the highest reliability in a
the lowest of the reliability. parallel configuration has the biggest effect on the
systems reliability, since the most reliable
In a series configuration, the component
with the lowest reliability has the component is the one that fails in the end.
biggest effect on the systems reliability.
As a result the reliability of the series
system is always less than the reliability
of the least reliable component

By Priya Singh (Assistant Professor, Dept of SE, DTU)


Example-

By Priya Singh (Assistant Professor, Dept of SE, DTU)


Solution-

By Priya Singh (Assistant Professor, Dept of SE, DTU)


Reliability of k-out-of-n independent and identical components

• By placing items in a parallel system, we can improve the overall reliability.


• The k-out-of-n configuration is a special case of parallel redundancy.
• This type of configuration requires that at least k components succeed out of the
total n parallel components for the system to succeed.
• E.g.: Consider an airplane that has four engines. Further, suppose that the design
of the aircraft is such that at least two engines are required to function for the
aircraft to remain airborne. This means that the engines are reliability-wise in a k-
out-of-n configuration where k=2 and n=4. (2-out-of-4 configuration)

By Priya Singh (Assistant Professor, Dept of SE, DTU)


• Suppose we need to have 900 MW power for a utility.
Suppose we install four numbers of 300 MW power source, then it is a 3-out-of-4
configuration. This means that while 3 units will be sufficient, we deploy 4 identical units
for the same application to enhance reliability.
• The idea is we connect n identical and independent components where k good
components can satisfy the needs of all the components of the same failure distribution
and whenever a failure occurs the remaining components are not affected.
• In this case the reliability of the system with such a configuration can be evaluated using
the Binomial Distribution as given in the following-

By Priya Singh (Assistant Professor, Dept of SE, DTU)


The effect of the k-out-of-n system, namely deploying n systems
where k systems are sufficient to meet the demand (n>=0) can
be analyzed in two ways-
fixing n as a constant and increasing k from 1 to n and vice
versa.
The effect of increasing the number of units required (k) for the
system success while the total number of units (n) remains
constant when we deploy 6 units each with the reliability of 0.85
when one unit will be able to satisfy the needs, by applying
Binomial Distribution we get the overall reliability of 0.9999.
We can verify this by substituting n=6, k=1 and R = 0.85.
When we deploy 6 units when the name is two identical units of
reliability of 0.85, we get overall reliability of 0.9996 and so.
The system configuration becomes a simple parallel
configuration of k=1 the system is a 6 unit series configuration.

By Priya Singh (Assistant Professor, Dept of SE, DTU)


Example-

By Priya Singh (Assistant Professor, Dept of SE, DTU)


Solution-

By Priya Singh (Assistant Professor, Dept of SE, DTU)


Maintenance
• Maintenance is defined as any action that restores failed units to an operational condition.
• For repairable systems, maintenance plays a vital role in the life of a system.
• It affects the system’s overall reliability, availability, downtime, cost of operation etc.

By Priya Singh (Assistant Professor, Dept of SE, DTU)


Down time
• Maintenance actions- preventive or collective, are non-instantaneous.
• There is the time factor associated with each action i.e. it takes some amount of time to
complete the action.
• This time is usually referred to as downtime.
• It is defined as the length of time an item is not operational.
• MDT (mean downtime)- the average time taken over n maintenance action is known as
mean downtime.
• There are a number of different factors that can affect the downtime such as physical
characteristics of the system, spare parts availability, repair crew availability, human
factors and environmental factors. The downtime can be classified into two categories
based on these factors- waiting downtime and active downtime

By Priya Singh (Assistant Professor, Dept of SE, DTU)


• Waiting downtime is the time during which the equipment is inoperable but not
yet undergoing repair. This could be due to the time it takes for replacement parts
to be shipped, administrative processing time etc.
• Active downtime is the time during which the equipment is inoperable and
actually undergoing repair.
• In other words, the active downtime is the time it takes the repair personnel to
undertake or performer repair or replacement work.
• The length of the active downtime is greatly dependent on human factors as well
as on the design of the equipment.
• E.g. the ease of accessibility of the components in a system has a direct effect on
the active downtime.

By Priya Singh (Assistant Professor, Dept of SE, DTU)


• The influence of a variety of different factors on downtime results in the fact that
the time it takes to repair or restore a specific item is not generally constant i.e. the
time to repair is a random variable much like the time to failure.
• The average time taken for repair over n observation is called mean time to repair.
• Distributions that describe the time to repair are called repair distributions (or
down time distribution).
• E.g. When using a live distribution with failure data i.e. the event modeled was
time-to-time-failure -unreliability provides the probability that the event of failure
will occur by that time while the reliability provides the probability that event of
failure will not occur.
• Using these definitions the probability of preparing the component by a given time
t is also called the component's maintainability

By Priya Singh (Assistant Professor, Dept of SE, DTU)


Maintainability
• Maintainability is defined as the probability of performing a
successful repair work within a given time frame.
• E.g. If it is said that a particular component as 90% maintainability in
one hour, it means that there is a 90% probability that the component
will be repaired within an hour.
• In maintainability the random variable is time-to repair in the same
manner as time-to-failure is the random variable in reliability.
• E.g.,
Consider the maintainability equation for a system in which the repair
times are distributed exponentially. Its maintainability M(t) at given
time t is given by the equation

By Priya Singh (Assistant Professor, Dept of SE, DTU)


• Note the similarity between this equation and the equation for CDF of
a system with an exponentially distributed failure rate.
• Note that maintainability represents the probability of an event
occurring (repairing the system), while reliability represent the
probability of an event not working (failure) –the maintainability
expression is equivalent to the unreliability expression (1 –R).
• Again the single model parameter μ is now referred to as the repair
rate which is analogous to the failure rate λ used in the reliability for
an exponential distribution.
• The mean time to repair MTTR is the reciprocal of repair rate, μ.

By Priya Singh (Assistant Professor, Dept of SE, DTU)


Availability
• If one considers both reliability- probability that the item will not fail and
maintainability- the probability that item would be successfully restored after failure
then an additional metric is needed for the probability that the component of the
system is operational at a given time t i.e., not failed or it has been restored after
failure, is metric is known as availability.
• Availability of a system is the probability that the system will be functioning
according to the expectations at any time during its scheduled working period.

• Assuming that downtime is only due to repair availability is used for repairable
systems.
• It is the probability that the system is operational at any random time t.
• It can also be specified as a probability of time that the system is available for use in
a given time interval (0,t).

By Priya Singh (Assistant Professor, Dept of SE, DTU)


• Ability is a performance criterion for repairable systems that accounts
for both the reliability and maintainability properties of a component
or a system.
• It is defined as the probability that the system is operating properly
when it is requested for use i.e., available.
• It is probable that the system has not been under failed conditions or
undergoing a repair action when it needs to be used.
• E.g. if a lamp has a 99.9% availability, there will be 1 time out of
1,000 that someone needs to use the lamp and finds out that the lamp
is not operational either because the lamp has been burned out or the
lamp is in the process of being replaced.

By Priya Singh (Assistant Professor, Dept of SE, DTU)


Conclusion
• All these concepts are applicable to hardware systems.
• In general some of them may not be applicable to software systems.
• E.g., k-out-of-n parallel redundancy is not directly applicable to software.
• Deploying additional copies of the same software system as a redundancy
measure when only one is needed is not going to improve the overall
reliability.
• The reason being, as the code is identical the same failure will occur in all
copies of the given software system.
• However although all pieces are manufactured by the same manufacturer as
a single design, the failure pattern of each PC unit will be random.
• Therfore, redundancy will improve the reliability of hardware systems.
• Hence the basic concepts of reliability, maintainability and availability are
to be applied carefully on the system software.

By Priya Singh (Assistant Professor, Dept of SE, DTU)

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy