1828 Appg
1828 Appg
Appendix G
Modeling the
Repair Process
Probability of Repair
Modeling the repair process can be complex and error prone. Fortunately,
several assumptions can be made which simplify the analysis without
serious impact on the results. However, one must understand the limits of
these assumptions and simplifications.
In many ways modeling the repair process is difficult because the repair
process is quite different from the failure process. Random failures are due
to a stochastic process and most of our modeling techniques were created
for these stochastic processes. Certain aspects of the repair process are
deterministic. Other aspects of the repair process are stochastic.
Fortunately, we can approximate the repair process more accurately with
Markov models than most other techniques.
λ∆t
OK Fail
0 1
µ∆t
Figure G-1. Markov Model for Repairable Component
357
Goble05-AppG.fm Page 358 Thursday, March 31, 2005 11:34 PM
Assume that it is estimated that the repair probability will vary with time
(non-homogeneous). Table G-1 shows a set of example repair time
statistics. These statistics indicate that the repair time of a set of 64 repairs
varied from one to six hours. Six units were repaired within one hour.
Sixteen units were repaired within two hours. Other units took longer.
Based on these numbers the average repair time is approximately 3 hours
and the probability of repair in a particular hour is shown in Figure G-2.
0.40
Repair Probability
0.30
0.20
0.10
0.00
1 2 3 4 5 6
Repair Hours
Using an average repair rate of 0.333 (1/3 hours), a simple steady state
availability equation (Figure G-3) gives an answer of 0.9708. Most would
agree that the simple approximation is good enough.
0.01
OK Fail
0 1
0.333
Figure G-3. Simple Approximate Markov Model
λ = 0.01
OK Fail
RT=1
0 1
1
Fail
RT=2
1 2
1
Fail
RT=3
3
The probability of moving from state 2 to state 1 has been modeled two
different ways from various reasons. If the constant repair rate for one unit
is Mu (µ), then the probability of moving from state 1 to state 0 is equal to
Mu given a delta t of one hour. The probability of moving from state 2 to
state 1 has been modeled as either 2 Mu (often called the “two repairman
model”) or Mu (often called the “one repairman model”).
2P Failure PFailure
System OK
System OK System Fail
1 Unit OK
2 Units OK 2 Units Fail
1 Unit Fail
0 2
1
PRepair10 P Repair21
An alternative argument (see Ref. 2) has been made for using the value of
2 Mu. This argument recognizes that only a single repair crew is available.
However, the second failure occurs only from state 1 where one
component is already failed and is likely under repair. Since the repair of
the first component is on average half complete, that repair will take only
half the time to complete. When that repair is complete the Markov model
is back in state 1. Therefore, the repair probability from state 2 is twice as
likely than the repair rate from state 1 to state 0. This is a reasonable
argument and justifies the use of 2 Mu in situations where all failures are
immediately detectable.
This model can be solved using discrete time matrix multiplication for the
case where periodic inspection and test is done to detect failures in state 3.
The P matrix is normally:
When the time counter equals the end of the inspection, test and repair
period the matrix is changed to represent the known probabilities of
failure. The P matrix (PTI matrix) used then is:
1 0 0 0
1 0 0 0
1 0 0 0
1 0 0 0
0.000004
0.0000035
Probability of Failure - State 3
0.000003
0.0000025
0.000002
0.0000015
0.000001
0.0000005
0
Operating Time Interval
0
System OK
Degraded
ΕλD Fail-
Energize
µO 1
µP 3
System Fail
(1−Ε)λD Fail-
Energize
4
Figure G-9. Upgraded Markov Model accounting for Imperfect Periodic Proof Testing
D S D S
1 – ( 2λ + 2λ ) 2λ 2λ 0 0
S D S D D
µO 1 – ( µO + λ + λ ) λ Eλ ( 1 – E )λ
µS 0 0 0 0
0 0 0 1 0
0 0 0 0 1
Goble05.book Page 364 Thursday, March 31, 2005 10:39 PM
When the time counter equals the end of the inspection, test and repair
period the matrix is changed to represent the known probabilities of
failure. The P matrix (PTI matrix) used then is:
1 0 0 0 0
1 0 0 0 0
1 0 0 0 0
1 0 0 0 0
0 0 0 0 1
This matrix indicates that all failures are detected and repaired except
those from state 4 where they remain failed. A plot of the PFD as a
function of operating time interval is shown in Figure G-10.
0.0000045
0.000004
Probability of Failure - State 3
0.0000035
0.000003
Test Interval
0.0000025
0.000002
0.0000015
0.000001
0.0000005
0
Operating Time Interval
0 0 0 1
0 0 0 1
0 0 0 1
0 0 0 1
0 0 0 1
0.00001
V alue goes to 1.0 Value goes to 1.0
0.000009
Probability of Failure - State 3
0.000008
Test Interval
0.000007
0.000006
0.000005
0.000004
0.000003
0.000002
0.000001
0
Operating Time Interval
Figure G-11. Probability of Failure in State 3 with Imperfect Testing and Bypassed
Testing
Overall, more sophisticated models could be used but the effect of repair is
well approximated when the modeling includes:
These variables are often not included in simplified equations and this can
clearly result in probabilistic verification calculations that are optimistic
leading to insufficient safety.
Goble05.book Page 366 Thursday, March 31, 2005 10:39 PM