Reliability and Maintainability Assessment of Industrial Systems
Reliability and Maintainability Assessment of Industrial Systems
Mangey Ram
Hoang Pham Editors
Reliability and
Maintainability
Assessment
of Industrial Systems
Assessment of Advanced Engineering
Problems
Springer Series in Reliability Engineering
Series Editor
Hoang Pham, Department of Industrial and Systems Engineering, Rutgers
University, Piscataway, NJ, USA
More information about this series at https://link.springer.com/bookseries/6917
Mangey Ram · Hoang Pham
Editors
Reliability
and Maintainability
Assessment of Industrial
Systems
Assessment of Advanced Engineering
Problems
Editors
Mangey Ram Hoang Pham
Department of Mathematics, Computer Department of Industrial and Systems
Science and Engineering Engineering
Graphic Era University Rutgers University
Dehradun, Uttarakhand, India Piscataway, NJ, USA
Institute of Advanced Manufacturing
Technologies
Peter the Great St. Petersburg Polytechnic
University
Saint Petersburg, Russia
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature
Switzerland AG 2022
This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether
the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse
of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and
transmission or information storage and retrieval, electronic adaptation, computer software, or by similar
or dissimilar methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this book
are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or
the editors give a warranty, expressed or implied, with respect to the material contained herein or for any
errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional
claims in published maps and institutional affiliations.
This Springer imprint is published by the registered company Springer Nature Switzerland AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Preface
Acknowledgements The editors acknowledge Springer for this opportunity and professional
support. Also, we would like to thank all the chapter authors and reviewers for their availability for
this work.
v
Contents
vii
viii Contents
Prof. Mangey Ram received the Ph.D. degree major in Mathematics and minor
in Computer Science from G. B. Pant University of Agriculture and Technology,
Pantnagar, India. He has been a faculty member for around thirteen years and has
taught several core courses in pure and applied mathematics at undergraduate, post-
graduate, and doctorate levels. He is currently Research Professor at Graphic Era
(Deemed to be University), Dehradun, India, and Visiting Professor at Peter the Great
St. Petersburg Polytechnic University, Saint Petersburg, Russia. Before joining the
Graphic Era, he was Deputy Manager (Probationary Officer) with Syndicate Bank
for a short period. He is Editor-in-Chief of International Journal of Mathematical,
Engineering and Management Sciences; Journal of Reliability and Statistical Studies;
Journal of Graphic Era University; Series Editor of six Book Series with Elsevier,
CRC Press-A Taylor and Frances Group, Walter De Gruyter Publisher Germany,
River Publisher, and Guest Editor and Associate Editor with various journals. He
has published 250 plus publications (journal articles/books/book chapters/conference
articles) in IEEE, Taylor & Francis, Springer Nature, Elsevier, Emerald, World Scien-
tific and many other national and international journals and conferences. Also, he
has published more than 50 books (authored/edited) with international publishers
like Elsevier, Springer Nature, CRC Press-A Taylor and Frances Group, Walter De
Gruyter Publisher Germany, River Publisher. His fields of research are reliability
theory and applied mathematics. He is Senior Member of the IEEE, Senior Life
Member of Operational Research Society of India, Society for Reliability Engi-
neering, Quality and Operations Management in India, Indian Society of Industrial
and Applied Mathematics. He has been a member of the organizing committee of a
number of international and national conferences, seminars, and workshops. He has
been conferred with “Young Scientist Award” by the Uttarakhand State Council for
Science and Technology, Dehradun, in 2009. He has been awarded the “Best Faculty
Award” in 2011; “Research Excellence Award” in 2015; “Outstanding Researcher
Award” in 2018 for his significant contribution in academics and research at Graphic
ix
x Editors and Contributors
Era Deemed to be University, Dehradun, India. Recently, he has been received the
“Excellence in Research of the Year-2021 Award” by the Honorable Chief Minister
of Uttarakhand State, India.
Contributors
Wenbin Zeng Fine Mechanics and Physics, Changchun Institute of Optics, Chinese
Academy of Science, Changchun, China
Dynamic Availability Analysis
for the Flexible Manufacturing System
Based on a Two-Step Stochastic Model
Abstract The paper proposes a dynamic availability analysis approach for the flex-
ible manufacturing system (FMS) under a stochastic environment that machines’
failure and starvation or blockage of production process occur randomly. Accurately
knowing the availability of the FMS, which is changing dynamically overtime in a
stochastic circumstance could benefit a lot for the improvement or re-design of the
system. A two-step stochastic model proposed in current paper that integrates the
intermediate buffers into associated workstations equivalently in terms of the rela-
tionships between upstream and downstream production rates. Calculation proce-
dures of relevant dynamic availability are established by using the Lz-transform
method, which conquer the states-explosion problem which is common in FMS
performance analysis. Meanwhile, the impacts of intermediate buffers on the FMS
W. Zeng
Fine Mechanics and Physics, Changchun Institute of Optics, Chinese Academy of Science,
Changchun 130022, China
e-mail: zengwb15@yeah.net
G. Shen
School of Mechanical and Aerospace Engineering, Jilin University, Changchun 130022, China
e-mail: shengx@jlu.edu.cn
I. Frenkel (B) · L. Khvatskin · A. Lisnianski
Center for Reliability and Risk Management, Shamoon College of Engineering, 84100 Beer
Sheva, Israel
e-mail: iliaf@frenkel-online.com
A. Lisnianski
e-mail: lisnianski@bezeqint.net
I. Bolvashenkov · J. Kammermann · H.-G. Herzog
Institute of Energy Conversion Technology, Technical University of Munich (TUM), 80333
Munich, Germany
e-mail: igor.bolvashenkov@tum.de
J. Kammermann
e-mail: joerg.kammermann@tum.de
H.-G. Herzog
e-mail: hg.herzog@tum.de
dynamic availability also revealed, which assist to determine the appropriate volumes
of the buffers to satisfy the various FMS production demands. A numerical example
is presented for illustrating the effectiveness and rationality of the approach.
1 Introduction
evaluating the effects of varying levels of machine and routing flexibility on the
makespan, average waiting time (AWT), and average utilization of an FMS. Dosdogru
et al. [6] utilized an integrated genetic algorithm, the Monte Carlo method, to solve
the stochastic flexible job shop-scheduling problem and to measure the impact of
routing flexibility on shop performance. Jain and Raj [13] performed three different
approaches to analyze the intensity of performance variables in an organization and
proposed the FMS performance index to intensify the factors, which affect FMS. In
a multi-stage production system, Gyulai et al. [10] proposed a simulation-based opti-
mization method that utilizes lower-level shop floor data to calculate robust produc-
tion plans for final assembly lines of a flexible, multistage production system. Rybicka
et al. [26] demonstrated how discrete event simulation (DES) can address complexity
in an FMS to optimize the production line performance. However, the reliability of
machines has a considerable impact on the practical performance of manufacturing
systems. The disturbances caused by these breakdowns lead to scheduling prob-
lems, which decrease the productivity of the entire manufacturing process. This
issue points out an important tangible for the consideration of machine reliability in
the performance evaluation of FMS, especially in light of the increasing complexity
of such systems in recent years. Consequently, many researchers have realized the
importance of reliability features of the FMS and much work has been contributed.
Koulamas [15] developed a semi-Markov model to study the effects of tool failures
on the performance of an FMC. Das et al. [5] proposed an approach that provides
a flexible routing which ensures high overall performance of the cellular manu-
facturing system (CMS) by minimizing the impact of machine failure through the
provision of alternative process routes in case of any machine failure. Elleuch et al.
[8] proposed a Markov-based model for reducing the severity of breakdowns and
improving performances of the CMS with unreliable machines. Loganathan et al.
[23] suggested a methodology for availability evaluation of manufacturing systems
using semi-Markov model, which considers variable failure or repair rates. Zeng et al.
[38] developed a multi-state transition model for CNC machine tools that included the
setup of production, and the Lz-transform method was introduced to determine the
productivity and availability evaluation of the multi-state FMS. Savsar [28] developed
stochastic models to determine the performance of an FMC under random opera-
tional conditions, including random failures of cell components in addition to random
processing times, random machine loading and unloading times, and random pallet
transfer times. Moreover, Savsar [30] developed a stochastic multi-states model to
analyze performance measures of a cell, which is allowed to operate under degraded
mode and the results can be useful for practitioners to analyze performance of an FMC
at the design or operational stage. Chen et al. [4] proposed a process performance
evaluation chart to examine the manufacturing performance of bearing connectors,
which can also be used for multiple evaluations of manufacturing performance in
other manufacturing industries. Manocher and Hamid [25] developed a discrete-event
simulation model to investigate the effectiveness of a reliability plan focusing on the
most critical machines in manufacturing cell. More studies about the performance
analysis of manufacturing systems can be found in papers of Anupma and Jayswal
[2], Savsar [29], Li and Huang [18], Alhourani [1].
4 W. Zeng et al.
Although the aforementioned studies have been undertaken, only few studies that
include the influence of intermediate buffers on performance measures of an FMS.
Tan and Gershwin [34] proposed a tool for performance evaluation of general Marko-
vian continuous material flow of two-processing stage production systems with one
intermediate buffer. Liu et al. [22] modeled and analyzed the throughput of a two-
stage manufacturing system with multiple independent unreliable machines at each
stage and one finite-sized buffer between the stages. Duan et al. [7] suggested a reli-
ability modeling and evaluating methodology for a repairable, non-series, multistate
FMS with finite buffers by using an improved vector universal generation function to
satisfy the market demands on capacity and capability of the system. However, only
steady-state probabilities of FMS performance metrics such as steady-state avail-
ability, theoretical production rates, utilization rates, efficient production rates and
production loss are calculated. Therefore, a two-step stochastic model is presented in
current paper as an attempt to evaluate the dynamic performance of FMS, especially
the availability, and to meet various production demands through select appropriate
buffers volumes.
The remaining article is structured as follows. The description of the FMS and
corresponding hypothesis are shown in the section “Problem Description”. Then
the section “Two-Step Stochastic Model” introduces the procedure of establishing
the required model in detail. Subsequently, an exploratory “Numerical Example” is
utilized to present the effectiveness of the proposed approach and remarks as well as
future work discussed in the “Conclusion”.
2 Problem Description
Fig. 1 Schematic of the FMS consisting of multiple FMCs and corresponding buffers: F MC is
the abbreviation of the flexible manufacturing cell, M, R and B represent the processing machines,
robots and intermediate buffers, respectively
Dynamic Availability Analysis for the Flexible Manufacturing … 5
the identical processing machines and a robot for automatic loading and unloading.
Note that the machines process the same work independently while connect in series
with the robot separately. The buffers are situated between every two FMCs.
The (i)th FMC as the example to illustrate the multi-state characteristics of the
device, (i) = 1, . . . , m. Machine M (i)
j , j = 1, . . . , H is one machine of the (i)th
FMC that has K j performance rates, represented with vector g j = g j1 , . . . , g(i)
(i) (i)
jK
j
issue (total failure) for machines and robot while g(i) (i)
j K and r1 as the best. Therefore,
j
Duan et al. [7] studied the relationships between the buffers and the productivities
of upstream and downstream for establishing the steady-state transition equation
in order to obtain the probability of the blank remaining in the buffer. However,
productivities of the workstations are set as constant which entail the aging and
performance degradation of machines that apparently have influence on the produc-
tivity of system are excluded in that equation. Consequently, an attempt that brings
in such factors to explore the dynamic performance of FMS is made in current
6 W. Zeng et al.
paper, which summarizes a two-step stochastic model. The specific procedures are
as follows.
The FMS steady-state performance models had been established by Duan et al.
[7]. This model cannot be directly used for systems with stochastic characteristics.
However, the logic and ideas of analyzing that problem are still worth learning.
Zhou and Liu [39] proposed the method by that the FMS (shown in Fig. 1) can be
decomposed into (m − 1) processing domains with two FMCs and one buffer, which
is shown in Fig. 2.
Supposing the volume of buffer B (i) is bi , the production rate of upstream F MC (i)
is P R (i) (t) and the downstream F MC (i+1) is P R (i+1) (t). Then in a very short period,
at most one blank enters or leaves the buffer, and the intensity of the workpiece
entering and leaving the buffer area is determined by the productivity of the upstream
and downstream stations. Therefore, the change process of the number of blanks in
the buffer is shown in Fig. 3.
The change process of the blanks in the buffer B (i) could be regarded as a nonlinear
birth–death process or a discrete state continuous-time Markov process [37], the
corresponding state transition intensity matrix a(i) is
⎡ ⎤
−P R (i) (t) P R (i) (t) ··· 0 0
⎢ PR ⎥
⎢ (i+1) (t) − P R (i) (t) + P R (i+1) (t) P R (i) (t) ··· 0 ⎥
⎢ ⎥
⎢ . . . . . ⎥
a(i) =⎢ . . . . . ⎥
⎢ . . . . . ⎥
⎢ ⎥
⎣ 0 ··· P R (i+1) (t) − P R (i) (t) + P R (i+1) (t) P R (i) (t) ⎦
0 ··· 0 P R (i+1) (t) −P R (i+1) (t)
Moreover, Zhou and Liu [39] explained that the initial probability distribution
has no effect on the dynamic probability of the above process, then the dynamic
probability of the number of blanks in B (i) at time t can be calculated. Subsequently,
considering the influence of the front and rear buffers, the FMC has the following
five independent states and equivalent probabilities:
(1) The equivalent probability for F MC (i) operates normally is
B
p(i−1)0 (t)PV(i) (t) pib
B
(t);
i
(2) The equivalent probability for F MC (i) that lack of blanks while output normal
B
is p(i−1)0 (t)PV(i) (t) pib
B
(t);
i
(3) The equivalent probability for F MC (i) that input normal while output blocked
B
is p(i−1)0 (t)PV(i) (t) pib
B
i
(t);
(4) The equivalent probability for F MC (i) that input empty while output blocked
B
is p(i−1)0 (t)PV(i) (t) pib
B
i
(t);
∼
(5) The equivalent probability for F MC (i) that equipment failure is PV(i) (t)
where, PV(i) (t) is the dynamic probability of F MC (i) normal operation when
B
the impact of the buffer is excluded, p(i−1)0 (t) and p(i−1)0
B
(t) are the dynamic
probabilities of B (i−1) for non-starvation and starvation respectively, pib B
(t)
i
(i)
B
and pibi
(t) are the dynamic probabilities of B for non-blockage and blockage
respectively.
Therefore, the equivalent FMS can be established on the basis of these proba-
bilistic relationships, and the result is shown in Fig. 4.
According to the content in Fig. 4 and the above analysis results, it is necessary
to determine the dynamic productivity of each FMC and to determine the equivalent
system performance model for completing the dynamic performance analysis. Hence,
the Lz-transform method is introduced to finish the task and the specific analysis
procedure is as follows.
Machine M (i)
j , j = 1, . . . ., H in the (i)th FMC has K j states, where transition
intensities between states may be the function of time t, represented with a tran-
sition intensity matrix a j = a juv (t) , j = 1, . . . ., H, u, v = 1, . . . ., K j . Then the
machine can be regarded as a discrete-state
continuous-time (DSCT) Markov process
(i) (i) (i) (i)
[36] G j (t) ∈ g j = g j1 , . . . , g jK which is descripted by the following triplet
j
notation:
G (i) (i)
j (t) = g j , a j , p j (0) , (1)
where p j (0) is initial probability distribution of the machine which generally defines
the state K j with the highest performance level as the initial state, which is expressed
by (2).
p j (0) = p j1 (0) = Pr G (i) (i)
j1 (0) = g j1 , . . . ,
p jK j (0) = Pr G (i)
jK (0) = g (i)
jK = [0, . . . , 1] (2)
j j
dp ju (t) Kj
Kj
= p jv (t)a juv (t) − p ju (t) a juv (t), u, v = 1, . . . ., K j (3)
dt
v=1 v=1
v = u v=u
Introducing the row-vector p j (t) = p j1 (t), p j2 (t), . . . , p jK j (t) , then Eq. (3)
can rewrite in matrix notation:
dp j (t)
= p j (t)a j (4)
dt
Consequently, the probabilities of the states at the moment t should be found as
a solution of the system (4) under the initial conditions (2).
When the machine state transitions are caused by its failures and repairs events, the
corresponding transition intensities are expressed by the machine’s failure and repair
rates [20, 21]. Without lose generality, the stochastic process of the machine contains
minor and major failures and repairs. The state-transition process with corresponding
state performance rates of the machine is presented in Fig. 5.
Dynamic Availability Analysis for the Flexible Manufacturing … 9
Fig. 5 State-transition
process for repairable
machine with minor and
major failures and repairs
Once the dynamic probabilities of the machine’s states are obtained, the Lz-
transform of discrete-state continuous-time (DSCT) Markov process (1) can be
defined as:
Kj
(i)
L z G (i)
j (t) = p ju (t) · z g ju (5)
u=1
where G (i) j (t) is the stochastic output performance process of the single machine,
p ju (t) is a probability that the process is in state u at instant t ≥ 0 for the given
previous initial states probability distribution p j (0), g(i)
ju is a performance in state u,
and z in general case is a complex variable. The definition and proof of Lz-transform
can be found in [19–21].
The Lz-transform of all processing machines, operated in parallel, can be obtained
by using the Universal Generating Operator (UGO) f par [20, 21].
KM
(i)
L z G (i)
M (t) = f par
L z G (i)
1 (t) , . . . , L z G (i)
H (t) = PMU (t) · z g MU (6)
U =1
where G (i)
M (t) is the stochastic output performance process of the parallel operate
processing machines, K M is the possible states of the process, PMU (t) is a probability
that the process is in state U, U = 1, . . . , K M at instant t ≥ 0, g(i)
MU is a performance
in state U .
The Lz-transform of the whole (i)th FMC obtains by using the serial Universal
Generating Operator (UGO) fser since the group of processing machines and robot
work in series
10 W. Zeng et al.
K (i)
(i) (i)
L z G (i) (t) = fser L z G M (t) , L z G R (t) = PV(i) (t) · z g(i)V (7)
V =1
where G (i) (t) is the stochastic output performance process of the whole F MC (i) ,
G (i)
R (t) is the output performance process of the robot of the FMC, K (i) is the possible
states of the process, PV(i) (t) is a probability that the process is in state V, V =
1, . . . , K (i) at instant t ≥ 0, g(i)
V is a performance in state V .
The dynamic production rate of the (i)th FMC at instant t ≥ 0 can be calculated
by Eq. (8).
K (i)
P R (i) (t) = g(i) (i)
V · PV (t), i = 1, . . . , m (8)
V =1
Therefore, the P R (i) (t) results are utilized to calculate the dynamic probability
of the number of blanks in B (i) . Then, the equivalent probabilities of FMCs and the
equivalent FMS shown in Fig. 4 are obtained.
Step 1 incorporates the influence of the intermediate buffers on the FMS into the
FMCs to generate the equivalent FMCs that are neither starvation nor blockage, and
an equivalent FMS that is neither starvation nor blockage is also generated. Moreover,
due to the existence of the intermediate buffers, the rigid connection of the production
system is transformed into a flexible connection [31]. When a station in the flexible
connection production line is shut down, the upstream and downstream stations can
continue processing because of the buffer for storing and supplying blanks, which
prevents the production line from being forced to shut down due to station shutdown.
From the perspective of reliability, the production line is transformed from a reliable
single-circuit series structure to a parallel redundant structure [32]. Therefore, the
dynamic availability A F M S (t) of the equivalent FMS is:
m
∼
A F M S (t) = 1− A F M S (t)= 1 − 1 − A(i)
F MC (t) (9)
i=1
∼
where A F M S (t) is the unavailability of the equivalent FMS, A(i)
F MC (t) is the avail-
ability of the equivalent FMC, where A(i) (i)
F MC (t) = p(i−1)0 (t)A F MC (t) pib (t) and
B B
i
A(i)
F MC (t) is the availability of F MC
(i)
that excludes the impact of buffers and it
required to processes blanks more than production demands, otherwise it should be
Dynamic Availability Analysis for the Flexible Manufacturing … 11
where d is the production demands that could be constant or stochastic. Thus, the
dynamic performance measurements of the FMS can be calculated based on the
Eqs. (9) and (10), and the specific calculation index and approach will be presented
within the numerical example.
4 Numerical Example
machines obey exponential distribution. The failure rates and repair rates of robots
both follows exponential distribution. Note that the loading and unloading times of
robots are neglected compared to the processing times.
The parameters of each device are given in Table 1. Shape parameters β (i) and scale
parameters α (i) present the Weibull process of machines’ failures. The reciprocal of
the Mean time to repair (MTTR-1) and mean time to failure (MTTF) are given to
describe the exponential distribution. In addition, two constant production demands,
d1 = 12 units/h and d2 = 24 units/h are included to depict more about the impacts
of buffers on the FMS.
Therefore, taking constrain d1 as the detailed example, the first step to evaluate
the dynamic performance metrics of the FMS is to capture the dynamic production
rates of FMCs. Taking the F MC (2) as example, the differential system of the state
probabilities of machine M (2) j , j = 1, 2, 3, 4 obtained based on the Eq. (3) and
Fig. 7a. For simplicity, cancel the subscript indicating the FMC in the following
calculation.
⎧ dp1 (t)
⎨ = p3 (t)λ3,1 (t) + p2 (t)λ2,1 (t) − p1 (t)μ1,3 (t)
dt
dp2 (t)
= p3 (t)λ3,2 (t) − p2 (t) λ2,1 (t) + μ2,3 (t) (11)
⎩ dp3 (t) dt
dt
= p 1 (t)μ1,3 (t) + p 2 (t)μ2,3 (t) − p 3 (t) λ 3,1 (t) + λ 3,2 (t)
The Lz-transform of the whole F MC (2) obtained through Eqs. (6) and (7), and
the dynamic production rate of F MC (2) are calculated through Eq. (8). The result
about F MC (2) is depicted in Fig. 8.
In the mimic procedures, the productivities of the remaining FMCs can be
evaluated as shown in Fig. 9.
Determining the dynamic productivities of FMCs, the next task is to calculate the
dynamic models of the equivalent FMCs. Based on the state transition intensity matrix
a(i) and the Chapman-Kolmogorov differential equations, the first state probability
of equivalent FMCs, i.e. the equivalent availability of F MC (1) is:
Table 1 Parameters of each device
Device Failure rates/(α (i) , β (i) ) Repair rates Performance Device Failure rates Repair rates Performance
(MTTR−1 /h) (Units/h) (MTTF/h) (MTTR−1 /h) (Units/h)
λ3,2 (t) λ3,1 (t) λ2,1 (t) μ2,3 μ1,3 g3 g2 λ1,0 μ0,1 g1
M (1) [3200, 2.2] [3500, 2.3] [2100, 1.9] 12 24 12 6 R(1) 2500 10 30
M (2) [1800, 1.8] [2300, 1.9] [1200, 1.6] 30 60 6 3 R(2) 2200 20 30
M (3) [2500, 2.7] [3100, 2.4] [1500, 1.8] 25 50 8 4 R(3) 2800 15 30
M (4) [2800, 2.4] [3400, 2.0] [1400, 1.5] 15 30 12 6 R(4) 3500 10 30
M (5) [2600, 3.2] [2900, 3.3] [1700, 2.3] 10 20 24 12 4000 8 30
Dynamic Availability Analysis for the Flexible Manufacturing …
R(5)
13
14 W. Zeng et al.
Fig. 9 Dynamic production rates of F MC (1) , F MC (3) , F MC (4) and F MC (5) under constrain d1
A(1) (1)
F MC (t) = A F MC (t) p1b1 (t)
B
(14)
The results in the figure shows that as b1 increases, the equivalent availability
of F MC (1) increases but gradually reaches the plateau. Note that, the availability
is chosen as the criterion for determining the appropriate buffer volume in current
paper, while other criterion in terms of system reliability associated cost (RAC)
will be discussed in further research, where the optimal volume of buffer could be
determined. Therefore, b1 = 40 is selected as an appropriate volume for buffer B (1) .
Consequently, the changes of equivalent availability of F MC (2) with the set
volume of b1 and variational b2 could also be obtained in the mimic procedure,
which shown in Fig. 11.
Therefore, an appropriate volume for buffer B (2) also confirmed as b2 = 30.
Similarly, the equivalent availability of F MC (3) , F MC (4) and F MC (5) with the
changes of volume of b3 and b4 depicted in Fig. 12, respectively. In addition, a group
of appropriate FMS intermediate buffers’ volumes can be obtained, as shown in Table
2.
In accordance with the aforementioned analysis results and the Eq. (9), the
dynamic availability of the FMS under production demands d1 and d2 plotted as
follow.
It can be concluded from Fig. 13 that the results in Table 2 could ensure the
FMS possess high availability under d1 while perform poorly under d2 , which entails
the production capacity requirements have a significant impact on the FMS avail-
ability. The approach proposed in current paper not only quantifies this impact, but
more importantly, provides a new solution for evaluating and improving the dynamic
availability of FMS during the life cycle. Nevertheless, it is inadequate to determine
the optimal buffers combination barely from the perspective of reliability because
production cost is an inevitable topic for the design, operation and optimization of
16 W. Zeng et al.
Fig. 11 The equivalent availability of F MC (2) with the volume of b1 = 40 and the variational b2
Fig. 12 The equivalent availability of F MC (3) , F MC (4) and F MC (5) with the changes of volume
of b3 and b4 , respectively
FMS. Therefore, it will be an interesting research to incorporate cost factors with the
proposed model for obtains a veritable optimal buffers combination.
5 Conclusion
References
1. Alhourani F (2016) Cellular manufacturing system design considering machines reliability and
parts alternative process routings. Int J Prod Res 54(3):846–863
2. Anupma Y, Jayswal SC (2019) Evaluation of batching and layout on the performance of flexible
manufacturing system. Int J Adv Manuf Technol 101:1435–1449
3. Browne J, Dubois D, Rathmill K et al (1984) Classification of flexible manufacturing systems.
FMS Mag 114–117
4. Chen KS, Yu CM, Hus TH et al (2019) A model for evaluating the performance of the bearing
manufacturing process. Appl Sci 9(15):3105
5. Das K, Lashkari R, Sengupta S (2007) Reliability consideration in the design and analysis of
cellular manufacturing systems. Int J Prod Econ 105(1):243–262
6. Dosdogru AT, Gocken M, Geyik F (2015) Integration of genetic algorithm and Monte Carlo to
analyze the effect of routing flexibility. Int J Adv Manuf Technol 81:1379–1389
7. Duan JG, Xie N, Li LH (2019) Modelling and evaluation of multi-state reliability of repairable
non-series manufacturing system with finite buffers. Adv Mech Eng 11(6):1–13
8. Elleuch M, Bacha HB, Masmoudi F et al (2008) Analysis of cellular manufacturing systems
in the presence of machine breakdowns. J Manuf Technol Manag 19(2):235–252
9. Groover MP (2007) Automated assembly system, automation, production systems, and
computer-integrated manufacturing. Prentice Hall Press, Upper Saddle River, NJ
10. Gyulai D, Pfeiffer A, Monostori L (2017) robust production planning and control for multi-stage
systems with flexible final assembly lines. Int J Prod Res 55(13):3657–3673
11. Halse LL, Jæger B (2019) Operationalizing industry 4.0: understanding barriers of industry
4.0 and circular economy. In: Ameri F, Stecke K, von Cieminski G, Kiritsis D (eds) Advances
in production management systems. Towards smart production management systems. APMS
2019. IFIP advances in information and communication technology, vol 567. Springer, Cham
12. He C, Zhang SY, Qiu LM et al (2019) Assembly tolerance design based on skin model shapes
considering processing feature degradation. Appl Sci 9(16):3216
13. Jain V, Raj T (2016) Modeling and analysis of FMS performance variables by ISM, SEM and
GTMA approach. Int J Prod Econ 171(1):84–96
14. Jin R, Liu K (2013) Multimode variation modeling and process monitoring for serial-parallel
multistage manufacturing processes. IIE Trans 45(6):617–629
15. Koulamas CP (1992) A stochastic model for a machining cell with tool failure and tool
replacement considerations. Comput Oper Res 19(8):717–729
16. Kumar N, Kumar J (2019) Efficiency 4.0 for industry 4.0. Human Technol 15(1):55–78
17. Lee J, Bagheri B, Kao HA (2015) A cyber-physical systems architecture for industry 4.0 based
manufacturing systems. Manuf Let 3:18–23
18. Li JS, Huang NJ (2007) Quality evaluation in flexible manufacturing systems: a markovian
approach. Math Prob Eng. Article ID 057128, 24 pages. https://doi.org/10.1155/2007/57128.
19. Lisnianski A (2012) Lz-transform for a discrete-state continuous-time Markov process and its
application to multi-state system reliability. In: Lisnianski A, Frenkel I (eds) Recent advances
in system reliability. Springer-Verlag, London, pp 79–95
20. Lisnianski A, Frenkel I, Ding Y (2010) Multi-state system reliability analysis and optimization
for engineers and industrial managers. Springer, London
21. Lisnianski A, Frenkel I, Khvatskin L (2021) Modern dynamic reliability analysis for multi-state
systems. Springer series in reliability engineering. Springer, Cham
22. Liu JL, Yang S, Wu AG et al (2012) Multi-state throughput analysis of a two-stage manu-
facturing system with parallel unreliable machines and a finite buffer. Eur J Oper Res
219(2):296–304
23. Loganathan MK, Girish K, Gandhi OP (2016) Availability evaluation of manufacturing systems
using Semi-Markov model. Int J Comput Integ M 29(7):720–735
24. MacDougall W (2014) Industry 4.0: smart manufacturing for the future. Berlin, Germany,
GTAI
Dynamic Availability Analysis for the Flexible Manufacturing … 19
25. Manocher D, Hamid S (2019) Analysis of critical machine reliability in manufacturing cells.
J Ind Eng Manag 12(1):70–82
26. Rybicka J, Tiwari A, Enticott S (2016) Testing a flexible manufacturing system facility produc-
tion capacity through discrete event simulation: automotive case study. Int J Mech Aerosp Ind
Mechatron Manuf Eng 10(4):668–672
27. Sanghavi D, Parikh S, Raj SA (2019) Industry 4.0: tools and implementation. Manag Prod Eng
Rev 10(3):3–13
28. Savsar M (2000) Reliability analysis of a flexible manufacturing cell. Reliab Eng Syst Saf
67(2):147–152
29. Savsar M (2004) Performance analysis of an FMS operating under different failure rates and
maintenance policies. Int J Flex Manuf Sys 16:229–249
30. Savsar M (2011) Multi-state reliability modeling of a manufacturing cell. Int J Perform Eng
7(3):217–228
31. Shu S (1992) An analysis of the repairable computer integrated manufacturing system (CIMS)
with buffers and a study of the system reliability. Acta Automatica Sinica 18(1):15–22
32. Shu S, Zhang Y (1995) Reliability analysis of series production lines. Control Theory Appl
12(2):177–182
33. Singholi A, Ali M, Sharma C (2013) Evaluating the effect of machine and routing flexibility
on flexible manufacturing system performance. Int J Serv Oper Manag 16(2):240–261
34. Tan B, Gershwin SB (2009) Analysis of a general Markovian two-stage continuous-flow
production system with a finite buffer. Int J Prod Econ 120(2):327–339
35. Thames L, Schaefer D (2016) Software-defined cloud manufacturing for industry 4.0. Procedia
CIRP 52:12–17
36. Trivedi K (2019) Probability and statistics with reliability, queuing and computer science
applications. Wiley, New York
37. Zeng W, Chen P (2008) Volatility smile, relative deviation and trading strategies: a general
diffusion model for stock price movements based on nonlinear birth-death process. China
Econ Quart 7(4):1415–1436
38. Zeng W, Shen G, Chen B et al (2019) Lz-transform method and Markov reward approach for
flexible manufacturing system performance evaluation. Appl Sci 9(19):4153
39. Zhou J, Liu Z (2006) Relationship between machine utilization and buffer capacity. Tool Eng
40(9):24–26
Integrated Reliability and Risk
Assessments of Nuclear Facilities
Abstract Reliability is the probability that a system will perform its intended func-
tion satisfactorily during its lifetime under specified environmental and operating
conditions. Risk can be measured by assessing the probability of an undesired event
(e.g., a system failure) and the magnitude of its consequences. Therefore, risk and
reliability are complementary variables. Licensing of nuclear facilities requires safety
and risk assessment. Probability risk assessment implies carrying out a quantitative
assessment of the reliability of items important to safety (IISs). In addition, reliability
assessments are required during all lifetime phases for optimizing plant performance,
maintainability and safety. Thus, the outcomes of reliability and risk assessments are
interchangeable and complementary. This chapter proposes a framework for inte-
grating reliability and risk assessments of IISs of nuclear facilities using tools of
reliability engineering, as Markov models, fault tree analysis, event tree analysis,
reliability block diagrams and life data analysis. Based on frequency-dose limits to
the public, risk acceptance criteria are also suggested in the scope of the frame-
work. A case study demonstrating the advantages of using the integrated approach
applied to a preliminary plant design phase of a nuclear fuel fabrication, to meet risk
acceptance criteria and licensing requirements, is presented.
1 Introduction
period of time under operation conditions. Among several risk concepts used in
different scientific areas, risk can be defined as the potential of loss resulting from
exposure to hazards. Sometimes, the risk is measured by assessing the probability
(or frequency) of occurrence of an undesired event (e.g., an accident) and the mag-
nitude of its consequences (severity) [1]. Under this viewpoint, risk and reliability
can be considered as complementary variables. Thus, even if these analyses were
not performed at the same time during plant lifetime, or by the same team, their
outcomes should be interchangeable and complementary. This chapter proposes a
framework for integrating reliability and risk assessments applied to the design of
items important to safety (IISs) of nuclear facilities (NFs). This kind of integration
is a trend in advanced reactor designs within reliability assurance programs (RAPs),
which exploit the optimization of performance and safety of such plants using well-
established reliability, availability and maintainability (RAM) and probabilistic risk
assessment (PRA) tools [2]. The application of this framework for nuclear facilities,
which are simpler and less potentially hazardous than nuclear power plants (NPPs),
can also bring benefits during all plant life phases, supporting design, operation and
licensing processes.
Compared to NPPs, NFs differ in many aspects. These last facilities use a wide
variety of technologies and processes and, in addition to fissile material and waste,
they often handle, process and storage large quantities of toxic, corrosive, or com-
bustible chemicals. Furthermore, a common characteristic of these facilities is the
continuous improvement and innovations, research and development, as well as
greater dependence on human factors, which range the potential hazards and the
possibility of release of hazardous chemical or radioactive materials. Despite that,
the methodologies used for reliability and risk assessments of NPPs can be adapted
and integrated for use during all NF lifetime, in order to optimize the safety and
performance of such facilities [3].
An overview of an integrated approach for reliability and risk assessments of NFs
is shown in Fig. 1. Supported by reliability engineering tools and risk acceptance
criteria, the IISs are analyzed in an integrated way throughout the facility lifetime. The
optimization of plant performance and safety is carried out using well-established
techniques of PRA and reliability engineering, such as life data analysis (LDA),
reliability block diagrams (RBD), failure mode and effects analysis (FMEA), event
tree analysis (ETA), fault tree analysis (FTA), Markov models and Monte Carlo
simulation, among others. Special emphasis should be given to aspects of human
reliability analysis (HRA), sensitivity analysis of IISs and uncertainty assessment of
reliability and risk quantitative results.
Nuclear-based electric power depends on heat generation within the reactor core,
which is usually an assembly of uranium fuel rods (nuclear fuel), the uranium some-
times being enriched in the naturally present uranium-235 isotope (fissile atom). The
Integrated Reliability and Risk Assessments of Nuclear Facilities 23
Fig. 1 Overview of integrated approach for reliability and risk assessments of nuclear facilities
heat results from a sustained and carefully controlled “chain reactions”, in which
neutrons cause fissions of fissile atoms within the core. The fission process involves
the splitting of the fissile atoms into lighter radioactive fission products, the produc-
tion of further neutrons, sustaining the chain reactions and releasing a considerable
amount of energy [4].
Spent nuclear fuel can be reprocessed to extract recyclable energy material, inside
the so-called “nuclear fuel cycle”. This cycle involves all industrial activities neces-
sary to produce electricity from uranium in NPPs, including “front-end” activities
(fuel fabrication), “service period” activities (electricity generation in NPPs) and
“back-end” activities (reprocessing, reuse and disposal). Fuel cycle is referred to as
“closed fuel cycle” if the spent fuel is reprocessed; otherwise, it is referred to as
“open fuel cycle”. Therefore, nuclear fuel cycle facilities (NFCFs) can include the
following industrial operations:
• Processing of uranium ores;
• Uranium conversion and enrichment;
• Nuclear fuel fabrication;
• Reprocessing of spent nuclear fuel;
• Nuclear waste management and disposal;
• Separation of radionuclides from fissile products;
• Research and development to support NFCFs.
NFCFs are installations other than NPPs, research reactors and critical assem-
blies, which pose potential hazards to workers, public and environment [5]. The
terms “nuclear fuel cycle facility” and “nuclear facility” and their initials are used
interchangeably in this chapter. The variety of processes associated with NFCFs
can result in a broad range of potential accidents. Some types of facilities with low
24 V. de Vasconcelos et al.
radioactive inventories and low criticality risk, or facilities handling natural uranium,
have a low hazard potential. Other NFCFs (e.g., reprocessing facilities) may have a
potential risk comparable with NPPs.
The following specific features need to be considered when performing a safety
assessment of NFCFs [6]:
• The nature and variety of technologies result in a broad range of hazardous con-
ditions;
• Presence of a wide diversity of radioactive, fissile, toxic, combustible, explosive
or reactive materials;
• Radioactive and other hazardous materials are present and handled throughout
different parts of the facility;
• The possibility of occurrence of nuclear criticality accidents;
• Need for frequent changes in equipment, processes, and operating procedures can
pose additional hazards;
• Operations of NFCFs usually require more operator actions than NPPs, making
operators more vulnerable;
• Many of NFCFs worldwide are obsolete and lack adequate design documentation
and information on operating experience, which require additional efforts and
specific approaches to safety assessment.
The loss of the main safety functions of NFCFs may lead to release of radiological
or hazardous materials. The most common safety functions required by NFCFs are
[7]:
• Prevention of criticality of fissile materials;
• Radiological protection;
• Confinement of radioactive or hazardous materials to prevent or mitigate unplanned
releases to the environment;
• Removal of decay heat.
This chapter presents a simplified or graded version of the complete probabilistic
risk assessment (PRA) traditionally used in NPPs, which is compatible with the
complexity and level of hazard presented by NFs. PRA is useful to complement the
conventional deterministic safety assessment (DSA), whereby plant safety can be
assessed and improved.
PRA provides not only a comprehensive and structured approach to identify fail-
ure scenarios and facility damages but also for estimating numerical values for the
risks. Likewise, it offers a systematic approach to verify whether the reliability and
independence of safety systems are adequate for implementing defence-in-depth
provisions and assessing whether the risks are “as low as reasonably achievable”
(ALARA criterion). It is a trend currently in such analyses to make a less conserva-
tive assumption and to consider the best-estimated values [6].
Both PRA applications to NPPs and NFCFs are based on common principles.
Nevertheless, there are some significant differences, mainly due to plant design and
processes. Compared to NPPs, NFCFs are usually characterized by a larger diversity
of technologies and processes. Apart from processing radioactive or fissile materials,
Integrated Reliability and Risk Assessments of Nuclear Facilities 25
Fig. 2 Integration of reliability and risk assessments during facility lifetime and licensing process
In relation to NFs, risk assessment is defined as “an assessment of the radiation risks
and other risks associated with normal operation and possible accidents involving
facilities and activities. This will normally include consequence assessment, together
with some assessment of the probability of those consequences arising.” Throughout
this chapter, safety assessment is used with the concept of “an assessment of all
aspects of a practice that are relevant to protection and safety”, a scope that encompass
risk assessment [5].
The integration of reliability and risk assessments of items important to safety
(IISs) can be carried out at all phases of the facility lifetime. At the beginning of a
lifetime, these assessments are almost independent of each other, although they can be
done together, at the same time, and by the same team in order to optimize resources.
Throughout the facility lifetime, this integration becomes even more necessary, not
only for cost-effectiveness reasons but also mainly for improving safety and quality
assurance of the outcomes [2, 8]. Figure 2 shows the integration of reliability and risk
assessments during facility lifetime and licensing process. The arrow width illustrates
the need or potential of integration of these assessments from initial phases of a
lifetime or licensing process to decommissioning of the facility.
26 V. de Vasconcelos et al.
Looking for the details of this integration, a table was built, listing the safety and
risk assessments typically required in the licensing process of NFs, as well as the
Reliability, Availability and Maintainability (RAM) assessments required by design,
operation and performance issues. Table 1 shows, for each phase in the lifetime of
an NF, some examples of typical documents and authorizations required during the
licensing process, relating them to safety, risk and RAM assessments.
Before the start of the licensing process itself (pre-licensing) in the phase of the
feasibility study and preliminary design, there is still no formal licensing process,
but some decisions are taken about reference facilities, strategic plan for licensing,
and acceptable levels of risk, safety and reliability. At this point, a definition and
categorization of a list of IISs, preliminary deterministic and probabilistic safety
assessments, a qualitative analysis of hazard and accident scenarios, as well as a
selection of design principles applied to safety are carried out. On the other hand,
reliability characteristics and functional requirements of IISs, as well as automation
and control philosophies are defined.
During siting and site evaluation phase, regulatory bodies usually require some
licensing documents, such as site evaluation reports, environmental impact assess-
ments, emergency preparedness and response plan. For supporting public inquiries,
identification of potential vulnerabilities and areas of importance to risk, demonstra-
tion of facility safety taking into account specific site characteristics and long-term
safety assessment are carried out. At the same time, the project team proceeds with
the availability and performance assessments taking into account specific site char-
acteristics.
Both deterministic and probabilistic safety assessments should be completed
before initiating the detailed design process, looking for any necessary changes
in the basic design philosophy, because future changes could become excessively
expensive. These assessments require technical specification of IISs, as well as defi-
nitions about maintenance, inspections, tests, requirements, man-machine interface
and human reliability, among others.
As the detailed design evolves to the final design, the importance of an integrated
assessment of safety, risk, maintainability and availability increase, not only for
economic reasons, but also for quality assurance requirements, which imposes high
reliability and safety and lower risks. In addition to technical specifications and PRA,
the elaboration of safety-related documents as physical protection and fire protection
plans are more effective when performed through integrated activities, using the same
tools as FTA, ETA and RBD, for example. Considerations about influence of human
factors on facility design and operation are also analyzed in this phase.
For granting a construction permit from a regulatory body, a preliminary safety
analysis report (PSAR) is one of the required documents. Therefore, an assessment of
the potential impact of construction on neighboring and demonstration of compliance
of IISs with safety requirements will need life data analysis and supplier assessment,
as well as the elaboration of pre-service inspection strategy. Integration of safety
assessment and procurement process is carried out to specify reliability standards
and purchase the most reliable components available.
Integrated Reliability and Risk Assessments of Nuclear Facilities 27
Table 1 Examples of typical safety, risk, reliability, availability and maintainability assessments
during facility lifetime and licensing process of nuclear facilities (NFs)
Facility lifetime Safety and risk RAM assessments Licensing process
phases assessments
1. Feasibility study Definition and Definition of Definition of reference
and preliminary design categorization of a list reliability facilities
of IISs characteristics of IISs Acceptable levels of
Preliminary Definition of risk, safety and
deterministic and automation and reliability
probabilistic safety control philosophies
assessments
2. Siting and site Vulnerabilities and Demonstration of Site approval
evaluation areas of importance to facility availability Site evaluation report
risk taking into account Emergency plan
Demonstration of specific site
facility safety characteristics
3. Design Deterministic safety RAM assessment of Technical
assessment (DSA) IISs specifications
Probabilistic risk Maintenance, Fire protection plan
assessments (PRA) inspection and test Technical design
Specifications for requirements documents
safety functions Human reliability
analysis (HRA)
4. Manufacturing and Assessment of Life data analysis Construction permit
construction potential impact of Suppliers assessment Preliminary safety
construction on analysis report
neighboring (PSAR)
5. Commissioning As-built safety Assessment of Authorization for use
assessment maintenance, of nuclear material
Safety assessment of in-service inspection Quality assurance
commissioning tests and testing strategies program
Analysis of risks and Unconformities
opportunities analysis
6. Operation Data specialization of Data specialization of Authorization for
failure rate of IISs reliability operation
Living PRA characteristics of IISs As-built final safety
Verification of Analyses of facility analysis report
operator fitness for performance using (FSAR)
duties related to safetyRBI and RCM Operating procedures
HRA for facility
operation
7. Technical Safety review of risks Evaluation of Approval of technical
modifications and and hazards reliability modifications
innovations Assessing of safety characteristics of IISs Periodic safety review
factors against current Evaluation of facility (PSR)
standards performance
8. Ageing Categorizations of IISs Understanding of Ageing management
management to ageing management ageing mechanisms plan and reports
Prioritization of Prioritization of
actions for safety corrective actions for
improvement RAM improvements
(continued)
28 V. de Vasconcelos et al.
Table 1 (continued)
Facility lifetime Safety and risk RAM assessments Licensing process
phases assessments
9. Decommissioning Safety assessment of Assessment of Authorization for
decommissioning progressive shutdown decommissioning
Assessment of safety and new Decommissioning
of workers, public and configurations of IISs plan and reports
environment
The authorization for use of nuclear material and commissioning require, among
others, the assessment of as-built safety, measures for controlling nuclear materials,
and commissioning tests. These assessments complement or require the definition of
in-service inspection and testing strategies (e.g., risk-based inspection—RBI), and
maintenance strategies (e.g., reliability-centered maintenance—RCM). On the other
hand, a definition of a quality assurance program requires an integrated unconfor-
mities analysis, analysis actions to address risks and opportunities, and elaboration
of training and qualification plans, to meet the current requirements of the ISO 9001
standard.
For granting authorization for operation from a regulatory body, a final safety
analysis report (FSAR), taking into account data specialization of the failure rate
of IISs and operating experience, is one of the required documents. Sometimes, the
safety assessment is complemented by a living PRA, helping to ascertain the com-
pliance of NF plans and procedures with safety requirements (e.g., related to the
operation, radiation protection, fire protection, emergency preparedness and waste
management). During the operation, a fully continuous assessment of the effective-
ness of preventive and corrective actions, availability of IISs, facility performance,
use of RBI and RCM strategies, and verification of operator fitness for duties should
be integrated and carried out by the same team, if possible, for cost-effectiveness
issues.
During facility lifetime, many technical modifications and innovations are nec-
essary, which should be approved by a regulatory body, sometimes under periodic
safety review plans (PSRs). These technical modifications require a safety review
of risks and hazards, assessing safety factors against current standards, and reval-
uation of facility safety, reliability characteristics of IISs, and facility performance
after modifications. These safety, risk and RAM assessments are complementary and
would become expensive and with a high possibility of inconsistencies if not carried
out in an integrated way.
Integrated Reliability and Risk Assessments of Nuclear Facilities 29
Reliability assessments of NFs are required, among others, by quality assurance (QA)
programs and technical specifications of IISs [2]. Safety and risk assessments of NFs
are licensing requirements [8, 9]. Two complementary methods (deterministic and
probabilistic) are usually required to assess and improve the safety during the lifetime
of these facilities.
There are many reliability characteristics that are important to facility safety and
performance assessments. Table 2 summarizes some of these characteristics, with
their respective definitions. The selected reliability characteristics were probability
of failure given time, B(X) life, mean life, reliability given time, availability given
time, failure rate, and maintainability. These variables are interrelated in some way
and their definitions are complemented by the Eqs. 1–5 presented next [11].
Assuming an exponential life distribution for an item i with a constant failure
rate, λi , the reliability of item i as a function of time, t, Ri (t), is given by Eq. 1:
Table 2 Selected reliability characteristics important to facility safety and performance assessments
[11]
Reliability characteristics Definitions
Probability of failure given time The probability that an item will be failed at a particular
point in time. The probability of failure is also known as
“unreliability” and it is the reciprocal of the reliability
B(X) life The estimated time when the probability of failure will
reach a specified point (X%)
Mean life The average time that the items in the population are
expected to operate before failure. This is often referred to
as “mean time to failure” (M T T F)
Reliability given time The probability that an item will operate successfully at a
particular point in time and under specified operational
conditions
Availability given time Probability that an item is operational at a given time (i.e.,
has not failed or it has been restored after failure)
Failure rate The number of failures per unit time that can be expected to
occur for the item
Maintainability Probability that an item is successfully restored after failure.
“Mean time to repair” (M T T R) is often a reliability
characteristic used to measure maintainability
Integrated Reliability and Risk Assessments of Nuclear Facilities 31
Item uptime MT T F
Ai (t) = = . (5)
Item uptime + Item downtime MT T F + MT T R
In the case of systems with items in series or parallel configurations, the reliability
characteristics are calculated using concepts of Boolean algebra and probability
theory. For more complex configurations, reliability engineering tools and computer
codes are usually required [11, 12].
There are many tools of reliability engineering, usually supported by computer codes,
developed to solve reliability related problems. Some of the most important reliability
engineering tools are next shortly described.
32 V. de Vasconcelos et al.
According to [5], an item important to safety (IIS) is an item whose failure led to
radiation exposure of workers or public and includes structures, systems, and com-
ponents (SSCs), which prevent that foreseeable operational events result in accident
Integrated Reliability and Risk Assessments of Nuclear Facilities 33
Table 3 Examples of design principles and criteria applied to reliability, safety and risk [9]
Design focuses Principles and criteria
Reliability Standby and redundancy, physical separation, functional independence,
diversity, safety factors, maintainability, availability
Safety Fail-safe design, double contingency, single failure design, ALARA (as low
as reasonably achievable), defence-in-depth, passive safety, fault tolerance,
inherently safe, licensing requirements
Risk Prevention principle, precautionary principle, protection principle,
radioprotection, limitation of risks to individuals and environment, design
basis accidents (DBAs), risk-based inspections (RBI), IAEA safety principles
Human reliability analysis (HRA) studies and models the interactions between
humans and systems to estimate human error probabilities (HEPs), taking into
account work environment, opportunities to recover from errors and their conse-
quences. The most known HRA techniques are technique for human error rate pre-
34 V. de Vasconcelos et al.
diction (THERP), holistic decision tree (HDT), cognitive reliability and error anal-
ysis method (CREAM), human error assessment and reduction method (HEART),
and standardized plant analysis risk-human reliability analysis (SPAR-H). Many of
these techniques use the so-called performance-shaping factors (PSFs) to help HEP
prediction from specific work situations [15].
PSFs are defined as variables that may affect human performance in an HRA and
are used for adjusting basic human error probabilities (HEPs) assumed as nominal
under certain conditions. PSFs usually adopted in HRA methods include available
time, stress, complexity of systems and tasks, training, worker experience, opera-
tional procedures, human factors, and workload, among others.
Some HRA methods, such as SPAR-H, categorize human tasks into two basic
types: action or diagnosis. Routine operation, starting equipment, calibration or main-
tenance, and other activities of following procedures are examples of action tasks.
On the other hand, diagnosis tasks depend on knowledge/experience for analyzing
current facility conditions, planning and prioritizing activities, and take decisions
about alternative actions. SPAR-H assigns a figure of 10−2 to nominal H E Ps for
diagnosis tasks, which is ten times the nominal H E Ps for action tasks (10−3 ).
Some PSF levels cause the increase of H E P (negative influence), while other
PSF levels cause the decrease of H E P (positive influence). Figure 3 illustrates this
relationship for diagnosis and action tasks, including 5th and 95th percentiles, rep-
resenting the uncertainties involved in the estimates of H E Ps. A lower bound of
10−5 is suggested [15].
8
HEP = NHEP × Si , (6)
i=1
where N H E P is the nominal H E P (10−2 for diagnosis and 10−3 for action), and
Si is the multiplier associated with the corresponding PSFi level (Table 4). As an
example, in an accident situation, when a diagnosis task (N H E P = 10−2 ) of high
complexity (S3 = 5) is carried out under extreme stress (S2 = 5), considering the
other six levels as nominal (S1 = S4 = S5 = S6 = S7 = S8 = 1), the H E P value,
according to Eq. 7, would be:
8
HEP = NHEP × Si = 10−2 × 1 × 5 × 5 × 1 × 1 × 1 × 1 × 1 = 0.25. (7)
i=1
In this theoretical example, a diagnosis task that under normal conditions would have
a human error probability of 1% becomes 25% under accident conditions!
When three or more negative PSF influences are present (i.e., anytime a multiplier
greater than 1 is selected), when applying Eq. 6, SPAR-H method recommends the
use of an adjustment factor, according to Eq. 8 for computing H E P [16]. This helps
to reduce double counting interdependencies between PSF effects and avoids H E P
value from being greater than 1.
8
N H E P × i=1 Si
HEP = 8 . (8)
N H E P × ( i=1 Si − 1) + 1
Risk assessment is the combination and integration of the probabilities (or fre-
quencies) and the severities (or consequences) for identified hazards, taking into
account the effectiveness of controls and barriers (defence-in-depth levels). It pro-
vides an input to risk evaluation and decisions about risk management, through the
adoption of adequate risk acceptance criteria [18].
36 V. de Vasconcelos et al.
Table 4 Evaluation of PSFs for diagnosis and action portions of tasks according to SPAR-H method
[15]
PSFs PSF levels Diagnosis multiplier Action multiplier
1. Available time Inadequate time P (Failure) = 1.0 P (Failure) = 1.0
Barely adequate time 10 10
(≈2/3 × nominal)
Nominal time 1 1
Extra time (between 1 0.1 0.1
and 2 × nominal and >
than 30 min)
Expansive time (>2 × 0.01 0.01
nominal and >30 min)
Insufficient 1 1
information
2. Stress/stressors Extreme 5 5
High 2 2
Nominal 1 1
Insufficient 1 1
information
3. Complexity Highly complex 5 5
Moderately complex 2 2
Nominal 1 1
Obvious diagnosis 0.1 Not applicable
Insufficient 1 1
information
4. Experience/training Low 10 3
Nominal 1 1
High 0.5 0.5
Insufficient 1 1
information
5. Procedures Not available 50 50
Incomplete 20 20
Available, but poor 5 5
Nominal 1 1
Diagnostic/symptom 0.5 Not applicable
oriented
Insufficient 1 1
information
6. Ergonomics/HMI Missing/misleading 50 50
Poor 10 10
Nominal 1 1
Good 0.5 0.5
Insufficient 1 1
information
(continued)
Integrated Reliability and Risk Assessments of Nuclear Facilities 37
Table 4 (continued)
PSFs PSF levels Diagnosis multiplier Action multiplier
7. Fitness for duty Unfit P (failure) = 1.0 P(failure) = 1.0
Degraded fitness 5 5
Nominal 1 1
Insufficient 1 1
information
8. Work processes Poor 2 5
Nominal 1 1
Good 0.8 0.5
Insufficient 1 1
information
most common tools for modeling complex systems are ETA, FTA, and RBD, among
others. HRA methods are used for modeling human factors affecting the behavior
and performance of operators.
Step 4 “Data assessment and parameter estimating” consists of gathering the
information relevant for the quantification regarding the frequencies of occurrence
of operational sequences and the magnitude of consequences. Parameters such as
frequencies of IEs, component reliabilities, safety system unavailability, and human
error probabilities are necessary for frequency estimation. For consequence assess-
ment, specific parameters related to the amount, form, and transport of hazardous and
radioactive materials during postulated accidents are required. General data related
to effects to people and the environment are also required.
Step 5 “Sequence frequency” and step 6 “Consequence evaluation” consist of
quantification of scenarios using the models developed in step 3 and the data gathered
in step 4. These steps result in the assessment of the frequency of accident sequences
and the estimation of potential consequences, generally in terms of radiation doses.
There are many uncertainties related to both frequency and consequence assessments.
Assessment of uncertainties related to data, methods and models is of fundamental
importance to risk management. Uncertainty assessment usually involves the deter-
mination of the probability density function (pdf) of some parameters associated
with outcomes based on pdf of input data. For instance, uncertainty propagation of
failure probability in fault trees and event trees should be considered in PRAs, in
order to get the confidence bounds of the quantitative risk metrics.
Step 7 “Risk assessment” involves the integration (or product) of frequency and
severity assessments using the Eq. 9, and a comparison of estimated levels of risk
with defined risk acceptance criteria set by regulators. For risks involving radioac-
Integrated Reliability and Risk Assessments of Nuclear Facilities 39
tive materials, usually frequency-dose relationships are obtained and compared with
acceptance criteria (frequency-dose curves). Uncertainty assessment is also impor-
tant in this step to rank risks and verify compliance with standards and guidelines.
Where appropriate for risk management, sensitivity studies should be made for the
main assumptions and parameters to get their relative importance and guide the
decision-making processes.
Step 8 “Documentation” of the PRA, as part of the QA program, should cover
a compilation of assumptions, data, methods, detailed analyses, and interpretation
of the results. This is essential for future verification, validation, auditing, updating,
and improve the assessments.
Many risk assessment tools to estimate both the frequency of accidents and their con-
sequences can be used to support the implementation of PRA steps. Some commonly
used tools are next shortly described [13, 19].
Failure mode and effect analysis (FMEA) is a qualitative technique for identifying
potential failure modes, causes, effects and control mechanisms of failures on pro-
cesses or systems. Sometimes, a semi-quantitative analysis of risks is carried out for
prioritizing corrective actions. This analysis is based on scores assigned to severity
of effects, likelihood of occurrence and detection of failures. FMEA is used at the
initial steps of PRAs for screening the postulated initiating events (IEs) and accident
scenarios.
Event tree analysis (ETA) uses a binary decision tree for modeling accident sce-
narios. It starts with an IE and proceeds through a sequence of successes or failures
of defence-in-depth levels until the end-states are reached. Each end-state is an acci-
dent sequence whose consequence should be quantitatively assessed. Each accident
sequence has its own frequency of occurrence, depending on the frequency of the IE
and on the probability of success or failure of defence-in-depth levels. These proba-
bilities can be estimated using logical tools as RBD and FTA, or analyzing available
life data, using LDA or ALT, for example.
Monte Carlo simulation (MCS) is a stochastic method for modeling based on
direct simulation of systems or processes. Through the use of random sampling of
probability density functions of input parameters, the uncertainties of output param-
eters are obtained. It is simple in principle, but requires the use of computer programs
to be implemented because the high number of samples necessary to get accurate
results. In PRA, MCS is mostly used for uncertainty propagation in FTAs and ETAs,
as well as consequence estimates of accident sequences [10, 20].
Analytical models are used in safety assessment mainly through the determina-
tion of phenomenological, statistical or empirical mathematical equations to model,
for instance, accident consequences. They are mainly based on mass, energy and
momentum conservation principles. Analytical models can represent process condi-
tions, atmospheric dispersions, explosion shockwave, radiation heat, or radiological
40 V. de Vasconcelos et al.
doses, among others, at various points in time after an accident. The consequences
can be expressed as fatalities, injuries, radiation doses, damage in structures, or
environmental impact [10].
Other tools used in PRAs, such as RBD, FTA and Markov models, are also used
in RAM analyses and have already been described in Sect. 4.3 of this chapter.
Risk acceptance criteria are limits and conditions defined by regulatory bodies to
ensure the adequate safety levels during licensing processes. In the case of NPPs,
these criteria can be, for example, core damage frequency (CDF) and large early
release frequency (LERF). In the case of NFs, risk acceptance criteria are usually
defined in terms of frequency-consequence (e.g., frequency-dose) ranges for workers
or the public. On the other hand, these criteria can be more subjective, for instance,
the demonstration by the licensee that IISs or other SSCs meet certain principles
and criteria, such as, ALARA principle, single failure criterion, fail-safe design or
double contingency principle [21].
NUREG 1860 [22] presents a frequency-consequence (F-C) curve based on the
principle that event frequencies and doses are inversely related, which is broadly
consistent with ICRP 64 [23]. Table 5 shows these proposed numerical criteria in
terms of frequency per reactor-year and the corresponding total effective dose equiv-
alent (TEDE) ranges for the public. These frequency-dose ranges for the public are
based on U.S. Code of Federal Regulations (CFR) Parts 20, 50 and 100, as well as
on Environmental Protection Agency (EPA) Action Guidelines. While these regula-
tory requirements provide some dose points for deriving the dose ranges, consider-
able engineering judgment was necessary to assign the corresponding frequencies.
Despite these ranges had been proposed for NPPs, their base lines and criteria can
be applied to NFs, e.g., ALARA principle, stochastic and deterministic effects of
radiation, radiological protection, and emergency preparedness.
Figure 5 shows a graphical comparison of the safety criteria of Germany, Great
Britain, ICRP 64, NUREG 1860, and Switzerland. The values plotted in Fig. 5 are
based on the data available in [3, 22]. Note that some frequency-dose curves are only
represented by step lines separating the acceptable and unacceptable regions (Ger-
man, Switzerland and NUREG 1860), while other criteria present ALARA regions
between acceptable and unacceptable regions (ICRP 64 and Great Britain).
Apart from risk acceptance criteria, there are reliability acceptance criteria to meet
the safety, performance, and QA requirements. These criteria are based primarily on
verifying the compliance of the reliability characteristics (Table 2) with the specified
values. For instance, the vendor must include documented bases for the estimated
M T T F and M T T R of IISs. Specifying RAM levels for IISs depends not only on
their safety significance, but on many factors, such as energy efficiency and economic
evaluation (e.g., pricing, maintenance, and inspection costs).
Integrated Reliability and Risk Assessments of Nuclear Facilities 41
Table 5 Proposed dose-frequency ranges for public, according to NUREG 1860 [22]
Dose range (mSv) Frequency Comment
(per reactor-year) (all doses are total effective dose equivalent—TEDE)
0.01–0.05 1.0 0.05 mSv/year is the ALARA dose in 10 CFR 50 App I
0.05–1 10−2 1 mSv/year is the public dose limit from licensed
operation in 10 CFR 20
1–10 10−3 10 mSv/event is the off-site trigger of EPA Protection
Action Guidelines
10–250 10−4 250 mSv/event triggers abnormal occurrence reporting
and is limit in 10 CFR 50.34 and in 10 CFR 100 for siting
250–1000 10−5 500 mSv is a trigger for deterministic effects (i.e., some
early health effects are possible)
1000–3000 10−6 In this range the threshold for early fatality is exceeded
3000–5000 5 × 10−7 Above 3–4 Sv, early fatality is quite likely
>5000 10−7 Above 5 Sv early fatality is very likely and curve shall be
capped
The acceptable levels of complex IISs depend on the evaluation of the whole
system and interdependencies. RAM criteria and principles, such as redundancy,
diversity, physical separation, fail-safe, single failure and defence-in-depth, among
others (Table 3), should be carefully evaluated. Reliability tools, as described in
Sect. 4.3, are useful for this verification, mainly taking into account common mode
failures.
Double contingency is a requirement of particular relevance for NFs. According
to this requirement, a “criticality accident cannot occur unless at least two unlikely,
independent and concurrent changes in process conditions have occurred” [5]. This
implies, not only following the above mentioned criteria of Table 3, but also the
estimation of the probability of occurrence of failures, which can result in criticality
accidents, including human errors.
the facility complies with the acceptable levels of risk. At this point, it is important
to consider the confidence bounds of quantitative assessments (represented by the
5th and 95th percentiles of pdf of the estimated risks), obtained through uncertainty
assessment, in order to assure an adequate level of conservatism.
In order to illustrate the use of the integrated approach to reliability and risk assess-
ments, a simple example involving analysis of a criticality accident in a nuclear fuel
fabrication plant (NFFP) was developed.
Nuclear facilities such as fuel fabrication plants handle fissile material and must
manage their processes and activities to ensure criticality safety. The nuclear subcrit-
icality depends on many parameters, such as mass, concentration, geometry, volume,
enrichment, and density of fissile material. It is also influenced by the presence of
other materials in the facility, such as moderators, absorbers, and reflectors. Sub-
criticality is then ensured through different defence-in-depth levels of protection for
preventing failures, ensuring detection, and mitigating the consequences of critical-
ity accident. They can be classified as passive, active, and administrative protection
levels [24].
Examples of passive defence-in-depth levels, which do not depend on control sys-
tems, or human interventions, are the designing of geometry and volume of pipes,
vessels, and structures inherently safe against criticality, limiting of the uranium
enrichment, presence of neutron absorbing materials, as well as shielding for mit-
igating accidental radiation doses. Examples of active defence-in-depth levels are
automatic process control, neutron and gamma monitors, and computer systems for
controlling the movement of fissile material. Examples of administrative defence-in-
depth levels are operating procedures for avoiding dangerous situations, controlling
of the isotopic composition, mass, density, concentration, chemical composition,
degree of moderation, and spacing between fissile material systems. Other examples
are defining of areas authorized to contain significant quantities of fissile material
and access control. Active components that require human actions in response to
indicators or alarms are better classified as administrative defence-in-depth levels.
Let’s consider a probabilistic risk assessment (PRA) of a nuclear fuel fabrication
plant (NFFP) carried out at the preliminary facility design phase. It was assumed
as design basis accident (DBA) a criticality excursion producing an initial burst of
1.0 × 1018 fissions in 0.5 s followed successively at 10 min intervals by 47 bursts of
1.9 × 1017 fissions, resulting in a total of 1.0 × 1019 fissions in 8 h. According to
reports of this kind of accident, this event produces little or no mechanical damage in
the structures, systems and components (SSCs) of the NFFP [25]. It is also assumed
that the system for diagnosis of criticality accidents and mitigating their consequences
Integrated Reliability and Risk Assessments of Nuclear Facilities 45
is moderately complex and there are only poorly elaborated checking procedures to
anticipate hazardous conditions.
The semi-empirical Eqs. 10 and 11, respectively, are recommended for estimating
the prompt gamma and neutrons doses following the postulated accident [25]:
where:
Dγ = prompt gamma dose (mSv),
N = number of fissions,
d = distance of source (km),
μc = dose reduction factor (μc = 2.5 for concrete wall thickness = 8 in. and
μc = 5 for concrete wall thickness = 20 in.).
where:
Dn = prompt neutron dose (mSv),
N = number of fissions,
d = distance of source (km),
μc = dose reduction factor (μc = 2.3 for concrete wall thickness = 8 in. and
μc = 4.6 for concrete wall thickness = 20 in.).
To investigate design alternatives, defining reliability and risk acceptance crite-
ria, and make initial considerations about the influence of human factors on design
and operation, a preliminary PRA can be used in an integrated way with reliability
assessment techniques.
The generic event tree analysis shown in Fig. 8 was developed to investigate the
accident scenarios of criticality accident in an NFFP, where λ is the frequency of
initiating event, and P1 , P2 and P3 are the failure probabilities of passive, active and
administrative defence-in-depth levels, respectively.
An initiating event for this kind of accident could be an uncontrolled increase
of concentration of fissile material in piping and vessels of the facility. It will be
assumed conservatively a frequency of occurrence of such event as λ = 10−1 per
year [26], and P1 , P2 and P3 as 10−2 , and 10−1 , and 10−1 , respectively [27]. These
are very conservative figures since they are one order of magnitude greater than
the recommended upper limits for single passive and active defence-in-levels. The
figure of 10−1 assumed for administrative defence-in-depth, is also conservative and
is recommended for rare unplanned administrative events. It is also compatible with
SPAR-H estimates for moderately complex diagnosis tasks with poorly procedures.
Assuming that the administrative defence-in-depth levels are only to mitigate the
accident and not to prevent it, as well as the defence-in-depth levels are independent
and each one is capable of avoiding the accident, the only sequences in which the
criticality accident occurs are S1 and S2 .
46 V. de Vasconcelos et al.
Fig. 8 Generic event tree for the criticality accident analysis in an NFFP
Doses and frequencies for sequences S1 and S2 were estimated for three alterna-
tives designs: the initial assumed basic design; the improvement on active defence-
in-depth level, adding redundancy; and additional improvements on administrative.
Table 6 presents these estimated values.
For simplicity, the external doses from dispersion of fission products were
neglected, and only prompt gamma and neutron doses were considered in calcu-
lations for the public individuals of the critical group, assuming they are located at
0.1 km of distance.
Integrated Reliability and Risk Assessments of Nuclear Facilities 47
Fig. 9 Frequency-dose curve (based on NUREG-860 [22]]) and accident sequences S1 and S2 ,
comparing different design alternatives
The total doses estimated for the sequence S1 , using Eqs. 10 and 11, were 241
mSv for 8 in. concrete wall thickness and 120 mSv for 20 in. concrete wall thickness,
respectively. These doses were estimated considering that the public individuals
remain throughout the course of the accident at the distance of 0.1 km. The doses
from sequence S2 , estimated supposing an evacuation time of one hour, were reduced
to 46.9 mSv for 8 in. concrete wall thickness and 23.1 mSv for 20 in. concrete wall
thickness, respectively.
The frequencies of occurrence of sequences S1 and S2 , estimated according to
equations presented in Fig. 8, were 10−5 and 9 × 10−5 , respectively. The estimated
risks are plotted in a frequency-dose curve, according to NUREG 1860 criteria [22],
taken as reference. Both sequence S1 and S2 , for the initial basic design defence-in-
depth levels, lie under the curve, in an acceptable region, as can be seen in Fig. 9.
However, the dots are very close to the curve, and if the uncertainties on estimating
frequency and doses were taken into account, certainly the dots would lie in an unac-
ceptable region [20]. Therefore, further design alternatives should be investigated.
Adopting redundancy at active defence-in-depth level, its reliability becomes 0.99,
which reevaluates P2 as 10−2 per year. Revising design for simplifying diagnosis
system and improving maintenance, checklist, and operation procedures, the nominal
value of 10−3 per year can be adopted to P3 , according to SPAR-H method. In this
way, new values estimated for criticality risks are achieved, which lie in the acceptable
region in Fig. 9, far from limits of the frequency-dose curve and with adequate safety
margins for the risks.
48 V. de Vasconcelos et al.
8 Concluding Remarks
Acknowledgements The authors thank the following intitutions: CNEN (Brazilian Commission
of Nuclear Energy), CDTN (Center of Nuclear Technology Development) and FINEP (Funding of
Science, Technology and Innovation).
References
1. Christensen F, Andersen O, Duijm J (2003) Risk terminology: a platform for common under-
standing and better communication. J Hazard Mater 103:181–203
2. International Atomic Energy Agency (2001) Reliability assurance programme guidebook for
advanced light water reactors. IAEA-TECDOC-1264. IAEA, Vienna
3. International Atomic Energy Agency (2002) Procedures for conducting probabilistic safety
assessment for non-reactor nuclear facilities. IAEA-TECDOC-1267. IAEA, Vienna
4. Winteringham F, Peter W (1992) Energy use and the environment. Lewis Publishers Inc, London
5. International Atomic Energy Agency (2018) IAEA safety glossary terminology used in nuclear
safety and radiation protection. 2018th edn. Vienna
6. International Atomic Energy Agency (2020) Safety analysis and licensing documentation for
nuclear fuel cycle facilities. Safety Reports Series No. 102 IAEA, Vienna
7. International Atomic Energy Agency (2017) Safety of nuclear fuel cycle facilities. IAEA Safety
Standards Series No. SSR-4, IAEA, Vienna
8. International Atomic Energy Agency (2010) Licensing process of nuclear installations. Specific
Safety Guide No. SSG-12 IAEA, Vienna
9. International Atomic Energy Agency (2016) Safety of nuclear power plants: design, specific
safety requirements. SSR-2/1 (Rev. 1), IAEA, Vienna
10. Vasconcelos V, Soares W, Costa A et al (2019) Deterministic and probabilistic safety analyses.
In: Advances in system reliability engineering. Elsevier, London
11. Reliasoft Corporation (2015) System analysis reference: reliability, availability and optimiza-
tion. ReliaSoft, Tucson
12. Reliasoft Corporation (2015) Life data analysis reference. ReliaSoft, Tucson
13. Stamatelatos M (2002) Probabilistic risk assessment procedures guide for NASA managers
and practitioners: Version 1.1. NASA, Washington DC
Integrated Reliability and Risk Assessments of Nuclear Facilities 49
14. Fleming K (2004) Markov models for evaluating risk-informed in-service inspection strategies
for nuclear power plant piping systems. Reliab Eng Syst Saf 83:27–45
15. U.S. Nuclear Regulatory Commission (2004) The SPAR-H human reliability analysis method.
NUREG/CR 6883. USNRC, Washington DC
16. Park J, Jung W, Kim J (2020) Inter-relationships between performance shaping factors for
human reliability analysis of nuclear power plants. Nucl Eng Technol 52:87–100
17. U.S. Nuclear Regulatory Commission (1975) WASH-1400: reactor safety study, NUREG
75/014. USNRC, Washington DC
18. International Organization for Standardization (2009) Risk management: principles and guide-
lines. ISO 31000, 1st edn. ISO/IEC, Geneva
19. Calixto E (2013) Gas and oil reliability engineering, modeling and analysis. Elsevier, Amster-
dam
20. Vasconcelos V, Soares W, Costa A, Raso A (2019) Treatment of uncertainties in probabilistic
risk assessment. In: Reliability and maintenance-an overview of cases. IntechOpen, London.
https://www.intechopen.com/chapters/65179
21. International Atomic Energy Agency (2001) Applications of probabilistic safety assessment
(PSA) for nuclear power plants. TECDOC-1200, IAEA, Vienna
22. U.S. Nuclear Regulatory Commission (2007) Feasibility study for a risk-informed and per-
formance based regulatory structure for future plant licensing. NUREG 1860, v. 1, USNRC,
Washington DC
23. International Commission on Radiological Protection (1993) ICRP publication 64: protection
from potential exposure: a conceptual framework. Ann ICRP 23(1). Ottawa
24. International Atomic Energy Agency (2014) Criticality safety in the handling of fissile material.
Specific Safety Guide No. SSG-27, IAEA, Vienna
25. U.S. Nuclear Regulatory Commission (1979) Regulatory guide 3.34. Revision 1. Assumptions
used for evaluating the potential radiological consequences of accidental nuclear criticality in
a uranium fuel fabrication plant, Washington, DC
26. Laboratory Los Alamos National (2000) LA-13638. A review of criticality accidents (2000)
Revision. Los Alamos, New Mexico
27. Canadian Nuclear Safety Commission (2018) REGDOC-2.4.3. Nuclear criticality safety.
Ottawa, Ontario
Computational Tools of Media Analysis
for Corporate Policy Effectiveness
Evaluation: Models and Their Reliability
1 Introduction
into the simpler models and approaches such as SWOT analysis, scenario analysis,
and forecasting [2, 3, 36].
Nevertheless, efficiency is a concept that is universally used across all social
sciences—not only in management. In political science, efficiency is associated with
governments and states [34], in sociology—with various social institutions (such
as civil society) and the social environment [16]; in economics—with commerce
efficiency [39]; in psychology—with efficient behaviors of individuals [13]. This
multidimensional nature of efficiency has also led to the propagation of attenuating
concepts such as effectiveness [23] and efficacy [28].1
Therefore, to establish a reliable system of corporate efficiency evaluation that
is not limited to simpler models, one must take into consideration the increasing
complexity of decision-making in modern systems, including large corporations.
Outside the narrower field of management, this complexity is reflected in social
sciences, and mostly in policy science. Policy science is as a newer interdisciplinary
field, studying of policy-making, and it integrates contemporary developments in
economics, management, sociology, and political science [20]. The context-specific,
problem-oriented, and evidence-based approach of policy science also allows policy
analysts, corporate consultants, and business intelligence agents to better inte-
grate data science methods into their decision-making process, thereby increasing
corporate policy effectiveness.
Changing the focus from corporate efficiency to corporate policy effectiveness
embeds the necessary complexity into evaluation of decision-making. The concept
of “policy” provides a broader context of the social environment, which serves as a
ecological niche in which large corporations function and for which they carry certain
responsibilities.. Policy effectiveness focuses not only on the efficient results (any
outputs achieved with the minimum use of resources), but also on policy outcomes
(corporate policy effects that may be valuable for the general public and the society
at large).
Table 1 provides questions for corporate policy effectiveness evaluation, derived
from the stages of applied problem-solving approach, and the corresponding stages in
the policy cycle theory and practice. To increase the effectiveness, policy designers,
managers, practitioners, scientists, and social engineers usually have to answer
these questions using different methods and tools. Therefore, the table serves as
the summary of the commonly used methods and computational tools.
We argue that computational tools are not inferior, and in some cases, they may
be superior to the traditional methods of corporate policy effectiveness evaluation.
They could provide additional answers when decision-makers are dealing with big
corporate data wherehuman processing is impossible. Examples of such data are data
from old and new media (social networks), which lately became the source of many
important insights that affect corporate effectiveness.
1Despite the differences between these concepts (efficiency, effectiveness, and efficacy) exist, their
detailed analysis is beyond the scope of this chapter.
Computational Tools of Media Analysis … 53
Second, we describe how we use three popular data science methods of topic mining
to tackle a real-life business problem. Third, we provide an analysis of the advan-
tages and disadvantages of the topic mining methods related to the case of the Russian
TNK. In conclusion, we provide recommendations for development of a system of
monitoring and evaluation of corporate policy effectiveness using computational
policy tools, topic mining procedures in particular.
Text data is one of the most important sources of information and knowledge. In fact,
about 95% of big data come in text or some other unstructured form.2 Therefore, the
ability to get ideas and understanding of such data is an excellent advantage for the
successful possession and control of information in business [18, 27]. Social media
plays a decisive role in propagating textual information, having a great influence on
what information reaches the public [10, 31]. Today’s coverage of information has
greatly increased the ability of both ordinary people and large corporations to publish
their content and expand their reach.
At the same time, as information and knowledge from these media continue to be
digitized and stored in the form of news articles, web pages, social media posts and the
like, analyzing all this information will help businesses manage it in time to respond
to various events [19, 29]. That is why learning the characteristics of social media
content becomes important for a number of tasks, such as recent news discovery,
personalized message recommendation, friend recommendation, sentiment analysis,
and others.
However, analyzing social media content presents certain challenges. Typical
social media contains large amounts of data consisting of brief writings with serious
fragmentation. Many documents are “off-topic” to the subject of interest; excess
highlights often complicate readability. Users have no limitation on content they
post; their language is not sufficiently standardized, unlike news or scientific articles.
The language may have urban dialects and digitals forms and reductions, making it
difficult to recognize and understand. All of these issues create problem for data
reduction when processing social texts.
The challenge of identifying common topics is to detect “hot” topics in social
media data streams. In natural language processing, including topic modeling, there
are multiple approaches for detecting, labeling, and pattern detection. Most of these
tasks are done by machine learning algorithms. In this study we demonstrate three
different approaches to topic mining and outline their benefits and drawbacks.
Topic modeling became a popular analytical tool for evaluating text data.
Researchers use topic modeling for different reasons: to find similarity of post content
in social media [9, 15], explore environmental data [11], recommend scientific arti-
cles [37]. It also has a special place in management research [14]. There are several
2 https://builtin.com/big-data.
Computational Tools of Media Analysis … 55
definitions of “topic modeling;” in this study, we will use the definition of [7], who
define it as “an unsupervised machine learning method that learns the underlying
themes in a large collection of otherwise unorganized documents” [7, p. 419].
Topic modeling allows to capture the hidden semantic structures in a document.
Typical document is composed of mixtures of topics, and topics are composed of a
set of words [17, 32, 33]. Therefore, the methods used usually are aimed at separating
these sets from each other.
One popular topic modeling method is the Latent Dirichlet Analysis (LDA). Orig-
inally proposed for use with genetic data [30], the algorithm quickly found its way
into text analysis and machine learning [5], where it remains popular to this day.
LDA is a three-level hierarchical Bayesian model, where documents are represented
as random mixtures over latent topics. Each of the topics is a probability distribu-
tion over individual units of analysis (e.g. individual words.) Topics in LDA do not
mean the same as “topics” or “discourses” in qualitative text analysis, but rather
probability distributions over words that need additional interpretation. The model
has one hyperparameter—the number of topics to estimate. While there are methods
that allow for computational estimation of number of topics, in a particular, a corpus
[26], using them for large corpora requires a large amount of computational effort.
LDA for traditional text mining has made some progress and achieved good results
[5, 12]. Despite its wide popularity for complex text data, it is not very efficient at
analyzing short texts from social media, which have their own unique properties, so
special care is required when modeling social media data with LDA.
Another another popular method is Hierarchical Dirichlet Process (HDP). HDP is
an extension to LDA, designed to be used when the number of “topics” in inside the
document is not known [40]. Using this method, researchers can effectively improve
the efficiency of text segmentation.
Because both methods are the so-called these bag-of-words methods [8, 21], they
can be used without manual intervention since they are not concerned with text
structure. This is somewhat of a limitation, since the topics obtained automatically
by these methods can be correlated with each other, and this can complicate their
interpretation [6, 38].
There is another problem that complicates topic modeling. Sometimes, words
in the training data are spelled the same, but have different meaning and represent
different topics. Such words as “Elephant” or “Donkey” belongs to the “animal”
topic, but these words are also related to the “American politics” topic. Both LDA
and HDP are insensitive to the original topic of such words.
One way to solve the above limitations is the “manual labeling” method. Experi-
ments with real data sets have shown that manual labeling method is efficient and flex-
ible for solving problems related to brand image [1, 40]. In this study, we compare the
manual labeling method with several others for the use in corporate policy efficiency
evaluation.
56 G. S. Khvatsky et al.
3.2 Data
The data used for this project were collected via a media monitoring platform called
YouScan. This platform provides a tool that collects data from a variety of sources,
including social media websites, online news websites, web forums, blogs, and
posts made in public channels of messaging applications. YouScan also maintains
an archive of such data and provides access to it. The workflow for getting data
from YouScan is as follows: the user defines a “topic” that is denoted by several text
queries. The system searches the archive for posts matching the queries, and starts
monitoring for new data. The user can then export the data for offline processing via
a built-in API.
In total, we have collected 3,118,412 publications in Russian (texts and metadata
information), ranging in time from August 30, 2017 to October 29, 2019. Publications
we collected could be separated into multiple groups based on the role they played
in the discussions that happened on the online platforms monitored by YouScan. The
most common of these publications is a ‘post’ (1,270,434 publications). This type
of a publication describes a text that was published on an online platform by a user.
The descriptions of all publication types and their counts are provided in Table 2.
Computational Tools of Media Analysis … 57
YouScan collects data from a wide variety of sources, including, but not limited to,
social network websites, public Telegram channels, and a selection of news websites.
A more complete breakdown on data sources for our dataset is provided in Table 3.
As shown in Table 3, almost 80% of the data came from social network websites.
The most common of these and most common overall is VKontake (VK), one of the
most popular social network website in the Russian-language segment of the Internet.
It is followed by Facebook, Odnoklassniki (ok.ru), another social network targeted at
the Russian-language segment of the Internet, then Twitter and Instagram. Another
large source of data is Telegram, a messaging application that allows its users to
create public channels to which they can post. Other users can then subscribe to
these channels and discuss the posts in them. Telegram is also popular in Russia,
with more than 25% penetration rate.3 Among the top-10 most common data sources
are blog websites such as LiveJournal (similar to Blogspot) and Pikabu (similar to
Reddit), which are also targeted at the Russian-speaking audience. Also among the
3 https://www.statista.com/statistics/867549/top-active-social-media-platforms-in-russia/.
58 G. S. Khvatsky et al.
Table 4 Distribution of
Resource type Count Percentage of total (%)
publications across different
resource types social 2,628,923 84
news 219,663 7
blog 98,944 3
messenger 92,528 3
forum 76,852 2
reviews 1502 0
most common sources of data are YouTube and a large Belarus-based web forum
(talks.by) that is now defunct.
The distribution of publication count across different types of media indexed by
YouScan is presented in Table 4. As Table 4 shows, most popular type of resource
are also social network websites. Social media accounts for 84% of all publications
in the dataset.
YouScan collects a large amount of information on each of the publications. Of
prime interest for this study are the texts, but other information is also present.
YouScan collects data on the time and date the publication was made, the source
of the publication, information about the author of the publication (e.g. name, age,
gender of a person, and a URL and a name for a community) and the amount of
engagement (shares, likes, reposts) a publication has received. What information is
collected is dependent on the source of the publication. For example, it is not possible
to collect engagement rates from a web forum that does not allow its users to “like”
and “share” the discussions.
3.3 Preprocessing
Before doing any analysis, we applied multiple preprocessing steps to our data to
clean it and improve the interpretability of the results obtained later. Our first step
was text cleaning. Since the original corpus was in Russian, we removed English
words. We also removed emoji characters, hashtags, punctuation, and web URLs
from the texts in our database. The purpose of this step was to eliminate words that
could confuse the lemmatizer and to remove information that was not relevant to our
analysis.
Next, we performed the lemmatization procedure on all of the texts in the database.
Russian language has a fusional morphology, which means that words have different
forms according to their roles in a sentence. This can lead to problems with text
analysis, since bag-of-words based algorithms such as LDA may recognize different
grammatical forms of the same word as different words. Because of that, a lemma-
tization procedure that puts all words into their dictionary form is required when
performing bag-of-words based analysis.
Computational Tools of Media Analysis … 59
For this study, we used Mystem—a closed-source, but free to use lemmatizer for
Russian and Ukrainian languages. It uses a hybrid approach combining a suffix tree
and a dictionary. This approach allows for both accurate lemmatization of known
words and guessing dictionary forms of words not originally present in the dictionary
[35]. It should be noted that in our case, Mystem failed to properly lemmatize the
name of a large Russian state enterprise that was very common in our dataset, so we
had to create a custom dictionary in order to perform this step in a proper manner.
As a final step, we removed stopwords from our lemmatized texts. Stopwords are
words that carry no meaning and can be detrimental to bag-of-words based analysis.
Examples of such words for the English language are the verb “to be”, words such
as “and,” “or,” and “maybe”. There are lists of such words for almost all languages,
and we have used one of these lists for this study. After performing all of these steps,
the texts were finally ready for further analysis.
The first algorithm that we used for this study was Latent Dirichlet Allocation (LDA).
This algorithm has readily available open-source implementations, which makes it
particularly easy to use.
For our application of LDA, we used the implementation provided by the Gensim
Python library. The Gensim library is a collection of NLP algorithms and models.
The library strives to provide implementations of various models that are able to
process arbitrarily large corpora that might not fit into RAM of the computer running
the models. This property was quite useful for this study, as our corpus of 3,118,412
publications did not fit into RAM of the computer running the models.
We did not use algorithmic detection of the number of topics due to the extreme
computational complexity of having to fit a large number of models needed to esti-
mate the value of the hyperparameter. For this study, we chose to train an LDA model
with 100 topics and use its results for further analysis. While this number may appear
somewhat arbitrary, methodological studies aimed at detecting the number of topics
for topic modeling usually stop at about 100 (e.g., [41]). Therefore, we started with
this large number to make sure we picked up most of the topics.
With our particular implementation of LDA we faced a set of problems. First,
when building the dictionary of words that are present in the corpus, Gensim may
remove words that are not very common in the corpus. Such words, however, could
still be important for our analysis. One example is words that are present in less than
10 publications in the corpus. Second, Gensim may also remove the words that are too
common—for example, words that are present in more than 70% of the publications
in the corpus. One way to solve this problem was to tweak the parameters used for
this word removal. However, we decided to take an alternative approach, and created
a list of words that should not be removed from the dictionary even if they were
rare. These words included words that, for example, referred to events and people
important for the industry under investigation.
60 G. S. Khvatsky et al.
A third problem was the presence of publications that were not relevant to our
analysis. These publications (e.g. advertisements) have decreased the performance
of LDA: discovered topics often had words from the advertisements with very high
probabilities, which made interpretation of the discovered topics much more difficult.
The fourth problem we faced was a large number of topics that were highly
correlated with one another, thus making the model interpretation much harder. This
was most likely due to the large number of topics in the model. Also, it could be due
to the way that the original search query was constructed, because certain words (that
were part of the query itself) were present in a very large numbers of posts. Again,
this presented a problem for our analysis as it made it much harder for the training
algorithm to differentiate different topics present in the posts from one another.
Yet another problem that affected the model interpretability was the choice of the
procedure used to assign topics to individual publications. LDA assigns a probability
distribution of topics to each of the publications in the corpus. This means that
technically, all publications have all topics present within them with various degrees
of “affinity.” Determining which topics are present in which publication becomes
quite complicated as a result. An obvious choice would be to simply select the most
probable topic for each of the publications. However, doing so is based on an implicit
assumption that a publication can only have exactly one topic present within it. After
using this approach on our data, we have found that this assumption was not true for
many of the publications present in our dataset.
Thus, we decided to implement an alternative approach. We picked a threshold
value and selected all the topics with the probability equal to or higher than this
value. This approach allows for multiple topics to be present in one document, but
of course, is based on determining the appropriate threshold value. For this study,
set the probability threshold to the standard 0.05 or 5%.
After building the models, we visualized them using another Python library called
pyLDAvis. This library can convert LDA topic models to interactive visualizations
that can be used to analyze and interpret the meaning of topics discovered in the corpus
algorithmically. The library also visualizes the relationships between different topics
via a what is called a “topic map.” PyLDAvis builds these maps by first computing a
matrix of similarities between all possible pairs of topics and then embedding it into
a 2D space using the classic multidimensional scaling. An example of such map for
the LDA model we built on our corpus of 3,118,412 publications is present in Fig. 1.
It can be seen from the map that topics 93, 17, 8, 42 and 24 are the most common in
the corpus. After in-depth analysis, we have determined that they represent important
discourses present in the media.
Topic 93 represents the discourse that is related to the Company as well as its
subsidiaries. Publications related to this topic describe corporate activities of the
Company, such as public events that it organizes. A sizable part of publications
related to this topic were also related to companies that were important not only to
the Company itself, but to the industry as a whole. For example, certain environmental
protection agencies, societies and the events they organize became included in our
sample.
Computational Tools of Media Analysis … 61
Topic 17 represents the discourse that is related to the workers of the industry
that the Company operates in. The publications related to this topic had discussions
on the identity of typical industry workers and descriptions of their daily lives. This
discourse also contained the discussions of dangers that are inherent to the industry
for the workers themselves, the environment, and humanity as a whole.
Topic 8 represents the discourse related to one of the worst disasters to happen to
the industry in question, its short- and long-lasting effects on the lives of the affected
people, the environment, and the future of the industry. Another important aspect of
the discourse is the continued discussion of the ways the disaster was represented in
various media (e.g., TV shows and video games).
Topic 42 represents a discourse that is related to that of topic 8. The discourse for
this topic is also centered on the same disastrous event, but it centers more on the
media depictions of the event, the long-term effects on the environment surrounding
the epicenter of the event, and the exploration of this environment.
62 G. S. Khvatsky et al.
Another method that we attempted to use for topic modeling and topic discovery
was to perform topic modeling using the Hierarchical Dirichlet Process (HDP) topic
model. While it is harder to use than LDA, the availability of free and open source
implementation makes the use of this model relatively easy.
For our application of HDP, we have used the implementation provided by the
Gensim Python library. The implementation of HDP provided by Gensim is based
on an online algorithm that allows for analysis of arbitrarily large text corpora. As in
the case with LDA, this property was very useful due to the large size of our corpus.
For this model, we focused only on some types of publications in our corpus. We
removed comments from our analysis since they did not add as much substance to
the discussion of the Company and industry as posts and (extended) reposts. The
size of our final corpus used to train the model was 2,205,904 publications, and it
consisted of posts, reposts, and extended reposts.
Since HDP can infer the number of topics from the data, we did not employ
any algorithmic procedures for topic count estimation. For this study, we set the
maximum possible number of topics in the model to 150.
HDP and LDA are both based on the bag-of-words approach and as such share a
common set of problems. We followed the same procedure for building the dictionary
as outlined in the section on LDA. We used the same list of words that were important
for our analysis to prevent them from being removed from the dictionary based on
being too common or too rare. When building the dictionary, we used the same
parameters for word exclusion, removing words from the dictionary if they were
present in less than 10 publications, or more than in 70% of publications.
Another issue that was shared to a degree between HDP and LDA modeling
approaches was that the topics were highly correlated and contained similar sets of
probable words. In the case of HDP, this problem was alleviated by the fact that
the number of topics inferred from the text by the model was much lower than the
number we used for the part of our analysis based on LDA. Also relevant to this
algorithm were the issues caused by the presence of irrelevant publications in the
dataset.
Assigning topics to individual publications has also proved problematic. As with
the case of LDA, HDP assigns a probability distribution over all the topics to each
of the documents. However, this issue affects HDP much more than LDA, because
the implementation of HDP we used generates a distribution not over the inferred
Computational Tools of Media Analysis … 63
number of topics, but over the maximum possible number of topics specified when
training the model. To remedy this, we decided to use an approach similar to what
we did for LDA: we set a probability threshold of 5%. As with LDA, we could assign
more than one topic to a document.
After training the model, we created a visualization procedure that allowed us
to see the relationships between different topics. While HDP models can be trans-
formed into LDA models, this transformation destroys a large portion of the available
data, such as the number of topics inferred by the model. This severely limits the
usefulness of converting HDP models to LDA models to aid in their visualization
and interpretation.
To visualize the model we used the following procedure. First, for each topic we
counted the number of documents related to the topic using the rules described above.
Next, for each of the topics inferred by the model, we obtained a sample of 1000 most
probable words with their probability scores. We then created a topic-word matrix
with rows representing topics and columns representing sampled words. If for some
topic a word was not in the 1000 most probable, we assigned to it a probability score
of 0. Next, we used t-SNE to embed this large matrix into a 2-dimensional space. An
example of a topic map created using this procedure is presented in Fig. 2. The size
of the bubble represents the number of documents related to each of the topics.
As Fig. 2 shows, the model has discovered 20 topics in our dataset of publication
texts. Topics 0, 1, 2, 3 and 6 have the largest number of documents related to them,
and further analysis suggests that they represent important discourses that exist in
social media.
Topics 0, 3 and 6 represent a discourse that is centered around one of the largest
disasters to happen to the industry under investigation. This discourse also refers
to the geographical and geopolitical area where the event took place, as well as the
lives of people involved in the containment measures and otherwise affected by the
tragedy as well as its long-term effects. Topic 0 is focused on the lives of people
affected by the tragedy, and by its effects on the area surrounding the site. Topic 3
is concerned with the strain that this tragedy put on the industry under investigation
in the country where it happened, as well as how this strain affects large economic
sectors of this country. Topic 6 is concerned with the general dangers of the continued
development of the industry in question, and their connection with one of the largest
disasters that happened in connection with that industry.
Topic 1 represents a discourse centered around investments that the Company
makes in its home country and how it helps develop the remote parts of the country.
This discourse also contains discussions around new large-scale projects undertaken
by the corporation and the jobs it creates.
Topic 2 represents a discourse centered around new technological developments
in the industry in question and the role of the Company in these developments.
This discourse also includes discussions on the investments and job creation by the
corporation.
The third method that we used for topic analysis for this study is what we called
“manual labeling.” While this method required human expert involvement, it allowed
for simultaneous topic discovery and interpretation. It also allowed us to filter out
publications that were detrimental for analysis, such as advertisements.
The manual labeling method for topic analysis is centered around assigning topics
to publications based on a dictionary that is created by a panel of experts. A topic
can be assigned to a document based on the text of the publication or the set of other
topics assigned to the document (for example, if a word is present in the text or a
match of a regular expression).
For example, when analyzing a dataset related to computer hardware, one can
define a topic called “motherboards.” This topic will be assigned to a publication if
it contains the word “motherboard” or the word “mainboard.” Other topics might be
“graphics cards” (with the words being “graphics card” or “display adapter”) and
“keyboards” (with the word “keyboard”). It would then be possible to define two
other topics. One is “core components”, assigned to the publications that already
have topics “motherboards” or “graphics cards” assigned to them. Another is the
“peripherals,” assigned to publications that have the topic “keyboards” assigned to
them. The hierarchy of topics can be infinitely deep. It would also be possible to
define the topic “computers” that would be assigned only to those publications that
have both the “core components” and the “peripherals” topics assigned to them. This
topic hierarchy is represented in Fig. 3.
Computational Tools of Media Analysis … 65
For our analysis, we created our own implementation of this method using the
Python programming language. In our implementation, the definition for each topic
contained sets of regular expressions (used to search for substrings in the text), sets
of other topics, or a combination of both.
In order to build the topic dictionary using expert opinion, we devised the
following cyclic procedure. (Experts were the people who had the subject matter
knowledge of the industry.) Starting with the initial empty dictionary, we sampled a
set of documents that did not have any topics assigned to them from our database.
After that, we had a panel of experts analyze the sample of the documents in order
to make additions and modifications to the dictionary so that all the sampled docu-
ments had topics assigned to them. In the case of our study, the experts identified
various discourses that existed in our dataset. They also identified the connections
that existed between the discourses that were already discovered. In this study, we
repeated this procedure until all unassigned documents remaining were not relevant
to our analysis (e.g., unclassified advertisements).
One of the biggest challenges inherent to this type of analysis is that it is both
time-consuming and susceptible to subjectivity of the experts. While the first problem
remains unsolved, we have tried to remedy the second problem by having a panel of
multiple experts. Additionally, we have used independent coding together with the
experts discussing the coding results to reach consensus among experts and to make
the coding as objective as possible.
Another problem related to this method is that it does not allow to determine the
level of affinity of a publication to a particular topic, as our implementation only
allows for binary topic assignment. Still, the method allows to assign arbitrarily
many topics from the dictionary to a single publication, so it does not require the
assumption of one topic per document to hold.
Another issue with using this method (that it shares with other bag-of-words text
analysis methods) is that it largely ignores the structure of the text of the publications.
What makes this issue important is that particular care should be taken when building
66 G. S. Khvatsky et al.
the dictionary, since ambiguous entries can greatly affect the quality of the extracted
topics. For example, if the dictionary is designed to use single words to assign topics
to documents, using this method can lead to assigning topics to documents that are
not relevant to the topic. A general recommendation in this case is to include as much
context as possible when adding text-based entries to the dictionary.
While it was possible, our implementation of this method did not make the use
of additional metainformation available for the publications in our dataset. Since the
analysis performed for this study was largely exploratory, we reserved the use of
metainformation for future work. However, after performing this analysis, we used
the metadata available for the publications for further analysis.
After performing all the procedures outlined above, we discovered 189 topics
in the texts of publications in our dataset. We visualized them using the following
procedure. First, for each of the topics in the dictionary, we created a sample of the
publications related to the topic. Next, we used a pre-trained word2vec model to
convert the text of each of the publications of our samples to a vector with 1536
components. Next, for each of the topics we computed the component average of
the vectors of all the documents related to the topic. Then, we used t-SNE to embed
the average vectors into a 2-dimensional space. This visualization (or topic map) is
presented on Fig. 4.
The sizes of the bubbles represent the number of documents in a dataset that are
related to each of the topics. As Fig. 4 shows, some topics are more prevalent in the
dataset than the others. We have determined that topics 159, 12, 17, 1 and 121 are
the most common in the full corpus.
Topic 159 is related to the corporation under investigation, its social media
accounts and publications, and social media discussions centered around its public
image. It should be noted that this topic was defined using a single word, namely the
name of the Company, so it is quite general and requires care when interpreted.
Topics 12, 17 and 121 are related to each other. They describe a discourse centered
around one of the largest disasters that happened in the industry under investigation.
Topic 12 is more focused on the geographical area where the accident happened,
and the effects of the incident on the surrounding nature and environment. It also
contains publications dedicated to lives of those affected by the disaster. Topic 17
describes a discourse centered around both technical and cultural implications of the
accident and its containment, as well as related media. Topic 121 is focused on a
discourse surrounding the actual site of the accident, the present efforts for continued
containment, modern explorations of the site, and the ways the site of the accident
is represented in media.
Topic 1 describes a more general discourse that is focused on other disasters and
accidents related to the industry, and their cultural, environmental, and media impact.
It should be noted that the dictionary was constructed in a way that minimized the
intersection between this topic and topics 12, 17, and 121.
Drawbacks and benefits of each method described above are summarized in Table 5.
This Table can be used as a so-called “roadmap” for building a system of monitoring
and evaluation of corporate policy effectiveness using computational tools of old and
new media data mining.
It should be noted that there are many possible avenues for improvement of the
analysis presented in this study. The first is to use other, more modern methods for
automated topic discovery. The methods used for this study only made use of the text
data available for publications but not the metainformation. For example, the topic
model proposed by [25] can take location and time data into account when modeling
topics from the corpus, and both time and location data is available in our dataset.
Another topic model proposed by [22] is a variant of LDA that can take connections
between documents into account. These data are also available in our dataset in the
form of connections between posts and reposts, posts and comments, comments and
replies to comments. It is entirely possible that including metadata into the topic
discovery process may allow for better, less correlated, and easier to interpret topics
to be discovered in our dataset.
Another possible avenue for improvement is through modifications of the manual
labeling topic discovery process. While it better than having the experts read millions
of publications one by one, the process is still laborious and time-consuming. One
possible way to improve this process is by using active learning. It may be possible to
start with an automatically generated grouping of publications. They can be generated
by one of the two methods. The first is via a clustering algorithm that allows for
detection of publications that do not belong to any of the clusters (e.g. DBSCAN).
68 G. S. Khvatsky et al.
The second is via a topic modeling algorithm. After automatic groups are generated,
the expert panel can work on interpreting and improving the discovered document
groupings.
It is also possible to combine both automated topic modeling approaches with
the manual labeling approach we applied in this study. This combined approach can
help with removing posts that are irrelevant for analysis, such as advertisements. It
can also help with manual labeling. As a result, it can help improve the performance
of the automated topic discovery models.
Continuing with recommendations of applying the methods described in this
chapter for future system of monitoring and evaluation of corporate policy effec-
tiveness, we would like to suggest the following improvements to our methodology.
Our first recommendation is to take special care when creating the search query to
obtain the initial dataset for analysis. Since the topic modeling methods we used
were affected by irrelevant publications obtained by the query we used, using a
better, narrower query has a large potential for improving the performance of all the
presented models. Another recommendation is to use an algorithm to determine the
number of topics for the LDA model. We did not use it due to lack of time (the project
had a tight deadline) and low computational power. However, in the future, using an
optimal number of topics for the LDA model will definitely improve the quality and
interpretability of the discovered topics.
Notes Funding: The article was prepared within the framework of the HSE University Basic
Research Program.
References
10. Drury G (2008) Opinion piece: social media: should marketers engage and how can it be done
effectively? J Direct Data Digit Mark Pract 9(3):274–277. https://doi.org/10.1057/palgrave.
dddmp.4350096
11. Girdhar Y, Giguère P, Dudek G (2013) Autonomous adaptive underwater exploration using
online topic modeling. In: Desai JP et al (eds) Experimental robotics: the 13th interna-
tional symposium on experimental robotics. Springer tracts in advanced robotics. Springer
International Publishing, Heidelberg, pp 789–802. https://doi.org/10.1007/978-3-319-00065-
7_53
12. Griffiths TL, Steyvers M (2004) Finding scientific topics. Proc Natl Acad Sci 101(suppl
1):5228–5235. https://doi.org/10.1073/pnas.0307752101
13. Haley KJ, Fessler DMT (2005) Nobody’s watching? Subtle cues affect generosity in an anony-
mous economic game. Evol Hum Behav 26(3):245–256. https://doi.org/10.1016/j.evolhumbe
hav.2005.01.002
14. Hannigan TR et al (2019) Topic modeling in management research: rendering new theory from
textual data. Acad Manag Ann 13(2):586–632. https://doi.org/10.5465/annals.2017.0099
15. Hong L, Davison BD (2010) Empirical study of topic modeling in Twitter. In: Proceedings of
the first workshop on social media analytics. Association for computing machinery (SOMA
’10), New York, NY, USA, pp 80–88. https://doi.org/10.1145/1964858.1964870
16. Hoxha G (2015) Limited efficiency of civil society during the democratization process of
Albania. Thesis. Epoka University. http://dspace.epoka.edu.al/handle/1/1762. Accessed 17
November 2020
17. Jelodar H et al (2019) Latent Dirichlet allocation (LDA) and topic modeling: models, appli-
cations, a survey. Multimed Tools Appl 78(11):15169–15211. https://doi.org/10.1007/s11042-
018-6894-4
18. Kaplan AM (2015) Social media, the digital revolution, and the business of media. Int J Media
Manag 17(4):197–199. https://doi.org/10.1080/14241277.2015.1120014
19. Kaplan AM, Haenlein M (2010) Users of the world, unite! The challenges and opportunities
of social media. Bus Horiz 53(1):59–68. https://doi.org/10.1016/j.bushor.2009.09.003
20. Lasswell HD (1970) The emerging conception of the policy sciences. Policy Sci 1(1):3–14
21. Lau JH et al (2012) Word sense induction for novel sense detection. In: Proceedings of the 13th
conference of the European chapter of the association for computational linguistics. EACL
2012. Association for computational linguistics, Avignon, France, pp 591–601. https://aclant
hology.org/E12-1060. Accessed 21 July 2021
22. Liu Y, Xu S (2017) A local context-aware LDA model for topic modeling in a document
network. J Am Soc Inf Sci 68(6):1429–1448. https://doi.org/10.1002/asi.23822
23. Magalhães PC (2014) Government effectiveness and support for democracy. Eur J Polit Res
53(1):77–97. https://doi.org/10.1111/1475-6765.12024
24. McCorkindale T, DiStaso MW, Carroll C (2013) The power of social media and its influence
on corporate reputation. In: The handbook of communication and corporate reputation, vol 9,
no. 1, pp 497–512
25. Mei Q, Zhai C (2006) A mixture model for contextual text mining. In: Proceedings of the 12th
ACM SIGKDD international conference on knowledge discovery and data mining—KDD ’06;
The 12th ACM SIGKDD international conference, Philadelphia, PA, USA. ACM Press, p 649.
https://doi.org/10.1145/1150402.1150482
26. Mimno D et al (2011) Optimizing semantic coherence in topic models. In: Proceedings of
the 2011 conference on empirical methods in natural language processing, EMNLP 2011.
Association for computational linguistics, Edinburgh, Scotland, UK, pp 262–272. https://acl
anthology.org/D11-1024. Accessed 13 July 2021
27. Mulhern F (2009) Integrated marketing communications: from media channels to digital
connectivity. J Mark Commun 15(2–3):85–101. https://doi.org/10.1080/13527260902757506
28. Niemi RG, Craig SC, Mattei F (1991) Measuring internal political efficacy in the 1988 national
election study. Am Polit Sci Rev 85(4):1407–1413. https://doi.org/10.2307/1963953
29. Pentina I, Tarafdar M (2014) From “information” to “knowing”: exploring the role of social
media in contemporary news consumption. Comput Hum Behav 35:211–223. https://doi.org/
10.1016/j.chb.2014.02.045
Computational Tools of Media Analysis … 71
30. Pritchard JK, Stephens M, Donnelly P (2000) Inference of population structure using multilocus
genotype data. Genetics 155(2):945–959
31. Qualman E (2012) Socialnomics: how social media transforms the way we live and do business.
Wiley
32. Ramamonjisoa D (2014) Topic modeling on users’s comments. In: 2014 third ICT international
student project conference (ICT-ISPC), pp 177–180. https://doi.org/10.1109/ICT-ISPC.2014.
6923245
33. Rani S, Kumar M (2021) Topic modeling and its applications in materials science and
engineering. Mater Today Proc 45:5591–5596. https://doi.org/10.1016/j.matpr.2021.02.313
34. Rayp G, Sijpe NVD (2007) Measuring and explaining government efficiency in developing
countries. J Dev Stud 43(2):360–381. https://doi.org/10.1080/00220380601125230
35. Segalovich I (2003) A fast morphological algorithm with unknown word guessing induced by
a dictionary for a web search engine. In: MLMTA, p 273
36. Tewolde MH, Gubán P (2010) the means of analysis and evaluation for corporate performances.
Ann Univ Apulensis Ser Oecon 1(12):738–749. https://doi.org/10.29302/oeconomica.2010.12.
1.43
37. Wang C, Blei DM (2011) Collaborative topic modeling for recommending scientific articles.
In: Proceedings of the 17th ACM SIGKDD international conference on knowledge discovery
and data mining. Association for computing machinery (KDD’11), New York, NY, USA, pp
448–456. https://doi.org/10.1145/2020408.2020480
38. Wang H et al (2019) Optimization of topic recognition model for news texts based on LDA. J
Digital Inf Manag 17(5):257. https://doi.org/10.6025/jdim/2019/17/5/257-269
39. Wen HJ, Lim B, Lisa Huang H (2003) Measuring e-commerce efficiency: a data envelopment
analysis (DEA) approach. Ind Manag Data Syst 103(9):703–710. https://doi.org/10.1108/026
35570310506124
40. Yau C-K et al (2014) Clustering scientific documents with topic modeling. Scientometrics
100(3):767–786. https://doi.org/10.1007/s11192-014-1321-8
41. Zhao W, Chen JJ, Perkins R, Liu Z, Ge W, Ding Y, Zou W (2015) A heuristic approach to
determine an appropriate number of topics in topic modeling. In: BMC Bioinf 16(13):1–10.
BioMed Central, 2015
Optimal Design of Checkpoint Systems
with General Structures, Tasks and
Schemes
Abstract This chapter proposes some kinds of checkpoint systems with general
structures, tasks and schemes. We have already considered redundancy techniques
which are duplex and majority systems, and have applied them to two checkpoint
models in which their interval times are constant and random. Giving overheads for
checkpoints, we have obtained the mean execution times until the process succeeds,
and have derived optimal checkpoint times to minimize them. In this chapter, we first
introduce the standard checkpoint model, and propose general checkpoint models
which include parallel, series and bridge systems. Furthermore, we consider tandem
and bulk tasks, and apply them to two schemes and compare optimal policies the-
oretically and numerically. Finally, as examples of the above models, we give four
models, obtain their mean execution times analytically and discuss which scheme is
better numerically.
1 Introduction
It is of great importance to develop the design of computer systems with high reliabil-
ity. Especially, human technologies have been recently growing up to build spacecraft
bound for Mars, International Space Station, auto-drive cars, and so on. All of them
mostly consist of computing units that need high reliability and high speed pro-
cessing. Therefore, we have to design previously such computer systems with high
K. Naruse (B)
Faculty of Social and Environmental Studies, Josai International University,
1 Gumyo, Togane, Chiba 283-8555, Japan
e-mail: knaruse@jiu.ac.jp
T. Nakagawa
Department of Business Administration, Aichi Institute of Technology,
1247 Yachigusa, Yakusa, Toyota, Aichi 470-0392, Japan
e-mail: toshi-nakagawa@aitech.ac.jp
quality. However, we cannot eliminate some computer errors, and to mask these
errors, we have to design high reliable computers with multiple units and tolerant
performances.
Some errors may occur due to space radiation, electromagnetic waves, low qual-
ity of hardware, element overheat and over current, and so on. These errors lead
to faults or failures, and might cause serious damage to systems. To prevent such
faults, some types of fault-tolerant technologies including system redundancies and
configurations have been offered [1, 5, 13]. Using such techniques, we can achieve
high reliabilities and effective performances of objective computer systems in actual
fields.
Failures due to errors may be found and computer systems lose their consistency.
To protect such critical incidents, several recovery methods are required to restore
a consistent state just before failures. The most standard method is to taking copies
of normal states at suitable times, which is called checkpoint times. When errors of
the process occur and are detected, we execute the rollback operation to the nearest
checkpoint time and restore the consistent state of the process.
The simplest scheme for error detection recovery techniques is [6]: We execute two
independent modules which compare two states at checkpoint times. If two states of
each module do not match with each other, we go back to the newest checkpoint and
make their retrials. Furthermore, we recently considered six modules with duplex
and majority schemes, which consist of three clusters with two modules and two
clusters with three modules and compared them [9].
Several studies have been conducted to determine the optimal checkpoint fre-
quencies: The reliability and performance of a double modular system with one
spare module were evaluated [7, 11]. In addition, the performance of checkpoint
schemes with task duplication was evaluated [14, 15]. An optimal instruction retry
period that minimizes the possibility of dynamic failures using the triple modular
controller was derived [4]. Evaluation models with finite checkpoints and bounded
rollback were discussed [10].
In this chapter, we obtain the mean execution time to complete the computer
processing time for the standard checkpoint model, using renewal processes, and
derive optimal periodic checkpoint interval in Sect. 2. We make a generalization of
checkpoint models in Sect. 3: We consider a K-out-of-n structure in Sect. 3.1 and
obtain the reliability of general redundant systems which include parallel, series,
majority decision, and bridge systems in Sect. 3.2. We take up Tandem and Bulk tasks,
and compare which task is better numerically in Sect. 3.3. Section 3.4 introduces two
schemes of checkpoints such as compare-checkpoint and store-checkpoint. To apply
the above models to practical ones easily, we give four examples of random tasks in
Sect. 4: We consider a parallel structure with double modules in Sect. 4.1, 4 tasks with
3 schemes in Sect. 4.2, 6 tasks with 9 schemes with double modules in Sect. 4.3(1),
and triple majority modules in Sect. 4.3(2). We obtain the mean execution times of
each model, show their numerical examples, and discuss which model is better.
Optimal Design of Checkpoint Systems … 75
2 Standard Model
Suppose that S(0 < S < ∞) is a native execution time of processes and does not
include any checkpoint overheads. It is assumed that some errors occur according to
a general distribution F(t), and the failure rate is h(t) ≡ f (t)/F(t), where f (t) is a
density function of F(t) and Φ(t) ≡ 1 − Φ(t) for any function Φ(t). To detect such
errors, we divide S equally into N (N = 1, 2, . . .) time intervals and place periodic
checkpoints at planned times kT (k = 1, 2, . . .), where T ≡ S/N . If we detect errors
at checkpoint times, roll back to the previous checkpoint and re-execute the process,
where it is assumed that occur according to an identical distribution F(t) for each
checkpoint interval ((k − 1)T, kT ] and the failure rate h(t) increases with t.
Introduce constant overhead C1 of one checkpoint. Then, the mean time L(N ) to
complete the process with a native execution time S is the total of the execution times
and the overhead C1 for the checkpoint. From the assumption that if some errors are
detected at checkpoint times, the process is rolled back to the previous checkpoint,
the mean execution time for each checkpoint interval ((k − 1)T, kT ] is given in a
renewal equation [8, p. 126]:
T + C1
L(1) = . (2)
F(T )
N (T + C1 ) S + N C1 S(1 + C1 /T )
L(N ) ≡ N L(1) = = = . (3)
F(T ) F(S/N ) F(T )
We find optimal N1∗ to minimize L(N ). For this purpose, we derive optimal
1 (0 < T
T 1 ≤ S) to minimize
1 + C1 /T
L(T ) ≡ .
F(T )
Differentiating
L(T ) with respect to T and setting it equal to zero,
whose left-hand side increases strictly with t from −C1 to ∞. Thus, there exists a
1 (0 < T
finite T 1 < ∞) which satisfies (4).
76 K. Naruse and T. Nakagawa
Therefore, we have the following optimal number N1∗ , using the partition method
[8, p. 42]:
(i) When T 1 < S, we set [S/T 1 ] ≡ N and calculate L(N ) from (3). If L(N ) ≤
L(N + 1), then N1 = N , and conversely, if L(N ) > L(N + 1), then N1∗ = N +
∗
1.
(ii) When T 1 ≥ S, N1∗ = 1, i.e., we do not make any checkpoint, and the mean time
is (S + C1 )/F(S).
Note that T 1 given in (4) does not depend on S. Thus, if S would be very large, be
changed greatly, or be uncertain, then we may adopt T 1 as an approximate checkpoint
time of T1∗ . For example, when S is a random variable with a distribution L(t) and
mean l,
l(1 + C1 /T )
L(N ) = .
F(T )
C1
T 2 + C1 T − = 0. (5)
λ
Solving (5) with respect to T ,
1 = C1 1
T 1+ −1 . (6)
λC1
We propose general structures with redundant modules to mask errors and detect them
at checkpoint times. As one of general structures, we consider the following K-out-
of-n system whose theoretical properties and practical applications were extensively
collected [12]: A K-out-of-n (1 ≤ K ≤ n) structure can operate if and only if at least
K modules of the total n units are operable, and their reliability characteristics were
investigated. When errors of each module occur according to an identical distribution
F(t), the reliability of the structure at time t is
n
n
R K (t) = F(t) j F(t)n− j . (7)
j
j=K
Optimal Design of Checkpoint Systems … 77
n n
n
n n
R P (t) = pk F(t) j F(t)n− j = Pk F(t)k F(t)n−k , (8)
j k
k=0 j=k k=1
Note that Pk represents the probability that when k modules have no error, the struc-
ture is normal, i.e., the process executes correctly.
When Pk = 1 (k = 1, 2, . . . , n),
2m+1
2m + 1
R M (t) = F(t)k F(t)2m+1−k . (12)
k
k=m+1
In particular, when m = 1,
S + N C2 S(1 + C2 /T )
L M (N ) = = −2λT ≡ L M (T ). (15)
3e−2λS/N − 2e −3λS/N 3e − 2e−3λT
whose left-hand side increases strictly with T from 0 to ∞. Thus, there exists a finite
and unique T2 (0 < T
2 < ∞) which satisfies (16), and using the partition method in
Sect. 2, we obtain optimal N2∗ and T2∗ which minimize L M (N ) in (15). Therefore,
for any structures with reliability R P (t) in (8), we obtain the mean time in (3) and
discuss optimal policies to minimize it.
Table 1 presents optimal N2∗ and the mean execution time L M (N2∗ ) in (15) for λ
and C2 . For example, when S = 1.0, λ = 0.150, C2 = 0.05, optimal N2∗ is 3. Thus,
Optimal Design of Checkpoint Systems … 79
Table 1 Optimal N2∗ and mean time L M (N2∗ ) when S = 1.0, G(t) = 1 − e−t
λ C2 = 0.001 C2 = 0.005 C2 = 0.05
N2∗ L M (N2∗ ) N2∗ L M (N2∗ ) N2∗ L M (N2∗ )
0.150 5 1.008 3 1.022 3 1.117
0.125 4 1.007 3 1.020 1 1.092
0.100 4 1.006 2 1.017 1 1.077
0.075 3 1.005 2 1.014 1 1.066
0.050 2 1.004 2 1.012 1 1.057
0.025 2 1.002 1 1.007 1 1.052
0.001 1 1.001 1 1.005 1 1.050
when we make 3 chechkpoints until time S, it is optimal and the mean execution
time is 1.117. This indicates that N2∗ increase with λ from 1 and decrease with C2 ,
and L M (N2∗ ) increases with λ and C2 from S = 1.0. This means that if error rate λ
is small and overhead C2 is large, we should not place any checkpoints until time S.
Suppose that the process executes N tasks (N = 1, 2, . . .), each of which has a
processing time Y j ( j = 1, 2, . . . , N ) with an identical distribution G(t) ≡ Pr{Y j ≤
t} with finite mean 1/θ and is executed successively, which is called Tandem task in
Fig. 2.
Supposing that T is a random variable with G(t) in Sect. 2, the mean time to
complete the process is, from (1),
∞
L T (N ) ≡ (t + N C1 ) F(t) + [t + N C1 + L T (N )] F(t) dG (N ) (t), (17)
0
where G (N ) (t) is the N -fold Stieltjes comvolution of G(t) with itself, and G (0) (t) ≡ 1
for t ≥ 0. Solving (17) with respect to L T (N ), the mean time to complete the process
is
N (1/θ + C1 )
L T (N ) = ∞ . (18)
(N ) (t)
0 F(t)dG
Suppose that the process executes N tasks simultaneously, which is called Bulk
task in Fig. 3. If some tasks with errors are detected, then the process returns to the
80 K. Naruse and T. Nakagawa
previous checkpoint and executes all of N tasks, and ends when all of N tasks have
no errors, Then, letting C N be the overhead for N tasks, the mean time to complete
the process is
∞
L B (N ) = (t + C N )F(t) + [t + C N + L B (N )] F(t) dG(t) N . (19)
0
Thus, if C N = N C1 , then Bulk task is better than Tandem one, because we can begin
to execute N tasks simultaneously for Bulk one. However, overhead C N might be
larger than N C1 .
Optimal Design of Checkpoint Systems … 81
Table 2 Overhead C 2 and mean time L B (2) when N = 2, L T (2) = L B (2), C1 = 1 and G(t) =
1 − e−θ t and F(t) = 1 − e−λt
λ θ = 1.00 θ = 0.75 θ = 0.50 θ = 0.25
2
C L B (2) C2 L B (2) 2
C L B (2) 2
C L B (2)
0.1 2.690 4.840 2.958 5.994 3.545 8.640 5.667 19.600
0.05 2.598 4.410 2.817 5.310 3.286 7.260 4.909 14.400
0.01 2.520 4.080 2.698 4.792 3.059 6.242 4.196 10.816
0.005 2.510 4.040 2.682 4.729 3.030 6.121 4.099 10.404
0.001 2.502 4.008 2.670 4.679 3.006 6.024 4.020 10.080
0.0005 2.501 4.004 2.668 4.673 3.003 6.012 4.010 10.040
0.0001 2.500 4.001 2.667 4.668 3.001 6.002 4.002 10.008
2 (1/θ + C1 ) 3/ (2θ ) + C2
L T (2) = , L B (2) = 2 .
[θ/ (θ + λ)] 2 2θ / [(θ + λ) (2θ + λ)]
Thus, if
2θ (λ + 2θ ) C2 − 8θ (θ + λ) C1 > 5λ + 2θ,
Table 2 presents overhead C 2 and mean time L B (2) for λ and θ when N = 2,
L T (2) = L B (2) and C1 = 1. For example, when λ = 0.05 and θ = 1.00, C 2 = 2.598
and L B (2) = L T (2) = 4.410. Values of C2 and L B (2) increase with λ and 1/θ . This
2 , then Tandem task is better than Bulk one.
indicates that if C2 > C
T + C2
L 1 (1) = + CS.
e−2λT
Thus, the mean time to complete the process is
S + N C2 1 + C2 /T CS
L 1 (N ) ≡ L 1 (T ) ≡ N L 1 (1) = −2λS/N + N C S = S + . (22)
e e−2λT T
Table 3 Optimal N1∗ , N2∗ and mean time L 1 (N1∗ ), L 2 (N1∗ ) when S = 1.0, Cs = 0.1, G(t) = 1 −
e−t/μ
λ C2 = 0.001 C2 = 0.005 C2 = 0.001 C2 = 0.005
N1∗ L 1 (N1∗ ) N1∗ L 1 (N1∗ ) N2∗ L 2 (N2∗ ) N2∗ L 2 (N2∗ )
0.200 2 1.424 2 1.434 14 1.365 6 1.409
0.150 2 1.364 2 1.373 12 1.295 6 1.331
0.125 2 1.335 2 1.344 11 1.262 5 1.294
0.100 2 1.307 2 1.316 10 1.229 5 1.258
0.075 1 1.263 1 1.268 9 1.198 4 1.221
0.050 1 1.206 1 1.211 7 1.167 3 1.185
0.025 1 1.152 1 1.157 5 1.136 2 1.149
0.001 1 1.103 1 1.107 1 1.103 1 1.107
We find optimal T1∗ and T2∗ to minimize L 1 (T ) in (22) and L 2 (T ) in (24). Differ-
entiating L 1 (T ) with respect to T and setting it equal to zero,
Thus, there exists a finite and unique T1∗ (0 < T1∗ < ∞) which satisfies (25).
Differentiating L 2 (T ) with respect to T and setting it equal to zero,
1 2λT
(e − 1) − T = C2 . (26)
2λ
Thus, there exists a finite and unique T2∗ (0 < T2∗ < ∞) which satisfies (26). Using
the partition method in Sect. 2, we can get optimal N1∗ and N2∗ to minimize L 1 (N )
and L 2 (N ), respectively.
Example 3 Table 3 presents optimal N1∗ and N2∗ and the mean time L 1 (N1∗ ) in (22)
and L 2 (N2∗ ) in (24) for λ and C2 when Cs = 0.1 and S = 1. For example, when
λ = 0.100 and C2 = 0.001, N1∗ = 2 and N2∗ = 10, i.e., we should 1 CSCP in Scheme
1 and 9 CCP’s in Scheme 2. This indicates that both N1∗ and N2∗ increase with λ and
decrease with C2 .
We can give four checkpoint models with random tasks by combining structures,
tasks and schemes.
84 K. Naruse and T. Nakagawa
4.1 Model 1
C2 + 1/θ
L 1 (1) = + CS, (28)
G ∗ (2λ)
∞ −st
where G ∗ (s) ≡ 0 e dG(t) for Re(s) ≥ 0. Thus, the mean time to complete the
process is
C2 + 1/θ
L 1 (N ) ≡ N L 1 (1) = N + CS . (29)
G ∗ (2λ)
1 ∞
l2 (N ) = C2 + + C S e−2λt + l2 (1)(1 − e−2λt ) dG(t). (30)
θ 0
whose left-hand side increases strictly with N to ∞. Thus, there exists a finite and
unique minimum N2∗ (1 ≤ N2∗ < ∞) which satisfies (33). If
1 − G ∗ (2λ) Cs
≥
∗
[G (2λ)] 2 C2 + 1/θ
then N2∗ = 1.
Example 4 Table 4 presents optimal N2∗ and the mean time L 2 (N2∗ ) in (32) and
∗ ∗ ∗ ∗
L 1 (N2 ) in (29). Compute N2 and compare L 2 (N2 ) and L 1 (N2 ) for λ and C S when
C2 = 1 × 10−4 , Cs = 0.1 and 1/θ = 1. For example, when λ = 0.005, C2 = 0.001,
optimal N2∗ is 4. Thus, when we make 3 CCPs and 1 CSCP for 4 tasks, it is optimal
and the mean execution time is 1.051. Clearly, Scheme 2 is better than Scheme 1
with 4 tasks.
4.2 Model 2
We consider 3 schemes with 4 tasks in Fig. 5 for double modules, and discuss which
schemes are better.
When G(t) = 1 − e−θt and F(t) = 1 − e−λt , the mean execution times of each
scheme are obtained as follows: Let lk (k = 1, 2, 3, 4) be the mean time for task k to
complete the final task 4. For Scheme 1,
∞ ∞
l1 = (Cs + t + l2 ) e−2λt dG(t) + (C2 + t + l1 ) 1 − e−2λt dG(t),
0 0
∞ ∞
l2 = (Cs + t) e−2λt dG (3) (t) + (C2 + t + l2 ) 1 − e−2λt dG (3) (t).
0 0
For Scheme 2,
∞ ∞
l1 = (Cs + t + l3 ) e−2λt dG (2) (t) + (C2 + t + l1 ) 1 − e−2λt dG (2) (t),
0 0
∞ ∞
l3 = (Cs + t) e−2λt dG (2) (t) + (C2 + t + l3 ) 1 − e−2λt dG (2) (t).
0 0
For Scheme 3,
∞ ∞
l1 = (Cs + t + l4 ) e−2λt dG (3) (t) + (C2 + t + l1 ) 1 − e−2λt dG (3) (t),
0 0
∞ ∞
l4 = (Cs + t) e−2λt dG(t) + (C2 + t + l4 ) 1 − e−2λt dG(t).
0 0
This means that we should place the checkpoint at the middle point of the number
of tasks.
4.3 Model 3
C2 + 6/θ
L 1 = Cs − C2 + ,
G ∗ (2λ)6
C2 + 3/θ
L 2 = 2 Cs − C2 + ∗ ,
G (2λ)3
C2 + 2/θ
L 3 = 3 Cs − C2 + ∗ ,
G (2λ)2
C2 + 1/θ
L 4 = 6 Cs − C2 + ,
G ∗ (2λ)
(C2 + 3/θ) 1 + G ∗ (2λ)3
L 5 = Cs − C2 + ,
G ∗ (2λ)6
(C2 + 2/θ) 1 + G ∗ (2λ)2 + G ∗ (2λ)4
L 6 = Cs − C2 + ,
G ∗ (2λ)6
(C2 + 1/θ) 1 + G ∗ (2λ) + G ∗ (2λ)2 + G ∗ (2λ)3 + G ∗ (2λ)4 + G ∗ 2λ)5
L 7 = Cs − C2 + ,
G ∗ (2λ)6
(C2 + 1/θ) 1 + G ∗ (2λ) + G ∗ (2λ)2
L 8 = 2 Cs − C2 + ,
G ∗ (2λ)3
(C2 + 1/θ) 1 + G ∗ (2λ)
L 9 = 3 Cs − C2 + .
G ∗ (2λ)2
∞ ∞
l1 = (Cs + t) e−2λt dG(t)6 + (C2 + t + l1 ) 1 − e−2λt dG(t)6
0 0
6
= (Cs − C2 ) G ∗ (2λ)6 + C2 + + l1 1 − G ∗ (2λ)6 .
θ
Solving it with respect to l1 ,
C2 + 6/θ
L 1 ≡ l1 = Cs − C2 + .
G ∗ (2λ)6
For Scheme 7,
∞ ∞
lk = (C2 + t + lk+1 )e−2λt dG(t) + (C2 + t + l1 ) 1 − e−2λt dG(t)
0 0
(k = 1, 2, 3, 4, 5),
∞ ∞
l6 = (Cs + t)e−2λt dG(t) + (C2 + t + l1 ) 1 − e−2λt dG(t).
0 0
L 7 ≡ l1 = Cs − C2
(C2 + 1/θ ) 1 + G ∗ (2λ) + G ∗ (2λ)2 + G ∗ (2λ)3 + G ∗ (2λ)4 + G ∗ (2λ)5
+ .
G ∗ (2λ)6
n ∞
n
Pk F(t)k F(t)n−k dG(t). (34)
k 0
k=1
Table 7 Mean times L 1 ∼ L 9 when Cs = 0.1, C2 = 0.001 and G(t) = 1 − e−t for triple majority
modules
λ L1 L2 L3 L4 L5 L6 L7 L8 L9
0.1 7.692 6.949 6.790 6.840 7.272 7.136 7.004 6.697 6.668
0.05 6.545 6.419 6.445 6.672 6.433 6.397 6.362 6.349 6.412
0.01 6.121 6.210 6.307 6.603 6.116 6.116 6.117 6.211 6.308
0.005 6.105 6.203 6.302 6.601 6.105 6.106 6.108 6.206 6.304
0.001 6.100 6.200 6.300 6.600 6.101 6.102 6.105 6.204 6.303
0.0005 6.100 6.200 6.300 6.600 6.101 6.102 6.105 6.204 6.303
Table 8 Mean times L 1 ∼ L 9 when Cs = 0.1, λ = 0.05 and G(t) = 1 − e−t for triple majority
modules
C2 L1 L2 L3 L4 L5 L6 L7 L8 L9
0.05 6.549 6.422 6.448 6.676 6.487 6.502 6.620 6.552 6.564
0.01 6.546 6.419 6.446 6.673 6.443 6.416 6.410 6.387 6.440
0.005 6.546 6.419 6.445 6.672 6.437 6.405 6.383 6.366 6.424
0.001 6.545 6.419 6.445 6.672 6.433 6.397 6.362 6.349 6.412
5 Conclusion
We have proposed the optimal design of checkpoint systems with general structures,
tasks and schemes for random checkpoint models. In particular, we have considered
several useful checkpoint systems, obtained the mean time execution time and com-
pared them theoretically and numerically. It would be useful for systems designers
that need high reliability and high speed computing systems. When system designers
choose an appropriate system from this chapter, they will be able to make the opti-
mal checkpoint systems. Furthermore, if the systems failure rate are known, they can
easily choose an optimal kind of checkpoint schemes and get its checkpoint times.
This study can be applied to several systems with high reliable requirements such
as spacecraft, stock exchange system, aircraft, drone and so on, because such systems
need to provide high reliability and performance by implementing recovery methods
and redundant techniques.
References
4. Kim H, Shin KG (1996) Design and analysis of an optimal instruction-retry policy for TMR
controller computers. IEEE Trans Comput 45(11):1217–1225
5. Lee PA, Anderson T (1990) Fault tolerance principles and practice. Dependable computing
and fault-tolerant systems. Springer, Wien
6. Nakagawa S, Fukumoto S, Ishii N (2003) Optimal checkpointing intervals of three error detec-
tion schemes by a double modular redundancy. Math Comput Model 38:1357–1363
7. Nakagawa S, Okuda Y, Yamada S (2003) Optimal checkpointing interval for task duplication
with spare processing. In: Ninth ISSAT international conference on reliability and quality in
design, Honolulu, Hawaii, vol 2003, pp 215–219
8. Nakagawa T (2008) Advanced reliability models and maintenance policies. Springer, London
9. Naruse K, Nakagawa T (2020) Optimal checkpoint intervals, schemes and structures for com-
puting modules. In: Pham H (ed) Reliability and statistical computing. Springer, pp 265–287
10. Ohara M, Suzuki R, Arai M, Fukumoto S, Iwasaki K (2006) Analytical model on hybrid state
saving with a limited number of checkpoints and bound rollbacks (reliability, maintainability
and safety analysis). IEICE Trans Fundam Electron Commun Comput Sci 89(9):2386–2395
11. Pradhan DK, Vaidya NH (1992) Rollforward checkpointing scheme: concurrent retry with
nondedicated spares. IEEE Computer Society Press, pp 166–174
12. Ram M, Dohi T (2019) Systems engineering: reliability analysis using k-out-of-n structures.
CRC Press
13. Siewiorek DP, Swarz RS (eds) (1982) The theory and practice of reliable system design. Digital
Press, Bedford, Massachusetts
14. Ziv A, Bruck J (1997) Performance optimization of checkpointing schemes with task duplica-
tion. IEEE Trans Comput 46:1381–1386
15. Ziv A, Bruck J (1998) Analysis of checkpointing schemes with task duplication. IEEE Trans
Comput 47:222–227
Series System Reliability: An Unified
Approach Using Fatal Shock and Stress
Models
Abstract This chapter introduces a new approach to estimate the reliability and
series system reliability functions based on the structure of shock and stress mod-
els. Conceptually, shock may refer to a deterioration process that destroys the entire
system. Conversely, stress can be related to two components kept working inde-
pendently, both having maintenance scheduled at a fixed time. In this context, we
introduce multivariate models derived from those structures to evaluate the reliability
function of an n-component series system.
1 Introduction
2 Methodology
where S(t, t) denotes the joint survival function (sf) and T = min{T1 , T2 }, that is,
The shock model structure introduced by [22] presupposes three existing independent
shock sources in the environment where a 2-component system is placed. Following
[43], such a structure is defined as follows.
96 R. P. de Oliveira et al.
P(X > x, Y > y) = P(W1 > x)P(W2 > y)P(W3 > z).
Definition 2 ([43]) For a 3-component system and following the same struc-
ture for a 2-component series system, we have:
1. W1 (t, δ 1 ), W2 (t, δ 2 ), and W3 (t, δ 3 ) represents the fatal shocks, respectively,
to components 1, 2, and 3;
2. W12 (t, δ 12 ), W13 (t, δ 13 ), and W23 (t, δ 23 ) represents the fatal shocks, respec-
tively, to the component pairs (1, 2), (1, 3), and (2, 3);
Series System Reliability: An Unified Approach … 97
P(X > x, Y > y, Z > z) = P(W1 > x)P(W2 > y)P(W3 > z)
× P(W12 > max{x, y})
× P(W13 > max{x, z})
× P(W23 > max{y, z})
× P(W123 > max{x, y, z}).
For the structure of a stress model, also introduced by [12, 22], it is assumed that
a 2-component system is subject to individual and independent stresses, say U1 and
U2 . Besides, the authors suppose that the system has an overall stress source, U3 ,
which is transmitted to both components equally and independently. The stresses for
the components are X 1 = max{U1 , U3 } and X 2 = max{U2 , U3 }. Similar arguments
hold for the multivariate sf of the shock model. Here, the difference in the estimation
of the reliability is, in the case of 2-component, that the reliability function of the
system (T = min{X 1 , X 2 }), under the dependence assumption, is given by
Without loss of generally, suppose a series system with two components. Naturally,
the procedure for a n-component system is analogous. In general, the existing liter-
ature on the reliability of 2-component series systems is based only on the failure
times and the indication of which component had failed. Specifically, we have the
complete lifetime data for one component but only partial information for the other
since it may not have failed yet. In this case, we have that T = min{T1 , T2 }.
Now, given a random sample (T1 , . . . , Tm ) of size m from a series system with
two components, one can define the following variable
1, for T1i < T2i
δi =
0, for T1i T2i .
Hence, the individual contribution of T1i and T2i for the likelihood function of
vector β is given by
∂ S(t1i , t2i )
P(T1i = t1i , T2i > t2i ) = − , if δi = 1,
∂t1i
and
∂ S(t1i , t2i )
P(T1i > t1i , T2i = t2i ) = − , if δi = 0.
∂t2i
m
∂ S(t1i , t2i ) δi ∂ S(t1i , t2i ) 1−δi
L(β) = − − . (1)
i=1
∂t1i ∂t2i
Note that Eq. (1) could have no compact form depending on the probability dis-
tributions adopted for each component. This implies that the maximum likelihood
estimator and the Fisher information matrix of β could only be obtained using opti-
mization algorithms. Nevertheless, from a fully Bayesian perspective, one may use
a Gibbs Sampling (GS) algorithm, which is a Markov Chain Monte Carlo (MCMC)
method (see [45, 46]) for obtaining posterior inferences.
Series System Reliability: An Unified Approach … 99
3 Probabilistic Models
W1 ∼ Ray(β1 )
z2
2 x2
BR-II W2 ∼ Ray(β2 ) 1 − exp − 1 − exp − i 2 FR (xi , βi )FR (xi , β3 )
2β32 i=1
2βi
W3 ∼ Ray(β3 )
W1 ∼ Lin(β1 ) 2
β x
BL-II W2 ∼ Lin(β2 ) 1 − e−β3 z 1− 1+ i i e−βi xi FL (xi , βi )FE (xi , β3 )
1 + βi
W3 ∼ E x p(β3 ) i=1
This section presents an application of the proposed methodology using data from a
2-component series system. For each component, we assume that the failure times can
be approximated by an Exponential distribution with parameters β = 0.07 for the first
component and β = 0.05 for the second one. The system’s reliability was obtained
for an initial operation time of 50 h using the classical methodology, the product-law
of reliabilities (PLR), that is, assuming independence between the lifetimes [13].
Here, the main goal is to evaluate the system reliability using the proposed models
and the classical PLR methodology. In this way, we may verify if the proposed
models better accommodate the dependence structure related to a fatal shock or
overall stress that affects both components simultaneously. To perform a Bayesian
analysis, we have chosen noninformative Gamma prior distributions (α j = κ j =
0.001) for β j ( j = 1, 2) and an informative Gamma prior distribution with α3 = T
and κ3 = V(T ), where T is the random variable related to the operation time for
the fatal shock models, T is the sample mean and V(T ) is the sample variance.
Moreover, we have adopt a noninformative Uniform prior distributions, U (0, 10),
for β j ( j = 1, 2, 3) for the stress models to prevent computational instability. To
evaluate the series systems reliability, we have considered the parameter estimates to
compute the full system’s reliability for an initial operation time of 50 h. The obtained
results are presented in Figs. 1, 2 and 3.
From Figs. 1, 2 and 3, one may notice that only fatal shock models provide accurate
estimates for the system reliability compared to the PLR methodology. This result
may be related to the fact that we have considered a series system, and the results
could be different considering a parallel one; that is, maybe only the stress models
could provide similar estimates for the system reliability due to the structure from
which the stress models are derived.
One can notice that the BL-I model and the PLR method provide approximate
results, which can be related to the flexibility of the Lindley distribution on reliability
analysis (see [49]). For the BE-I model, the obtained result is quite similar to the
Series System Reliability: An Unified Approach … 101
0.8 0.8
R(t)
R(t)
0.5 0.5
^
^
0.2 0.2
0.0 0.0
0 12 25 38 50 0 12 25 38 50
Time (Hours) Time (Hours)
Fig. 1 System reliability for an initial operation time of 50 h considering the BE-I (left-panel), the
BE-II (right-panel), and the PLR methodology
0.8 0.8
R(t)
R(t)
0.5 0.5
^
0.2 0.2
0.0 0.0
0 12 25 38 50 0 12 25 38 50
Time (Hours) Time (Hours)
Fig. 2 System reliability for an initial operation time of 50 h considering the BR-I (left-panel), the
BR-II (right-panel), and the PLR methodology
0.8 0.8
R(t)
R(t)
0.5 0.5
^
0.2 0.2
0.0 0.0
0 12 25 38 50 0 12 25 38 50
Time (Hours) Time (Hours)
Fig. 3 System reliability for an initial operation time of 50 h considering the BL-I (left-panel),
BL-II (right-panel), and the PLR methodology
102 R. P. de Oliveira et al.
5 Conclusion
In this chapter, we considered both stress and fatal shock structures. Based on the
obtained results, we can notice that the fatal shock model led to accurate inferential
results for the full system reliability, even using noninformative priors in the Bayesian
analysis. Also, as the real-data application showed, the proposed methodology could
be applied in many industrial situations where a series system fails due to a common
shock source that destroys both components. Based on the reliability estimates, it is
possible to conclude that using any of the proposed models could be an excellent
alternative to the PLR methodology under the independence assumption. Besides,
other probabilistic models could be assumed for the random variables Wi and Ui (i =
1, 2, 3), which would generate even more flexible models for accommodating other
information of the entire system.
References
1. Hall PL, Strutt JE (2003) Probabilistic physics-of-failure models for component reliabilities
using Monte Carlo simulation and Weibull analysis: a parametric study. Reliab Eng Syst Saf
80(3):233–242
2. Xie L, Zhou J, Hao C (2004) System-level load-strength interference based reliability modeling
of k-out-of-n system. Reliab Eng Syst Saf 84(3):311–317
3. Nandal J, Chauhan SK, Malik SC (2015) Reliability and MTSF evaluation of series and parallel
systems. Int J Stat Reliab Eng 2(1):74–80
4. Chauhan SK, Malik SC (2016) Reliability evaluation of series-parallel and parallel-series sys-
tems for arbitrary values of the parameters. Int J Stat Reliab Eng 3(1):10–19
5. Zhou W, Xiang W, Hong HP (2017) Sensitivity of system reliability of corroding pipelines to
modeling of stochastic growth of corrosion defects. Reliab Eng Syst Saf 167:428–438
6. Chi Z, Chen R, Huang S, Li YF, Bin Z, Zhang W (2020) Multi-state system modeling and
reliability assessment for groups of high-speed train wheels. Reliab Eng Syst Saf 107026
7. Okabe T, Otsuka Y (2020) Proposal of a validation method of failure mode analyses based on
the stress-strength model with a support vector machine. Reliab Eng Syst Saf 107247
8. Torres-Alves GA, Morales-Nápoles O (2020) Reliability analysis of flood defenses: the case
of the Nezahualcoyotl Dike in the Aztec City of Tenochtitlan. Reliab Eng Syst Saf 107057
9. Stefenon SF, Ribeiro MHDM, Nied A, Mariani VC, Coelho LS, Rocha DFM, Grebogi RB,
Ruano AEB (2020) Wavelet group method of data handling for fault prediction in electrical
power insulators. Int J Electr Power Energy Syst 123:106269
10. Ribeiro MHDM, Coelho LS (2020) Ensemble approach based on bagging, boosting and stack-
ing for short-term prediction in agribusiness time series. Appl Softw Comput 86:105837
Series System Reliability: An Unified Approach … 103
11. Yousefi N, Coit DW, Song S (2020) Reliability analysis of systems considering clusters of
dependent degrading components. Reliab Eng Syst Saf 202:107005
12. Oliveira RP, Achcar JA, Mazucheli J, Bertoli W (2021) A new class of bivariate Lindley
distributions based on stress and shock models and some of their reliability properties. Reliab
Eng Syst Saf 211:107528
13. Aggarwal KK (2012) Reliability engineering, vol 3. Springer Science & Business Media
14. Jensen PA, Bard JF (2003) Operations research models and methods, vol 1. Wiley
15. Thoft-Christensen P, Murotsu Y (1986) Reliability of series systems. In: Application of struc-
tural systems reliability theory. Springer, Berlin, Heidelberg
16. Li J, Coit DW, Elsayed EA (2011) Reliability modeling of a series system with correlated
or dependent component degradation processes. In: 2011 international conference on quality,
reliability, risk, maintenance, and safety engineering. IEEE, pp 388–393
17. Hu L, Yue D, Li J (2012) Availability analysis and design optimization for a repairable series-
parallel system with failure dependencies. Int J Innov Comput Inf Control 8(10):6693–6705
18. Park JH (2017) Time-dependent reliability of wireless networks with dependent failures. Reliab
Eng Syst Saf 165:47–61
19. Jafary B, Mele A, Fiondella L (2020) Component-based system reliability subject to positive
and negative correlation. Reliab Eng Syst Saf 107058
20. Gumbel EJ (1960) Bivariate exponential distributions. J Am Stat Assoc 55(292):698–707
21. Freund JE (1961) A bivariate extension of the exponential distribution. J Am Stat Assoc
56(296):971–977
22. Marshall AW, Olkin I (1967) A generalized bivariate exponential distribution. J Appl Probab
4(02):291–302
23. Marshall AW, Olkin I (1967) A multivariate exponential distribution. J Am Stat Assoc
62(317):30–44
24. Downton F (1970) Bivariate exponential distributions in reliability theory. J R Stat Soc Ser B
(Methodol) 32(3):408–417
25. Hawkes AG (1972) A bivariate exponential distribution with applications to reliability. J R Stat
Soc Ser B (Methodol) 129–131
26. Block HW, Basu AP (1974) A continuous, bivariate exponential extension. J Am Stat Assoc
69(348):1031–1037
27. Sarkar SK (1987) A continuous bivariate exponential distribution. J Am Stat Assoc
82(398):667–675
28. Arnold BC, Strauss D (1988) Bivariate distributions with exponential conditionals. J Am Stat
Assoc 83(402):522–527
29. Burr IW (1968) On a general system of distributions. J Am Stat Assoc 63
30. Burr IW (1973) Parameters for a general system of distributions to match a grid of α3 and α4 .
Commun Stat Theory Methods 2
31. Singh C, Billinton R (1977) System reliability, modelling and evaluation, vol 769. Hutchinson
London
32. Blanchard BS, Fabrycky WJ, Fabrycky WJ (1990) Systems engineering and analysis, vol 4.
Prentice Hall Englewood Cliffs, NJ
33. Chao MT, Fu JC (1991) The reliability of a large series system under Markov structure. Adv
Appl Probab 23(4):894–908
34. Mori Y, Ellingwood BR (1993) Time-dependent system reliability analysis by adaptive impor-
tance sampling. Struct Saf 12(1):59–73
35. Hulting FL, Robinson JA (1994) The reliability of a series system of repairable subsystems: a
Bayesian approach. Naval Res Logist 41(4):483–506
36. Zhang T, Horigome M (2001) Availability and reliability of system with dependent components
and time-varying failure and repair rates. IEEE Trans Reliab 50(2):151–158
37. Rausand M, Arnljot HÃ (2004) System reliability theory: models, statistical methods, and
applications, vol 396. Wiley
38. Kołowrocki K (2008) Reliability of large systems. Wiley Online Library
104 R. P. de Oliveira et al.
39. Eryilmaz S, Tank F (2012) On reliability analysis of a two-dependent-unit series system with
a standby unit. Appl Math Comput 218(15):7792–7797
40. Zhen H, Mahadevan S (2015) Time-dependent system reliability analysis using random field
discretization. J Mech Des 137(10):101404
41. Oliveira RP, Achcar JA (2019) Use of Basu-Dhar bivariate geometric distribution in the analysis
of the reliability of component series systems. Int J Quality Reliab Manag
42. Chowdhury S, Kundu A (2017) Stochastic comparison of parallel systems with log-Lindley
distributed components. Oper Res Lett 45(3):199–205
43. de Oliveira RP (2019) Multivariate lifetime models to evaluate long-term survivors in medical
studies. PhD thesis, University of São Paulo
44. Lawless JF (1982) Statistical models and methods for lifetime data. Wiley Series in probability
and mathematical statistics. Wiley, New York
45. Gelfand AE, Smith AFM (1990) Sampling-based approaches to calculating marginal densities.
J Am Stat Assoc 85(410):398–409
46. Chib S, Greenberg E (1995) Understanding the Metropolis-Hastings algorithm. Am Stat
49(4):327–335
47. R Core Team (2020) R: a language and environment for statistical computing. R foundation
for statistical computing, Vienna, Austria
48. Su Y-S, Yajima M (2012) R2jags: a package for running jags from R. R package version
0.03-08. http://CRAN.R-project.org/package=R2jags
49. Ghitany ME, Atieh B, Nadarajah S (2008) Lindley distribution and its application. Math Com-
put Simul 78(4):493–506
The New Attempt of Network Reliability
in the Assessment of Business Activities
for Precision Management
Shin-Guang Chen
Abstract Network reliability has been applied to many real-world applications. The
popular areas are traffic planning, computer network planning, power transmission
planning, etc. However, in the social applications, there are still very limited cases
reported. This chapter is planned to introduce the new attempt of applying network
reliability in the assessment of business activities. Such attempt is a step toward
precision management in business administration. Based on the capability of con-
ducting precision management, a great deal of improvements in business practices
can be taken place. Precision performance evaluation for individuals or groups are
presented, and a case study is illustrated to show the approach proposed in this
chapter.
1 Introduction
Network reliability has been applied to many real-world applications. The popular
areas are traffic planning, computer network planning, power transmission planning,
etc. Since 1954, the problem of maximum flow has attracted widespread attention in
the world [22]. They have also expanded to many other application areas [13]. Firstly,
the reliability problem of no flow in the binary state network is discussed by Aggarwal
et al. [3]. Lee [16] extended the problem to include cases with streams. The binary-
state cases for network reliability problem were solved firstly with minimum path
(MP) by Aggarwal et al. [2]. The multi-state cases for reliability network problem
were solved firstly by Xue [24]. The multi-state network (MSN) is a network whose
flow has a random state. Lin et al. [17] use MP or minimum cut (MC) [14] to illustrate
the reliability calculation of MSN. They also set up three stages in these calculations:
(a) Search for all MPs [10, 12, 23] / MCs [1, 8, 15, 28]; (b) Search for all d-MP
[19, 25] / d-MC [26] from these MP/MC; (c) From these d-MPs [6, 7, 30] / d-MCs
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 105
M. Ram and H. Pham (eds.), Reliability and Maintainability Assessment
of Industrial Systems, Springer Series in Reliability Engineering,
https://doi.org/10.1007/978-3-030-93623-5_6
106 S.-G. Chen
[12, 18], we calculate the joint probability of them. The staged approach (SA) was
stream-lined by Lin [19] and proposed a simpler and better method to evaluate the
reliability MSNs.
Now, SA has been extensively studied in the literature. A comprehensive discus-
sion can be referred to the article [4]. A good improvement in algorithms can be
referred to the article [11]. Based on the progresses, we can now apply the network
reliability to larger area such as social applications to show the impressive power of
network reliability.
The following sections are arranged as follows. The mathematical requirements
for SA are presented in Sect. 2. The details of SA [11] is given in Sect. 3. The per-
formance evaluation for individuals and groups in a process is given in Sect. 4. The
bonus dispatching in a process is given in Sect. 5. A case study is presented in Sect. 6.
Then, we draw the conclusion of this chapter in Sect. 7.
2 Mathematical Preliminaries
Assume that the process is a MSN, we have (E, V ) been a MSN, where E is the set
of edges representing the precedence relationships between individuals in a process,
V = {ai |1 ≤ i ≤ N } is the set of nodes representing the individuals in a process. Let
M = (m 1 , m 2 , . . . , m N ) be a vector with m i (an integer) being the maximal capacity
of ai . The process is assumed to satisfy the following assumptions.
1. The capacity of ai is an integer-valued random variable which takes values from
the set {0, 1, 2, …, m i } according to a given distribution μi .
2. The edges are perfect.
3. Flow in the process satisfies the flow-conservation law [13].
4. The states in nodes are statistically independent from each other.
From the input to the output of the process, we call ρ1 , ρ2 , …, ρz the MPs. The process
is described by two variables at time t: the status variable St = (s1t , s2t , . . . , s N t ) and
the stream variable Ft = ( f 1t , f 2t , . . . , f zt ), where sit denotes the current capacity
state of ai at time t and f jt denotes the current work flow on ρ j at time t. Then, Ft
is feasible if and only if
z
{ f jt |ai ∈ ρ j } ≤ m i , for i = 1, 2, . . . , N . (1)
j=1
The above equation states that the total stream of ai have a maximal limit. Then,
we have κ Mt ≡ {Ft |Ft is feasible under M}.
The New Attempt of Network Reliability … 107
z
{ f jt |ai ∈ ρ j } ≤ sit , for i = 1, 2, . . . , N . (2)
j=1
Let Ft = {Ft |Ft ∈ κ M and complies with Eq. (3)}. The below lemmas [19] hold.
z
sit = { f jt |ai ∈ ρ j }, for each i = 1, 2, . . . , N . (4)
j=1
Given a demand dt at time t, the reliability (ωdt ) is named the probability of the
network such that the maximal flow is greater than or equal to dt at time t, i.e., ωdt ≡
Pr{St | St ≥ dt }. For obtaining ωdt , we search the minimum vectors in {St | St ≥
dt }. A minimum vector (St ) is a d-MP iff (a) St ≥ dt and (b) Wt < dt , ∀Wt = St
such that Wt < St , where Wt ≤ St iff w jt ≤ s jt for each j = 1, 2, . . . , n and Wt < St
if and only if Wt ≤ St and w jt < s jt for at least one j. Suppose there are totally q
d-MPs for dt : S1t , S2t , . . . , Sqt . The probability (i.e., the reliability) is derived as
q
ωdt = Pr{ {St |St ≥ Skt }}. (5)
k=1
3 The SA Approach
The SA approach includes three stages of computation [4]. Stage one is to search for
all MPs in a network. Stage two is to search for all d-MPs under the MPs. Stage three
is to calculate the union probability (the reliability) under all d-MPs. The following
subsections will briefly introduce the stages listed above.
The most popular method to search for MPs can be referred to [10]. It can be shown
in the following.
Lemma 3 ρ j ∪ v creates a cycle if and only if ∃vk ∈ ρ j and vk = v.
Lemma 4 (Minimal Path detection) ρ is a MP iff ρ has no cycles, and the cor-
responding w = s ∪ v1 ∪ v2 ∪ . . . ∪ vk ∪ t, where vk ∈ B − (S ∪ T ), s ∈ S, and
t ∈ T.
The algorithm is briefly described as follows.
Since all undirected networks can be transformed to directed ones, the following
efficient algorithm [10] for directed networks to find MPs is introduced here.
Algorithm 1: // for minimum paths
1. Get L. // Get the network data.
2. While v ∈ L(0) = S, and v = φ, select a candidate.
3. R = v ∪ L − S. // Temporary list R (starting from 0).
4. Set j = 0, ρ = w = φ. // Initialization variables.
5. If j < 0, then ρ = ∪2≤l {ρ(l)}, Got MP and do backtrack.
6. Else begin v = R( j).
7. If w(k) = v holds, backtrack.
8. Else if j ∈ v and { j} = φ, ρ = { j} ∪ ρ, w = {v} ∪ w, and run Step 6.
9. Else do backtrack.
End.
end While.
A popular ARPA network [29] is given in Fig. 1. Table 1 shows the results. The
efficiency of the approach is very good.
The New Attempt of Network Reliability … 109
4
a c
1 8
3 5
s 7 t
2 9
6
b d
Fig. 1 The ARPA network
An elegant method has been reported in [9]. This method results in the birth of a
better efficiency approach [11]. This subsection will briefly describe the approach in
[11]. A non-negative linear Diophantine equation has the following form:
c1 f 1 + c2 f 2 + · · · + cz f z = d, ci , f i , d ∈ Z+ . (6)
Lin and Chen [20] pointed out a very efficient method for d-MPs verification,
which was pioneered by Yeh [27]. This creative method can be used along with the
exact enumeration method, and creates a better method to find out d-MPs in the
network. According to Lin and Chen [20], the following lemmas hold.
110 S.-G. Chen
One can find all cycles or loops under Property 1 and Lemma 7. The following
lemmas also hold [27].
Assume that ⊕ is a union operation for vectors. Let L be the network’s linked path
structure (LPS) [10]. Chen and Lin proposed the following algorithm.
Algorithm 2:
1. Let D = {1, 1, . . . , 1}.
2. Set r1 ⊕ r2 ⊕ . . . ⊕ rq ≡ D.
3. { f i | f i = #(ri ) for all i}. // #(ri ) is the count
of 1’s.
4. if F meets the equations as follows, zj=1 { f j |ai ∈ ρ j } ≤ m i , zj=1 f j = d,
for 1 ≤ i ≤ N ,
then si = zj=1 { f j |ai ∈ ρ j }, for 1 ≤ i ≤ N ,
5. for 1 ≤ i ≤ N do
6. if si > 0, then I = I ∪ {i}.
endfor
7. for i ∈ I do
8. λ = φ. // the path trace.
9. if not(L i ∈ λ), then keep on tracing L i until to the sinks, then continue, else
λ = λ ∪ Li .
10. else backtrack to Step 2.
endfor
11. min = min ∪ {S}.
12. Backtrack to Step 2.
13. Return. //Finish search.
A simple network in Fig. 2 is presented for the explanation of the proposed approach.
Four MPs are found: ρ1 = {a1 , a3 }, ρ2 = {a2 , a4 }, ρ3 = {a1 , a5 , a4 }, and ρ4 = {a2 ,
a6 , a3 }. The corresponding information flows are F = { f 1 , f 2 , f 3 , f 4 }. The observed
data transmission distribution of the links are listed in Table 2. The demand of data
sink is set to 6.
The New Attempt of Network Reliability … 111
a6 a2
a3
a5
a4
Data sink
Table 2 The observed data transmission distributions of the links for Fig. 2
Arcs The distributions
0 1 2 3 4 5 6
a1 0.000 0.000 0.001 0.008 0.069 0.316 0.606
a2 0.000 0.000 0.006 0.061 0.309 0.624 0.000
a3 0.000 0.000 0.000 0.006 0.055 0.292 0.647
a4 0.000 0.000 0.002 0.019 0.114 0.369 0.497
a5 0.000 0.000 0.008 0.073 0.328 0.590 0.000
a6 0.000 0.000 0.001 0.008 0.069 0.316 0.606
Deriving the union probability of d-MPs is a way to obtain the network reliability. We
usually employ the inclusion-exclusion principle (IEP) to do so. It can be backtracked
from the idea of Abraham de Moivre (1718) [21]. To explain this concept, Eq. (7)
shows the way of IEP to obtain the probability of {A1 ∪ A2 ∪ A3 }.
Pr{A1 ∪ A2 ∪ A3 } = Pr{A1 }
+ Pr{A2 } − Pr{A1 ∩ A2 }
+ Pr{A3 } − Pr{A2 ∩ A3 }
− Pr{A1 ∩ A3 } + Pr{A1 ∩ A2 ∩ A3 } (8)
= Pr{A1 }
+ Pr{A2 } − Pr{A1 ∩ A2 }
+ Pr{A3 } − Pr{A2 ∩ A3 ∪ A1 ∩ A3 }
Chen [6] reported a reduction approach to speed up the derivation of union proba-
bility. He emphasized the good application for monotonic cases such as the network
reliability problems. That is, the network reliability is to derive the union probability
of d-MPs.
+ +
π(A1 ) = Pr{A
n 1 }, when n = 1,
π(A1 , A2 , . . . , An ) = i=1 (Pr{Ai } (10)
−π(A1 ∩ Ai , A2 ∩ Ai , . . . , Ai−1 ∩ Ai )), when n > 1,
Property 2 Given Q̄ = (q̄1 , q̄2 , . . . , q̄h ) and W̄ = (w̄1 , w̄2 , . . . , w̄h ) the state vec-
tors. Q̄ and W̄ comply with the following rules:
1. Q̄ ≤ W̄ if and only if q̄ j ≤ w̄ j for each j = 1, 2, . . . , h.
2. Q̄ < W̄ if and only if Q̄ ≤ W̄ and q̄ j < w̄ j for at least one j.
Let Q̄ ⊕ W̄ ≡ {ḡi |ḡi = max(q̄i , w̄i ), q̄i ∈ Q̄, w̄i ∈ W̄ , ∀i}. ωdt can be derived by
the following steps.
Algorithm 3: // for union probability
1. d M P = {X 1 , X 2 , ...,X h }.
2. ω = 0.
3. If h = 0, then return ω,
4. For 1 ≤ i ≤ |d M P| do
5. ω2 = Pr {X ≥ X i }.
6. For 1 ≤ j < i do
7. X j,i = X j ⊕ X i .
End (for).
8. Let E = {X j,i |1 ≤ j < i} and Wl , Wk ∈ E.
9. For k ∈ / J and 1 ≤ k ≤ ||E|| do // Reduction process.
10. For l ∈ / J and k < l ≤ ||E|| do
11. If Wl ≤ Wk , then J = J ∪ {k} and go to Step 9.
Else, if Wl > Wk , then J = J ∪ {l}.
End (for).
12. d M P ∗ = d M P ∗ ∪ {Wk }.
End (for).
13. Run Step 1 employing d M P ∗ and got ω3 .
14. ω = ω + ω2 − ω3 .
End (for).
15. got ω.
The New Attempt of Network Reliability … 115
Since St = {s1t , s2t , . . . , s N t } are the capacity vector for the individuals at time t,
we can obtain the empirical throughput distribution t = {μ1t , μ2t , . . . , μ N t ]}. The
definition of the expected throughput at time t for individual is as follows.
ik |1≤k≤t}
max{s
E it = jμit ( j). (12)
j=1
Further, we can define the individual importance in the process. This can be done
by defining the covering set of the individual ai in the process. Let P j = {ρ j |ai ∈ ρ j }
be the MP set involving ai . The covering set ζi of ai is defined by the following.
||ζi ||
ϒi = , (13)
N
where || • || is the number of the member in a set.
By the above definition, we observe the fact that if ϒi = 1.0, it means it is a
critical individual in the process, since it cover all the flows in the process, and ϒi
cannot be 0, because it must at least cover itself.
We can also define the individual contribution in the process. This can be done
by defining the Survivability ωdt,i of ai in the process.
By the above definition, we observe the fact that if Ci = 1.0, it means ai having
most real contribution in the process, since all the time, the flows in the process
cannot exist without him. On the contrary, if Ci = 0.0, it means ai having no real
contribution in the process, since all the time, the flows in the process did not be
influenced without him.
The New Attempt of Network Reliability … 117
Let the process output at time t be t , and the empirical distribution during this
period is μt , then the throughput of the group during this period is defined as the
expected value during this period:
max{i |1≤i≤t}
Along with the above definitions, we get the following assessments for individuals
as well as groups.
Let the individual standard output δit at time t relative to the standard output dt , then
the performance appraisal λit of individual ai can be calculated by the following
formula:
E it
λit = t . (16)
j=1 δi j
t
Therefore, when λit > 1, the individual has over-standard performance; when
λit = 1, the individual has standard performance; when λit < 1, the individual has
below-standard performance.
The group performance assessment λt can be calculated using the following formula:
E t
λt = t . (17)
i=1 di
t
So when λt > 1, the group has over-standard performance; when λt = 1, the group
has standard performance; when λt < 1, the group has below-standard performance.
118 S.-G. Chen
The purpose of bonus dispatching is to distribute the profit of the process at time t
fairly to all individuals in the process. Therefore, there are two calculation methods.
1. Taking the individual contribution as a coefficient, the following formula can be
used to calculate:
i1 = λit Ci . (18)
This method will be affected by the individual capacity, that is, if the individual
is not properly allocated, some bottlenecks will appear. This kind of allocation
can reflect the bottleneck contribution.
2. With individual importance as a coefficient, the following formula can be used to
calculate:
i2 = λit ϒi . (19)
This method will not be affected by the individual capacity, but only reflects the
importance of the process position, that is, the dynamic bottleneck contribution
will not be reflected.
a2 Transporter
Salesman 1 a4
a1
Order Entry a5
Accountant
a3
Salesman 2
Fig. 4 A distribution process network in a well-known company
The New Attempt of Network Reliability … 119
The following features can be learned from the results of the case study on precision
management.
1. The case of this chapter involves the random absence of individual.
2. The performance of individuals in this case has exceeded their standard perfor-
mances, but the probability of achieving the corporate goal is only 0.58. Therefore,
there is a great risk of process shutdown. For example, if someone is absent for a
long time, the process may be shut down.
3. According to the indicators in this case, it can be seen that the risk of the case is due
to improper capacity planning of individuals. It is known from the Ci index that
everyone except a4 has the greatest substantial contribution, which is the reason
for the high risk. According to Chen’s optimal capacity planning scheme [5], the
standard output of a2 , a3 , and a4 should be 4, 4, 1. Therefore, the two salesmen
activities must increase manpower, and the transporter only needs to select the
least capable person. Even abolishing a4 will not affect the overall performance
too much.
120 S.-G. Chen
7 Conclusion
From the perspective of network reliability, this chapter makes an attempt to assess
the process performance by network reliability for precision management. One of
them is done by objectively evaluating the overall process performance, as well as
analyzing the different importance and contribution of each individual in the process.
The results show that the network reliability is very suitable to assess social activities
with process in them.
By the case study, we found that the improvement of process can be facilitated
by precising analysis of the performance of individuals as well as groups. This is the
basis of so-called precision management of business activities.
Future researches can be conducted by inspecting multi-commodities in a process
to show the merits of the theories in network reliability.
Acknowledgements This paper was supported in part by the Ministry of Science and Technology,
Taiwan, Republic of China, under Grant No. MOST 107-2221-E-236-004-MY3.
References
1. Abel U, Bicker R (1982) Determination of all minimal cut-sets between a vertex pair in an
undirected graph. IEEE Trans Reliab R-31:167–171
2. Aggarwal KK, Chopra YC, Bajwa JS (1982) Capacity consideration in reliability analysis of
communication systems. IEEE Trans Reliab 31:177–80
3. Aggarwal KK, Gupta JS, Misra KB (1975) A simple method for reliability evaluation of a
communication system. IEEE Trans Commun 23:563–565
4. Chen SG (2020) Computation in network reliability. In: Pham H (ed) Reliability and statistical
computing. Springer, pp 107–126
5. Chen SG (2012) An optimal capacity assignment for the robust design problem in capacitated
flow networks. Appl Math Model 36(11):5272–5282
6. Chen SG (2014) Reduced recursive inclusion-exclusion principle for the probability of union
events. In: 2014 IEEE international conference on industrial engineering and engineering man-
agement, Selangor, Malaysia, pp 1–3
7. Chen SG (2014) Reduced recursive sum of disjoint product in network reliability. In: 2014 the
20th ISSAT international conference on reliability and quality in design, Seattle, Washington,
USA, pp 170–173
The New Attempt of Network Reliability … 121
8. Chen SG (2015) Search for all MCs with backtracking. Int J Reliab Qual Perform 6(2):101–106
9. Chen SG (2018) A novel approach to search for all solutions of a non-negative linear Diophan-
tine equation. In: 24th ISSAT international conference on reliability and quality in design, pp
55–57
10. Chen SG, Lin YK (2012) Search for all minimal paths in a general large flow network. IEEE
Trans Reliab 61(4):949–956
11. Chen SG, Lin YK (2020) A permutation-and-backtrack approach for reliability evaluation in
multistate information networks. Appl Math Comput 373:125024
12. Colbourn CJ (1987) The combinatorics of network reliability. Oxford University Press, UK
13. Ford LR, Fulkerson DR (1962) Flows in networks. Princeton University Press, NJ
14. Jane CC, Lin JS, Yuan J (1993) On reliability evaluation of a limited-flow network in terms of
minimal cutsets. IEEE Trans Reliab 42:354–361, 368
15. Jasmon GB, Foong KW (1987) A method for evaluating all the minimal cuts of a graph. IEEE
Trans Reliab R-36:538–545
16. Lee SH (1980) Reliability evaluation of a flow network. IEEE Trans Reliab 29:24–26
17. Lin JS, Jane CC, Yuan J (1995) On reliability evaluation of a capacitated-flow network in terms
of minimal pathsets. Networks 25:131–138
18. Lin YK (2001) On reliability evaluation of a stochastic-flow network in terms of minimal cuts.
J Chin Inst Ind Eng 18:49–54
19. Lin YK (2001) A simple algorithm for reliability evaluation of a stochastic-flow network with
node failure. Comput Oper Res 28(13):1277–1285
20. Lin YK, Chen SG (2019) An efficient searching method for minimal path vectors in multi-state
networks. Ann Oper Res 1–12
21. Roberts FS, Tesman B (2009) Applied combinatorics, 2nd edn. CRC Press
22. Schrijver A (2002) On the history of the transportation and maximum flow problems. Math
Program 91(3):437–445
23. Shen Y (1995) A new simple algorithm for enumerating all minimal paths and cuts of a graph.
Microelectron Reliab 35:973–976
24. Xue J (1985) On multistate system analysis. IEEE Trans Reliab 34(4):329–337
25. Yeh WC (2001) A simple algorithm to search for all d-MPs with unreliable nodes. Reliab Eng
Syst Saf 73:49–54
26. Yeh WC (2001) A simple approach to search for all d-MCs of a limited-flow network. Reliab
Eng Syst Saf 71:15–19
27. Yeh WC (2002) A simple method to verify all d-minimal path candidates of a limited-flow
network and its reliability. Int J Adv Manuf Technol 20(1):77–81
28. Yeh WC (2006) A simple algorithm to search for all MCs in networks. Eur J Oper Res 174:1694–
1705
29. Yeh WC (2007) A simple heuristic algorithm for generating all minimal paths. IEEE Trans
Reliab 56(3):488–494
30. Zuo MJ, Tian Z, Huang HZ (2007) An efficient method for reliability evaluation of multistate
networks given all minimal path vectors. IIE Trans 39:811–817
Optimal Design of Reliability Acceptance
Sampling Plans for Multi-stage
Production Process
M. Kumar
1 Introduction
Acceptance sampling plan is one of the oldest techniques in quality control, which
helps to make a decision on the acceptance of a lot which contains a finite or infinite
number of units. It occupies the middle ground between no inspection and 100%
inspection. A complete sampling is often not desirable when the cost of testing is
high, compared to that of passing a defective unit, or if the testing is destructive.
The sampling schemes protect both the producer and consumer. It enables the pro-
ducer to accept the good lot and the consumer to reject the bad lot. The use of
sampling schemes minimize the nonconforming products and maximize the profit
of the industry.
M. Kumar (B)
Department of Mathematics, National Institute of Technology, Calicut, Kerala, India
e-mail: mahesh@nitc.ac.in
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 123
M. Ram and H. Pham (eds.), Reliability and Maintainability Assessment
of Industrial Systems, Springer Series in Reliability Engineering,
https://doi.org/10.1007/978-3-030-93623-5_7
124 M. Kumar
There exists a variety of sampling schemes in the literature. The acceptance sam-
pling plans which incorporate the lifetime information of a product are known as
reliability sampling plans. They are time censored (Type I), failure censored (Type
II), Progressive censored, both time and failure censored (Hybrid) or the mixture of
these censorings. Some of the earlier researchers have studied the design and devel-
opment of the life test sampling plans. The first attempt on the development of the life
test sampling plans was done by Epstein and Sobel [1]. They derived the sampling
plan for exponential distribution under Type I and Type II censoring. The expected
number of failures were obtained in Type I censoring and expected testing time in
Type II. Later, many acceptance sampling plans were considered by many authors
for different distributions. For example, Epstein [2], Jun et al. [3], Bartholomew [4],
Aslam and Jun [5], Azam et al. [6], Sai and Wu [7], Kantam et al. [8], Balamurali and
Usha [9], Srinivasa Rao [10], Aslam [11], Aslam and Jun [12] and Chien-Wei Wu
et al. [13], Vellaisamy and Sankar [14]. In all these works, they have considered the
number of failures. Kumar and Ramyamol [15] have considered reliability accep-
tance sampling plan for exponential distribution in which the lifetime of failed units
is taken into account under repetitive group sampling. Ramyamol and Kumar [16]
have also designed acceptance sampling plans for mixture distributions. However,
the above-mentioned works concentrated on the single stage processes.
Asokan and Balamurali [17] considered multi-attribute acceptance sampling plan
for multiple quality characteristics based on a single sample. Using acceptance sam-
pling plan, Duffuaa et al. [18] developed process targeting model for a product with
two dependent quality characteristics. They determined the first quality character-
istic by setting the first process, whereas the determination of the second quality
characteristic depends on the setting of the two processes. The quality of the prod-
uct is controlled by a single sample inspection plan, where inspection is assumed
to be error free. Moskowitz et al. [19] proposed a multi-stage Bayesian acceptance
sampling model for a serial production process incorporating multiple-type defects
that may occur in each stage. Lee and Tagaras [20] derived an economic acceptance
sampling plan in a complex multi-stage production system. The first stage can be
viewed either as the first manufacturing stage or an incoming lot of components or
raw materials on which an operation is performed at the second stage. The planning
and scheduling of production in a multi-task or multi-stage batch manufacturing
process (for example in the case of industries such as chemical manufacturing, food
processing, and oil refining) was developed and studied in the work carried out by
Francesco Gaglioppa et al. [21]. In this paper, a reliability acceptance sampling plan
is developed for a multi-stage production process, in which units pass sequentially
from Machine N to Machine 1. In real life, several products such as semiconductor
devices and television tubes are processed through a multi-stage production process,
and the literature has addressed acceptance sampling plans for these situations. By
exploring the literature, Aslam et al. [22] found that there is no study on the devel-
opment of an acceptance sampling plan for multiple quality characteristics from a
multi-stage process using the time-truncated life test. They considered the devel-
opment of this sampling plan for the product coming from a two-stage process by
assuming that the lifetime of the product follows the Weibull distribution with known
Optimal Design of Reliability Acceptance Sampling Plans … 125
or unknown shape parameters. They have taken the number of failures as the criterion
for accepting the lot and ignored the failure time of units tested. The cost of testing is
also not addressed in their work. The “number of failures criterion” is not enough to
judge whether a lot is good or bad. For example, an accepted lot (for instance, a lot of
certain brand of bulbs), based on this information, may actually perform bad when
the units are deployed in the market, and thereby, could receive negative feedbacks
from the consumers. This will result a huge loss to the industry which produces such
units. In the light of the above, we propose to develop a complete acceptance sampling
plan which overcomes the drawbacks in the existing sampling plan for multi-stage
process. We include the lifetime information of failed units, instead of just depending
upon “number of failures”, and make an attempt to develop a reliability acceptance
sampling plan for multi-stage process which minimizes the expected total testing
cost. The rest of the chapter is organized as follows: In Sect. 2, we define a single
sampling plan for multi-stage production process in which lifetime of units in each
stage follows the exponential distribution with different parameters. In Sect. 3, we
consider double sampling plan. Section 4 discusses the results obtained under single
and double sampling plans. In Sect. 5, comparisons of proposed single and double
acceptance sampling plans are made. Besides, we compare our double sampling plan
with the existing plan (see [22]). In Sect. 6, we have included sensitivity analysis of
expected total testing cost. Case studies are presented in Sect. 7, and in Sect. 8, some
conclusive remarks are presented.
where xi is the value of the random variable X i , which represents the lifetime of ith
quality characteristics. The formulation of the acceptance sampling plan for the N
stage process is given in the following steps 1 and 2.
1. At the ith, i = 1, 2, . . . , N , stage select a random sample of size n i from the
submitted lot and put them on a test. Continue the test until the total number of
failures observed is ri . Let, X 1,ni , X 2,ni , . . . , X ri,ni be the lifetime of failed units
in the increasing order.
126 M. Kumar
Let Ti denote the time to get ri failures from a sample of size n i at the ith stage.
Then Ti is a random variable with pdf
n!e(n−r +1)x/θi
[1 − e−x/θi ]ri −1 , x > 0, θi > 0
gri ,ni (x) = (r −1)!(n−r )!θi
0, otherwise.
ri
1
E(Ti ) = θi
j=1
ni − j + 1
(See [2]). Next, we calculate the expected total testing time based on the Bernoulli
random variables as follows:
Let
1, if the ith quality is acceptable
Zi = (2)
0, otherwise.
N i−1
Hence the expected total testing time, E(T ) = i=1 k=0 E(Z k ) E(Ti )
N i−1 ri
= i=1 k=0 Pk θi j=1 n i − j+1 , P0 = 1, Z 0 = 1, since Z k s and Ti s are inde-
1
pendent.
In the subsequent sections, we propose the design of optimal acceptance sampling
plan which minimizes the expected total testing cost. It is to be noted that the final
total cost in conducting the proposed acceptance sampling plan will generally consist
of installation cost of installing all the test units at the beginning of the test, test units
cost and operational cost (salary of operators, utility, depreciation of test equipment
and so on), in addition to expected total testing cost. Note that the final total cost can
be computed once the minimum expected total testing cost is obtained.
such that
P(Type I error at acceptable quality level) ≤ α (4)
128 M. Kumar
such that
n
Pi ≥ 1 − α, θi ≥ θi∗ , ∀1 ≤ i ≤ N (7)
i=1
n
Pi ≤ β, θi ≤ θi0 , for some 1 ≤ i ≤ N . (8)
i=1
such that
N
Min Pi ≥ 1 − α, θi ≥ θi∗ , ∀1 ≤ i ≤ N (10)
(θ1 ,θ2 ,...,θn )
i=1
N
Max Pi ≤ β, θi ≤ θi0 , for some 1 ≤ i ≤ N , (11)
(θ1 ,θ2 ,...,θn )
i=1
where α and β are producer’s and consumer’s risk respectively. Note that for an
N stage process one has to find 2N parameters, (r1 , t1 ), (r2 , t2 ), . . . , (r N , t N ). The
problem is intractable when N ≥ 3. Hence we solve our problem for the case N = 2.
Hence Problem Q 3 can be written as Problem Q 4 as follows:
⎡ ⎤
2 i−1
ri
1
Min ⎣ Pk θi ⎦ (12)
i=1 k=0 j=1
ni − j + 1
Optimal Design of Reliability Acceptance Sampling Plans … 129
such that
2
Min Pi ≥ 1 − α, θi ≥ θi∗ , ∀1 ≤ i ≤ 2 (13)
(θ1 ,θ2
i=1
2
Max Pi ≤ β, θi ≤ θi0 , for some 1 ≤ i ≤ 2. (14)
(θ1 ,θ2 )
i=1
2
Observe that, i=1 Pi on the left-hand side of the inequalities (13) and (14) are
all increasing in θi . The minimum i=1 2
Pi occurs at (θ1∗ , θ2∗ ). Consider the feasible
regiondescribed in (14). Divide it into three parts
as follows:
A1 = (θ1 , θ2 ) : θ1 ≤ θ10 and θ2 ≤ θ20 , A2 = (θ1 , θ2 ) : θ1 ≥ θ10 and θ2 ≤ θ20 and
A3 = (θ1 , θ2 ) : θ1 ≤ θ10 and θ2 ≥ θ20 .
The regions A2 and A3 are unbounded, hence the Problem Q 4 cannot be solved.
Therefore, we consider only the bounded region A1 and note that the point of maxi-
mum is (θ10 , θ20 ). So the final optimization problem becomes Problem Q 5 :
⎡ ⎤
2 i−1
ri
1
Min ⎣ Pk θi ⎦ Ci (15)
i=1 k=0 j=1
ni − j + 1
such that
2
Pi ≥ 1 − α at (θ1∗ , θ2∗ ) (16)
i=1
2
Pi ≤ β at (θ10 , θ20 ). (17)
i=1
A double sampling plan, in general, is more difficult to construct and hence need
not be easy for implementation as compared to the single sampling plan. However,
it may give similar levels of the consumer’s and the producer’s risk and requires
less sampling in the long run than a single sampling plan. We consider the double
sampling plan for multi-stage process which is defined as follows. Here one may
note that each stage can have two samples. The DSP algorithm is given as follows.
For 1 ≤ i ≤ N :
130 M. Kumar
(1) (2)
Here θ̂ri,ni and θ̂ri,ni are independent random variables.
(1) (2)
Denote θ̂ri,ni = X and θ̂ri,ni = Y . Then the pdf of X (see [1]) is given by
( 2rθii )ri 2xri Γ ri exp( −rθii x ), x > 0, θi > 0
ri−1
f (x) = (18)
0, otherwise.
Note that Y also has the same pdf. The joint pdf of X and Y is f (x) f (y) since X
and Y are independent.
Thus
Optimal Design of Reliability Acceptance Sampling Plans … 131
Now, the independence of all the stages gives the total acceptance probability, P =
N
i=1 Pi . As in the single sampling plan, we find the plan parameters (ri , ti1 , ti2 , ), i =
1, 2, . . . , N by formulating an optimization problem which minimizes the expected
total testing cost at acceptable quality level satisfying the Type I and Type II error
constraints.
(1) (1) (2)
We have, E(Z i ) = P(θ̂ri,ni ≥ ti2 ) + P((ti1 ≤ θ̂ri,ni ≤ ti2 ) ∩ (θ̂ri,ni + θ̂ri,ni ≥
2ti2 )) = Pi , where Z i is the Bernoulli random variable defined in Sect. 2.
Let Ti be the random testing time to get ri failures from a sample of size
n i . Then the random testing time for the ith stage, i = 1, 2, . . . , Nin the DSP
(1)
is (1 + Pi1 )Ti , where Pi1 = P((ti1 ≤ θ̂ri,ni ≤ ti2 ). Also E(Ti ) = θi rj=1 i 1
n i − j+1
.
Hence the total random testing time, T = (1 + P11 )T1 + (1 + P21 )Z 1 T2 + · · · +
(1 + PN 1 )(Z n−1 , . . . , Z 1 )TN .
If Ci is the cost of testing a sample of size n i for unit time in the ith stage, then the
N i−1 ri
k=0 Pk θi (1 + Pi1 )
1
expected total testing cost is i=1 j=1 n i − j+1 C i , where
−ri ti1 −ri ti2
i −1 e θi ( riθtii1 )m r t
e θi ( iθ i2 )m
P0 = 1, Pi1 = rm=0 m!
− m!
i
.
such that
N
Pi ≥ 1 − α, θi ≥ θi∗ , ∀1 ≤ i ≤ N (20)
i=1
132 M. Kumar
N
Pi ≤ β, θi ≤ θi0 , for some 1 ≤ i ≤ N . (21)
i=1
N
Min Pi ≥ 1 − α, θi ≥ θi∗ , ∀1 ≤ i ≤ N (22)
(θ1 ,θ2 ,...,θn )
i=1
N
Max Pi ≤ β, θi ≤ θi0 , for some 1 ≤ i ≤ N . (23)
(θ1 ,θ2 ,...,θn )
i=1
For an N-stage process, one has to find 3N parameters namely (ri , ti1 , ti2 ), i =
1, 2, . . . , N for the implementation of DSP. The problem is hard to deal with for
N ≥ 3. Hence we solve our problem for the case N = 2. Hence the Problem can be
rewritten as Problem Q 7 :
⎡ ⎤
2 i−1
ri
1
Min ⎣ Pk θi (1 + li ) ⎦ Ci (24)
i=1 k=0 j=1
ni − j + 1
such that
2
Min Pi ≥ 1 − α, θi ≥ θi∗ , ∀1 ≤ i ≤ 2 (25)
(θ1 ,θ2 )
i=1
2
Max Pi ≤ β, θi ≤ θi0 , for some 1 ≤ i ≤ 2. (26)
(θ1 ,θ2 )
i=1
such that
2
Pi ≥ 1 − α at (θ1∗ , θ2∗ ) (28)
i=1
Optimal Design of Reliability Acceptance Sampling Plans … 133
2
Pi ≤ β at (θ10 , θ20 ). (29)
i=1
The optimization can be solved using genetic algorithm solver in Matlab. The exam-
ples are mentioned in Table 2. It is observed from Tables 3 and 4 that a change in
sample size will effect expected total testing time and cost.
In this section, we present some numerical results of single and double sampling
plans in Tables 1 and 2 respectively. Various comparisons of results based on SSP
(ri , ti ) and DSP (ri , ti1 , ti2 ) are shown in Tables 3 and 4.
First, we compare SSP (ri , ti ) and DSP (ri , ti1 , ti2 ). Next, DSP (ri , ti1 , ti2 ) is com-
pared with the existing plan in the literature [22]. One may observe that in Aslam et
al. [22], authors have considered the quality ratio, and for the different quality ratios,
minimum sample number is calculated. The main drawback of this work is that the
plan parameters are calculated using quality ratio which ignores the actual measured
lifetime of the product for a specified period. Our DSP (ri , ti1 , ti2 ) based on the life-
time of the product, one can choose the sample size, where the units are tested for their
failure, and the testing cost is changing according to the lifetime of the products. For
example, in Aslam et al. [22], when α = 0.05, β = 0.1, ratio = 2 and a = 0.5, their
plan gives an average sample number of 41.66, and the lot is accepted or rejected only
based upon the number of defectives obtained. Hence the test plan ignores the actual
lifetime of the units in the lot. The total testing time involved is 1200 units. Next,
we consider our DSP (ri , ti1 , ti2 ) problem. Let, α = 0.05, β = 0.1, θ1∗ = 100, θ10 =
30, θ2∗ = 300, θ20 = 100, n 1 = 15, n 2 = 18, will give an expected total testing cost
114.2. Note that the actual testing cost may even be less than 114.2. It is observed that
the proposed plan DSP (ri , ti1 , ti2 ) has an advantage of savings in expected testing
costs of about 50%.
In this section, we compare our single sampling and double sampling plans. A com-
parisions of our plans are done with existing sampling plans in the literature (see
[22]). It is seen that our plans perform better as compared to the existing plans in
terms of testing costs incurred (see Tables 3, 4, 5 and 6).
134 M. Kumar
Table 3 Comparison between single (SSP (ri , ti )) and double (DSP (ri , ti1 , ti2 )) sampling plans
Comparisons of SSP(ri , ti ) and DSP (ri , ti1 , ti2 ) for various choices of (θ1∗ , θ10 ),
(θ2∗ , θ20 ), (α, β) and C1 = C2 = 1
(α, β) (θ1∗ , θ10 ) (θ2∗ , θ20 ) (n 1 , n 2 ) ETC under ETC under
single double
sampling sampling
0.1, 0.25 1500, 700 900, 500 20, 20 1711 978.3
0.05, 0.05 100, 30 300, 100 15, 18 256.49 114.2
0.05, 0.05 500, 100 300, 100 10, 10 636.4034 233.544
Table 4 Comparison between double sampling plan DSP (ri , ti1 , ti2 ) and the existing sampling plan Aslam et al. [22]
Comparisons of double sampling plan with the existing plan for various choices of
(θ1∗ , θ10 ), (θ2∗ , θ20 ), (α, β) and C1 = C2 = 1
(α, β) (θ1∗ , θ10 ) (θ2∗ , θ20 ) (n 1 , n 2 ) ETC under double expected total testing cost Average sample number
sampling for existing plan, a=0.5 for existing plan
and ratio=2
0.05, 0.25 1500, 700 900, 500 20, 20 978.3 1200 40.13
0.05, 0.05 100, 30 300, 100 15, 18 114.2 200 65.66
Optimal Design of Reliability Acceptance Sampling Plans …
Table 5 Effect of change in sample number on cost in single sampling plan (SSP (ri , ti ))
The values of ETC for various choices of (θ0 , θ1 ), (α, β) and C = 1
(α, β) (θ1∗ , θ10 ) (θ2∗ , θ20 ) (n 1 , n 2 ) t1 t2 (r1 , r2 ) E T C at
θ1∗ , θ2∗
0.1, 0.25 1500, 700 900, 500 15, 10 850 400 12, 1 3086.9
0.1, 0.25 1500, 700 900, 500 25, 10 850 400 12, 1 1392.6
0.1, 0.25 1500, 700 900, 500 15, 20 850 400 12, 1 2953.3
0.1, 0.05 1000, 200 300, 100 15, 25 550 200 9, 10 1272.8
0.1, 0.05 1000, 200 300, 100 25, 25 550 200 9, 10 711.6130
0.1, 0.05 1000, 200 300, 100 15, 15 550 200 9, 10 1474.9
0.05, 0.01 100, 30 300, 100 10, 12 50 100 7, 5 365.1710
0.05, 0.01 100, 30 300, 100 15, 20 50 100 7, 5 193.7672
0.05, 0.01 100, 30 300, 100 25, 20 50 100 7, 5 156.4667
Table 6 Effect of change in sample number on cost in double sampling plan DSP (ri , ti1 , ti2 )
The values of ETC for various choices of (θ1∗ , θ10 ), (θ2∗ , θ20 ), (α, β) and C1 = C2 = 1
(α, β) (θ1∗ , θ10 ) (θ2∗ , θ20 ) (n 1 , n 2 ) (t11 , t12 ) (t21 , t22 ) (r1 , r2 ) E T C at
θ1∗ , θ2∗
0.1, 0.05 1500, 500 900, 300 20, 20 275, 650 65, 485 4, 3 802.96
0.1, 0.05 1500, 500 900, 300 10, 20 275, 650 65, 485 4, 3 1439.30
0.1, 0.05 1500, 500 900, 300 10, 10 275, 650 65, 485 4, 3 1709.41
0.1, 0.05 1000, 300 300, 50 15, 10 99, 387 18, 98 2, 2 409.699
0.1, 0.05 1000, 300 300, 50 25, 10 99, 387 18, 98 2, 2 287.9
0.1, 0.05 1000, 300 300, 50 5, 15 99, 387 18, 98 2, 2 1003.8
0.1, 0.25 300, 80 800, 350 10, 15 17, 96 71, 326 2, 2 412.442
0.1, 0.25 300, 80 800, 350 15, 15 17, 96 71, 326 2, 2 365.79
0.1, 0.25 300, 80 800, 350 15, 25 17, 96 71, 326 2, 2 256.06
0.1, 0.25 300, 80 300, 80 10, 15 14, 97 11, 57 2, 1 204.76
0.1, 0.25 300, 80 300, 80 5, 5 14, 97 11, 57 2, 1 471.02
0.1, 0.25 300, 80 300, 80 5, 25 14, 97 11, 57 2, 1 323.4
0.15, 0.15 700, 90 400, 100 10, 15 25, 154 11, 91 1, 1 267.05
0.15, 0.15 700, 90 400, 100 10, 5 25, 154 11, 91 1, 1 416.5
138
137 θ*1=150,θ*2=90
θ01=40,θ02=30
136 β=0.1
Expected total testing cost
135
134
133
132
131
130
129
0.05 0.055 0.06 0.065 0.07 0.075 0.08 0.085 0.09 0.095
α
We observe from the above sensitivity analysis studies that the expected total testing
cost is a decreasing function of α (see Fig. 1). On the other hand, ETC shows fluc-
tuating behavior, that it is neither decreasing nor increasing with respect to changes
in β. It can also be noted from Fig. 2 that ETC has an increasing trend in the interval
(0.06, 0.85).
The sensitivity analysis study of the expected total testing cost subjected to
changes in two parameters, namely n 1 and n 2 , is given in Figs. 3 and 4. It is clear
from the figures that ETC is a decreasing function of (n 1 , n 2 ), when all the other
parameters in the sampling plan are kept fixed.
Lets us consider the following examples to understand the results obtained in the
previous sections.
Example 1 We consider the example depicted in Aslam et al. [22], a two-stage
inspection process of ball bearings. The compressive strength and the maximum
stress of a ball bearing are first and second quality characteristics respectively.
Assume that the compressive strength and the stress of a ball bearing are inde-
138 M. Kumar
137.5
136
135.5
135
134.5
134
133.5
0.05 0.055 0.06 0.065 0.07 0.075 0.08 0.085 0.09 0.095
β
pendent and that each follows the exponential distribution. Let us consider the single
and double sampling plan for the lot acceptance. Assume that α = 0.05, β = 0.25.
Let the specified mean failure time under the test compressive load or the stress load
of a ball bearing be θ1∗ = θ2∗ = 0.25 and θ10 = θ20 = 0.1. Table 7 and Table 8 give the
sample values.
Single sampling plan: Choose the first sample of each stage with n 1 = n 2 = 10.
Then the plan parameters are calculated using Problem Q 5 . Here t1 = 0.11, t2 =
0.115, r1 = 5 and r2 = 6. Consider sample 1 given in Table 7. We take first five
failure times, which give θ̂5,10 = 0.60554, and 0.60554 > t1 = 0.11. So we accept
the lot from the first stage and move to the second stage. In the second stage, first 6
failure times recorded in sample 1 given in Table 8 are under consideration. Then we
Optimal Design of Reliability Acceptance Sampling Plans … 139
θ* =100,θ* =300,
1 2
0 0
θ =30, θ =100,
1 2
α=0.05, β=0.01
500
expected total testing cost
400
300
200
100
0
40
30 40
20 30
20
10 10
n2 0 0
n1
Table 8 Failure times of 20 ball bearings related with stress (stage II)
Sample 0.3337 0.0379 0.3248 0.2935 0.3901 0.3608 0.5570 0.6922 0.0971
I:
0.3029
Sample 0.3042 0.2392 0.9623 0.4297 0.2629 0.6186 0.5351 0.6047 0.6976
II:
0.1603
have, θ̂6,10 = 0.5083 > t2 = 0.115. Hence we accept the product from the second
stage and hence this leads to the final acceptance of the entire lot.
Double sampling plan: Let n 1 = n 2 = 10. Then the decision making parameter
values are, t11 = 0.1, t12 = 0.2, t21 = 0.1, t22 = 0.184, r1 = 2 and r2 = 1. These
(1)
values are obtained using Problem Q 8 . Thus if the average failure time θ̂2,10 is greater
than 0.2, we accept the units from the first stage. In our example, from sample 1 given
(1)
in Table 7, θ̂2,10 = 0.567 > 0.2. So we accept the units from the first stage. In the
second stage, r2 = 1. From sample 1 given in Table 8, we have the first failure time
(1)
is 0.0379 and θ̂1,10 is 0.379 which is greater than t22 = 0.184. Hence we accept the
units from the second stage also. Thus we finally accept the entire lot.
140 M. Kumar
* *
θ1=1500,θ2=900
θ01=500, θ02=300
α=0.1, β=0.05
4000
expected total testing cost
3000
2000
1000
0
40
30 40
20 30
20
10 10
n2 0 0
n1
Example 2 In this second example, we have generated two samples of size 10 from
an exponential distribution, each with mean 1 and 5 respectively. The samples are
given in Tables 9 and 10.
We illustrate SSP and DSP for this example in the following.
Single sampling plan: Set α = 0.1, β = 0.2, n 1 = 10, n 2 = 10, θ1∗ = 1, θ10 = 0.5
and θ2∗ = 5, θ20 = 2. Our plan parameters are given by r1 = r2 = 5, t1 = 0.395 and
ˆ = 0.60368.
t2 = 3. From the Table 9, select the first 5 failure times, which gives θ5,10
Since 0.60368 > t1 = 0.395, we accept the product from the first stage. In the second
ˆ = 1.5398 < t2 = 3. Hence we reject the product
stage r2 = 5. From Table 10, θ5,10
from the second stage and hence this leads to the final rejection of the entire lot.
Optimal Design of Reliability Acceptance Sampling Plans … 141
Double sampling plan: Set α = 0.1, β = 0.2, n 1 = 10, n 2 = 10, θ1∗ = 1, θ10 = 0.5
and θ2∗ = 5, θ20 = 2. Our plan DSP (ri , ti1 , ti2 ) give r1 = r2 = 2, t11 = 0.35, t12 =
(1)
0.8, t21 = 2.12 and t22 = 4. From Table 9, θ̂2,10 = 0.375 > t11 = 0.35. Now we take
another sample of size 10, which is given below:
Sample II: 1.4491 1.0408 0.1970 4.1732 3.1460 1.7779 0.4321 0.3124 0.4343
0.7965.
(2) (1) (2)
From this sample θ̂2,10 = 1.5043, and (θ̂2,10 + θ̂2,10 )/2 = 0.93965 > 0.8. Hence
(1)
we accept the units from the first stage. In the second stage, θ̂2,10 = 1.0052 < t21 =
2.12. Therefore, we reject the units from the second stage which leads to the rejection
of the entire lot.
8 Conclusions
In this chapter, we developed two sampling schemes, namely single and double
sampling plans for the multistage process based on the information about the failure
times of units obtained under Type II censoring. The multistage production process in
which stages are independent is discussed by Aslam et al. [22]. They have considered
the number of failures as the criterion for accepting the lot and ignored the failure
time of units tested. Also the cost of testing is not addressed in their work. In this
work, we have removed the above drawbacks by considering the average-life of units
in the sample. It is noticed that when the sample size increases, the expected total
testing cost decreases which is clear from the sensitivity analysis study of ETC with
respect to changes in sample sizes (n 1 , n 2 ). Also, as α increases, ETC decreases.
However, it is observed that ETC shows a fluctuating trend subjected to changes
in β. It is inspected that our plan DSP (ri , ti1 , ti2 ) has an advantage of savings in
expected total testing cost of about 50% as compared to Aslam et al. [22]. The actual
testing cost may even be less than that reported. As a future scope of work one may
consider dependent production process with several stages.
Acknowledgements The author would like to express his gratitude to the editors for their con-
structive comments which improved the presentation of the chapter. The author would also like to
thank Dr. Ramyamol P C for her computational assistance.
References
5. Aslam M, Jun CH (2009) A group acceptance sampling plans for truncated life tests based on
the Inverse Rayleigh and Log-Logistic distributions. Pakistan J Stat 25:1–13
6. Azam M, Aslam M, Balamurali S, Javaid A (2015) Two stage group acceptance sampling plan
for half normal percentiles. J King Saud University - Sci 27:239–243
7. Tsai TR, Wu SJ (2006) Acceptance sampling based on truncated life test for generalized
Rayleigh distribution. J Appl Stat 33:595–600
8. Kantam RR, Rosaiah K, Srinivas Rao G (2001) Acceptance sampling plans based on life tests:
log- logistic model. J Appl Stat 28:121–128
9. Balamurali S, Usha M (2013) Optimal design of variable chain sampling plan by minimizing
the average sample number. Int J Manuf Eng 3:1–10
10. Srinivasa Rao G (2009) A group acceptance sampling plans for lifetimes following a general-
ized exponential distribution. Econ Quality Control 24:75–85
11. Aslam M (2007) Double acceptance sampling based on truncated life-tests in Rayleigh distri-
bution. Eur J Sci Res 17:605–611
12. Aslam M, Jun CH (2009) A group acceptance sampling plan for truncated life test having
Weibull distribution. J Appl Stat 36:1021–1027
13. Chien-Wei W, Aslam M, Jun C-H (2012) Variables sampling inspection scheme for resubmitted
lots based on the process capability index Cpk. Eur J Oper Res 217:560–566
14. Vellaisamy P, Sankar S (2005) A unified approach for modeling and designing attribute sam-
pling plans for monitoring dependent production processes. Methodol Comput Appl Probab
7:307–323
15. Kumar M, Ramyamol PC (2016) Optimal reliability acceptance sampling plan for exponential
distribution. Econ Quality Control 31:23–36
16. Ramyamol PC, Kumar M (2019) Optimal design of variable acceptance sampling plans for
mixture distribution. J Appl Stat 46:2700–2721
17. Asokan MV, Balamurali S (2000) Multi attribute single sampling plans. Econ Quality Control
15:103–108
18. Duffuaa SO, Al-Turki UM, Kolus AA (2009) Process-targeting model for a product with two
dependent quality characteristics using acceptance sampling plans. Int J Prod Res 47:4041–
4046
19. Moskowitz H, Plante R, Tang K (1996) Multi stage multi attribute acceptance sampling in
serial production systems. IIE Trans 130–137
20. Lee HL, Tagaras G (1992) Economic acceptance sampling plans in complex multi-stage pro-
duction systems. Int J Prod Res 30:2615–2632
21. Gaglioppa F, Miller LA, Benjaafar S (2008) Multi-task and multi stage production planning
and scheduling for process industries. Oper Res 56:1010–1025
22. Aslam M, Azam M, Jun C-H (2015) Acceptance sampling plans for multi-stage process based
on time–truncated test for Weibull distribution. Int J Adv Manuf Technol 79:1779–1785
The Importance of Technical Diagnostics
for Ensuring the Reliability of Industrial
Systems
Abstract The success and sustainability of the business industrial system is largely
determined by the degree of effectiveness of the production system as its basic integral
part. The reliability of the technical system, i.e., the probability of performing the
projected goal function in the observed time period, along with the readiness and
functional convenience is a basic indicator of the effectiveness of the production
system. Maintenance, as an integral part of the production system, has the function
of providing the projected level of optimal reliability by implementing activities
aimed at ensuring the required level of technical readiness of parts of the technical
system. One of the ways to ensure the optimal level of reliability of the technical
system is the application of maintenance concepts according to the condition that
allows monitoring and control of technical parameters of the state of the elements of
technical systems within the production processes. The basis of the application of the
condition based maintenance is the introduction of procedures for monitoring and
control of condition parameters using technical diagnostic methods. Knowing the
current state of the parameters that determine the degree of success of the projected
function of the goal of the technical system gives the possibility of timely response
to the occurrence of malfunctions and avoid the entry of the system into failure. The
paper presents a systematic provision of regular monitoring and control of condition
parameters of parts of the technical system of paper machines, specifically vibration
levels on high voltage motors, within the industrial system for production of toilet
paper using technical diagnostic methods using portable vibration control devices.
By timely response to the observed occurrence of increased vibration levels at a
specific critical position with the application of technical diagnostic methods and
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 143
M. Ram and H. Pham (eds.), Reliability and Maintainability Assessment
of Industrial Systems, Springer Series in Reliability Engineering,
https://doi.org/10.1007/978-3-030-93623-5_8
144 D. Lj. Branković et al.
1 Introduction
The production of goods with new use value as well as the provision of the required
quality of services within the production process is the basis of economic develop-
ment of social communities. The organization of production processes arises as a
consequence of the planned integration of human, material and energy resources and
their transformation into the desired form within the production industrial systems.
Production systems are composed of a series of complex technical systems, i.e., inte-
grated functional units by means of which the transformation of input quantities is
performed, i.e., material values into semi-finished or finished products. An integral
part of production processes are activities to ensure the continuity of the process of
technical systems in order to fulfill the projected function of the goal. These activ-
ities are planned, organized and managed by a special organizational unit of the
production system—maintenance. All maintenance activities within the production
industrial system are closely related to other units that participate in the production
process and form part of the integrated system support to production.
The need for maintenance in industrial systems appears from the very beginning
and application of technical systems during the first industrial revolution and the
introduction of mechanization in production processes. The feature of any technical
system is that under the influence of external and internal disturbances during oper-
ation, it can reach a state of failure which interrupts its function, which stops part
or the entire production process. In order to prevent the failure condition, or if it
occurs, an appropriate response from the maintenance service is required. Within the
period of the first industrial revolution, there was no need for a systematic approach
The Importance of Technical Diagnostics … 145
There are numerous scientific papers and researches that try to describe the concept
of maintenance in more detail. Papić and Milovanović [2] point out that maintenance
ceases to be seen as a need that has a random character and becomes a joint action of
a set of segments or elements in order to ensure the functional state of the technical
system in accordance with the requirements and criteria. The process of maintaining
assets, as one of the most important parts of the overall production process, has the
task of preventing and eliminating system failures, primarily through streamlining
and optimizing their use and increasing productivity and cost-effectiveness in the
production and operation process.
146 D. Lj. Branković et al.
With the construction and establishment of the production system, the need has
been imposed to define certain procedures that ensure the elimination of possible
causes of failures in the operation of the system. Working condition and condition in
failure are the main characteristics of the production system. The permanent goal of
the maintenance process is to minimize the time of failure and to provide working
conditions in which the production system fulfills the planned function of the goal.
system was entrusted to individuals and in which maintenance was seen as a non-
profit business entity, the basic characteristic of modern maintenance is the organi-
zation and implementation of team activities to prevent and eliminate failures at the
lowest possible cost gives the maintenance service the possibility of a great influence
on the economic results and a large share in the creation of the value of profit in the
operation of the system. In Table 1 presents the basic characteristics of the old /
traditional and new / modern understanding of maintenance.
There are a large number of ways in which the division of maintenance strategies
can be performed, but in each one a connection with the way of performing main-
tenance procedures which characterizes the basic division of maintenance strategies
can be seen, Fig. 4.
The most important characteristics of basic maintenance strategies are:
• Corrective maintenance strategy or so-called. “Wait and see” strategy—main-
tenance activities are performed after the technical system fails. There are no
previous planned interventions other than regular cleaning and lubrication oper-
ations. The justification of this strategy is in cases of simple, cheap and less
important elements of the technical system,
function, which means work until the failure condition occurs, without the inten-
tion to repair the failed element and bring it back to functional condition. The
strategy is acceptable in all cases in which it is economically unacceptable to
engage its own maintenance or services of external companies in relation to the
availability and price of a completely new part, subassembly or assembly. If the
lifespan of such a part or assembly is long enough to ensure the write-off of the old
part while achieving the planned productivity within the technological process,
this strategy provides for the complete replacement of the failed part with a new,
spare part or assembly.
The main difference between monitoring and inspection is that monitoring is used
to assess any changes in system parameters over time. Monitoring can be continuous,
within time intervals or within a certain number of operations. It is usually realized
in the operating conditions of the system.
152 D. Lj. Branković et al.
The term system degradation is defined according to standard EN 13,306 [4] as:
An irreversible process of disrupting one or more characteristics of a system either by time
or by the action of an external.
The basic determinant of the term degradation is the possibility of the system
coming to a state of failure. Often the term degradation also refers to wear. In terms of
the previous definition, the notion of degradation can also be explained as a notion of
disruption of system operating conditions because the basic task of the maintenance
strategy according to the condition is to diagnose the disorder before it becomes an
irreversible process.
Looking at the financial aspects, it is very important to determine the economic
feasibility of a particular maintenance strategy. In the case of high capital importance,
criticality and safety of the complete system, safety of the production process, etc.,
condition based maintenance is a better solution, although it is a significant invest-
ment in diagnostic equipment. The basic precondition is that the observed system has
measurable parameters of the work process. There are numerous analysis decisions
on which maintenance method and strategy to apply for each case, Al-Najjar and
Alsyouf [11].
The basis of monitoring is the existence of the condition based maintenance,
which can be defined as:
A system that uses state-of-the-art maintenance to determine and plan predictive maintenance
activities independently or in interaction with other systems or humans.
Tavner et al. [12] emphasize the growing importance of the application of Condi-
tion Monitoring in the automation and reduction of human oversight within power
plants. They analyze the effects of monitoring the condition of electric rotating
machines with techniques that are applicable when these machines are in opera-
tion, neglecting the many off-line inspection techniques. The results of monitoring
the condition relate to: electromagnetic and dynamic behavior of electrical machines
(with special emphasis on control with modern power electronics) and the behavior
of the insulation system of electrical machines, [12].
Condition based maintenance has the function of converting measurable energy
input quantities (temperature, vibration, noise …) into the desired shape or effect
(condition of elements, comparison with allowed values for safe operation, forecasts
The Importance of Technical Diagnostics … 153
for failure, etc.) in space and time. The maintenance system according to the condi-
tion uses the characteristics and parameters of the monitoring technique (vibrations,
sound, heat …) in terms of detecting performance disturbances or changes in the
operating characteristics of the observed system.
Based on the previous definitions, techniques or activities of the monitoring system
can be:
• subjective—noticed and observed by workers,
• objective:
• use of portable measuring devices—manual,
• stand-alone continuous systems for measuring predefined parameters.
It should be noted that monitoring activities are usually carried out during the
normal operation of the system, which in many cases represents the ability to obtain
relevant data on the state of the system. Monitoring can be planned, on request or
continuous.
Following the characteristics and parameters of monitoring, the maintenance
system according to the condition should determine the subsequent activities that are
realized within the predictive maintenance. Advanced maintenance systems, using
the support of computer programs, have the ability to autonomously plan mainte-
nance activities. They provide input on the current state of the system and an estimate
of where the observed system should be in the future given the required operating
conditions. After predicting the state of the system, the role of man is reduced to
deciding on the schedule of maintenance activities. This integration of man into an
integral part of the system creates the possibility of realization of all set requirements
within the condition based maintenance.
In the last few decades, the term technical diagnostics has penetrated into all branches
of technology, especially in electrical engineering and mechanical engineering. This
term means all measures used to assess the current condition or give a forecast of the
behavior of machinery, devices, equipment and the like, without disassembly / disas-
sembly or destruction in a certain period of time. The basic task accomplished by
technical diagnostic measures is the timely prediction of the occurrence of malfunc-
tions of parts of the technical system. The implementation of technical diagnostic
measures increases the overall degree of reliability, availability and effectiveness of
the complete technical system.
Technical diagnostics is the science of recognizing the condition of a technical
system and thanks to systems for monitoring the size of the condition, the occurrence
154 D. Lj. Branković et al.
Technical diagnostics is a kind of tool for assessing the condition of the components of
the technical system. The state of the system is the basic characteristic of the observed
technical system. The state of the technical system determines a set of physical
parameters whose monitoring and comparison in time can define the behavior of the
system at a given time and the forecast of the state of the system in the future after
the action of internal or external disturbances, Fig. 6.
The term system state can be described by the following features:
• working condition,
• state of dismissal,
• quality of functioning,
• safety and security, etc.
The existence of a technical system with measurable parameters is the first prereq-
uisite for establishing technical diagnostics. Technical diagnostics is a process
composed of certain activities or steps that include:
• installation of measuring sensors on the object (observed system),
• stabilization of the operating mode (systems and equipment) to obtain relevant
results,
• measurement of diagnostic signals,
• comparing the obtained values with the prescribed values from the norms,
• making a conclusion about the current state of the elements of the system under
observation,
• forecasts of the future state of the system, with the definition of recommendations
on the sustainability of that state.
Defining critical places in terms of system parts that by their function can cause
failures of parts or a complete system narrows the process of eliminating potential
risks that endanger the system. Each individual critical location should contain some
of the measurable physical quantities that change over time during operation. Sensors
for monitoring and measuring the characteristic values of the system state are placed
in critical places. The process involves measuring and converting these quantities
The Importance of Technical Diagnostics … 157
into values that can be displayed in a clear and concise manner. The measured values
are compared with the prescribed values from the standards, on the basis of which a
conclusion is drawn about the state of the operating characteristics of the system.
If we monitor the behavior of the observed part of the technical system by
collecting, measuring, converting and processing certain physical quantities that
describe the state of the technical system, we can talk about the diagnostic procedure
of the technical system, Fig. 7.
The processes of technical diagnostics are related to the monitoring of the
parameters of the state of the technical system by the workers / operators on
machines/devices, maintenance workers trained to handle diagnostic equipment or
by the equipment manufacturer through the service of technical support.
When choosing a method for identifying the state of the technical system, it is
essential that the measurement procedures, i.e., diagnostics or tests to check the tech-
nical condition of the elements or the overall system are realized without destroying
the elements and dismantling the technical system, because with each disassembly
and reassembly there is a change in the geometry and conditions of contact. This
leads to an increased intensity of wear due to the repetition of the running-in phase.
In general, all procedures or methods of diagnosing the state of technical systems
can be divided into: subjective and objective procedures.
Subjective methods of technical diagnostics are based primarily on the expe-
rience of responsible technical persons in charge of monitoring and forecasting
the current state of the technical system. Certainly, the level of education with the
necessary experience is a key parameter of a successful diagnosis.
The main task when choosing an objective measurement method is to observe
the characteristics of the system and the physical quantities that can be monitored.
The use of objective methods to identify the condition can be based on:
• temperature recording (using thermometers, thermocouples, thermistors, thermo-
sensitive paint, infrared radiation and heat flow),
• vibration recording (speed accelerator and shock pulse displacement meter,
stethoscope) and length measurement: mechanical, electrical, laser rangefinder,
etc.,
• registration of defects in the material (magnetism, penetrating colors, eddy
currents, radiography, ultrasound, hardness, sound resonance, corona emission
and optical fiber),
• registration of deposition, corrosion and erosion (ultrasound, radiography,
cathodic potential and using weight),
• flow and pressure recording (using a neon-Freon detector, manometer and micro
turbine meter),
• registration of electrical parameters (cable fault meter, frequency meter, phase
angle meter, universal instruments for checking electrical circuits and voltage
meter in transients),
• load recording (dynamometers: mechanical, hydraulic and electrical),
• registration of physical and chemical quantities (spectrograph, humidity meter,
O2 and CO2 gas meter, pH meter, viscosity meter and meter of the presence of
metal particles in the lubricant), etc.
Objective methods of identifying the state of the technical system cover only a
certain segment of measurable values and determining their relationship with changes
in system state parameters, characteristics of their change, diagnosing the state of
system parts in a certain time interval and predicting future state of performance.
In practice, there is no unambiguous criterion for the introduction of a single
diagnostic system that would be appropriate for all technical production systems.
The choice of the control-diagnostic system concept to be proposed and selected
depends on a number of factors, namely:
The Importance of Technical Diagnostics … 159
The basic characteristic of technical systems that have moving parts is the creation of
mechanical forces that are the source of the excitation of oscillation of the complete
system or its individual parts. This oscillation, if it is continuous for a long period
of time and if it goes beyond the limits that disrupt normal and uninterrupted work,
can lead to:
• wear of system elements,
• changes in the geometry and position of system parts,
• failure of electronic components,
• damage to the insulation of cables in contact,
• disturbance of workload balance,
• disturbance of lubrication between moving parts,
• the appearance of noise and causing damage and disease in humans,
• material fatigue, damage to parts and eventually fracture which causes the
complete system to fail.
Precisely for the purpose of recognizing and monitoring the state of the system
in the conditions of vibration occurrence and avoiding harmful events, they have
developed methods of vibration monitoring, which we often abbreviate as vibro-
diagnostics (Fig. 8).
The main characteristic of vibration systems is the intense interaction of forces
and vibrations, Fig. 9.
There are numerous reasons for the development and application of vibration
analysis methods in industrial practice. Randal [15] points out that vibration analysis
is by far the most common method for defining the state of a machine because it
has a number of advantages over other methods. The main advantage relates to the
immediate response to external change and can therefore be used for continuous or
intermittent monitoring. Modern signal processing techniques enable the detection
160 D. Lj. Branković et al.
and filtering of very weak, masking signals that can be indicators of damage to the
monitored elements of the technical system, [15].
The main benefit that can be expected after the introduction of vibro diagnostics
can be described through:
• elimination or minimization of unplanned downtimes that cause the technical
system to fail,
• minimizing corrective maintenance and transition to preventive maintenance,
• reduction of the risk of material damage,
• increasing the operational reliability of the technical system,
• increase in the mean time between cancellations,
The Importance of Technical Diagnostics … 161
In science, the term signal is related to a physical quantity that depends on time,
spatial coordinates or some other independently variable. A signal is the carrier of
a message that was created in the process of signal generation. Given the current
level of development of technology and especially electronics, when we say a signal
in most cases we mean electrical signals that are suitable for further processing and
presentation.
There are two basic signal classes: stationary and non-stationary, Fig. 10.
If we can predict the value of the signal size at any given time then we can talk about
deterministic signals. Random (stochastic) signals are characterized by the property
of uncertainty, i.e., their occurrence and value cannot be accurately predicted at some
point in time in the future. To represent such a signal using a time function, we can
only use known values from the past and in that case we are talking about random
functions.
Non-stationary continuous signals can be viewed as random signals or can be
divided into smaller intervals and viewed as transient. If the beginning and end of
the signal observation are related to a constant value, e.g. most often zero value, then
we can talk about transient signals.
By nature, signals can be divided into: continuous and discrete. Continuous signals
correspond to one of the continuous functions in time. Discrete signals appear as
arrays of separate elements and can have a finite number of different values.
Frequency domain signal analysis is one of the most acceptable methods today. As
the signal recording is always in the time domain, this means that the signal recorded
in this way should be transformed into a frequency domain.
The theoretical basis of the previously mentioned transformation is the possibility
of decomposing a complex function of time into a set of sine functions:
where:
Ai —amplitude of sinusoidal function,
ωi —circular frequency of the i-th sinusoidal function,
ϕi —phase of the i-th sinus function
The decomposition of the time function x(t) into a continuous complex spectrum
in the frequency domai F( jw) is performed using the Fourier integral which reads
(Fig. 11):
The Fourier integral is derived based on the concept that a periodic function
expressed in the Fourier order can fully represent an aperiodic function if its period
tends to infinity. In expression (2) the function F( jw) is called the spectral density
of amplitudes.
In general, there are two ways to transform a signal from a time domain to a
frequency domain. The first way is transformation using specialized devices, spec-
trum analyzers that at first glance resemble classical oscilloscopes. By establishing
signal digitization, spectrum analyzers immediately perform Furier signal integra-
tion and plot the result on a monitor that is an integral part of the device. Another
way is to digitize the help of analog-to-digital (A/D) converters that are built into the
computer.
Both signal transformations from the time domain to the frequency domain use the
developed Furier Transformation FFT algorithms, which, depending on the number
of measurement results, save up to 99% of the time during the calculation (Fig. 12).
There are numerous scientific papers that show the advantages of applying
advanced methods of frequency signal processing. Lee [16] describes the devel-
oped concept of the Acceleration Enveloping method and the result of applying the
method to early detection of rolling element damage in a bearing. By proper anal-
ysis of the Wave form using Enveloping, it is possible to determine the cause of the
potential problem that can arise from the bearing cage, the outer or inner ring or the
rolling element of the bearing itself. In this particular case, the detection of problems
on the outer ring of the bearing is described, showing how this observed frequency
could be isolated and differentiated in the total vibration spectrum, [16]. Kazzaz and
Sigh [17] cite an example of on-line digital signal processing obtained from a moni-
toring system for monitoring process state parameters using the C++ programming
language. The obtained signals are analyzed in detail through specially developed
modified algorithms in order to obtain the most accurate state of technical systems.
By using quality techniques to protect the input signals that come into the monitoring
system, it is possible to timely warn operators of the possibility of serious equipment
failures, [17].
Figure 13 shows the principle of frequency analysis in the case of monitoring the
operating parameters of the electric motor assembly and the actuator (reducer, mixer,
mill, etc.). Vibration level / amplitude measurement and frequency spectrum analysis
can provide an answer to the condition of the operating element and the forecast of
the behavior of operating parameters in the future.
All physical structures and machines, which are connected to the rotating compo-
nents, stimulate vibration. Veltkamp [19] emphasizes that vibrations generated on
machines have become a well-used parameter for assessment within condition moni-
toring. It is one of the most versatile techniques that can detect about 70% of common
mechanical faults associated with rotating machines [19].
The measured values are compared with the values prescribed by the standard for
each type of equipment or working element. An example of ISO 10816–3 defining
oscillation amplitude values and allowable limit values for rotating elements is shown
in Fig. 14.
In addition to accelerometers, there are other ways to measure and analyze vibra-
tion signals such as ultrasonic defect detection. In their study, Kim et al. [21]
give the results of testing low-velocity bearing vibration signals where significantly
The Importance of Technical Diagnostics … 165
• machine part of the plant paper machines—mixing devices (mixing tubs, pumps,
fresh and regenerated water tanks, …), homogenization (equalizers, mills, concen-
tration meters,…), transport (pumps, pipelines and related instrumentation),
purification (vertical and centrifugal purifiers) and central pump for delivery of
prepared paper pulp to paper machine,
• paper machine—a technical system consisting of several functionally dependent
units from which can be distinguished:
– wet part—subsystem for injecting paper pulp (flow), a system of rollers that
move the sieves through which water is separated from the pulp by pressing
and presses (rotary rollers for additional squeezing of water from the pulp by
pressure and vacuum),
– dry part—cylinder for drying paper with steam and creping with the help of a
system of knives, system for winding already formed paper tape created in the
process of creping,
– recuperation—a system related to the provision of conditions for drying paper,
which includes: steam–air tube exchangers, high-voltage motors for hot air
fans, hot air drying hood and all accompanying instrumentation (condensers,
air flow regulators, thermocouples, etc.)
The output of the described technological process is a semi-finished product—a
paper strip that is wound on winding shafts and further handed over or on a cutting
machine to appropriate formats needed for processing in own paper confectionery
plants or for sale as a final output semi-finished product with defined specification
of quality parameters.
The Importance of Technical Diagnostics … 169
Fig. 17 Structure of the on-line diagnostic system at paper machine press positions
170 D. Lj. Branković et al.
In the case of the described technical system of the paper machine, one of the
regular preventive activities of the maintenance service is the contracted vibrodi-
agnostic examinations, which are realized by an external service from the Czech
Republic. Experts from a specialized external company come to the site once or
twice a month and inspect the previously defined measuring points of technical
functional elements of the paper machine (drive and guide rollers, presses, drying
cylinder) and auxiliary units (high voltage recuperation motors with couplings and
fan bearings, mill motors and vacuum plants). Examples of parts of the diagnostic
report with the values of the measured parameters are shown in Figs. 18, 19 and 20.
Portable vibration measuring devices such as Mikrolog CMVA 65 and Microlog
CMVA55 are used as on-site vibration diagnostic measuring equipment by the
external company, Fig. 21.
These devices have the ability to measure, archive and instantly analyze the
vibration spectrum.
During the regular control and measurement of the vibration condition parameters
of the bearings at the position of the high-voltage motor for the drive of the hot air
recirculation fan for drying the paper strip on the dry side of the hood, on March
21, 2019, the measured values of bearing oscillation vibrations at position L2 are
significantly higher than allowed, Fig. 22.
Two methods were used to evaluate the motor bearings condition implemented
in SKF Microlog frequency analyzers: “Enveloping” and HFD (High Frequency
Detection) methods (Fig. 23).
On botht trends diagrams is a visible increase of the overall values that indicate
worsening of the bearing condition. On the Fig. 24 is frequency spectrum measured
with Enveloping method and on the Fig. 25 there is time domain spectrum of the
measured signal.
In time domain is visible impact occurrence with frequency approximately
575 rpm. This frequency is also clearly visible and dominate in the frequency spec-
trum. Frequency 575 rpm is 0.40 multiplies of the rotary speed of the motor, what
correspond to cage fault frequency of the bearing. Appearance of this frequency
indicates a problem with the cage of the L2 bearing.
Cage problem of the bearing is a severe problem because of the cage separates
the rolling element. In the case of the brake of the cage, the rolling element changes
its position and the bearing is immediately blocked. Rolling element are not rolling
but sliding on the bearing races what lead to high increase of the temperature and in
most cases to the damage of the rotor shaft. Possible is also physical contact of the
rotor and stator, which lead to total destruction of the motor.
Comparing the obtained results with previous measurements (trend/measurement
history) it could be clearly seen that there was a sudden jump in vibration values
at the position of the front bearing of the high voltage motor L2 (side to the motor
fan) and a dangerous situation that can cause a serious problem at any time. on the
bearing itself but also on the motor and the associated motor-coupling-fan bearing
assembly.
The Importance of Technical Diagnostics … 171
Fig. 22 Trend of vibration value measurement at the position of bearing L2 of a high voltage hot
air fan motor on recuperation with the Enveloping method
Figure 26 shows the working position of the high-voltage motor for driving the
fan for recirculation of hot air on the recuperation—“dry” side of the hood for drying
paper.
Taught by experience from previous years, when due to sudden bearing failure,
brush problems or voltage problems, on several occasions there were more serious
failures and burns of both high-voltage motors (Figs. 27 and 28) and other motor
units (Fig. 29), a state of high risk for the occurrence of serious emergency situations
was stated.
The Importance of Technical Diagnostics … 175
The conclusion of the expert diagnostic measurement finding is that the cage
problem of the bearing is a severe problem that can lead to immediate damage of the
bearing and after all to the severe motor damage. Based on the expert opinion of the
specialist who performed the measurement and who recommended the replacement
of the L2 bearing of the high-voltage motor for hot air recirculation on recuperation
176 D. Lj. Branković et al.
Fig. 27 Breakdown of the winding of the high-voltage fan motor from 2017
Fig. 28 High vibrations and cracks of the high-voltage motor support, 2018
The Importance of Technical Diagnostics … 177
in the shortest possible time, the responsible maintenance persons adopted executive
measures and assessed the terms for specific preventive and corrective activities:
• decision to extend operation in high-risk mode with intensive, half-hour temper-
ature measurement at a critical place—high voltage motor bearing by shift main-
tenance with clear instruction on immediate shutdown of the entire production
process if the temperature value rises above 55 °C because it would mean imme-
diate entry into the accident emergency condition. In addition to the measure of
constant temperature control, a written order was issued for intensive lubrication
of this position (grease). The decision and risk-taking for the continuation of work
was made on the basis of a technical assessment of the situation and taking into
account the needs of the production cycle, fulfillment of the production plan and
compliance with obligations to customers,
• preparation of emergency maintenance activities during the next day which were
related to:
– providing the required number of executors (locksmiths and electricians of
the maintenance service, auxiliary technological staff and support of logistics
workers),
– providing the necessary tools and devices—various types of manual chain
hoists, certified ropes, devices for dismantling bearings and couplings and
forklifts,
– providing a crane—the service of an external person since the company does
not have its own means of lifting heavy loads. The crane had to be urgently
announced and delivered to the scene. The position of the high-voltage motor
requires the engagement of a crane from the outside, i.e., outside the produc-
tion hall, so that the manipulation (extraction of the engine from the working
position) is realized in the shortest possible time,
• realization of high voltage motor replacement—emergency maintenance activity.
178 D. Lj. Branković et al.
Maintenance activities are often very complicated and specific for certain types
of industrial production plants due to numerous external and internal factors that
affect the conditions and manner of their implementation. The concrete case is a
real example of very demanding preparation and realization with the appearance of
several unpredictable situations that needed to be solved during the elimination of
the observed problem. There are several specifics that were present in the preparation
of the implementation of maintenance activities:
• careful planning of the most favorable period of shutdown of the plant with regard
to the current production plan and compliance with the already defined deadlines
for delivery of semi-finished products to the customer,
• available working space at the working position of the high-voltage electric motor,
• use of the presence of a technician for diagnostic measurement,
• the required number of workers for the intervention,
• crane engagement—external company and availability of equipment,
• preparation of the necessary tools (based on events in the history of replacement
of this engine), …
Realization of repair, i.e., replacement of the problematic high-voltage motor
and elimination of unforeseen problems that occurred during the performance of
maintenance activities is shown in Figs. 30, 31, 32 and 33.
The dismantling of the lightweight partition wall made of 120 mm thick panels
was necessary in order to facilitate the access of the crane and significantly shorten the
time of pulling the high-voltage motor out of the working position. It takes an average
of up to 6 h to remove the engine from the outside and insert the spare engine, while
the same activity realized in the direction of the interior of the production hall was
performed in 16 h due to numerous physical obstacles such as the inner panel wall,
steam and condensate piping. impossibility of reaching the crane into the extraction
space, etc.
Fig. 30 Working and spare high-voltage electric motor of the recuperative fan
The Importance of Technical Diagnostics … 179
Fig. 32 Disassembly of the coupling from the working motor, finishing of the spacer ring and
installation of the coupling on the spare motor
Fig. 33 Setting the spare high-voltage motor to the working position using a truck crane
The most complex problem that arose during the realization of the replacement
of the high-voltage motor is the observation of the inappropriate position of the
opening on the spare motor in relation to the place of reliance on the metal structure
and the concrete base. The reason for this problem was the fact that the spare engine
was not manufactured by the same manufacturer and was intended as a replacement
for two similar positions, where the structural measures of the support point were
provided for the second position. After defining the necessary measures for position
correction, new corresponding holes were cut in the metal support, Fig. 34. This
180 D. Lj. Branković et al.
Fig. 34 Correcting the location of the motor support on the concrete base
Fig. 35 Centering the connector and connecting the cables to the high voltage motor
necessary intervention required a significant amount of time relative to the total time
spent on engine replacement.
By installing and centering the coupling as well as connecting the motor to the
existing cables, the basic maintenance activities were completed, Fig. 35 (Fig. 36).
After almost 14 h of continuous work on the replacement of the high-voltage motor
with the successfully completed testing of the motor with parameters that correspond
to the operating conditions, the conditions for the continuation of the operation of
the plant have been created. After the resumption of production, the installation of a
light panel wall was completed, thus completing the complete maintenance activity.
Vibration control measurement after engine replacement at this position showed
satisfactory values.
5 Discussion of Results
The application of technical diagnostic methods has a very positive impact on the
degree of realization of planned production activities of industrial systems. In the
case of the previously described industrial system for the production of toilet paper,
the following characteristic indicators can be highlighted:
The Importance of Technical Diagnostics … 181
t
Po = 9h · 4, 8 = 43, 2t (3)
h
where:
• 4,8 tons—the average production of a technical paper machine system for one
hour of operation.
If, given the diversity of the product range, an average gross margin—GM of 290
e/ton of paper produced is assumed, a direct potential production loss (T p ) can be
182 D. Lj. Branković et al.
calculated, i.e., realization of potential profit (Pr ) by selling the realized production
from:
e (4)
T p = Pr = G M · Po = 290 · 43, 2t = 12.580e
t
which is a significant financial amount, given that the estimate relates to only one
accident event in a system with a constant production cycle.
2. Maintenance cost savings—the estimate of maintenance cost savings can be
reported on the basis of a comparison of the costs of preventive planned repairs
and the costs of a possible emergency situation. The common costs in both
cases are: the cost of labor of the maintenance worker during the downtime
and engagement to replace the high-voltage motor and the cost of the crane.
Additional costs in case of an accident
• burnt engine winding cost (C v ):
where:
PV M —is high voltage motor power (KW ),
C PV M —is estimated cost of engine winding per unit power.
• transport cost (C t ) to the nearest service center (in this case abroad) and the value
of customs duties/taxes = e 1,500 (estimate based on previous cases).
Total savings of currently identifiable maintenance costs (C m ):
Fig. 37 Electrical cabinet of the vibrodiagnostic system with VSE001 PLC measuring units
manufactured by IFM electronic
Fig. 38 Vibration sensors at measuring points of critical positions presses the paper machine
184 D. Lj. Branković et al.
6 Conclusion
References
19. Veltkamp A (2001) Vibration introduction course: SKF condition monitoring, in computer-
ized maintenance management system and enterprise resource planning, Nigerian Society of
Engineers, Lagos, Nigeria, pp 1.1–4.5
20. ISO 10816–3 (1995) Displacement, International standard
21. Kim YH., Tan ACC, Mathew J, Yang BS (2006) Condition monitoring of low speed bearings:
a comparative study of the ultrasound technique versus vibration measurements. In: Mathew J,
Kennedy J, Ma L, Tan A, Anderson D (eds) Engineering asset management. Springer, London.
https://doi.org/10.1007/978-1-84628-814-2_21
22. Moubray J (2000) Maintenance management: a new paradigm, strategic technologies, Inc.,
Aladon Ltd, UK, pp 7–11. http://www.maintenanceresources.com/RCM/Maintparadigm
Reliability Assessment of Replaceable
Shuffle Exchange Network
with an Additional Stage Using
Interval-Valued Universal Generating
Function
Abstract In this paper, a Shuffle Exchange Network with an additional stage (SEN+)
is inspected, the probabilities of whose components aren’t known with accuracy.
To overcome this difficulty we determined its reliability by using the method of
Interval-valued universal generating function (IUGF) and thus the transient state
probabilities are obtained in intervals. The reliability has been analyzed in terms
of three parameters: Terminal, Broadcast, and Network reliability. Also within the
observed network if any of the Switching Element (SE)comes up short and the system
quits working then we will replace that SE by a fixed replacement rate to continue the
operation of the network. A numerical example is also provided to offer a practical
clarification of the proposed technique.
1 Introduction
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 189
M. Ram and H. Pham (eds.), Reliability and Maintainability Assessment
of Industrial Systems, Springer Series in Reliability Engineering,
https://doi.org/10.1007/978-3-030-93623-5_9
190 A. Khati and S. B. Singh
it becomes very vital to analyze the reliability of SEN+ incorporating their uncer-
tainty which will prominently upgrade the believability of the reliability analysis of
the network. The motives causing the uncertainties in the complex systems are:
(a) Temperature, strain, moistness, vibration, stress, and so forth are the natural
highlights that would end in the uncertainties within the device and its
segments.
(b) When evaluating a posh system, in some cases it turns out to be very brain
desensitizing to obtain exact reliability data.
(c) If we routinely utilize a system, its overall performance depreciates with time,
and consequently the probability of its components shifts with time.
Interval-valued universal generating function (IUGF) is a way to investigate the
reliability of a network having uncertainties. Li et al. [8] advised a way to appraise
the reliability of the Multi-state system (MSS) when the prevailing data about the
component isn’t enough. In such cases rather than considering the precise values of
the probabilities, the interval-valued probabilities of the components can be consid-
ered. To acquire the interval-valued reliability of MSS, an IUGF was built up and
from the results it can be observed that the proposed method is efficient when the
state probabilities of the components are imprecise (or uncertain). Pan et al. [10] gave
an approach for the assessment of interval-valued reliability of MSS incorporating
uncertainty. They constructed the algorithm for the IUGF method and authorized
their method by encountering examples. Kumar et al. [7] considered a 2-out-of-4
system comprising of two components organized in a series pattern and evaluated
its interval-valued reliability using the IUGF technique. Singh [9] appraised the
reliability and MTTF of a non-repairable MSS by using the IUGF approach. They
analyzed the system’s reliability by considering the uncertainties in the probabilities
and the failure rates of the components of the considered system.
Also, all the researchers considered the SEN+ in which if any of the components
fizzles then one can’t replace it by a decent one. Nonetheless, in practice, we’d like
a network in which we can replace any broken segment with the goal that the entire
network doesn’t quit working.
Keeping all these facts in mind, we’ve considered a SEN+, the probability of
whose parts aren’t known with accuracy, and determined its reliability by using the
method of IUGF. Thus the transient state probabilities are evaluated in intervals.
The reliability is examined regarding three parameters: Terminal, Broadcast, and
Network reliability. Also, in the considered network if any of the SE fails and the
network stops operating, at that point we can replace that SE to proceed with the
operation of the network.
The SEN+ proposed is of size 8 × 8. The differential equations overseeing the
network’s conduct have been obtained by applying the method of supplementary
variable technique. The probabilities of all the components are obtained by using the
Laplace transform and henceforth the reliability of the network is computed. Also,
the MTTF of the SEN+ is analyzed. At last, we demonstrated our model by taking a
numerical example.
192 A. Khati and S. B. Singh
2 Assumptions
1. To begin with, the network is in worthy condition i.e., all the hubs and
connections are working appropriately.
2. The network considered is an 8 × 8 SEN+ wherein each of the SE is of size 2
× 2.
3. All the network’s components can be in a working stage or in a failed stage.
4. On the off chance that the network comes up short totally then only will go for
replacement and after replacement, the network will become as good as new.
5. The failure rates of different components are different while the replacement
rates of all the components are supposed to be the same.
3 Notations
4 Definitions
The RBD for the terminal reliability of SEN+ is shown in Fig. 1 and the transition
state diagram for the considered SEN+ is shown in Fig. 2.
(a) Formulation of mathematical model:
Applying the supplementary variable technique to the proposed model, the set of
difference-differential equations for lower bound probabilities of the components of
the model are obtained as:
∞ ∞
d
+ D P 0 (t) = P 1 (x, t)η(x)d x + P 8 (x, t)η(x)d x +
dt
0 0
∞ ∞
P 9 (x, t)η(x)d x+ P 10 (x, t)η(x)d x +
0 0
∞ ∞
P 11 (x, t)η(x)d x + P 12 (x, t)η(x, t)d x (1)
0 0
∂ ∂
+ + η(x) P1(x, t) = 0 (2)
∂x ∂t
d
+ λ4 P2(t) = λ2P0(t) (3)
dt
d
+ λ2 P3(t) = λ4P0(t) (4)
dt
d
+ λ5 P4(t) = λ3P0(t) (5)
dt
d
+ λ3 P5(t) = λ5P0(t) (6)
dt
d
+ λ2 + λ4 P6(t) = λ5P4(t) + λ3P5(t) (7)
dt
d
+ λ3 + λ4 P7(t) = λ2P3(t) + λ4P2(t) (8)
dt
∂ ∂
+ + η(x) P8(x, t) = 0 (9)
∂x ∂t
∂ ∂
+ + η(x) P9(x, t) = 0 (10)
∂x ∂t
∂ ∂
+ + η(x) P 10 (x, t) = 0 (11)
∂x ∂t
∂ ∂
+ + η(x) P11(x, t) = 0 (12)
∂x ∂t
∂ ∂
+ + η(x) P12(x, t) = 0 (13)
∂x ∂t
Boundary conditions:
Initial conditions:
P 0 (t) = 1 att = 0 and is zero at all other values of t.
(b) Solution of the model:
On taking Laplace transform of Eqs. (1) to (13) along with the boundary conditions
(14) to (19) and using the initial conditions, we get:
∞ ∞
(s + D)P 0 (s) = 1 + P1(x, s)η(x)d x + P1(x, s)η(x)d x
0 0
∞ ∞
+ P8(x, s)η(x)d x + P9(x, s)η(x)d x (20)
0 0
∞ ∞
+ P10(x, s)η(x)d x+ P11(x, s)η(x)d x
0 0
s + λ4 P 2 (s) = λ2 P 0 (s) (21)
s + λ2 P 3 (s) = λ4 P 0 (s) (22)
s + λ5 P 4 (s) = λ3 P 0 (s) (23)
s + λ3 P 5 (s) = λ5 P 0 (s) (24)
s + λ2 + λ4 P 6 (s) = λ5 P 4 (s) + λ3 P 5 (s) (25)
s + λ3 + λ5 P 7 (s) = λ4 P 2 (s) + λ2 P 3 (s) (26)
∂
s+ + η(x) P 8 (s, x) = 0 (27)
dx
196 A. Khati and S. B. Singh
∂
s+ + η(x) P 9 (s, x) = 0 (28)
dx
∂
s+ + η(x) P 10 (s, x) = 0 (29)
dx
∂
s+ + η(x) P 11 (s, x) = 0 (30)
dx
∂
s+ + η(x) P 12 (s, x) = 0 (31)
dx
Boundary conditions:
The equations from (20) to (37) lead to the following transition state probabilities of
the considered model:
1
P 0 (s) = (38)
(s + D)
λ1 (1 − S(s))
P 1 (s) = P 0 (s) (39)
s
λ2
P 2 (s) = P0(s) (40)
(s + λ4 )
λ4
P 3 (s) = P (s) (41)
(s + λ2 ) 0
Reliability Assessment of Replaceable Shuffle Exchange Network … 197
λ3
P 4 (s) = P0(s) (42)
(s + λ5 )
λ5
P 5 (s) = P 0 (s) (43)
(s + λ3 )
λ5 λ3 1 1
P 6 (s) = + P 0 (s) (44)
(s + λ2 + λ4 ) (s + λ5 ) (s + λ3 )
λ2 λ4 1 1
P 7 (s) = + P 0 (s) (45)
(s + λ3 + λ5 ) (s + λ2 ) (s + λ4 )
λ1 (1 − S(s))
P 8 (s) = P 6 (s) (46)
s
λ4 (1 − S(s))
P 9 (s) = P 4 (s) (47)
s
λ3 (1 − S(s))
P 10 (s) = P 3 (s) (48)
s
λ5 (1 − S(s))
P 11 (s) = P 5 (s) (49)
s
λ6 (1 − S(s))
P 12 (s) = P 0 (s) (50)
s
where D = λ1 + λ2 + λ3 + λ4 + λ5 + λ6 .
Similarly, we can find the expressions for upper bounds of the transition state
probabilities of the network’s components by replacing P i (s) by P i (s) (i = 0 to 12)
and λ j by λ j (j = 1 to 6) in Eqs. (38) to (50).
(i) Interval-valued Reliability of the network:
The interval-valued terminal reliability of the replaceable SEN+ is given as:
R = [ P 0 + P 5 + P 6 + P 8 + P 9 + P 11 + P 12 + P 13 + P 16 + P 18 , P 0 + P 5
+P 6 + P 8 + P 9 + P 11 + P 12 + P 13 + P 16 + P 18 ]
(51)
198 A. Khati and S. B. Singh
The lower bound of the MTTF for the TR of SEN+ is obtained as:
1 λ λ λ λ λ3 λ5 1 1
(MTTF)TR = + 2 + 4 + 3 + 5 + +
D Dλ4 Dλ Dλ5 Dλ3 D(λ2 + λ4 ) λ5 λ3
2
λ2 λ4 1 1
+ + (52)
D(λ3 + λ5 ) λ2 λ4
The RBD for the BR of SEN+ is shown in Fig. 3 and the transition state diagram for
the broadcast reliability of SEN+ is shown in Fig. 4.
(a) Formulation of mathematical model:
By applying supplementary variable technique to the planned model, the difference-
differential equations for lower bound probabilities of the components of the network
in different states, governing the network’s behavior are as:
∞ ∞
d
+ E P 0 (t) = P 3 (x, t)η(x)d x + P 10 (x, t)η(x)d x
dt
0 0
∞ ∞
+ P 11 (x, t)η(x)d x + P 12 (x, t)η(x)d x
0 0
∞ ∞
+ P 13 (x, t)η(x)d x + P 14 (x, t)η(x)d x
0 0
∞ ∞
+ P 17 (x, t)η(x)d x + P 20 (x, t)η(x)d x
0 0
∞ ∞
+ P 21 (x, t)η(x)d x + P 22 (x, t)η(x)d x (53)
0 0
d
+ λ3 + λ4 + λ5 P 1 (t) = λ2 P 0 (t) (54)
dt
d
+ λ2 + λ6 + λ7 P 2 (t) = λ3 P 0 (t) (55)
dt
d
+ λ5 P 4 (t) = λ4 P 1 (t) (56)
dt
d
+ λ4 P 5 (t) = λ5 P 1 (t) (57)
dt
d
+ λ3 + λ6 + λ7 P 6 (t) = λ4 P 5 (t) + λ5 P 4 (t) (58)
dt
200 A. Khati and S. B. Singh
d
+ λ7 P 7 (t) = λ6 P 2 (t) (59)
dt
d
+ λ6 P 8 (t) = λ7 P 2 (t) (60)
dt
d
+ λ2 + λ4 + λ5 P 9 (t) = λ7 P 7 (t) + λ6 P 8 (t) (61)
dt
∂ ∂
+ + η(x) P10(x, t) = 0 (62)
∂x ∂t
∂ ∂
+ + η(x) P11(x, t) = 0 (63)
∂x ∂t
∂ ∂
+ + η(x) P12(x, t) = 0 (64)
∂x ∂t
∂ ∂
+ + η(x) P13(x, t) = 0 (65)
∂x ∂t
∂ ∂
+ + η(x) P14(x, t) = 0 (66)
∂x ∂t
d
+ λ7 P 15 (t) = λ6 P 6 (t) (67)
dt
d
+ λ6 P 16 (t) = λ7 P 6 (t) (68)
dt
∂ ∂
+ + η(x) P17(x, t) = 0 (69)
∂x ∂t
d
+ λ5 P 18 (t) = λ4 P 9 (t) (70)
dt
d
+ λ4 P 19 (t) = λ5 P 9 (t) (71)
dt
∂ ∂
+ + η(x) P 20 (x, t) = 0 (72)
∂x ∂t
∂ ∂
+ + η(x) P 21 (x, t) = 0 (73)
∂x ∂t
∂ ∂
+ + η(x) P 22 (x, t) = 0 (74)
∂x ∂t
Reliability Assessment of Replaceable Shuffle Exchange Network … 201
Boundary conditions:
Initial conditions:
P 0 (t) = 1 at t = 0 and is zero at all other values of t.
(b) Solution of the model:
By taking Laplace transform of the Eqs. (53) to (83) and using the initial conditions,
we get:
∞ ∞
s + F P 0 (s) = 1 + P 1 (s, x)η(x)d x + P 2 (s, x)η(x)d x
0 0
∞ ∞
+ P 3 (s, x)η(x)d x + P 4 (s, x)η(x)d x
0 0
∞ ∞
+ P 7 (s, x)η(x)d x + P 10 (s, x)η(x)d x
0 0
∞ ∞
+ P 14 (s, x)η(x)d x + P 15 (s, x)η(x)d x
0 0
202 A. Khati and S. B. Singh
∞ ∞
+ P 19 (s, x)η(x)d x + P 20 (s, x)η(x)d x (84)
0 0
s + λ3 + λ4 + λ5 P 1 (s) = λ2 P 0 (s) (85)
s + λ2 + λ6 + λ7 P 2 (s) = λ3 P 0 (s) (86)
∂
s+ + η(x) P 3 (s, x) = 0 (87)
dx
s + λ5 P 4 (s) = λ4 P 1 (s) (88)
s + λ4 P 5 (s) = λ5 P 1 (s) (89)
λ λ P (s) λ4 λ5 P 1 (s)
s + λ3 + λ6 + λ7 P 6 (s) = 4 5 1 + (90)
(s + λ4 ) (s + λ5 )
s + λ7 P 7 (s) = λ6 P 2 (s) (91)
s + λ6 P 8 (s) = λ7 P 2 (s) (92)
s + λ6 P 16 (s) = λ7 P 6 (s) (103)
∂
s+ + η(x) P 17 (s, x) = 0 (104)
dx
s + λ5 P 18 (s) = λ4 P 9 (s) (105)
s + λ4 P 19 (s) = λ5 P 9 (s) (106)
∂
s+ + η(x) P 20 (s, x) = 0 (107)
dx
∂
s+ + η(x) P 21 (s, x) = 0 (108)
dx
∂
s+ + η(x) P 22 (s, x) = 0 (109)
dx
Boundary conditions:
1
P 0 (s) = (119)
(s + E)
λ2 P 0 (s)
P 1 (s) = (120)
(s + λ3 + λ4 + λ5 )
λ3 P 0 (s)
P 2 (s) = (121)
(s + λ2 + λ6 + λ7 )
1 − S(s)
P 3 (s) = λ3 P 1 (s) + λ2 P 2 (s) (122)
s
λ2 λ4 P 0 (s)
P 4 (s) = (123)
(s + λ5 )(s + λ3 + λ4 + λ5 )
λ2 λ5 P 0 (s)
P 5 (s) = (124)
(s + λ4 )(s + λ3 + λ4 + λ5 )
λ2 λ4 λ5 λ6 P 0 (s)
P 6 (s) =
(s + λ4 )(s + λ7 )(s + λ3 + λ4 + λ5 )
λ2 λ4 λ5 λ6 P 0 (s)
+ (125)
(s + λ5 )(s + λ7 )(s + λ3 + λ4 + λ5 )
λ3 λ6 P 0 (s)
P 7 (s) = (126)
(s + λ7 )(s + λ2 + λ6 + λ7 )
λ3 λ7 P 0 (s)
P 8 (s) = (127)
(s + λ6 )(s + λ2 + λ6 + λ7 )
λ3 λ6 λ7 P 0 (s)
P 9 (s) =
(s + λ6 )(s + λ2 + λ4 + λ5 )(s + λ2 + λ6 + λ7 )
λ3 λ6 λ7 P 0 (s)
+ (128)
(s + λ7 )(s + λ2 + λ4 + λ5 )(s + λ2 + λ6 + λ7 )
1 − S(s)
P 10 (s) = λ1 P 0 (s) (129)
s
Reliability Assessment of Replaceable Shuffle Exchange Network … 205
1 − S(s)
P 11 (s) = λ8 P 0 (s) (130)
s
1 − S(s)
P 12 (s) = λ9 P 0 (s) (131)
s
1 − S(s)
P 13 (s) = λ10 P 0 (s) (132)
s
1 − S(s)
P 14 (s) = λ11 P 0 (s) (133)
s
λ2 λ4 λ5 λ6 P 0 (s)
P 15 (s) =
(s + λ4 )(s + λ7 )(s + λ3 + λ4 + λ5 )
λ2 λ4 λ5 λ6 P 0 (s)
+ (134)
(s + λ5 )(s + λ7 )(s + λ3 + λ4 + λ5 )
λ2 λ4 λ5 λ7 P 0 (s)
P 16 (s) =
(s + λ4 )(s + λ6 )(s + λ3 + λ4 + λ5 )
λ2 λ4 λ5 λ7 P 0 (s)
+ (135)
(s + λ5 )(s + λ6 )(s + λ3 + λ4 + λ5 )
1 − S(s)
P 17 (s) = λ7 P 15 (s) + λ6 P 16 (s) (136)
s
λ3 λ4 λ6 λ7 P 0 (s)
P 18 (s) =
(s + λ5 )(s + λ7 )(s + λ2 + λ4 + λ5 )(s + λ2 + λ6 + λ7 )
λ3 λ4 λ6 λ7 P 0 (s)
+ (137)
(s + λ5 )(s + λ6 )(s + λ2 + λ4 + λ5 )(s + λ2 + λ6 + λ7 )
λ3 λ5 λ6 λ7 P 0 (s)
P 19 (s) =
(s + λ4 )(s + λ7 )(s + λ2 + λ4 + λ5 )(s + λ2 + λ6 + λ7 )
λ3 λ5 λ6 λ7 P 0 (s)
+ (138)
(s + λ4 )(s + λ6 )(s + λ2 + λ4 + λ5 )(s + λ2 + λ6 + λ7 )
1 − S(s)
P 20 (s) = λ5 P 18 (s) + λ4 P 19 (s) (139)
s
1 − S(s)
P 21 (s) = λ3 P 6 (s) (140)
s
1 − S(s)
P 22 (s) = λ2 P 9 (s) (141)
s
Similarly, we can find the expressions for upper bounds of the transition state
probabilities of the network’s components by replacing P i (s) by P i (s) (i = 0 to 22)
and λ j by λ j (j = 1 to 11) in Eqs. (119) to (141).
(i) Interval-valued reliability:
The interval-valued BR of the SEN+ is given as:
R = [P 0 + P 1 + P 2 + P 4 + P 5 + P 6 + P 7 + P 8 + P 9 + P 15 + P 16
+ P 18 + P 19 , P 0 + P 1 + P 2 + P 4 + P 5 + P 6 + P 7
+ P 8 + P 9 + P 15 + P 16 + P 18 + P 19 ] (142)
The lower bound of the MTTF for the BR of SEN+ is given as:
(MTTF)BR =
1 λ2 λ3 λ2 λ4
+ + +
E E(λ3 + λ4 + λ5 ) E(λ2 + λ6 + λ7 ) Eλ5 (λ3 + λ4 + λ5 )
λ2 λ5 λ2 λ4 λ5 λ2 λ4 λ5
+ + +
Eλ4 (λ3 + λ4 + λ5 ) Eλ4 (λ3 + λ4 + λ5 ) Eλ5 (λ3 + λ4 + λ5 )
λ3 λ 6 λ3 λ7
+ +
Eλ7 (λ2 + λ6 + λ7 ) Eλ6 (λ2 + λ6 + λ7 )
λ3 λ6 λ7 λ3 λ6 λ7
+ +
Eλ7 (λ2 + λ6 + λ7 )(λ2 + λ4 + λ5 ) Eλ6 (λ2 + λ6 + λ7 )(λ2 + λ4 + λ5 )
λ2 λ4 λ5 λ6 λ2 λ4 λ5 λ6
+ +
Eλ4 λ7 (λ3 + λ4 + λ5 ) Eλ5 λ7 (λ3 + λ4 + λ5 )
λ2 λ4 λ5 λ7 λ2 λ4 λ5 λ7
+ +
Eλ4 λ6 (λ3 + λ4 + λ5 ) Eλ5 λ6 (λ3 + λ4 + λ5 )
λ3 λ4 λ7 λ6 λ3 λ4 λ7 λ6
+ +
Eλ5 λ7 (λ2 + λ6 + λ7 )(λ2 + λ4 + λ5 ) Eλ5 λ6 (λ2 + λ6 + λ7 )(λ2 + λ4 + λ5 )
λ3 λ5 λ7 λ6 λ3 λ5 λ7 λ6
+ +
Eλ4 λ7 (λ2 + λ6 + λ7 )(λ2 + λ4 + λ5 ) Eλ4 λ6 (λ2 + λ6 + λ7 )(λ2 + λ4 + λ5 )
(143)
We can determine the upper bound of the MTTF for BR of SEN+ by replacing λi
by λi (i = 1 to 11) in Eq. (143).
Reliability Assessment of Replaceable Shuffle Exchange Network … 207
The network RBD for an 8 × 8 SEN+ is shown in Fig. 5 and the transition state
diagram for the network reliability of SEN+ is shown in Fig. 6.
(a) Formulation of mathematical model:
By applying supplementary variable technique to the assumed model, the difference-
differential equations for lower bound probabilities of the components of the network
in different states, governing the network’s behavior are as:
∞ ∞
d
+ F P 0 (t) = P 1 (x, t)η(x)d x + P 2 (x, t)η(x)d x
dt
0 0
∞ ∞
+ P 3 (x, t)η(x)d x + P 4 (x, t)η(x)d x
0 0
∞ ∞
+ P 7 (x, t)η(x)d x + P 10 (x, t)η(x)d x
0 0
∞ ∞
+ P 14 (x, t)η(x)d x + P 15 (x, t)η(x)d x
0 0
∞ ∞
+ P 19 (x, t)η(x)d x + P 20 (x, t)η(x)d x (144)
0 0
d
+ λ6 P5 (t) = λ5 P 0 (t) (145)
dt
d
+ λ5 P6 (t) = λ6 P 0 (t) (146)
dt
∂ ∂
+ + η(x) P 7 (x, t) = 0 (147)
∂x ∂t
d
+ λ8 P8 (t) = λ7 P 0 (t) (148)
dt
d
+ λ7 P9 (t) = λ8 P 0 (t) (149)
dt
∂ ∂
+ + η(x) P 10 (x, t) = 0 (150)
∂x ∂t
d
+ λ10 P11 (t) = λ9 P 0 (t) (151)
dt
d
+ λ9 P12 (t) = λ10 P 0 (t) (152)
dt
d
+ λ11 + λ12 P13 (t) = λ9 P 12 (t) + λ10 P11 (t) (153)
dt
∂ ∂
+ + η(x) P 14 (x, t) = 0 (154)
∂x ∂t
∂ ∂
+ + η(x) P 15 (x, t) = 0 (155)
∂x ∂t
Reliability Assessment of Replaceable Shuffle Exchange Network … 209
d
+ λ12 P16 (t) = λ11 P0 (t) (156)
dt
d
+ λ11 P17 (t) = λ12 P0 (t) (157)
dt
d
+ λ9 + λ10 P18 (s) = λ12 P16 (s) + λ11 P17 (s) (158)
dt
∂ ∂
+ + η(x) P 19 (x, t) = 0 (159)
∂x ∂t
∂ ∂
+ + η(x) P 20 (x, t) = 0 (160)
∂x ∂t
∂ ∂
+ + η(x) P 21 (x, t) = 0 (161)
∂x ∂t
∂ ∂
+ + η(x) P 22 (x, t) = 0 (162)
∂x ∂t
∂ ∂
+ + η(x) P 23 (x, t) = 0 (163)
∂x ∂t
∂ ∂
+ + η(x) P 24 (x, t) = 0 (164)
∂x ∂t
Boundary conditions:
Initial conditions:
P 0 (t) = 1 at t = 0 and is zero at all other values of t.
(b) Solution of the model:
By taking Laplace transform of the Eqs. (144) to (178) and using the initial conditions,
we get:
∞ ∞
s + F P 0 (s) = 1 + P 1 (s, x)η(x)d x + P 2 (s, x)η(x)d x
0 0
∞ ∞
+ P 3 (s, x)η(x)d x + P 4 (s, x)η(x)d x
0 0
∞ ∞
+ P 7 (s, x)η(x)d x + P 10 (s, x)η(x)d x
0 0
∞ ∞
+ P 14 (s, x)η(x)d x + P 15 (s, x)η(x)d x
0 0
∞ ∞
+ P 19 (s, x)η(x)d x + P 20 (s, x)η(x)d x (179)
0 0
∂
s+ + η(x) P 1 (s, x) = 0 (180)
dx
Reliability Assessment of Replaceable Shuffle Exchange Network … 211
∂
s+ + η(x) P 2 (s, x) = 0 (181)
dx
∂
s+ + η(x) P 3 (s, x) = 0 (182)
dx
∂
s+ + η(x) P 4 (s, x) = 0 (183)
dx
s + λ6 P 5 (s) = λ5 P 0 (s) (184)
s + λ5 P 6 (s) = λ6 P 0 (s) (185)
∂
s+ + η(x) P 7 (s, x) = 0 (186)
dx
s + λ8 P 8 (s) = λ7 P 0 (s) (187)
s + λ7 P 9 (s) = λ8 P 0 (s) (188)
∂
s+ + η(x) P 10 (s, x) = 0 (189)
dx
s + λ10 P 11 (s) = λ9 P 0 (s) (190)
s + λ9 P 12 (s) = λ10 P 0 (s) (191)
s + λ11 + λ12 P 13 (s) = λ9 P 12 (t) + λ10 P 11 (t) (192)
∂
s+ + η(x) P 14 (s, x) = 0 (193)
dx
∂
s+ + η(x) P 15 (s, x) = 0 (194)
dx
s + λ12 P 16 (s) = λ11 P 0 (s) (195)
s + λ11 P 17 (s) = λ12 P 0 (s) (196)
s + λ9 + λ10 P 18 (s) = λ10 P 16 (t) + λ11 P 19 (t) (197)
212 A. Khati and S. B. Singh
∂
s+ + η(x) P 19 (s, x) = 0 (198)
dx
∂
s+ + η(x) P 20 (s, x) = 0 (199)
dx
∂
s+ + η(x) P 21 (s, x) = 0 (200)
dx
∂
s+ + η(x) P 22 (s, x) = 0 (201)
dx
∂
s+ + η(x) P 23 (s, x) = 0 (202)
dx
∂
s+ + η(x) P 24 (s, x) = 0 (203)
dx
Boundary conditions:
Initial conditions:
Reliability Assessment of Replaceable Shuffle Exchange Network … 213
1
P 0 (s) = (214)
(s + F)
1 − S(s)
P 1 (s) = λ1 P 0 (s) (215)
s
1 − S(s)
P 2 (s) = λ2 P 0 (s) (216)
s
1 − S(s)
P 3 (s) = λ3 P 0 (s) (217)
s
1 − S(s)
P 4 (s) = λ4 P 4 (s) (218)
s
λ5 P 0 (s)
P 5 (s) = (219)
s + λ6
λ6 P 0 (s)
P 6 (s) = (220)
s + λ5
1 − S(s)
P 7 (s) = λ6 P 5 (s) + λ5 P 6 (s) (221)
s
λ7 P 0 (s)
P 8 (s) = (222)
s + λ8
λ8 P 0 (s)
P 9 (s) = (223)
s + λ7
1 − S(s)
P 10 (s) = λ8 P 8 (s) (224)
s
λ9 P 0 (s)
P 11 (s) = (225)
(s + λ10 )
λ10 P 0 (s)
P 12 (s) = (226)
s + λ9
λ9 λ10 P 0 (s) λ9 λ10 P 0 (s)
P 13 (s) = + (227)
(s + λ9 )(s + λ11 + λ12 ) (s + λ10 )(s + λ11 + λ12 )
214 A. Khati and S. B. Singh
1 − S(s)
P 14 (s) = λ11 P 13 (s) (228)
s
1 − S(s)
P 15 (s) = λ12 P 13 (s) (229)
s
λ11 P 0 (s)
P 16 (s) = (230)
s + λ12
λ12 P 0 (s)
P 17 (s) = (231)
s + λ11
λ11 λ12 P 0 (s) λ11 λ12 P 0 (s)
P 18 (s) = + (232)
(s + λ12 )(s + λ9 + λ10 ) (s + λ11 )(s + λ9 + λ10 )
1 − S(s)
P 19 (s) = λ9 P 18 (s) (233)
s
1 − S(s)
P 20 (s) = λ10 P 18 (s) (234)
s
1 − S(s)
P 21 (s) = λ13 P 0 (s) (235)
s
1 − S(s)
P 22 (s) = λ13 P 0 (s) (236)
s
1 − S(s)
P 23 (s) = λ13 P 0 (s) (237)
s
1 − S(s)
P 24 (s) = λ13 P 0 (s) (238)
s
R = [P 0 + P 5 + P 6 + P 8 + P 9 + P 11 + P 12 + P 13 + P 16 + P 18 ,
P 0 + P 5 + P 6 + P 8 + P 9 + P 11
+ P 12 + P 13 + P 16 + P 18 ] (239)
Reliability Assessment of Replaceable Shuffle Exchange Network … 215
The lower bound of the MTTF for the NR of the considered SEN+ can be determined
by using Eq. (240) as:
1 λ λ λ λ λ λ
(MTTF)NR = + 5 + 6 + 7 + 8 + 9 + 10
F Fλ6 Fλ5 Fλ8 Fλ7 Fλ10 Fλ9
λ10 λ9 λ λ
+ + + 11 + 12
F(λ11 + λ12 ) F(λ11 + λ12 ) Fλ12 Fλ11
λ12 λ11
+ + (240)
F(λ9 + λ10 ) F(λ9 + λ10 )
One can determine the upper bound of the MTTF for NR in a similar manner by
replacing λi by λi (i = 1 to 16) in Eq. (240).
8 Numerical Illustration
Let the upper and lower bounds of the failure rates of the SEN + be λ1 = 0.01, λ2 =
0.03, λ3 = 0.05, λ4 = 0.07, λ5 = 0.09, λ6 = 0.11, λ1 = 0.02, λ2 = 0.04, λ3 = 0.06,λ4
= 0.08,λ5 = 0.1,λ6 = 0.13.
On substituting the assumed values of failures rates in Eqs. (15) to (17) and
Eqs. (25) to (27) and on taking their inverse Laplace transform, we get the expressions
for various transition state probabilities. Tables 1 and 2 show the changes in the
transition state probabilities of the operating state with respect to time.
From Eq. (29), we can determine the variation on the reliability of the replaceable
SEN with time which is presented in the Table 3 and is shown in the Fig. 7.
By using Eq. (52), we can find the bounds of MTTF by changing the bounds of
failure rates which is presented in the Tables 4, 5, 6 and 7 and the changes in the
bounds of MTTF with different failure rates are displayed in the Figs. 8 and 9, where
B = λ1 ,λ2 ,λ3 ,λ4 ,λ5 ,λ6 and B = λ1 ,λ2 ,λ3 ,λ4 ,λ5 ,λ6.
216 A. Khati and S. B. Singh
0.8
Lower bound of
0.6 TR
Upper bound of
0.4
TR
0.2
0
0 5 10 15
Let the upper and lower bounds of the failure rates of the proposed SEN + be λ1 =
0.02, λ2 = 0.04, λ3 = 0.06, λ4 = 0.08, λ5 = 0.1, λ6 = 0.12„ λ7 = 0.14,λ8 = 0.16,λ9
= 0.18,λ10 = 0.2,λ11 = 0.22, λ12 = 0.24, λ1 = 0.01, λ2 = 0.03, λ3 = 0.05,λ4 =
0.07,λ5 = 0.09,λ6 = 0.11,λ7 = 0.13,λ8 = 0.15,λ9 = 0.17,λ10 = 0.19,λ11 = 0.21,λ12
= 0.23.
Substituting the values of failures rates in Eq. (119) to (141) and taking their
inverse Laplace transform, we get the expressions for various transition state proba-
bilities. Tables 8, 9 and 10 shows the changes in the transition state probabilities of
the operating states of the SEN with variation in time.
Using the Eq. (142), we can evaluate the variation on the reliability of the consid-
ered replaceable SEN w. r. t. time which is given in the Table 11 and is depicted in
the Fig. 10.
Using Eq. (143), we can determine the upper and lower bounds of MTTF of the SEN
+ corresponding to its BR. By varying the failure rate, the various bound obtained
are presented in Tables 12, 13, 14, 15, 16, 17, 18 and 19 and their graphical version
are given in Figs. 11 and 12, where C = λ1 ,λ2 ,λ3 ,ss λ4 ,λ5 ,λ6 ,λ7 ,λ8 ,λ9 ,λ10 ,λ11 and C
= λ1 ,λ2 , λ3 ,λ4 , λ5 , λ6 ,λ7 ,λ8 ,λ9 ,λ10 ,λ11 .
Let the upper and lower bounds of the failure rates of the proposed SEN + be λ1 =
0.01, λ2 = 0.03, λ3 = 0.05, λ4 = 0.07, λ5 = 0.09, λ6 = 0.11„ λ7 = 0.13,λ8 = 0.15,λ9
= 0.17,λ10 = 0.19,λ11 = 0.21,λ12 = 0.23, λ13 = 0.25, λ14 = 0.27, λ15 = 0.28, λ16 =
0.29, λ 1 = 0.02, λ 2 = 0.04, λ 3 = 0.06,λ 4 = 0.08,λ 5 = 0.1,λ 6 = 0.12,λ 7 = 0.14,λ 8
= 0.16,λ 9 = 0.18,λ 10 = 0.2,λ 11 = 0.22,λ 12 = 0.24, λ 13 = 0.26,λ 14 = 0.28,λ 15 =
0.285,λ 16 = 0.295.
Putting the values of failures rates in Eqs. (214) to (238) and taking their inverse
Laplace transform, we get the expressions for various transition state probabilities.
Tables 20, 21 and 22 show the changes in the transition state probabilities of the
operating states with respect to time.
Using the Eq. (239), we can find the variation on the network reliability of the
replaceable SEN+ with time which is given in the Table 23 and is shown in the
Fig. 13.
222
0.8
Reliability
0
0 5 10 15
Time
Table 15 MTTF w. r. t.
MTTFw. r. t.λ10 MTTF w. r. t.λ11
λ10 ,λ11
0.1 [4.772304, 27.954032] [4.897891, 27.954032]
0.2 [4.229997, 27.954032] [4.328369, 27.954032]
0.3 [3.798364, 27.954032] [3.877497, 27.954032]
0.4 [3.560932, 27.954032] [3.511695, 27.954032]
0.5 [3.154574, 27.954032] [3.208963, 27.954032]
0.6 [2.896932, 27.954032] [2.934981, 27.954032]
0.7 [2.6557518, 27.954032] [2.737057, 27.954032]
0.8 [2.402058, 27.954032] [2.549587, 27.954032]
0.9 [2.355947, 27.954032] [2.386152, 27.954032]
0.91 [2.34113, 27.954032] [2.370953, 27.954032]
Reliability Assessment of Replaceable Shuffle Exchange Network … 225
Table 16 MTTF w. r.
MTTF w. r. t.λ1 MTTF w. r. t.λ2 MTTF w. r. t.λ3
t.λ1 ,λ2 ,λ3
0.1 [1.26434352, [1.26434352, [1.26434352,
3.09359758] 1.654956] 1.536492]
0.2 [1.26434352, [1.26434352, [1.26434352,
2.78423782] 1.761904] 2.390001]
0.3 [1.26434352, [1.26434352, [1.26434352,
2,531125] 2.009773] 3.199891]
0.4 [1.26434352, [1.26434352, [1.26434352,
2.320198] 2.298437] 3.961509]
0.5 [1.26434352, [1.26434352, [1.26434352,
2.1513129] 2.595035] 4.676214]
0.6 [1.26434352, [1.26434352, [1.26434352,
1.988741] 2.886925] 5.346969]
0.7 [1.26434352, [1.26434352, [1.26434352,
1.8307822] 3.168934] 5.977091]
0.8 [1.26434352, [1.26434352, [1.26434352,
1.740149] 3.43903] 6.569825]
0.9 [1.26434352, [1.26434352, [1.26434352,
1.637787] 3.696612] 7.128206]
0.91 [1.26434352, [1.26434352, [1.26434352,
1.628209] 3.721682] 7.182264]
Table 17 MTTF w. r. t. λ4 ,
MTTF w. r. t.λ4 MTTF w. r. t.λ5 MTTF w. r. t.λ6
λ5 , λ6 λ6
0.1 [1.26434352, [1.26434352, [1.26434352,
2.8468719] 2.78429] 3.85526]
0.2 [1.26434352, [1.26434352, [1.26434352,
2.4268227] 2.68398] 3.81627]
0.3 [1.26434352, [1.26434352, [1.26434352,
2.2214942] 2.2845] 3.757654]
0.4 [1.26434352, [1.26434352, [1.26434352,
2.10470249] 2.14352] 3.6225]
0.5 [1.26434352, [1.26434352, [1.26434352,
2.03287682] 2.04436] 3.516455]
0.6 [1.26434352, [1.26434352, [1.26434352,
1.9860925] 1.983529] 3.475505]
0.7 [1.26434352, [1.26434352, [1.26434352,
1.82286785] 1.897982] 3.388433]
0.8 [1.26434352, [1.26434352, [1.26434352,
1.488691] 1.769752] 3.287852]
0.9 [1.26434352, [1.26434352, [1.26434352,
1.27589] 1.668726] 3.161679]
0.91 [1.26434352, [1.26434352, [1.26434352,
1.255834] 1.6483829] 3.14492]
226 A. Khati and S. B. Singh
Table 18 MTTF w. r. t.
MTTF w. r. t.λ7 MTTF w. r. t.λ8 MTTF w. r. t.λ9
λ7 ,λ8 ,λ9 λ9
0.1 [1.26434352, [1.26434352, [1.26434352,
3.92946] 3.66342118] 3.01864]
0.2 [1.26434352, [1.26434352, [1.26434352,
3.87847] 3.42744197] 2.887845]
0.3 [1.26434352, [1.26434352, [1.26434352,
3.7123] 3.30020844] 2.78429]
0.4 [1.26434352, [1.26434352, [1.26434352,
3.68763] 3.19098693] 2.67742]
0.5 [1.26434352, [1.26434352, [1.26434352,
3.55954] 3.0881076] 2.55070]
0.6 [1.26434352, [1.26434352, [1.26434352,
3.42497] 2.9854911] 2.44289]
0.7 [1.26434352, [1.26434352, [1.26434352,
3.383862] 2.84319] 2.376763]
0.8 [1.26434352, [1.26434352, [1.26434352,
3.292675] 2.74945] 2.209894]
0.9 [1.26434352, [1.26434352, [1.26434352,
3.15458] 2.686855] 2.16682]
0.91 [1.26434352, [1.26434352, [1.26434352,
3.126243] 2.643192] 2.13875]
Table 19 MTTF w. r. t.
MTTF w. r. t.λ10 MTTF w. r. t.λ11
λ10 ,λ11
0.1 [1.26434352, 2.8634218] [1.26434352, 2.209894]
0.2 [1.26434352, 2.74945] [1.26434352, 2.198549]
0.3 [1.26434352, 2.642497] [1.26434352, 2.142497]
0.4 [1.26434352, 2.559854] [1.26434352, 2.024987]
0.5 [1.26434352, 2.320989] [1.26434352, 1.985416]
0.6 [1.26434352, 2.242455] [1.26434352, 1.820984]
0.7 [1.26434352, 2.17494] [1.26434352, 1.74945]
0.8 [1.26434352, 2.0985413] [1.26434352, 1.64247]
0.9 [1.26434352, 2.042497] [1.26434352, 1.498519]
0.91 [1.26434352, 2.009894] [1.26434352, 1.474945]
The bounds of the MTTF at various failure rates are given in the Tables 24, 25, 26,
27, 28, 29, 30, 31, 32, 33, 34 and 35 and are represented graphically in Figs. 14 and
15, where in Fig. 14 λ A = λ1 ,λ2 ,λ3 , λ4 ,λ5 ,λ6 ,λ7 ,λ8 ,λ9 ,λ10 ,λ11 ,λ12 , λ13 , λ14 ,λ15 , λ16
and in Fig. 15 λ A = λ1 , λ2 , λ3 , λ4 , λ5 , λ6 ,λ7 , λ8 , λ9 , λ10 , λ11 , λ12 , λ13 , λ14 , λ15 , λ16 .
Reliability Assessment of Replaceable Shuffle Exchange Network … 227
Fig. 11 MTTF versusλ1 ,λ2 ,λ3 ,λ4 ,λ5 ,λ6 ,λ7 ,λ8 ,λ9 ,λ10 ,λ11
The proposed model studies how one can analyze the bounds of reliability of a
replaceable shuffle exchange network incorporating uncertainties by using the IUGF
technique. The reliability bounds of the proposed SEN+ have been determined based
on three reliability parameters: TR, BR, and NR. From the outcomes accomplished
it can be concluded that the reliability bounds of all the three reliability indices of the
SEN+ decrease with increasing time, which can be observed from Figs. 7, 10 and 13.
228
Thus, the performance of the network is crumbling with time. From the outcomes, it
was also observed that among all the three reliability bounds, the bounds of TR are
the best followed by the BR bounds which are trailed by the NR bounds.
In addition to the bounds of reliability of the network, the bounds of the MTTF
of the network at different failure rates were also evaluated and the results obtained
can be summed up as:
230 A. Khati and S. B. Singh
Reliability
0.8
0.6
Lower bound of NR
0.4
Upper bound of NR
0.2
0
0 5 10 15
Time
Table 31 MTTF w. r. t. λ4 , λ5 , λ6
MTTF w. r. t.λ4 MTTF w. r. t.λ5 MTTF w. r. t.λ6
0.1 [3.835971515, 4.34453] [3.835971515, 4.36641] [3.835971515, 4.40186]
0.2 [3.835971515, 4.1812] [3.835971515, 4.33703] [3.835971515, 4.48618]
0.3 [3.835971515, 4.02971] [3.835971515, 4.44362] [3.835971515, 4.6746]
0.4 [3.835971515, 3.88881] [3.835971515, 4.57498] [3.835971515, 4.87625]
0.5 [3.835971515, 3.75743] [3.835971515, 4.70988] [3.835971515, 5.00642]
0.6 [3.835971515, 3.65638] [3.835971515, 4.84193] [3.835971515, 5.26432]
0.7 [3.835971515, 3.51962] [3.835971515, 4.9689] [3.835971515, 5.44485]
0.8 [3.835971515, 3.42843] [3.835971515, 5.09006] [3.835971515, 5.67248]
0.9 [3.835971515, 3.31011] [3.835971515, 5.20528] [3.835971515, 5.77759]
0.91 [3.835971515, 3.30029] [3.835971515, 5.21647] [3.835971515, 5.72164]
234 A. Khati and S. B. Singh
(i) By observing Fig. 11, it can be examined that with increasing values of
λ1 ,λ4 ,λ5 ,λ6 ,λ7 ,λ8 ,λ9 ,λ10 and λ11 the lower bounds of MTTF decrease while
they increase with increasing λ2 ,λ3 . The upper bound of the MTTF remains
constant concerning all the parameters.
(ii) From Fig. 12, it can be observed that the upper bound of the MTTF
of the considered network decrease with increase in the values of
λ1 ,λ4 ,λ5 ,λ6 ,λ7 ,λ8 ,λ9 ,λ10 ,λ11 and increase with increasing values of λ2 ,λ3 .The
lower bound of the MTTF remains unchanged for all the mentioned parameters.
(c) MTTF with respect to Network Reliability:
(i) From Fig. 14 it can be detected that on increasing the values of the failure rates
λ1 ,λ3 ,λ4 ,λ7 ,λ8 ,λ9 ,λ10 ,λ11 ,λ12 ,λ13 ,λ14 ,λ15 and λ16 ,lower bound of MTTF of the
proposed SEN decrease whereas with increasing value of λ6 , lower bound of
MTTF increase. Also, as λ2 and λ5 increase the lower bounds of the MTTF first
decreases and then increases. Here, also the upper bound remains constant.
(ii) On examining Fig. 15 it can be visualized that the upper bound
of the MTTF decrease with increasing values of the parameters
λ1 ,λ2 ,λ3 ,λ4 ,λ7 ,λ8 ,λ9 ,λ10 ,λ11 ,λ12 ,λ13 ,λ14 ,λ15 and λ16 while increase with
increasing λ6 . It can also be observed that if we increase λ5 then the upper
bound of MTTF first decrease then increase. Also, the lower bound of MTTF
remains the same for all the failure rates.
Reliability Assessment of Replaceable Shuffle Exchange Network … 237
One can observe from the results that the lower bound of the MTTF of the network
is lowest with respect to λ1 and is highest with respect to λ4 , and both are corre-
sponding to the NR of the network. The lowest and the highest value of the MTTF
are 0.1562452 and 20.39972 respectively. Also, the upper bound of the MTTF is
lowest with respect to the parameter λ4 with value 1.255834 and is highest with
respect to λ2 with value 27.954032. The lowest and highest values of the upper
bound of the MTTF are also obtained corresponding to the BR of the SEN+.
10 Conclusion
In this paper, we have considered a SEN+ the probability of whose components are
not known with accuracy and determined its reliability using the method of IUGF.
The transient state probabilities are obtained in intervals. The reliability is acquired
regarding three parameters: Terminal, Broadcast, and Network reliability. Also in the
observed network if any of the SE fails and the network quits operating at that point we
are able to replace that SE to hold the operation of the network. The SEN+ proposed
is of length 8 × 8. The RBD for the Terminal, Broadcast, and Network reliability
of SEN+ have been presented. The differential equations overseeing the network’s
behavior have been obtained with the aid of the method of the supplementary variable
technique. The probabilities of all the components are obtained by using the Laplace
transform method and hence the reliability of the network is computed. Furthermore,
the MTTF is examined and at last, we established our model by taking a numerical
example.
References
1. Bisht S, Singh SB (2020) Assessment of reliability and signature of Benes network using
universal generating function. Life Cycle Reliab Saf Eng
2. Bistouni F, Jahanshahi M (2014) Improved extra group network: a new fault- tolerant multistage
interconnection network. The J Supercomput 69(1):161–199
3. Bistouni F, Jahanshahi M (2014) Analyzing the reliability of shuffle-exchange networks using
reliability block diagrams. Reliab Eng Syst Saf 132:97–106
4. Bistouni F, Jahanshahi M (2018) Rearranging links: a cost-effective approach to improve
the reliability of multistage interconnection networks. Int J Internet Technol Secured Trans
8(3):336–373
5. Bistouni F, Jahanshahi M (2019) Determining the reliability importance of switching elements
in the shuffle-exchange networks. Int J Parallel Emergent Distrib Syst 34(4):448–476
6. Fard NS, Gunawan I (2005) Terminal reliability improvement of shuffle-exchange network
systems. Int J Reliab Qual Saf Eng 12(01):51–60
7. Kumar A, Singh SB, Ram M (2016) Interval-valued reliability assessment of 2-out-of-4 system.
In: 2016 international conference on emerging trends in communication technologies (ETCT).
IEEE, pp 1–4
8. Li CY, Chen X, Yi XS, Tao JY (2011) Interval-valued reliability analysis of multi-state systems.
IEEE Trans Reliab 60(1):323–330
238 A. Khati and S. B. Singh
9. Singh SB (2017) Reliability analysis of multi-state complex system having two multi-state
subsystems under uncertainty. J Reliab Stat Stud 10(1):161–177
10. Pan G, Shang CX, Liang YY, Cai JY, Li DY (2016) Analysis of interval-valued reliability of
multi-state system in consideration of epistemic uncertainty. In: International conference on
P2P, parallel, grid, cloud and internet computing. Springer, Cham, pp 69–80
11. Rajkumar S, Goyal NK (2016) Review of multistage interconnection networks reliability and
fault-tolerance. IETE Tech Rev 33(3):223–230
12. Sharma S, Kahlon KS, Bansal PK (2009) Reliability and path length analysis of irregular fault
tolerant multistage interconnection network. ACM SIGARCH Comput Architecture News
37(5):16–23
13. Yunus NAM, Othman M (2011) Shuffle exchange network in multistage interconnection
network: a review and challenges. Int J Comput Electr Eng 3(5):724
14. Yunus NAM, Othman M (2015) Reliability evaluation for shuffle exchange interconnection
network. Procedia Comput Sci 59:162–170
Modeling Software Vulnerability
Injection-Discovery Process
Incorporating Time-Delay and VIKOR
Based Ranking
M. Agarwal
Amity School of Business, Amity University Uttar Pradesh, Noida 201303, India
D. Aggrawal
University School of Management and Entrepreneurship, Delhi Technological University, Delhi,
India
e-mail: deeptiaggrawal@dtu.ac.in
S. Das · A. Anand (B)
Department of Operational Research, University of Delhi, Delhi 110007, India
N. Bhatt
Anil Surendra Modi School of Commerce, SVKM’s Narsee Monjee Institute of Management
Studies (Deemed to be University), Mumbai 400056, India
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 239
M. Ram and H. Pham (eds.), Reliability and Maintainability Assessment
of Industrial Systems, Springer Series in Reliability Engineering,
https://doi.org/10.1007/978-3-030-93623-5_10
240 M. Agarwal et al.
1 Introduction
Software security is crucial to ensure data security and prevent the misuse of the
software [28]. Software vulnerabilities are bugs/loopholes in the software system
which makes it prone to software breaching [22]. The presence of vulnerability is
a potential risk to the software whereas an exploit is the set of code to take advan-
tage of a security flaw. Vulnerability discovery process is like security check for
programmers/developers and assists in improving software quality. From the view-
point of software security, it is very important to review the bugs that are discovered
might be exploited by any hacker. Basically, when a software vulnerability is discov-
ered it follows a life cycles that consist of injection, discovery, disclosure, patch and
exploitation [8]. All software vulnerabilities are not equally harmful and are analyzed
based on their ease of detection, their exploitability and damage they can cause.
In the arena of software engineering, fault removal phenomena and vulnerability
handling has been given the utmost importance [10]. The most convenient way to
deal with vulnerabilities is to provide software patches or the corrective code to over-
come the loophole. The software firms tend to deploy software updates as soon as
possible and most of the vulnerability exploitations can be avoided by applying these
patches. Common vulnerabilities and Exposure (CVE), the vulnerability database
provides instances of publicly disclosed loopholes present in any software. Also,
many instances of vulnerabilities getting exploited can also be viewed. As quoted by
[32], the first half of the year 2019 has seen more than 3800 cases of publicly disclosed
software breaches uncovering 4.1 billion compromised records. It is even more aston-
ishing that out of this 3.2 billion of these records were uncovered by only eight
breaches. Moreover, in 2017 the infamous WannaCry vulnerability, a ransomware
affected the software community on a massive scale [23]. These breaching affects
the software and its dependent processes as well. Thereby making it a prime respon-
sibility to discover the loopholes before their exploitation by bad guys. Even though,
huge number of resources will be required for finding, disclosing, and fixing the attack
prone areas in susceptible software. The discovery process is being undertaken both
internally and externally which comprises of both testers (internal part of devel-
oping team and users (external, jointly engaged in discovering the vulnerabilities
throughout the lifespan of the software.
As described above, life cycle of vulnerability consist of injection, discovery,
disclosure, patch release and exploitation [4] and researchers have extensively
worked on each stage of the Vulnerability life cycle, be it working on induction,
be it its discovery and disclosure and be it the patch scheduling policies. The present
framework is different from some of the previous works in many aspects, which can
be understood as below (Table 1).
The chapter has been chronologically arranged as follows: after the brief discus-
sion on importance and introduction to vulnerabilities in Sect. 1, a brief literature on
vulnerability discovery process has been supplemented in Sect. 2. Methodology and
model development have been discussed in Sect. 3 followed by data analysis and
Modeling Software Vulnerability Injection-Discovery Process … 241
model validation in Sect. 4. In Sect. 5 the conclusion and lastly list of references are
provided.
2 Literature Review
3 Methodology
Infinite Server Queuing Theory has been exhaustively utilized in various field like that
of software engineering [16] and Managerial decision making [3] to name a few. The
Modeling Software Vulnerability Injection-Discovery Process … 243
Let the counting process {M(t), t ≥ 0}, {N (t), t ≥ 0} represents number vulner-
abilities injected and discovered up to time t and the process started at initial time
t = 0 . Then the distribution of N (t) is given by [3]:
∞
Vi (t)e−Vi (t)
Pr{N (t) = n} = Pr{N (t) = n|M(t) = m } (1)
j=0
m!
where p(t) represents the probability that an arbitrary vulnerability is getting discov-
ered at time t , which can be defined using steiltjes convolution and the concept of
the conditional distribution of arrival times, given as:
t
d Vi (u)
p(t) = Fd (t − u) (3)
Vi (t)
0
244 M. Agarwal et al.
Henceforth, the mean value function for vulnerability injection and distribution
pattern of vulnerabilities discovery can assist in determining the count of vulnerabil-
ities present in the system at any instant of time. For analytical understanding four
different cases have been considered.
VDM-I: Assuming the injection of vulnerability to follow the constant detection
pattern whereas the discovery process follows exponential pattern i.e., Vi (t) ∼ 1(t)
and Fd (t) ∼ exp(λ). On substitution the functional form for the same in Eq. (5) leads
to:
Vd (t) = v 1 − e−λt (6)
The function form given by Eq. (6) is like Rescorla’s Exponential (RE) model
(2005).
VDM-II: Considering another case of where injection happen to follow the
constant pattern whereas the discovery followed s-shaped pattern encompassed with
the learning i.e., Vi (t) ∼ 1(t) and Fd (t) ∼ logistic(λ, β). Putting the functional
form in Eq. (5) leads to:
1 − e−λt
Vd (t) = v (7)
1 + βe−λt
The function for VDM as stated in Eq. (8) is equivalent to the model proposed
by [15] where the researcher worked with Weibull distribution for vulnerability
discovery.
VDM-IV: If injection and discovery distribution function to follow one-stage
Erlang distribution with same rate i.e., Vi (t) ∼ exp(λ) and Fd (t) ∼ exp(λ). On
substituting in Eq. (5) gives:
Vd (t) = v 1 − (1 + λt)e−λt (9)
VDM-V: Considering the case when rates of vulnerability injection and discovery
both follow exponential pattern with different rates i.e.,Vi (t) ∼ ex p(λ1 ) and Fd (t) ∼
ex p(λ2 ). Using Eq. (5) we get:
1 −λ2 t
Vd (t) = v 1 − λ1 e − λ2 e−λ1 t (10)
(λ1 − λ2 )
VDM-VIII: Yet another case that can be on constant pattern for injection intensity
and vulnerability discovery process follows hump shaped rate of detection Vi (t) ∼
1(t) and Fd (t) ∼ hump Shape distribution (λ, β). Putting the functional form in
Eq. (5) leads to:
−λ 1
− 1
Vd (t) = v 1 − e 1+βe−λt 1+β
(13)
Equation (13) defines the Hump shaped based vulnerability discovery model being
dependent on constant pattern of vulnerability injection which is in line with [4].
246 M. Agarwal et al.
Further, the Eqs. (6)–(13) can assist in determining the count of discovered vulnera-
bilities based on the pattern of vulnerability injection which can be useful instruments
for framing patching and update schedules.
For parameter estimation a statistical software, SAS has been used which is a compre-
hensive and flexible software used for data management and analytics. Non-Linear
Regression (NLR) modules have been used, it is an analytical method which classi-
fies the variables as predictor and response variable. Here, we have considered time
as the predictor and vulnerabilities as the response variable.
Figures 1 and 2 depicts the goodness-of-fit curves, i.e., it displays the how the
observed vulnerability and the model predicted vulnerabilities variate over the time.
Classification of models based on pictorial representation sometimes becomes
ambiguous. Hence, a need for an approach for better classification is necessary.
Modeling Software Vulnerability Injection-Discovery Process … 247
VIKOR, a very important multi criteria decision making approach, was proposed
by Serafim Opricovic [25, 26, 30] and is also known as the compromise ranking
method. The technique helps choose the best option out of the various alternatives
by considering the conflicting criteria. The utility lies in the fact that it facilitates
decision making in situations where decision makers are unable to express their
preferences. It makes uses of an aggregate function by expressing closeness to the
ideal solution and to rank the various alternatives. Along with the ranking, VIKOR
also provides a compromise solution with an advantage rate which maximizes the
group utility for the majority and minimizes the individual regret for the opponent.
The optimal solution is closest to the ideal solution and farthest from the nadir
solution. Following are the steps involved in its implementation:
Step-1: Establish a matrix of criteria and different alternatives.
Step-2: Normalization of decision matrix.
248 M. Agarwal et al.
m ∗
xi − xi j
Si = wj ∗
j=1
xi∗ − xi−
∗
xi − xi j
max w j ∗
Ri =
x∗ − x− i i
j
Step-4: Determine the ideal solution and nadir solutions (negative ideal solution)
Step 7: Rank the Preference Order. The alternative with the smallest value is the
best solution.
The idea behind this presentation is to evaluate the appropriateness of the VIKOR
method that an all-inclusive classification of the alternative models can be directed
by considering several attributes related to the models for a given dataset.
Example 1: Vulnerability discovery data for Mozilla Firefox (DS-I) has been
collected from the CVE details [9] and for the assessment, optimal model selec-
tion and ranking on the models on the bases of six different performance criteria’s
i.e., SSE, MSE, Variance, Root-MSE, R2 and Adj.-R2 . The parameter values for the
eight proposed models have been identified by applying the non-linear least square
(NLLS) regression technique and the parameter is provided in Table 2.
The value of the six performance criteria have been obtained. The value of the
attributes for each VDM is given in Table 3.
Based on the above comparison table, it is noticed that the selection of best model
becomes difficult. To evade this difficulty, we have applied VIKOR method to classify
Table 2 Parameter estimates for VDS-I
Parameter VDM-I VDM-II VDM-III VDM-IV VDM-V VDM-VI VDM-VII VDM-VIII
v 9850 2670.246 4707.227 5761.476 6524.02 2870.237 4818.152 14,853.58
λ1 0.010351 0.210592 0.003688 0.067987 0.101144 0.199337 0.089582 0.210377
λ2 – – – – 0.040864 – – –
β – 14.67787 – – 4.21 0.23 16.25926
k – – 1.734185 –
Modeling Software Vulnerability Injection-Discovery Process …
249
250
the models by considering all six criteria taken together and the respective ranking
of the model is shown in Table 4.
The ranking is outlined on the relative closeness value of the models that is estab-
lished by considering six performance measures as contributing attributes together
used in VIKOR analysis. The model with the lowest value is given rank 1, that
with second lowest as rank 2, and so on. As per the results found after applying the
VIKOR method, it can be seen that the VDM-IV is ranked as first position followed
by VDM-V. Hence, the model that depicts the injection and discovery distribution
function follow one-stage Erlang distribution with same rate attain 1st rank.
Example 2: The vulnerability discovery data of Windows Server 2008 (DS-II) has
been collected from the CVE details [9] and for the assessment, optimal model selec-
tion and ranking on the models on the bases of six different performance criteria’s
i.e., SSE, MSE, Variance, Root-MSE, R2 and Adj.-R2 . The parameter values for the
eight proposed models have been identified by applying the non-linear least square
(NLLS) regression technique and the parameter is provided in Table 5.
The value of the six performance criteria have been obtained and the value of the
attributes for each VDM is given in Table 6.
Similarly, based on the above table it is difficult to judge which model is
performing best. Thus, VIKOR approach has been used to rank the models based on
all six criteria taken together and shown in Table 7.
The model with the lowest value is given rank 1, that with second lowest as rank
2, and so on. The results obtained depict that VDM-II is ranked as first position
followed by VDM-VIII. The outcomes of above examples show that not all the
scenarios can be best explained by all models and their predictive capabilities are
influenced accordingly. Also, the VDM-I is ranked 8 on both the data sets implying
exponential growth in vulnerability discovery is least suited to understand its pattern.
252
5 Conclusion
This chapter has been focused on modeling the impact of vulnerability injection
on the discover process. As quoted “Vulnerability finding increases total software
quality”, it would be better to discover the vulnerabilities before being discovered by
exploiters. Here, we have formulated a unification strategy based on infinite server
queuing theory for the modeling of eight different VDMs which are based on different
injection and discovery patterns. These models were tested for prediction capability
on two different data sets collected from CVE repository. Moreover, for in depth
understanding a ranking procedure VIKOR analysis has been used and ranks are
assigned based on six comparison criterions. It was observed that the predictive capa-
bilities of the proposed models vary with the change in shape of the data. According
to authors knowledge, this is the first time that the VDM is being studied based on
the injection pattern of vulnerabilities.
In future, we wish to work on extending the field of vulnerability discovery
modeling to incorporate more flexibility in locating loopholes because of random
lag function approach and thereby establishing the equivalence with the proposed
approach.
References
1. Alhazmi OH, Malaiya YK, Ray I (2007) Measuring, analyzing and predicting security
vulnerabilities in software systems. Comput Secur 26(3):219–228
2. Alhazmi OH, Malaiya YK (2005) Modeling the vulnerability discovery process. In: Proceed-
ings of the 16th IEEE international symposium on software reliability engineering. IEEE,
Chicago, IL, pp 138–147
3. Anand A, Agarwal M, Aggrawal D, Singh O (2016) Unified approach for modeling innovation
adoption & optimal model selection for the diffusion process. J Adv Manage Res-An Emerald
Insight 13(2):154–178
4. Anand A, Bhatt N (2016) Vulnerability discovery modeling and weighted criteria based ranking.
J Indian Soc Probab Stat 1–10
Modeling Software Vulnerability Injection-Discovery Process … 255
5. Anderson R (2002) Security in open versus closed systems—the dance of Boltzmann, Coase
and Moore. Cambridge University, England, Technical report, pp 1–15
6. Arora A, Nandkumar A, Telang R (2006) Does information security attack frequency increase
with vulnerability disclosure? An empirical analysis. Inf Syst Front 8(5):350–362
7. Bhatt N, Anand A, Aggrawal D (2019) Improving system reliability by optimal allocation of
resources for discovering software vulnerabilities. Int J Qual Reliab Manage
8. Bhatt N, Anand A, Yadavalli VSS, Kumar V (2017) Modeling and characterizing software
vulnerabilities. Int J Math, Eng Manage Sci (IJMEMS) 2(4):288–299
9. CVE (2019) https://www.cvedetails.com/. Accessed 20 Jan 2020
10. Chatterjee S, Saha D, Sharma A (2021) Multi-upgradation software reliability growth model
with dependency of faults under change point and imperfect debugging. J Softw: Evol Process
e2344
11. Gao X, Zhong W, Mei S (2015) Security investment and information sharing under an alternative
security breach probability function. Inf Syst Front 17(2):423–438
12. Garg S, Singh RK, Mohapatra AK (2019) Analysis of software vulnerability classification
based on different technical parameters. Inf Sec J: A Glob Perspect 28(1–2):1–19
13. Hanebutte N, Oman PW (2005) Software vulnerability mitigation as a proper subset of software
maintenance. J Softw Maint Evol Res Pract 17(6):379–400
14. Inoue S, Yamada S (2002) A software reliability growth model based on infinite server queuing
theory. In: Proceedings 9th ISSAT international conference on reliability and quality in design.
Honolulu, HI, pp 305–309
15. Joh H, Kim J, Malaiya YK (2008) Vulnerability discovery modeling using Weibull distribution.
In: 2008 19th international symposium on software reliability engineering (ISSRE). IEEE, pp
299–300
16. Kapur PK, Pham H, Gupta A, Jha PC (2011) Software reliability assessment with OR
applications. Springer, London Limited
17. Kapur PK, Sachdeva N, Khatri SK (2015) Vulnerability discovery modeling. In: International
conference on quality, reliability, infocom technology and industrial technology management,
pp 34–54
18. Kaur J, Anand A, Singh O (2019) Modeling software vulnerability correction/fixation process
incorporating time lag. In: Boca Raton FL (ed) Recent advancements in software reliability
assurance. CRC Press, pp 39–58
19. Kudjo PK, Chen J, Brown SA, Mensah S (2019) The effect of weighted moving windows
on security vulnerability prediction. In: 2019 34th IEEE/ACM international conference on
automated software engineering workshop (ASEW). IEEE, pp 65–68
20. Kumar A, Ram M (2018) System reliability analysis based on Weibull distribution and hesitant
fuzzy set. Int J Math Eng Manag Sci 3(4):513–521. https://doi.org/10.33889/IJMEMS.2018.3.
4-037
21. Liu B, Shi L, Cai Z, Li M (2012) Software vulnerability discovery techniques: a survey. In: 2012
fourth international conference on multimedia information networking and security. IEEE, pp
152–156
22. Liu Q, Xing L (2021) Survivability and vulnerability analysis of cloud RAID systems under
disk faults and attacks. Int J Math Eng Manag Sci 6(1):15–29. https://doi.org/10.33889/IJM
EMS.2021.6.1.003
23. MSRC Team (2017) Customer Guidance for WannaCrypt attacks. Accessed 25th Jan 2020
24. Massacci F, Nguyen VH (2014) An empirical methodology to evaluate vulnerability discovery
models. IEEE Trans Softw Eng 40(12):1147–1162
25. Opricovic S (1998) Multicriteria optimization of civil engineering systems. Faculty Civ Eng,
Belgrade 2(1):5–21
26. Opricovic S, Tzeng GH (2004) Compromise solution by MCDM methods: a comparative
analysis of VIKOR and TOPSIS. Eur J Oper Res 156(2):445–455
27. Rescorla E (2005) Is finding security holes a good idea? IEEE Secur Priv 3(1):14–19
28. Ryan KT (2016) Software processes for a changing world. J Softw: Evol Process 28(4):236–240
256 M. Agarwal et al.
29. Schatz D, Bashroush R (2017) Economic valuation for information security investment: a
systematic literature review. Inf Syst Front 19(5):1205–1228
30. Tong LI, Chen CC, Wang CH (2007) Optimization of multi-response processes using the
VIKOR method. The Int J Adv Manuf Technol 31(11–12):1049–1057
31. Verma R, Parihar RS, Das S (2018) Modeling software multi up-gradations with error genera-
tion and fault severity. Int J Math Eng Manag Sci 3(4):429–437. https://doi.org/10.33889/IJM
EMS.2018.3.4-030
32. Winder D (2019) https://www.forbes.com/sites/daveywinder/2019/08/20/data-breaches-exp
ose-41-billion-records-in-first-six-months-of-2019/#6e20808bd549, Accessed 25th Jan 2020
33. Woo SW, Joh H, Alhazmi OH, Malaiya YK (2011) Modeling vulnerability discovery process
in Apache and IIS HTTP servers. Comput Secur 30(1):50–62
34. Yang SS, Choi H, Joo H (2010) Vulnerability analysis of the grid data security authentication
system. Inf Secur J: A Glob Perspect 19(4):182–190
35. Younis A, Joh H, Malaiya Y (2011) Modeling learning less vulnerability discovery using a
folded distribution. In: Proceedings of SAM, vol 11, pp 617–623
Assessment of Reliability Function
and Signature of Energy Plant Complex
System
Monika Negi, Megha Shah, Akshay Kumar, Mangey Ram, and Seema Saini
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 257
M. Ram and H. Pham (eds.), Reliability and Maintainability Assessment
of Industrial Systems, Springer Series in Reliability Engineering,
https://doi.org/10.1007/978-3-030-93623-5_11
258 M. Negi et al.
1 Introduction
reliability and signature analysis. Section 2 studied some numerical formula for
computing signature factor. Section 3 concluded model configuration. In Sect. 4
having numerical result on the basis of Owen’s and UGF method. Section 5 and 6
describe result and discussion and conclusion part.
Notation
n = total number of components in the multi-state system
p = the probability of system
q = failure probability of the system
ϕ& ⊗ = shifting or multiplication operator
R = reliability of the system
E(X) = expected X of the components
E(T ) = excepted working time of units
S = signature of the system with l components
s = tail signature of system with l components
C = cost of the system having i units
H = number of components which is associated with n units.
The signature and its measures of the various coherent systems which are i.i.d.
elements [for instance, order statistics and reliability function methods [2, 28, 29]
can be evaluated in the following manner.
Signature and its measures are defined as
1 1
Sl = ϕ(H ) − ϕ(H ) (1)
n H ⊆[n]
n H ⊆[n]
n − l + 1 | H |=n−l+1 n − 1 | H |=n−1
From systems structure function which are of polynomial’s form and having i.i.d.
components
m
m e n−e n
H ( p) = Cj p q and Ce = si , e = 1, 2, .., n (2)
e=1
e i=n−e+1
262 M. Negi et al.
n
The system is Sl = si = ⎛ 1 ⎞ ϕ(H ) with n-tuples set function
n
i=l+1 ⎝ ⎠ | H |=n−l
n−1
such as S = (S0 , ..., Sn ) and we have to find the tail signature along with signature
of the system we considered. Using Taylor’s expansion and signature is
n−1 l
Sl = D P(1), l = 0, 1, ..., n (3)
n!
n
n
and E(T ) = μ Ci
i
and E(X ) = i.si , i = 1, 2, ..., n are defined as to determine
i=1 i=1
the cost and anticipated lifespan of the system from reliability function [6, 26] which
are based on minimal signature and quantity of failed components of the system with
one as the mean value.
3 Model Description
Thermo power plant system is the generalization of coal fired power plant system.
Thermo-plant model is an arrangement of apparatus and complex manner which is
helpful in the conversion of fossil fuel’s chemical energy into thermal energy. Then
transform it into steam it is shifted to water at high temperature and pressure and
it is utilized for the evolution of power in a turbine or steam engine. The water is
transformed into steam with the assistant of heat generated by burning of coal in the
boiler. Then, the exhaust steam is liquefied to water as it passes through a condenser.
The condensed steam (condensate) is driven into X 4 (GS condenser) through X 2 (CEP
1), X 3 (CEP 2) from X 1 (condenser). Then it proceeds through X 5 (drain cooler) and
eventually to X 9 (deaerator) after the elevation of its temperature in X 6 (heater 1),
X 7 (heater 2), X 8 (heater 3). The complete working of condensate system has been
discuss in the diagram given below in (Fig. 1).
4 Numerical Example
From the composition operator, get u-function using structure of the system such
as
X 10 = X 1 X 2 X 3
X 11 = X 4 X 5
X 12 = max(X 11 , X 6 , X 7 , X 8 )
264 M. Negi et al.
X 13 = X 10 X 12
X = max(X 9 , X 13 )
If all the probabilities be i.i.d. components of each other, then structure function
of the consider system is
p1 = p2 = ... = p9 = p.
The required reliability function of the proposed system is
R = p + 3 p4 − 5 p5 + 5 p7 − 4 p8 + p9 . (7)
= p + 3 p 4 − 5 p 5 + 5 p 7 − 4 p 8 + p 9 dp?
0
1 (8)
= e(−t) − 3e(−4t) − 5e(−5t) + 5e(−7t) − 4e(−8t) + e(−9t) dt
0
= 1.075.
9
E(X ) = isi , i = 1, 2, ..., 9. such as = 5.472. (9)
i=1
Assessment of Reliability Function and Signature … 267
In this work, signature used for comparison of system with respect to probability and
failure probability of working elements. System have in good state according to mean
time to failure, cost analysis and failure probability of the elements. Furthermore,
most of the components are not in substandard state and they will execute effectively.
These results are good enough for a system to perform excellently.
6 Conclusion
This chapter aims at the efficacy of various components of the thermo plant
energy system in which UGF and system structure have been used to calcu-
late the reliability function. We evaluated both signature (0, 6/72, 4752/36288,
229,824/1524096, 762,048/4572288, 1,233,792/9144576, 12,192,768/109734912,
73,156,608/658409472, 4032/36288) and tail signature (1, 1, 66/72, 396/504,
1920/3024, 708/1512, 2016/6048, 4032/18144, 4032/36288, 0) of the system which
came out to be less and hence making our model more productive. From the outcomes
of the case study we can say that system’s mean time to failure is 1.075 high but
the expected cost 5.09 of the system is moderate. If water supply are tends to fail,
the system is will to be collapse. That is the one of the major limitation of consider
system.
References
8. Goyal N, Ram M (2017) Stochastic modelling of a wind electric generating power plant. Int J
Qual Reliab Manage 34(1):103–127
9. Gupta S (2019) Stochastic modelling and availability analysis of a critical engineering system.
Int J Qual Reliab Manage 36(5):782–796
10. Huang GH, Loucks DP (2000) An inexact two-stage stochastic programming model for water
resources management under uncertainty. Civil Eng Syst 17(2):95–118
11. Komal S (2019) Fuzzy reliability analysis of the compressor house unit (CHU) system in a coal
fired thermal power plant using TBGFLT technique. Int J Qual Reliab Manage 36(5):686–707
12. Kumar A, Ram M (2019) Computation interval-valued reliability of sliding window system.
Int J Math, Eng Manage Sci 4(1):108–115
13. Kumar A, Singh SB (2017) Computations of the signature reliability of the coherent system.
Int J Qual Reliab Manage 34(6):785–797
14. Kumar A, Singh SB (2018) Signature reliability of linear multi-state sliding window system.
Int J Qual Reliab Manage 35(10):2403–2413
15. Kumar A, Singh SB (2019) Signature A-within-B-from-D/G sliding window system. Int J Math,
Eng Manage Sci 4(1):95–107
16. Kumar A, Varshney A, Ram M (2015) Sensitivity analysis for casting process under stochastic
modelling. Int J Ind Eng Comput 6(3):419–432
17. Kumar P, Singh LK, Chaudhari N, Kumar C (2020) Availability analysis of safety-critical and
control systems of NPP using stochastic modeling. Ann Nuclear Energy 147:107657
18. Levitin G (2001) Redundancy optimization for multi-state system with fixed resource-
requirements and unreliable sources. IEEE Trans Reliab 50(1):52–59
19. Levitin G (2002) Optimal allocation of elements in a linear multi-state sliding window system.
Reliab Eng Syst Saf 76(3):245–254
20. Levitin G (2005) The universal generating function in reliability analysis and optimization, p
442. Springer, London. https://doi.org/10.1007/1-84628-245-4
21. Machida F, Xiang J, Tadano K, Maeno Y (2013) Composing hierarchical stochastic model
from Sys ML for system availability analysis. In: 2013 IEEE 24th international symposium on
software reliability engineering (ISSRE), pp 51–60
22. Malik SC, Munday VJ (2014) Stochastic modelling of a computer system with hardware
redundancy. Int J Comput Appl 89(7):26–30
23. Manatos A, Koutras VP, Platis AN (2016) Dependability and performance stochastic modelling
of a two-unit repairable production system with preventive maintenance. Int J Prod Res
54(21):6395–6415
24. Marichal JL, Mathonet P (2013) Computing system signatures through reliability functions.
Stat Probab Lett 83(3):710–717
25. Michael S, Mariappan V, Kamat V (2011) Stochastic modelling of failure interaction: Markov
model versus discrete event simulation. Int J Adv Oper Manage 3(1):1–18
26. Navarro J, Rubio R (2009) Computations of signatures of coherent systems with five
components. Commun Stat-Simul Comput 39(1):68–84
27. Navarro J, Rychlik T (2010) Comparisons and bounds for expected lifetimes of reliability
systems. Eur J Oper Res 207(1):309–317
28. Navarro J, Ruiz JM, Sandoval CJ (2007) Properties of coherent systems with dependent
components. Commun Stat—Theory Meth 36(1):175–191
29. Navarro J, Rychlik T, Shaked M (2007) Are the order statistics ordered? A survey of recent
results. Commun Stat—Theory Meth 36(7):1273–1290
30. Osaki S, Nakagawa T (1976) Bibliography for reliability and availability of stochastic systems.
IEEE Trans Reliab 25(4):284–287
31. Ram M, Goyal N (2016) Automated teller machine network inspection under stochastic
modelling. J Eng Sci Technol Rev 9(5):1–8
32. Rushdi AMA, Ghaleb FAM (2021) Reliability characterization of binary-imaged multi-state
coherent threshold systems. Int J Math, Eng Manage Sci 6(1):309–321
33. Sahner RA, Trivedi KS (1993) A software tool for learning about stochastic models. IEEE
Trans Educ 36(1):56–61
Assessment of Reliability Function and Signature … 269
34. Samaniego FJ (2007) System signatures and their applications in engineering reliability, vol
110. Springer Science & Business Media. ISBN 978-0-387-71796-8
35. Samaniego FJ, Balakrishnan N, Navarro J (2009) Dynamic signatures and their use in
comparing the reliability of new and used systems. Naval Res Logistics (NRL) 56(6):577–591
36. Triantafyllou IS (2021) On the lifetime and signature of constrained (k, d)-out-of-n: F reliability
systems. Int J Math, Eng Manage Sci 6(1):66–78
37. Weng Q (2002) Land use change analysis in the Zhujiang Delta of China using satellite remote
sensing, GIS and stochastic modelling. J Environ Manage 64(3):273–284
38. Zacharof AI, Butler AP (2004) Stochastic modelling of landfill leachate and biogas production
incorporating waste heterogeneity. Model formulation and uncertainty analysis. Waste Manage
24(5):453–462
Reliability Evaluation and Cost
Optimization of Solar Air-Conditioner
Abstract As we all know synergy files are the channel for aspiring engineers
and technologists that are striving for a better and more sustainable world. During
summers when the temperature can soar to mid 40 °C in many places around the
world, air conditioning becomes more than just a luxury however it is costly to cool
spaces particularly in areas with a high level of humidity. One of the advantages of
cooling instead of heating is that cooling is required more when there is more heat
energy around to tap into or in other words there is more energy available. To be
more specific we have more solar energy available to us. Here we are interested in
the technology that uses heat from the sun to provide energy directly in the thermo-
dynamic circle of an air-conditioner. This research work is dedicated to evaluate the
reliability measures of solar air-conditioners which include availability, mean time to
failure (MTTF), and sensitivity analysis with their graphical representation by using
the Markov process. Along with reliability assessment, Particle Swarm Optimization
(PSO) technique is applied with the objective to find the minimum cost of the system
while taking maximum reliability as a constraint.
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 271
M. Ram and H. Pham (eds.), Reliability and Maintainability Assessment
of Industrial Systems, Springer Series in Reliability Engineering,
https://doi.org/10.1007/978-3-030-93623-5_12
272 A. S. Bhandari et al.
1 Introduction
The need for air conditioning has been increased rapidly in recent times due to global
warming and climate change. The simultaneous processing of temperature, humidity,
purification, and distribution of air current in compliance with the requirement of
space needing air conditioning is defined as air conditioning [18]. In general, air
conditioning is a system for controlling the humidity, ventilation, and temperature in
a building or vehicle, typically to maintain a cool atmosphere in warm conditions. All
these types of processors need a continuous energy supply to operate. However, air
conditioning would commonly take up half of building electricity consumption [19].
Limited availability and environmental concerns of conventional sources of energy
such as coal, petroleum, natural gas, etc. are some serious challenges in the twenty-
first century. Some renewable sources of energy like hydroelectric power plants and
wind turbines are not that feasible to set up as per requirement. Critical worries of the
present energy area comprise of issues such as nonstop expansion in energy interest,
quick exhaustion of ordinary energy assets, and the impact on the climate [2, 3, 17,
20, 26–29]. By 2050 the interest in energy gracefully could double or even triple as
the worldwide population develops and developing nations extend their economies.
This has just raised worries over potential power supply difficulties, exhaustion of
fuel sources, and assisting ecological effects.
The primary need for solar air conditioners is to reduce the use of electricity and
use a renewable clean source of energy more and more [7]. Though, expanding carbon
impression because of the developing utilization of air conditioners had expanded
interest in sunlight-based cooling. The utilization of solar air conditioners is broadly
found in business, private, and modern arrangements. Developing endeavors to spare
power and expanding ecological concerns has additionally enlarged interest in solar
air conditioners. Easy maintenance and cost adequacy are some other benefits of solar
air conditioners. After the installation of solar air conditioners at home, electricity
consumption can drop by 50% or more. The solar panels at the rooftop do not require
much care and maintenance, sometimes they might need some repair work which is
far less than the traditional air-cooling systems.
Solar air conditioners have the great advantage of the excess of solar energy, as
air conditioning is mostly used when the temperature is high outside. Solar panels
work best with shining bright daylight and can also be used with lower sunlight, as
they are equipped with batteries. These batteries can last for 12–24 h and can be
recharged by sunlight. Therefore, it can also be money efficient to use solar AC.
A Markov model consisting state transition diagram of a solar-powered air condi-
tioner has been designed to study reliability measures, availability, MTTF, cost, and
sensitivity analysis. In this regard, Lameiro and Duff [14] presented a model approach
to the generalized solar energy space heating performance analysis problem. The
authors developed a stochastic model for the performance analysis of solar energy
space heating systems and implemented that model into a working FORTRAN
computer program. Gangloff [5] had described intractable common mode failures,
which were not easy to find in the regular Markov block diagram approach. To
Reliability Evaluation and Cost Optimization … 273
evaluate the common mode of failure, the author extended the fault tree analysis
techniques. In the domain of failure and repair of a system, several authors [8, 9,
12] (Cui and Li 2007) calculated the reliability of complex systems having different
modes of failure and repair techniques. Tushar et al. [25] introduced a model that
can capture the intertemporal dependency of solar irradiance using Markov chain
segmentation. Various state transitions were created for different intervals of time
for the used model to generate the solar states during daytime. By using a numerical
example, the interdependency of solar states was established with proof that solar
state transition from state to state is time-dependent. They calculated a probability
transition matrix by using real solar data sets and discussed the improvement. Ram
[16] used the copula technique to calculate reliability measures of three state systems,
and the findings of the work were significantly improved. Singh et al. [23] analyzed
the cost for a machine that consisted of two subsystems. Along with cost analysis
authors discussed various measures of reliability for the used system. The authors
concluded the relationship between availability and time for a complex repairable
system. Ram and Singh [21] improved the availability. MTTF and expected profit for
a complex system. The authors implemented the Gumbel-Hougaard family copula
technique in their work to conclude that in the long-run availability of the system
becomes constant and the expected profit decreases rapidly.
Particle swarm optimization (PSO) is a populace-based stochastic improvement
strategy created by [13]. It shows some developmental calculation properties: 1.
It is introduced with a populace of arbitrary arrangements that are called random
solutions. 2. It looks for optima by updating generations. 3. Updating generations
depends on past generations. In PSO, the possible arrangements, called particles, fly
through the issue space by following the current ideal particles [10].
The updates of the particles are cultivated by the accompanying conditions. Condi-
tion (A) ascertains new velocity for every particle dependent on its past velocity, the
particle’s area at which the best fitness (pid ) has been accomplished up until now,
and the best particle among the neighbors at which the best fitness (pgd ) has been
accomplished up until now. Condition (B) updates every particle’s position in the
solution hyperspace.
v(i+1)d = wI vid + c1rand(v)(pid − xid ) + c2Rand(v) pgd − xid
where d is the dimension, positive constants c1 and c2 are learning factors, rand(v)
and Rand(v) are random functions, and the inertia weight ‘w’ has provided improved
performance in several applications. So,
the simulation of a basic social model that describes connections between PSO and
genetic algorithms. The authors concluded PSO is a very simple and strong technique
for the optimization of several types of functions. Shi and Eberhart [22] experimen-
tally studied the efficiency of PSO. The authors showed the high rate of conver-
gence toward the optimal solution in PSO which makes the technique dependable.
The authors also suggested a new technique to make advancements in PSO. Some
authors [1, 4] also discussed the comparison of genetic algorithm and PSO tech-
niques and advancement in PSO as well. Liu et al. [15] introduced a hybrid particle
swarm optimization technique as up-gradation of PSO with the help of chaos, which
is a type of nonlinear system’s property. The authors concluded that the hybrid
PSO performs exploration as well as exploitation by implementing the chaotic local
searching performance. Jiang et al. [11] discussed an improvement in PSO by parti-
tioning the swarms into several sub-swarms, where each sub-swarm performs PSO
and after a period the whole population is shuffled, and new sub-swarm are created to
perform PSO again. This technique remarkably improved the ability to explore and
exploit the swarms. The authors concluded that the improved PSO greatly benefits
calculation accuracy and effective global optimization.
There are no destructive impacts or side-effects of solar air conditioner. They use
water as a refrigerant rather than some other harmful coolants. Additionally, on the
grounds that they are solar powered, they are autonomous of the poer grid and ration
energy by dispensing heat and friction created by the wires of a conventional air
conditioner.
Solar air conditioners come with no harmful effects and zero waste products, as
instead of any harmful coolant, they use water as a refrigerant. These solar air systems
work best in radiant, dry atmospheres, for example, the South-western United States
or Northern Africa. As these systems can also work with battery power, they can
perform well in rainy, cloudy weather. Following Fig. 1 describes the block diagram
of solar air conditioner.
4 Mathematical Modelling
The accompanying differential equations have been drawn from the state transition
diagram (Fig. 2) of the solar air conditioner.
276 A. S. Bhandari et al.
Table 1 Notations
t Time
s Variable for Laplace transformation
x Supplementary variable
λ1 /λ2 /λ3 /λ4 Hazard rate of unit inverter/ battery/ solar panel/ charge controller
μ Repair rate of the system from the degraded state P1 (t) to P0 (t)
P0 (t) Probability of initial (good) state
P1 (t) Probability of the degraded state when the system has one broke down unit of solar
panel
P2 (t) Completely failed state probability caused by battery failure
P3 (t) Completely failed state probability caused by inverter failure
P4 (t) Completely failed state probability caused by failure of both solar panels
P5 (t) Completely failed state probability caused by failure of charge controller
φ(x) Rate of repairing all the failed states
d
+ λ1 + λ2 + 2λ3 + λ4 P0 (t) = μ(P1 (t))
dt
∞
+ φ(x)[P2 (x, t) + P3 (x, t) + P4 (x, t) + P5 (x, t)]d x
0
(1)
d
+ λ2 + λ3 + λ4 + μ P1 (t) = 2λ3 t P0 (t) (2)
dt
∂ ∂
+ + φ(x) P2 (x, t) = 0 (3)
∂x ∂t
∂ ∂
+ + φ(x) P3 (x, t) = 0 (4)
∂x ∂t
∂ ∂
+ + φ(x) P4 (x, t) = 0 (5)
∂x ∂t
∂ ∂
+ + φ(x) P5 (x, t) = 0 (6)
∂x ∂t
Boundary conditions
Initial Conditions
P0 (0) = 1 (11)
By taking the Laplace transformation from Eqs. (1) to (10) and then using Eq. (11),
the solution obtained for the model is given below:
278 A. S. Bhandari et al.
[s + λ1 + λ2 + 2λ3 + λ4 ]P0 (s) = 1 + μ P1 (s)
∞
+ φ(x) P2 (x, s) + P3 (x, s) + P4 (x, s) + P5 (x, s) d x
0
(12)
For i = 2, 3, 4, 5.
Boundary condition
P2 (0, s) = λ2 P0 (t) + P1 (s) (19)
P3 (0, s) = λ1 P0 (t) + P1 (s) (20)
P4 (0, s) = λ3 P1 (s) (21)
1
P0 (s) = (23)
D(s)
where
2λ3
D(s) = (s + +λ1 + λ2 + 2λ3 + λ4 ) − μ
s + λ2 + λ3 + λ4 + μ
Reliability Evaluation and Cost Optimization … 279
λ 1 + λ2 + λ3 + λ4
− + λ1 + λ2 + λ4 sφ (s) (24)
s + λ1 + λ2 + λ3 + μ
From (13)
2λ3
P1 (s) = P0 (s) (25)
s + λ2 + λ3 + λ4 + μ
1 − sφ (s)
P2 (s) = λ2 P0 (s) + P1 (s) (26)
s
1 − sφ (s)
P3 (s) = λ3 P1 (s) (27)
s
1 − sφ (s)
P4 (s) = λ3 P1 (s) (28)
s
1 − sφ (s)
P5 (s) = λ4 P0 (s) + P1 (s) (29)
s
It is noticed that
1
Pup (s) + Pdown (s) = (32)
s
4.3 Availability
1.0
0.9
0.8
Availability
0.7
0.6
0.5
0 1 2 3 4 5 6 7 8 9 10 11 12
2 13 14 1
15
5
Time (t)
Table 2 shows the availability of solar air conditioner at different times and
following Fig. 3 is the demonstration of the table.
The reliability of the solar air conditioner’s is one of the basic quality attributes that
manages the conduct of every component and can be characterized as the likelihood
Reliability Evaluation and Cost Optimization … 281
of the solar air conditioner to perform adequately for a specified time stretch in a
determined climate. With the unavailability of a repair facility i.e., μ = 0, φ(x) = 0
and λ1 = 0.8, λ2 = 0.03, λ3 = 0.05, λ4 = 0.09 [6], the reliability
− 1+ 0.10
s+0.17
R(s)=
s + 0.30
Now, by taking the Laplace inverse authors have calculated the reliability of the
model in terms of time (t). Which is
Above Table 3 represents the reliability of the model for the time t = 0 to t = 10
and Fig. 4 is the graphical representation of the reliability of solar air conditioner
with respect to time.
MTTF represents the average time between the failure of components of the system.
Following Table 4 and Fig. 5 describes the MTTF of the solar air conditioner with
respect to the variation of failure.
282 A. S. Bhandari et al.
1.0
0.8
Reliability
0.6
0.4
0.2
0.0
0 2 4 6 8 10
Time(t)
Sensitivity in the reliability of the solar air conditioner can be analyzed by taking
partial derivative of the reliability function with respect to their input parameters, after
taking the inverse Laplace transformation. By fixing the values of input parameters as
λ1 = 0.8, λ2 = 0.03, λ3 = 0.05,λ4 = 0.09 [6] in the partial derivatives of reliability
284 A. S. Bhandari et al.
expression. Table 6 contains the numerical values of the reliability sensitivity which
are demonstrated graphically in Fig. 7.
0 0 0 0 0
1 −0.3850328887 −0.4174266903 0.3341754247 −0.4174266903
2 −0.3099803012 −0.3969378011 0.6618856904 −0.3969378011
3 −0.2007384159 −0.3360541034 0.7658142659 −0.3360541034
4 −0.1274532735 −0.2980814046 0.7267230318 −0.2980814046
5 −0.08521134701 −0.2783179595 0.6278076002 −0.2783179595
6 −0.06124393423 −0.2661764893 0.5158654217 −0.2661764893
7 −0.04689343589 −0.2554315687 0.4116244354 −0.2554315687
8 −0.03750226945 −0.2435809364 0.3221518873 −0.2435809364
9 −0.03077464807 −0.2300916402 0.2483868571 −0.2300916402
10 −0.02560774555 −0.2152497664 0.1888986146 −0.2152497664
Reliability Evaluation and Cost Optimization … 285
The aim of using PSO is to get minimized the cost of the system with the required
reliability of the whole system as well as the components. The following nonlinear
programming problem is solved to get the results:
Minimize
C = K 1 R1 α1 + K 2 R2 α2 + K 3 R3 α3 + 2K 4 R4 α4 (33)
−[R 2 R3 R4 2R1 − R1 2 ] ≤ 0.3
6 Results Analysis
In this research work, the authors have studied the numerous reliability characteristics
such as obtainability, dependability, MTTF, and sensitivity of reliability and MTTF of
the system under the consideration of four types of failures and employing the Markov
process. Through the general investigation on the solar air conditioning system,
Reliability Evaluation and Cost Optimization … 287
the authors mentioned the accompanying observable facts. Figure 3 addressing the
diagram of the availability of the solar air conditioner versus time. Initially, the
availability of the system decreases smoothly with increase in time but later, the value
of the availability of the system attains a constant value with the increase in time. The
general pattern shows that the availability of the system decreases with the increase
in time, and maintain a constant value (after long time) of 0.5 approximately for the
defined values of the failure rates. Figure 4 communicates the pattern of reliability
of the solar air conditioner with respect to time. From the graph, the reliability of
the solar air conditioner also decreases with time and tends to zero as time increases.
The pattern of MTTF of the solar air conditioner is illustrated in Fig. 5. Here L1,
L2, L3, and L4 represents the failure rates λ1 , λ2 , λ3 and λ4 respectively. Figures 6
and 7 are behavioral explanations of MTTF sensitivity and reliability sensitivity of
the solar air conditioner with respect to time. MTTF sensitivity is increasing while
there is variation in failure rates λ1 and λ2 , decreasing with variation in λ4 and
rapidly increasing after a quick fall while varying the failure rate λ3 . From Fig. 6, the
sensitivity in the reliability of the solar air conditioner is observable, here also L1,
L2, L3, and L4 represents the failure rates λ1 , λ2 , λ3 and λ4 respectively. Figure 8 is
the PSO convergence curve for the cost of solar air conditioner which converges to
365.0355.
7 Conclusion
The current work proposes the Markov model of solar air conditioner. Overall assess-
ment presumes that with the standard upkeep of solar air conditioners, availability
remains steady after some time. It is beneficial to comment here that the solar air
conditioner is generally delicate in worry of component failure. In this way, it is
important to deal with component failure rate, to achieve an exceptionally dependable
solar air conditioner. The optimization technique is capable to determine the number
of redundant components and their location for higher reliability and lower cost. The
consequences of this examination are a lot useful for technologists, engineers, and
plant managers.
References
1. Angeline PJ (1998) Using selection to improve particle swarm optimization. In: IEEE
international conference on evolutionary computation. Anchorage, Alaska, May 4–9, 1998
2. Chai B, Yang Z (2014) Impacts of unreliable communication and modified regret matching
based anti-jamming approach in smart microgrid. Ad Hoc Netw 22:69–82
3. Chai B, Chen J, Yang Z, Zhang Y (2014) Demand response management with multiple utility
companies: a two-level game approach. IEEE Trans Smart Grid 5(2):722–731
4. Eberhart RC, Shi YH (1998) Comparison between genetic algorithms and particle swarm
optimization. In: 1998 annual conference on evolutionary programming. San Diego
288 A. S. Bhandari et al.
5. Gangloff WC (1975) Common mode failure analysis. IEEE Trans Power Apparatus Syst
94(1):27–30
6. Goyal N, Ram M, Kaushik A (2017) Performability of solar thermal power plant under
reliability characteristics. Int J Syst Assurance Eng Manage 8(2):479–487
7. Gugulothu R, Somanchi NS, Banoth HB, Banothu K (2015) A review on solar powered air
conditioning system. Procedia Earth Planetary Sci 11:361–367
8. Gupta PP, Agarwal SC (1984) A parallel redundant complex system with two types of failure
under preemptive-repeat repair discipline. Microelectron Reliab 24(3):395–399
9. Gupta PP, Sharma MK (1993) Reliability and MTTF evaluation of a two duplex-unit standby
system with two types of repair. Microelectron Reliab 33(3):291–295
10. Hu X, Eberhart RC, Shi Y (2003) Engineering optimization with particle swarm. In: Proceed-
ings of the 2003 IEEE swarm intelligence symposium. SIS’03 (Cat. No. 03EX706). IEEE, pp
53–57
11. Jiang Y, Hu T, Huang C, Wu X (2007) An improved particle swarm optimization algorithm.
Appl Math Comput 193(1):231–239
12. Ke J-B, Lee W-C, Wang K-H (2007) Reliability and sensitivity analysis of a system with
multiple unreliable service stations and standby switching failures. Phys A 380:455–469
13. Kennedy J, Eberhart R (1995) Particle swarm optimization. In: Proceedings of ICNN’95-
international conference on neural networks, vol 4. IEEE, pp 1942–1948
14. Lameiro GF, Duff WS (1979) A Markov model of solar energy space and hot water heating
systems. Sol Energy 22(3):211–219
15. Liu B, Wang L, Jin YH, Tang F, Huang DX (2005) Improved particle swarm optimization
combined with chaos. Chaos, Solitons Fractals 25(5):1261–1271
16. Ram M (2010) Reliability measures of a three-state complex system: a copula approach. Appl
Appl Math: An Int J 5(10):1483–1492
17. Hassan NU, Pasha MA, Yuen C, Huang S, Wang X (2013) Impact of scheduling flexibility
on demand profile flatness and user inconvenience in residential smart grid system. Energies
6(12):6608–6635
18. Ochi M, Ohsumi K (1989) Fundamental of refrigeration and air conditioning. Ochi Engineering
Consultant Office
19. Pachauri RK, Reisinger A (2007) 2007. IPCC fourth assessment report. IPCC, Geneva
20. Khan RH, Brown J, Khan JY (2013) Pilot protection schemes over a multi-service WiMAX
network in the smart grid. In: 2013 IEEE international conference on communications
workshops (ICC). IEEE, pp 994–999
21. Ram M, Singh SB (2010) Availability, MTTF and cost analysis of complex system under
preemptive-repeat repair discipline using Gumbel-Hougaard family copula. Int J Qual Reliab
Manage
22. Shi Y, Eberhart RC (1999) Empirical study of particle swarm optimization. In: Proceedings of
the 1999 congress on evolutionary computation-CEC99 (Cat. No. 99TH8406), vol 3. IEEE, pp
1945–1950
23. Singh VV, Ram M, Rawal DK (2013) Cost analysis of an engineering system involving
subsystems in series configuration. IEEE Trans Autom Sci Eng 10(4):1124–1130
24. Tillman FA, Hwang CL, Fan LT, Lai KC (1970) Optimal reliability of a complex system. IEEE
Trans Reliab 19(3):95–100
25. Tushar W, Huang S, Yuen C, Zhang JA, Smith DB (2014) Synthetic generation of solar states
for smart grid: A multiple segment Markov chain approach. In: IEEE PES innovative smart
grid technologies. IEEE, Europe, pp 1–6
26. Tushar W, Chai B, Yuen C, Smith DB, Wood KL, Yang Z, Poor HV (2014) Three-party energy
management with distributed energy resources in smart grid. IEEE Trans Industr Electron
62(4):2487–2498
27. Tushar W, Yuen C, Chai B, Smith DB, Poor HV (2014) Feasibility of using discriminate pricing
schemes for energy trading in smart grid. In: 2014 IEEE global communications conference.
IEEE, pp 3138–3144
Reliability Evaluation and Cost Optimization … 289
28. Khalid YI, Hassan NU, Yuen C, Huang S (2014) Demand response management for power throt-
tling air conditioning loads in residential smart grids. In: 2014 IEEE international conference
on smart grid communications (SmartGridComm). IEEE, pp 650–655
29. Liu Y, Hassan NU, Huang S, Yuen C (2013) Electricity cost minimization for a residential
smart grid with distributed generation and bidirectional power transactions. In: 2013 IEEE
PES innovative smart grid technologies conference (ISGT). IEEE, pp 1–6
Analyzing Interrelationships Among
Software Vulnerabilities Using Fuzzy
DEMATEL Approach
M. Anjum
Amity Institute of Information Technology, Amity University, Noida, Uttar-Pradesh, India
P. K. Kapur
Amity Center for Inter-Disciplinary Research, Amity University, Noida, Uttar-Pradesh, India
V. Agarwal (B)
Amity International Business School, Amity University, Noida, Uttar-Pradesh, India
V. Kumar
Department of Operational Research, University of Delhi, Delhi, India
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 291
M. Ram and H. Pham (eds.), Reliability and Maintainability Assessment
of Industrial Systems, Springer Series in Reliability Engineering,
https://doi.org/10.1007/978-3-030-93623-5_13
292 M. Anjum et al.
1 Introduction
The technological innovation has grown across the globe, with the growing popularity
of electronic equipments, elevated sensors and other kinds of networking technolo-
gies are generally included in the Internet of Things (IoT) umbrella [1]. A statistical
forecast shows that the penetration of these intelligent devices in global infrastruc-
ture is anticipated to rise from 26 billion devices to over 75 billion by 2025 [2]. This
rapid growth of the use of networked intelligent devices leads to a growing danger of
cybersecurity [3]. Computer users may have been the only individuals to care about
vulnerabilities in the past [4], with the changing communication dynamics, anyone
using smartphones, smart watches, smart TVs, or any other connected device is now
vulnerable to the theft of their information [5]. Present organizations and companies
have been quite careful to safeguard their networks. However, even with sustain-
able and accountable defense expenditure, they are still at risk [6]. This occurs as
assailants can circumvent the security of organizations via unknown vulnerabili-
ties that security individuals do not list [7]. A flaw can be uncovered by a repeated
test of a certain hacker in a well-guarded network [8]. Intruders might use network
configuration flaws to breach the intended target [6].
Historical statistics demonstrate that computer threats with different
complexity/lethality exist in various forms. Computer vulnerability has also
been widely documented and a group effort was made to compile lists of vulner-
abilities such as the Database for National Vulnerabilities (NVD) and Common
Vulnerabilities and Exposures (CVE) [9]. In recent years, the amount of risk
assessments filed by NVD has continued to rise. In 2019 the NVD was reported
with 17,302 vulnerabilities [10] while as a total of 16,555 security vulnerabilities
were reported in the year 2018. IT management needs to detect and evaluate
vulnerabilities on various hardware and software platforms in order to prevent
vulnerability losses [11].
In the existing literature lot of quantitative models have been proposed by several
authors. Such quantitative models allow developers to allocate resources for testing
process, planning and security patch creation [12–16]. In addition, developers can use
Vulnerability Discovery Models (VDM) to assess risk and estimate the redundancy
needed in resources and procedures to deal with potential breaches [17]. As the repair
of all the vulnerabilities within the given time and money is a highly tiresome process,
the security team must prioritize these vulnerabilities and remedy the riskiest ones
[18]. Techniques for prioritization are divided into two major groups: quantitative
and qualitative. Qualitative systems offer a grading method for defining the degree of
vulnerabilities in software, whereas quantitative scoring systems link each vulnera-
bility with a number result [19]. Multicriteria (MCDM) decision-making approaches
have recently become a prominent tool for assessing vulnerabilities [20]. It is regarded
a complicated decision-making technique that incorporates both quantitative and
qualitative elements [21].
The literature has offered a broad range of MCDM procedures. The concepts and
applications of Fuzzy MCDM are always evolving [22] as they help to remove the
Analyzing Interrelationships Among Software Vulnerabilities … 293
vagueness in the data. In recent years, certain fuzzy MCDM techniques have been
suggested and utilized frequently, such fuzzy TOPSIS [23], fuzzy ELECTRE [24]
and FBWM [25]. In the context of the use of MCDM in prioritizing vulnerability,
a hybrid strategy has been provided in one recent study by Sibal based on Normal-
ized Criteria Distance (NCD), analytical hierarchy Process (AHP), and DEMATEL.
[26]. Also, Narang et al. [27] have proposed DEMATEL method to show the inter-
dependence in various vulnerabilities by suggesting a cause–effect theory. In the
conventional DEMATEL, measures are based on clear, ambiguous standards. The
following research targets can be selected from the previous discussion:
• Identifying vulnerability types of a software.
• To evaluate the interrelationships in terms of cause and effect among the identified
software vulnerability types.
To address the aforementioned research questions, the manuscript proposes a
framework for assessing these interrelationships of various software vulnerabili-
ties by employing “Fuzzy Decision-making Trial and Evaluation Laboratory (F-
DEMATEL).” DEMATEL methods assist to show a contextual link among the sorts
of vulnerabilities, while ambiguity in the decision-making process is incorporated
into the fuzzy set theory. The manuscript is organized in the following way. The
research technique utilized in this study is described in Sect. 2. Section 2.1 provides
the dataset in this article and Fuzzy DEMATEL is discussed in Sect. 2.2. Section 3
represents the data validation. Section 4 discusses the conclusion of the study.
2 Research Methodology
The present section concentrates on the methods of research. This study identi-
fies the vulnerabilities first, and then uses Fuzzy DEMATEL MCDM technique to
discover interconnections between categories of vulnerability and also to evaluate
what influence one type has on another.
The data set for this study was collected from the National Vulnerability Database
(NVD) [28] and CVE [29] which are legitimate databases for data gathering. This
data collection comprises of nine categories of vulnerabilities, which are shown in
the following table along with their notations (Table 1).
294 M. Anjum et al.
Table 1 Software
Notation Vulnerability type
vulnerability types
G1 SQL injection (SQLI)
G2 Cross site scripting (XSS)
G3 Buffer overflow (BO)
G4 Cross site request forgery (CSRF)
G5 File inclusion (FI)
G6 Code execution (CE)
G7 Information gain (IG)
G8 Gain of privileges (GP)
G9 Race condition (RC)
Step 4.2: “Compute the normalized left (lxa) and right (lxc) value”:
xbil j
lxail j = (4)
1 + xbil j − xail j
xcil j
lxcil j = (5)
1 + xcil j − xbil j
1 1
ki j = ki j + ki2j + ... + kil j (8)
l
3 Data Analysis
One measure can be influenced over the others by defining threshold levels in the
overall matrix of the relationship that is 0.476 in this scenario. (R + C) indicates
the strength index of effects exercised or received as indicated in Table 6. Increased
the (R + C) value, larger is the degree of effect on other characteristics. If (R −
C) is positive, then A affects other attributes, and if value is negative, A is being
affected. G7 (Information Gain) has become the most prominent attribute with the
greatest (R + C) value based on the aforementioned information. Once the threshold
value is calculated we can easily calculate which vulnerability is influencing other
vulnerability from Table 5. The values greater than the threshold value are known to be
influencing other vulnerability types accordingly. The vulnerability G1 is influencing
G3, G4, G7 and G8 with intensities of 0.476, 0.539, 0.610 and 0.602 respectively.
Vulnerability type G2 is influencing G4, G5, G7 and G8. Type G3 is influencing
G2, G4, G5 and G7. Vulnerabilities G2, G5 and G7 are being influenced by G4.
Likewise, vulnerability G6 influences all the identified vulnerability types excluding
G5 and G6. G7, G8 and G9 are influencing vulnerabilities G2, G4, G5, G8 and G7
respectively. The values inside the table represent the intensity of the effect. we also
obtain the causal diagram by mapping a dataset of (Ci + Ri, Ci − Ri) as given in
Table 6 is represented in Fig. 1.
On analyzing the above diagram given in Fig. 1, it is clear that vulnerability
metrices are visually divided into the cause and effect group. The vulnerabilities G1,
G2, G5, G6 and G9 are included in the cause group and vulnerability types G3, G4,
G7 and G8 fall under the effect group.
4 Conclusion
In our everyday lives, software systems perform important and versatile functions.
In almost every advanced and complicated system, the software components are
the heart and soul. Thus, several organizations are widely using interdependent and
networked software systems for their business decisions. However, organizations
have lost large amounts of money and reputations in contemporary systems owing
to security violations and software vulnerabilities. This work aims at identifying and
assessing contextual connections between the categories of vulnerabilities in software
that assist the Security Team in minimizing the risk utilizing the fuzzy theory and
DEMATEL methodology.
The nine selected vulnerabilities are classified into cause and effect group.
According to the Table 6, it is evident that the vulnerabilities SQLI, XSS, FI, CE and
RC are the causative vulnerabilities while as BO, CSRF, IG and GP are the affected
vulnerabilities. G7 > G5 > G2 > G4 > G8 > G3 > G1 > G9 > G6 is the priority
of vulnerabilities based on the values of (R + C). The most critical vulnerability with
the greatest (R + C) value turned out to be Vulnerability G7 which requires quick
attention. When we look at the equivalent G3 (Ri-Ci) value, we discover it works as
an effect and therefore may decrease the effects of this vulnerability type by working
directly with the causes. Similarly, the following vulnerabilities are found in the list
G4, G7 and G8 and also the effect of additional vulnerabilities. Security managers
need to focus on such causative vulnerabilities so that the other vulnerabilities can
be controlled and loss is minimized.
References
9. Dondo MG (2008) A vulnerability prioritization system using a fuzzy risk analysis approach.
In IFIP international information security conference. Springer, Boston, MA, pp 525–540
10. National Vulnerability Database, published on January 1, 2020. https://nvd.nist.gov/general/
news
11. Liu Q, Zhang Y, Kong Y, Wu Q (2012) Improving VRSS-based vulnerability prioritization
using analytic hierarchy process. J Syst Softw 85(8):1699–1708
12. Kimura M (2006) Software vulnerability: definition, modelling, and practical evaluation for
e-mail transfer software. Int J Press Vessels Pip 83(4):256–261
13. Okamura H, Tokuzane M, Dohi T (2013) Quantitative security evaluation for software system
from vulnerability database. J Softw Eng Appl 06:15
14. Kapur PK, Garg RB (1992) A software reliability growth model for an error-removal
phenomenon. Softw Eng J 7(4):291–294
15. Kansal Y, Kapur PK, Kumar U, Kumar D (2017) User-dependent vulnerability discovery model
and its interdisciplinary nature. Int J Life Cycle Reliab Saf Eng, Springer 6(1):23–29
16. Younis A, Joh H, Malaiya Y (2011) Modeling learningless vulnerability discovery using a
folded distribution. In: Proceedings of SAM, vol 11, pp 617–623
17. Arora A, Krishnan R, Nandkumar A, Telang R, Yang Y (2004). Impact of vulnerability disclo-
sure and patch availability-an empirical analysis. In: Third workshop on the economics of
information security vol 24, pp 1268–1287
18. Anjum M, Agarwal V, Kapur PK, Khatri SK (2020) Two-phase methodology for prioritization
and utility assessment of software vulnerabilities. Int J Syst Assurance Eng Manage 11(2):289–
300
19. Liu Q, Zhang Y (2011) VRSS: a new system for rating and scoring vulnerabilities. Comput
Commun 34(3):264–273
20. Kazimieras Zavadskas E, Antucheviciene J, Chatterjee P (2019) Multiple-criteria decision-
making (MCDM) techniques for business processes information management
21. Govindan K, Rajendran S, Sarkis J, Murugesan P (2015) Multicriteria decision making
approaches for green supplier evaluation and selection: a literature review. J Clean Prod
98:66–83
22. Mardani A, Jusoh A, Zavadskas EK (2015) Fuzzy multiple criteria decision-making techniques
and applications–two decades review from 1994 to 2014. Expert Syst Appl 42(8):4126–4148
23. Zhang X, Xu Z (2015) Soft computing based on maximizing consensus and fuzzy TOPSIS
approach to interval-valued intuitionistic fuzzy group decision making. Appl Soft Comput
26:42–56
24. Chen N, Xu Z (2015) Hesitant fuzzy ELECTRE II approach: a new way to handle multi-criteria
decision-making problems. Inf Sci 2015(292):175–197
25. Anjum M, Kapur PK, Agarwal V, Khatri SK (2020) A framework for prioritizing software
vulnerabilities using fuzzy best-worst method. In: 2020 8th international conference on relia-
bility, infocom technologies and optimization (trends and future directions) (ICRITO). IEEE,
pp 311–316
26. Sibal R, Sharma R, Sabharwal S (2017) Prioritizing software vulnerability types using multi-
criteria decision-making techniques. Life Cycle Reliab Saf Eng 6(1):57–67
27. Narang S, Kapur PK, Damodaran D, Majumdar R (2018) Prioritizing types of vulnerability
on the basis of their severity in multi-version software systems using DEMATEL technique.
In: 2018 7th international conference on reliability, infocom technologies and optimization
(trends and future directions) (ICRITO). IEEE, pp 162–167
28. National Vulnerability Database, nvd.nist.gov/, 2020
29. CVE Details, The Ultimate Security Vulnerability Data source, www.cvedetails.com. 2020
30. Agarwal V, Govindan K, Darbari JD, Jha PC (2016) An optimization model for sustainable
solutions towards implementation of reverse logistics under collaborative framework. Int J Syst
Assurance Eng Manage 7(4):480–487
31. Opricovic S, Tzeng GH (2003) Defuzzification within a multicriteria decision model. Int J
Uncertainty, Fuzziness Knowl-Based Syst 11(05):635–652
Universal Generating Function Approach
for Evaluating Reliability and Signature
of All-Digital Protection Systems
1 Introduction
S. Bisht (B)
Department of Mathematics, Eternal University, Himachal Pradesh, Baru Sahib, India
S. B. Singh
Department of Mathematics, Statistics and Computer Science, G.B. Pant, University of
Agriculture and Technology, Pantnagar, India
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 301
M. Ram and H. Pham (eds.), Reliability and Maintainability Assessment
of Industrial Systems, Springer Series in Reliability Engineering,
https://doi.org/10.1007/978-3-030-93623-5_14
302 S. Bisht and S. B. Singh
control and monitor the power system. There are many extensive controlling systems
in the future grid which deals with control of different aspects such as voltage, current,
energy distribution at transformers, substations transformers, distribution in different
switching devices and smart meters [4, 21].
The latest development of non-conventional devices and widespread use of
different digital relays to analyze the performance of an ADPS. An IEC61850
digital process system is connected to digital relays, through the output of the
non-conventional device. IEC 61,850 substations have been found to be more reli-
able than traditional substations, which have been in use for decades. Many of the
features mentioned in the IEC 61,850, such as fibre optics, redundancy settings, and
a self-testing and monitoring system, all contribute to the IEC 61,850’s increased
reliability. Usually, all conventional protection system is less reliable than digital
protection systems. Digital protection systems contain more electronic devices than
the conventional one, viz. MUs, Ethernet switches, time sources and PRs. The relia-
bility of the system can be improved in two ways, one by replacing the copper wires
with the fiber optics as proposed by the IEC 61,850 process bus. Another way is
redundancy, viz. (i) by adding redundant components in the selected components
of MUs and redundant MUs, named as MU1 and MU2; (ii) a communication link
to protective relay (PR) and their redundant components, recognized as PR1 and
PR2; and (iii) Ethernet communication media (EC) and their redundant components,
named as EC1, EC2, EC3, EC4, EC5, and EC6 [2].
In the past, many techniques are used for the analysis of different reliability aspects
of conventional protection systems, one of them is Markov process. Singh and Patton
[17] discussed a model of a system and its related protection system. The protection
systems recognized the faults and faulty components in the system to prevent the
damage and also minimize the effect of fault component in the whole operation of
the system. Schweitzer et al. described the reliability analysis of protective systems
using fault tree analysis. Many techniques applied for the reliability analysis of
conventional protection systems are also applied to the digital protection system.
Schweitzer et al. first introduced the unconventional architectures of ADPS and found
the reliability with the help of minimal path method. Here, researchers analyzed the
component importance of all alternative architectures. Chauhan et al. [2] discussed
the ranking of all components present in digital protection system using cost-based
heuristic redundancy importance measure.
The reliability analysis with the help of UGF is a novel approach used in reliability
engineering, especially in the digital protection system. Several authors applied UGF
in many systems such as series, parallel, l-out-of-m, consecutive m-out-of-n etc. with
identical and different elements [7, 8, 18] analyzed the reliability of binary state
and multistate systems with the help of UGF using different algorithms. Rausand
and Arnljot [13] also studied the different types of the probabilistic and statistical
reliability of different complex systems. Ushakov [19] evaluated the unreliability of a
weighted (w-l + 1)-out-of-m: system. Negi and Singh [12] computed the reliability,
expected lifetime and Birnbaum measure of the non- repairable complex systems
which have two subsystems A and B having weighted u-out-of-v: G and weighted
m-out-of-n: G configurations respectively with the help of UGF.
Universal Generating Function Approach for Evaluating Reliability and Signature … 303
1.1 Definitions
1.1.1 Structure–function
Digital protection system is constructed with the help of MUs, time sources, Ethernet
switches, EC and PR.
MU acquires AC current and voltage from conventional current and voltage trans-
formers. MU digitizes the AC current and voltage signals and sends them over the
Ethernet network in the form of sampled values.
Time source is an external time source but an external synchronization source needed
by the system for the large time synchronization. A time source is required to manage
the overall system reliability.
This device is used to develop a network connection between the attached compo-
nents. Ethernet switch understand the packets addressing scheme and send any data
packet only to its destination ports.
It is a relay tool measured to trip a circuit breaker when a fault is identified. Electro-
magnetic devices are the first PRs, relying on coils to recognize the irregular oper-
ating conditions such as overflow of current and voltage and reverse flow of voltage
and current. The need of protective relay is to protect from damage of circuits and
equipment within a few seconds.
Zhang et al. described various types of alternative structure designs of the ADPS.
In the first structure design, we considered the redundancy only on PRs whereas in
second structure design redundancies are there in MUs and PRs which are dependent
on each other. The third structure design has two redundant and independent PRs
and in the fourth structure design, the switches are in cascade form so as to provide
cross backup in the merging units. Fifth architecture adds redundancy in Ethernet
switches and EC (Fig. 1).
RBD is the graphical version of the components of a system and the associations
between them, which can be used to establish the overall system reliability, even
for a large-scale system. Therefore, it can be accomplished that the RBD method is
a precise method for the analysis of such systems. Many researchers observed that
the use of RBDs is one of the very important ways for system reliability analysis.
306 S. Bisht and S. B. Singh
Here, we consider the five-reliability block diagram in Fig. 2 which is related to the
structure designs of ADPS.
In reliability engineering, there are various methods for the reliability assessment of
the systems. The universal generating function is one of the useful tools due to its
difficulty and time reducing capability. The UGF is firstly introduced by Ushakov
[18].
The UGF of an independent discrete random variable l is expressed in the
polynomial form as:
M
U (z) = pm z l m (1)
m=1
Universal Generating Function Approach for Evaluating Reliability and Signature … 307
where the variable X has m has possible values and pm is the probability that X is
equal to lm and z is any variable.
Consider r independent discrete random variable X 1 , X 2 , …, X r . Let the UGF of
random variable X 1 , X 2 , …, X r be U1 (z), U2 (z)........., Ur (z) respectively and f (X 1 ,
X 2 , …, Xr) be an arbitrary function. Further, combination of r UGF is defined by
composition operator ⊗ f , where the properties of the composition operator strictly
depend on the properties of the function f (X 1 , X 2 , …, X r) . Therefore, the composition
U r (z) is defined as:
The UGFs’ of the structure design containing two types of components are
expressed as follow:
For the series arrangement, the UGF can be expressed as:
M
N
M
N
Uc (z) ⊗ Ud (z) = kcm z gcm ⊗ kdn z gdn = kcm kdn z ser (gcm , gdn ) (2)
ser ser
m=1 n=1 m=1 n=1
M
N
M
N
Uc (z) ⊗ Ud (z) = kcm z gcm ⊗ kdn z gdn = kcm kdn z par (gcm , gdn ) (3)
par par
m=1 n=1 m=1 n=1
where, ⊗ and ⊗ are the composition operators over u-function associated with
ser par
series and parallel system respectively. ser (gam , gbn ) and par (gam , gbn ) yield the
entire performance rates of binary state elements c and d at state m and n respectively.
The reliability evaluation of structure design 1 by UGF can be done using following
two steps:
(i) Consider the structure design that contains non-identical distributed compo-
nents.
(ii) Assume the architectures that contain independent and identically distributed
components.
Reliability expression of structure design 1 (RSD1 ) with the use of UGF can be
obtained as:
308 S. Bisht and S. B. Singh
The UGF of different components present in the design can be expressed as:
u i (z) = pi z 1 + (1 − pi )z 0 (5)
Here, pi is the probability of i th component, where i = TS1, TS2, MU, EC, ES,
PR1, PR2, EC1, EC2.
Now, apply the composition operator ⊗(uTS1 (z), uTS2 (z), uMU (z), uEC (z), uES (z),
uPR1 (z), uEC1 (z), uPR2 (z), uEC2 (z)) to obtain the system structure function.
If X 1 is the state variable corresponding to the subsystem consisting of elements
TS1 and TS2, then structure function is obtained as:
When X 2 is the state variable related to the subsystem containing the elements
MU, EC and ES, then structure function can be written as:
When X 5 is the state variable related to the subsystem containing the elements X 3
and X 4 , then the structure function can be evaluated by:
X 5 (z) = max(X 3 , X 4 )
Assuming the values of pi ’s, i = TS1, TS2, ES, PR1, PR2, EC1, EC, and
substituting the same in Eq. (5), we have UGFs’ of the components of the system as:
u P R2 (z) = 0.85z 1 + 0.15z 0 , u EC1 (z) = 0.85z 1 + 0.15z 0 , u EC (z) = 0.8z 1 + 0.2z 0 .
RSD1 = 2 p 9 − 4 p 8 − p 7 + 4 p 6 (6)
Taking into account the different components reliabilities, one can get the
reliability of structure design 1 as listed in Table 1 and depicted in Fig. 3.
310 S. Bisht and S. B. Singh
0.8
Reliability of SD 1
0.6
0.4
0.2
0
0.88 0.9 0.92 0.94 0.96 0.98 1
Component Reliability
The reliability R S D2 of the structural design 2, with the help of UGF method is
expressed as:
(a) Reliability of structure design 2, when the components in the system are non-
identically distributed. In this case, the system structure functions are as follow:
where,
Universal Generating Function Approach for Evaluating Reliability and Signature … 311
Now to evaluate the reliability of structure design 2, firstly compute the following
by applying composition operators to the UGFs of components.
U1 (z) = u T S1 (z) ⊗ u T S2 (z) = 0.9z 1 + 0.1z 0 ⊗ (0.95z 1 + 0.05z 0 )
max max
= (0.995z 1 + 0.005z 0 ),
(b) When components are independent and identically distributed ( pi = p), then
the structure function of reliability of the structure design 2 is expressed as:
RSD2 = − p 11 + 2 p 10 + 4 p 9 − 8 p 8 − 4 p 7 + 8 p 6 (8)
1
0.9
0.8
Reliability of SD 2
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
0.88 0.9 0.92 0.94 0.96 0.98 1
Component Reliability
The reliability of structure design 3 with the help of UGF is given by:
(a) When all components in the system are non-identically distributed, then the
reliability of structure design 3 is computed as:
If X 1 is the state variable corresponding to the subsystem containing the elements
TS1and TS1, then the structure function can be expressed as:
If X 2 is the state variable related to the subsystem containing the elements MU1,
EC1, ES1, PR1and EC3, then the structure function can be computed as:
If X 4 is the state variable related to the subsystem containing the elements X 2 and
X 3 , then the structure function is computed by:
X 4 (z) = max(X 2 , X 3 )
X (z) = max(X 1 , X 4 )
Putting the assumed values of pi ’s in Eq. (5), the UGFs’ of components can be
expressed as:
(b) It is worth mentioning that when all components are identical ( pi = p), then
the structure function of reliability becomes:
RSD3 = p 12 − 2 p 11 − 2 p 7 + 4 p 6 (10)
1
0.9
0.8
Reliability of SD 3
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
0.88 0.9 0.92 0.94 0.96 0.98 1
Component Reliability
System structure function of the structure design 4, when all components in the
system are non-identically distributed is given by:
where,
(continued)
State variable Subsystem consisting of elements
X 7 (z) max (PR2, EM4)
X 8 (z) max (X 6 , X 7 )
X(z) min (X 1 , X 4, X 5 and X 8 )
Again the UGFs’ of the components for the considered values of pi ’s are obtained
as:
To evaluate the system reliability of the structure design 4 let us apply the
composition operators:
U1 (z) = u T S1 (z) ⊗ u T S2 (z) = 0.9z 1 + 0.1z 0 ⊗ (0.95z 1 + 0.05z 0 )
max max
= (0.995z + 0.005z ),
1 0
(b) Reliability of structural design 4 when the components are independent and
identically distributed ( pi = p) are having the same probabilities then the
structure function is given by:
RSD4 = − p 12 + 2 p 11 + 4 p 10 − 8 p 9 − 4 p 8 + 8 p 7 (12)
Considering the different components reliabilities, one can get the reliability of
structure design 4 as listed in Table 4 and depicted in Fig. 6.
1
0.9
0.8
Reliability of SD 4
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
0.88 0.9 0.92 0.94 0.96 0.98 1
Component Reliability
The reliability of the structure design 5 with the help of UGF is given by:
If X 2 is the state variable related to the subsystem containing the elements EC1,
ES1and EC3, then the structure function can be expressed as:
If X 4 is the state variable related to the subsystem containing the elements X 2 and
X 3 , then the structure function can be computed as:
X 4 (z) = max(X 2 , X 3 )
If X 7 is the state variable related to the subsystem containing the elements X 5 and
X 6 , then the structure function can be obtained as:
X 7 (z) = max(X 5 , X 6 )
Universal Generating Function Approach for Evaluating Reliability and Signature … 319
When X 8 is the state variable related to the subsystem containing the elements X 4
and X 7 , then the structure function can be expressed as:
X 8 (z) = max(X 4 , X 7 )
X (z) = max(X 1 , X 8 )
(b) The structure function of the structure design 5 where components are
independent and identically distributed is expressed as:
RSD5 = − p 18 + 2 p 17 + 4 p 15 − 8 p 14 − 4 p 12 + 8 p 11 (14)
1
0.8
Reliability of SD 5
0.6
0.4
0.2
0
0.88 0.9 0.92 0.94 0.96 0.98 1
Component Reliability
The idea of the signature was firstly introduced by Samaniego [14] for the systems
whose components have continuous and i.i.d lifetimes. In recent decades, the signa-
ture has proved to be one of the powerful tools for the quantification of the reliability
of the coherent system. The signature is very effective tool in optimal design and
reliability economics of the systems. Let G 1 , G 2 , ....., G v be i.i.d. component with
continuous distribution function. Let T be the lifetime of this network, then the
reliability at a time t ≥ 0 is given by
v
P(T > t) = su P(G u:v > t) (15)
u=1
where, G 1:v ≤ G 2:v ≤ ........ ≤ G v:v are the ordered lifetime of the components
and signature of uth components su = P(T = X u:v ), u = 1, 2, ....., v.
The vector s = (s1 , s2 , ...., sv ) is called the signature of the system.
where,
b
Cf = S f a = 1, 2, ..., b
f =b−a+1
Step 2: Calculate the tail signature of the system S = (S0 , ......., Sm ) using
u
Sn = se = 1
φ(h) (18)
(m
m−n )
e=n+1 |h|=m−n
322 S. Bisht and S. B. Singh
Step 3: With the help of Taylor expansion about y=1, we find the reliability function
in the form of a polynomial.
p(x) = x u h( x1 ) (19)
Step 4: Estimate the tail signature of ADPS reliability function with the help of
Eq. (18) by
(m−n)! n
Sn = n!
D p(1) , n = 0, 1, ..., m (20)
Using Eqs. (6) and (17), one can get the structure function of structure design 1 as:
h(u) = 2u 9 − 4u 8 − u 7 + 4u 6 (22)
Computing the tail signature (S S D1 ) and the signature (SS D1 ) of the structure
design 1 by Eqs. (20) and (21) respectively as:
2 11 1
S S D1 = 1, , , , 0, 0, 0, 0, 0, 0
3 36 21
1 13 195 1 287
s S D1 = , , , , , 0, 0, 0, 0, 0
3 36 756 21 4410
With the help of Eqs. (8) and (17), we get the structure function of structure design
2 as:
h(u) = −u 11 + 2u 10 + 4u 9 − 8u 8 − 4u 7 + 8u 6 (23)
The tail signature (S S D2 ) and the signature (SS D2 ) of the structure design 2 can
be computed by Eqs. (20) and (21) respectively as:
Universal Generating Function Approach for Evaluating Reliability and Signature … 323
10 36 56 6 4
S S D2 = 1, , , , , , 0, 0, 0, 0, 0, 0
11 55 165 55 231
1 14 2860 38 1166 4
s S D2 = , , , , , , 0, 0, 0, 0
11 55 9075 165 12705 231
With the help of Eq. (10) and using Eq. (17) we get the structure function h(u) from
RBD of structure design 3 as:
h(u) = u 12 − 4u 11 − 2u 7 + 4u 6 (24)
The tail signature (S S D3 ) and the signature (SS D3 ) of the structure design 3 can
be obtained by Eqs. (20) and (21) respectively as:
20 3 10 1 1
S S D3 = 1, 1, , , , , , 0, 0, 0, 0
33 11 99 36 231
13 11 17 261 195 1
s S D3 = 0, , , , , , , 0, 0, 0, 0, 0
33 33 99 3564 8316 231
Using Eqs. (12) and (17), we get the structure function of structure design 4 as:
h(u) = −u 12 + 2u 11 + 4u 10 − 8u 9 − 4u 8 + 8u 7 (25)
The tail signature (S S D4 ) and the signature (SS D4 ) of the structure design 4 can
be obtained by Eqs. (20) and (21) correspondingly as:
5 6 14 4 1
S S D4 = 1, , , , , , 0, 0, 0, 0, 0, 0, 0
6 11 55 55 99
1 19 16 10 341 1
s S D4 = , , , , , , 0, 0, 0, 0, 0, 0
6 66 55 55 5445 99
324 S. Bisht and S. B. Singh
With the help of Eqs. (14) and (17), we get the structure function of the structure
design 5 as:
h(u) = −u 18 + 2u 17 + 4u 15 − 8u 14 − 4u 12 + 8u 11 (26)
The tail signature (S S D5 ) and the signature (SS D5 ) of the structure design 5 can
be obtained by Eqs. (20) and (21) respectively as:
7 8 43 53 2 1 1
S S D5 = 1, , , , , , , , 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
9 17 204 765 119 357 3978
1 13 195 1 287
s S D5 = , , , , , 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
3 36 756 21 4410
Step 1: Determine the MTTF of the ADPS which has i.i.d components with mean
μ.
Step 2: Evaluate the minimal signature of the ADPS with an expected lifetime of
reliability function by using
n
HT (w) = Cr H1:r (w) (27)
r =1
n
E(T ) = μ Cr
r
(28)
r =1
Using Eq. (24), the expected lifetime from the minimal signature of structure design
1 is obtained as:
h(u) = 2u 9 − 4u 8 − u 7 + 4u 6
The minimal signature of the structure design 1 (M SS D 1 ) with the help of Eq. (27)
is given by
Hence, the expected lifetime of the structure design 1 is evaluated by Eq. (28) as:
E(T ) S D1 = 0.246031
The expected lifetime from the minimal signature of structure design 2 is computed
by Eq. (24) as:
h(u) = −u 11 + 2u 10 + 4u 9 − 8u 8 − 4u 7 + 8u 6
The minimal signature of the structure design 2 (M SS D 2 ) with the help of Eq. (27)
is computed as
Hence, the expected lifetime of the structure design 2 is estimated by Eq. (28) as:
E(T ) S D2 = 0.252453
Using Eq. (18), the expected lifetime from the minimal signature of the structure
design 3 is expressed as:
326 S. Bisht and S. B. Singh
h(u) = u 12 − 4u 11 − 2u 7 + 4u 6
Hence, the expected lifetime of the structure design 3 is calculated by Eq. (28) as:
E(T ) S D3 = 0.100649
Using Eq. (18), the expected lifetime from the minimal signature of the structure
design 4 is obtained as:
h(u) = −u 12 + 2u 11 + 4u 10 − 8u 9 − 4u 8 + 8u 7
Hence, the expected lifetime of the structure design 4 is evaluated by Eq. (28) as:
E(T ) S D4 = 0.254310
With the help of Eq. (18), the expected lifetime from the minimal signature of the
structure design 5 is expressed as:
h(u) = −u 18 + 2u 17 + 4u 15 − 8u 14 − 4u 12 + 8u 11
The minimal signature of the structure design 5 (M SS D 5 ) with the help of Eq. (27)
is determined as
Hence, the mean time to failure of the structure design 5 is estimated by Eq. (28)
as:
E(T ) S D5 = 0.151268
2 Conclusion
The present paper analyzed the reliability of the all-digital protection system having
non-identically and identically distributed components. Here, we computed the reli-
ability of all the structure designs with the help of the UGF. Also, unlike done in
the past the signature reliability of the all-digital protection systems with the help of
Owens method has also been studied in this paper. The expected lifetime of digital
protection systems with the help of minimal signature is also studied for the first
time. In these systems, we have considered many redundant components to increase
the system reliability.
It is also revealed in the study that when the systems have non-identical compo-
nent, then the architecture 5 is found to be most reliable having value 0.715833175
while the structure design 2 is the second most reliable having value 0.6559.
The signature analysis has been done to examine the impact of the failure proba-
bilities of components in the architectures which will help the engineers and system
engineers in designing. MTTF of the all architectures with the help of minimal signa-
ture has also been obtained with respect to all five proposed aspects. We also found
that MTTF of structure design 2 is the highest while the structure design 3 has attained
the lowest value.
References
1. Boland PJ, El Neweihi E, Proschan F (1988) Active redundancy allocation in coherent systems.
Probab Eng Inf Sci 2(3):343–353
2. Chauhan U, Singh V, Rani A, Pahuja GL (2015) Ranking of all digital protection system compo-
nents using cost-based heuristic redundancy importance measure. In: International conference
on recent developments in control, automation and power engineering (RDCAPE), pp 141–145
3. Da G, Zheng B, Hu T (2012) On computing signatures of coherent systems. J Multivariate
Anal 103(1):142–150
4. Djekic Z, Portillo L, Kezunovic M (2008) Compatibility and interoperability evaluation of
all-digital protection systems based on IEC 61850–9–2 communication standard. In: Power
and energy society general meeting-conversion and delivery of electrical energy in the 21st
century. IEEE, pp 1–5
5. Kumar A, Singh SB (2017) Computations of the signature reliability of the coherent system.
Int J Qual Reliab Manage 34(6):785–797
6. Kumar A, Singh SB (2017) Signature reliability of sliding window coherent system.
In: Mathematics applied to engineering, pp 83–95
328 S. Bisht and S. B. Singh
7. Levitin G (2005) The universal generating function in reliability analysis and optimization, vol
6. Springer, London
8. Lisnianski A, Levitin G (2003) Multi-state system reliability: assessment, optimization, and
applications, vol 6. World Scientific Publishing Co Inc.
9. Marichal JL, Mathonet P (2013) Computing system signatures through reliability functions.
Stat Probab Lett 83(3):710–717
10. Navarro J, Rubio R (2009) Computations of signatures of coherent systems with five
components. Commun Stat-Simul Comput 39(1):68–84
11. Navarro J, Rychlik T (2007) Reliability and expectation bounds for coherent systems with
exchangeable components. J Multivariate Anal 98(1):102–113
12. Negi S, Singh SB (2015) Reliability analysis of the non-repairable complex system with
weighted subsystems connected in series. Appl Math Comput 262:79–89
13. Rausand M, Arnljot HÃ (2004) System reliability theory: models, statistical methods and
applications, vol 396. Wiley
14. Samaniego FJ (2007) System signatures and their applications in engineering reliability, vol
110. Springer Science & Business Media
15. Samaniego FJ (1985) On closure of the IFR class under formation of coherent systems. IEEE
Trans Reliab 34(1):69–72
16. Scheer GW, Dolezilek DJ (2000) Comparing the reliability of Ethernet network topologies in
substation control and monitoring networks. In: Western power delivery automation conference.
Spokane, Washington
17. Singh C, Patton AD (1980) Protection system reliability modeling: unreadiness probability and
mean duration of undetected faults. IEEE Trans Reliab 29(4):339–340
18. Ushakov I (1986) Universal generating function. J Comput Sci Syst 24:118–129
19. Ushakov IA (1994) Handbook of reliability engineering. Wiley
20. Wu JS, Chen RJ (1994) An algorithm for computing the reliability of weighted-k-out-of-n
systems. IEEE Trans Reliab 43(2):327–328
21. Zhang P, Portillo L, Kezunovic M (2006a) Compatibility and interoperability evaluation for
all-digital protection system through automatic application test. In: Power engineering society
general meeting. IEEE
Reliability Analysis of 8 × 8 SEN- Using
UGF Method
1 Introduction
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 329
M. Ram and H. Pham (eds.), Reliability and Maintainability Assessment
of Industrial Systems, Springer Series in Reliability Engineering,
https://doi.org/10.1007/978-3-030-93623-5_15
330 V. Bisht and S. B. Singh
2 Preliminaries
For over the recent ten years, we’ve gradually seen the development in the field of
interconnection networks which is now being injected into this contemporary age of
multiprocessor frameworks. This development is introducing new roads for devel-
opment and application in numerous fields. There are different components of a
computer system interconnected by communication networks, and these networks
are referred to as interconnection networks. An interconnection network is simply
the interconnection of various networks of connecting nodes. In the interconnec-
tion network, switching elements are used to connect the destination node from
the source node. The reliability of interconnection networks depends on the reli-
ability of switching element and their interconnection network. Direct and indi-
rect networks are the two types of interconnection networks. The network is sent
across multiple edges of the direct network via point-to-point connections with the
processing nodes. The router-based network or static network is another name for
the direct network. Direct networks include the star graph, hypercube, torus, mesh,
and the trees. Messages between any two separate nodes in an indirect network are
routed through the network’s switches. In parallel computing, indirect interconnec-
tion networks are widely employed for switching and routing dynamic nodes. The
communication of signals/messages in an indirect network is accomplished with the
help of switches [4, 16]. Negi and Singh [15] assessed the reliability of complex
systems and its subsystems, which were not repairable and was connected in series.
The main purpose behind the construction of interconnection network is that when
the single processor is not able to handle the task involving a huge amount of data,
then task is broken into different parallel tasks, which are performed simultaneously,
resulting into reduction of processing time. Hence, these interconnection network
plays very important part in constructing large parallel processing system. Intercon-
nection networks are widely used in many real-life applications such as telephone
switches, networks in industries, supercomputers and many more.
Since the dynamic topology consists of numerous links that may be reconfigured
by setting with the movable switching components, the multistage interconnection
Reliability Analysis of 8 × 8 SEN- Using UGF Method 331
networks (MINs) play an essential role in the dynamic network system. MINs are
made up of several sheets of linked switching components organised in a prede-
termined architecture. The integrated circuits, telecommunication switches, multi-
processor systems and computer communications are all examples of uses for these
networks. Since these networks are of big size and have a complicated topology, it
is necessary to enhance the performance and hence to increase their reliability. It is
important to note that these switching elements (SEs) are often linked in different
number stages. Blocking, non-blocking and the rearrangeable non-blocking MINs are
three different types of MINs. Since it mixes with the different connections already
existing in the network, there is a very low likelihood of communication between
a free source/destination pair in blocking networks. In non-blocking networks,
messages are communicated from every input node to every output node without
affecting the network’s pre-defined topology, resulting in various pathways between
each source and destination node, which results extra stages in the system. Gunawan
[9] discussed the basics of reliability engineering, distribution of probability and
certain fundamentals of probability theory.
A multistage interconnection network (MIN) typically has m inputs and m outputs,
as well as m (=log2 m) stages and every stage has m/2 switching components, where m
is the size of network. Shuffle exchange network (SEN), SEN with extra stages, Benes
network, Gamma interconnection network (GIN), Extra-stage Gamma interconnec-
tion network GIN, Clos network, Omega network, Multistage cube network, and
many more multistage interconnection networks are frequently used. It is important to
note that network reliability is determined not only by the components in the network,
but also by the topology of the network. The failure of the MINs is mostly due to
inadequate network architectural selection and insufficient routing methods. Many
engineers and academics have proposed different methods to enhance the perfor-
mance of MINs and make them more dependable in the past. Trivedi [17] used the
continuous Markov chains technique to assess the dependability of MINs. Rajkumar
and Goyal [16] attempted to connect and investigate different MIN network topolo-
gies in terms of reliability, fault-tolerance, and cost-efficiency. Blake and Trivedi
[5] investigated the single path MIN’s dependability and focused on fault tolerant
schemes for improving network resilience. Both achieved reliability derivations for
8 × 8 shuffle exchange networks and 16 × 16 shuffle exchange networks. Bistouni
and Jahanshahi [3] proposed a new technique to increase the reliability and fault-
tolerance of the SEN by augmenting the switching stages. They both found out that
SEN with one extra stage (SEN + 1) was more reliable than SEN or SEN with two
more stages (SEN + 2). Fard and Gunawan [8] estimated the terminal reliability
of a modified SEN with 2 × 2 SEs at the intermediate stages, 1 × 2 at the source
nodes, and 2 × 1 at the terminal nodes, and compared it to the conventional shuffle
exchange network. Chinnaiah [7] proposed a new MIN called replicated SEN, which
he compared to the SENs and Benes networks. Bisht and Singh [2] calculated the
reliability of 4 × 4 SEN, SEN + 1 and SEN + 2 by UGF method and found that 4
× 4 SEN is most reliable and SEN + 2 is least reliable and reliability of SEN + 1
lies between these two networks.
332 V. Bisht and S. B. Singh
M
U (z) = r m z km (1)
m=1
K1
K2
k1
k2
u 1 (z) ⊗ u 2 (z) = p1k 1 z g1k1 ⊗ p2k 2 z g2k 2 = p1 k 1 p2 k2 z par (g1 k1 ,g2 k2 )
par par
k1 =1 k2 =1 =1 k2 =1
(3)
should transmit networks to all destinations at least by one of its paths. As a result,
this network does not meet the requirements for MIN approval.
In New 8 × 8 SEN- structure the MUX has been used at source and DEMUX have
been used at destination end, so that the signal from each input can be transmitted
to each output node. For the sake of convenience, 2 × 1 MUX and 1 × 2 DEMUX
have been used to transmit network from one input to all outputs [10]. Bigger size of
MUX and DEMUX can be also used to make system more redundant. This novel 8 ×
8 SEN provides two pathways between each and every source and destination. The
primary benefit of this strategy is that it creates totally independent paths between
the source and destination, making the entire system fault resilient at both ends. Now,
there are m MUX and m DEMUX at both, source, and destination node in this 8 × 8
SEN-, for a total of log 2 m− 1 stages in this network and m/2 SEs per stage. Figure 2
depicts the new SEN of size 8 × 8.
With the aid of UGF, the reliability of this SEN- may be assessed based on the three
reliabilities: (1) terminal reliability (TR) (2) broadcast reliability (BR) (3) the network
reliability (NR). To calculate reliability, we assume the following assumptions:
RT R (S E N -) = max(min( p1 , p3 , p5 , p7 ), min( p2 , p4 , p6 , p8 ))
where, p1 , p2 ,….., p8 are the probabilities of the switching element existing in this
8 × 8 SEN-.
(1) If the SEN- elements are not the same and the probabilities of the network
components are distinct, the UGFs of different SEs are represented by:
u s j (z) = ps j z 1 + 1 − ps j z 0
1 3 5 7
2 4 6 8
Fig. 3 TR of 8 × 8 SEN-
336 V. Bisht and S. B. Singh
TR (SEN−) = 0.971747011
(2) The structural function, when the SEN- elements are identical and all SEs have
the same probability is given by:
RT R (S E N −) = 2 p 3 − p 6
With the help of the suggested UGF technique, the TR of 8 × 8 SEN- is assessed
for distinct switching element reliability and being compared with 8 × 8 SEN, as
shown in Table 1.
Table 1 TR of 8 × 8 SEN-
Switching TR of 8 × 8 SEN- by TR of 8 × 8 SEN [16]
reliability UGF
0.90 0.926559 0.72900
0.95 0.979658 0.85737
0.96 0.986714 0.88473
0.99 0.999118 0.97029
Reliability Analysis of 8 × 8 SEN- Using UGF Method 337
The likelihood of a network/message being sent from a single source to all destination
pairs is said to be broadcast reliability (BR). Figure 4 shows the block diagram for
broadcast reliability for SEN-, which is of the size 8 × 8.
The SEN- broadcast reliability may be computed using the UGF technique as
follows:
where, p1 , p2 ,….., p18 are the switching elements probabilities of these components
in the network.
(1) If all the SEN- elements are distinct and the probability of the components of
the network is not the same, then the UGFs of the various SEs are provided by:
u s j (z) = ps j z 1 + 1 − ps j z 0
1 3 5 7 9 11 13 15 17
4 6 8 10 12 14 16
2 18
Fig. 4 BR of 8 × 8 SEN-
338 V. Bisht and S. B. Singh
BR (SEN−) = 0.642475527
(2) The structural function is given, when the SEN- components are alike and all
SEs have the same probability:
R B R (S E N −) = 2 p 13/2 − p 13
With the help of the suggested UGF technique, the BR of 8 × 8 SEN- is assessed
for distinct switching element reliability and being compared with 8 × 8 SEN, as
shown in Table 2.
Reliability Analysis of 8 × 8 SEN- Using UGF Method 339
1 3 5 7 9 11 13 15 17 19 21 23
2 4 6 8 10 12 14 16 18 20 22 24
The likelihood of successful signal transmission from all source pairs to all sink
nodes is said to be network reliability (NR). Figure 5 shows the block diagram for
network reliability for SEN-, which is of the size 8 × 8.
The network reliability of SEN- may be estimated using the UGF technique as
follows:
where, p1 , p2 ,….., p24 are the switching element’s probabilities present in this
network.
(1) When the SEN- elements are not the same and the probabilities of components
of the network are distinct, the UGFs for these different switching parts are
provided by:
u s j (z) = ps j z 1 + 1 − ps j z 0
NR (SEN-) = 0.351211727
(2) When all of the components of 8 × 8 SEN- are same and the probability of all
switching elements is same, and the structural function is written as:
R N R (SEN-) = 2 p 8 − p 16
With the use of the suggested UGF technique, the NR of 8 × 8 SEN- is assessed
with regard to the distinct switching element reliability and being compared to NR
of 8 × 8 SEN and is shown in Table 3.
5 Conclusion
Reliability is a one of major concerns for most networks, particularly in the realm of
communication networks. This paper demonstrates how we may use the proposed
UGF to assess the reliability of 4 × 4 SEN-. This paper emphasizes the fact that 4
× 4 SEN- is not possible and use of MUX and DEMUX at source and destination
makes this network possible. This study examines the reliability of 4 × 4 SEN- from
three perspectives: TR, BR, and NR. A critical analysis of the findings of various
reliabilities derived from the suggested technique, followed by a comparison reveals
that the terminal reliability (TR), broadcast reliability (BR) and network reliability
(NR) of 4 × 4 SEN- is way higher than that of 8 × 8 SEN. Hence the 4 × 4 SEN- is
more reliable than 8 × 8 SEN.
References
1. Bisht S, Singh SB (2019) Signature reliability of binary state node in complex bridge networks
using universal generating function. Int J Qual Reliab Manage 36(2):186–201
2. Bisht V, Singh SB (2021) Reliability estimation of 4 × 4 SENs using UGF method. J Reliab
Stat Stud 173–198
342 V. Bisht and S. B. Singh