0% found this document useful (0 votes)
11 views10 pages

Gupta2008

Uploaded by

sufian
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views10 pages

Gupta2008

Uploaded by

sufian
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

ARTICLE IN PRESS

Reliability Engineering and System Safety 93 (2008) 1434–1443


www.elsevier.com/locate/ress

Weibull extension model: A Bayes study using Markov chain


Monte Carlo simulation
Ashutosh Gupta, Bhaswati Mukherjee, S.K. Upadhyay
Department of Statistics, & DST Centre for Interdisciplinary Mathematical Sciences, Banaras Hindu University, Varanasi 221 005, India
Received 18 August 2007; received in revised form 11 October 2007; accepted 25 October 2007
Available online 6 November 2007

Abstract

Several generalizations of the two-parameter Weibull model have been proposed to model data sets that exhibit complex non-
monotone shapes of hazard rate function. The present paper focuses on one such generalization referred to as the Weibull extension
model in the literature. Complete Bayesian analysis of the model has been provided using Markov chain Monte Carlo simulation.
Finally, a thorough study has been conducted for checking the adequacy of the model for a given data set using some of the graphical
and numerical methods based on predictive simulation ideas. A real data set is considered for illustration.
r 2007 Elsevier Ltd. All rights reserved.

Keywords: Bathtub hazard rate; Weibull extension model; Markov chain Monte Carlo; Hybrid algorithm; Model validation; Predictive simulation; Partial
predictive p-value

1. Introduction certain data sets, say, for example, the bathtub behaviour
of the hazard rate. The practical scenarios where such
Weibull and its derivative models cover the major part of shapes are advocated include mortality behavior of human
modeling the reliability and survival data. Modeling of being, failure rate of newly launched products, etc. The
such data sets may be governed by some specific second example often helps manufacturers in reliability-
characteristics inherent in the process that generate the related decision making like estimation of burn in time,
data set. One such characteristic is hazard rate function, replacement time, etc. It is to be noted that the importance
i.e., instantaneous rate of failure of unit under some sort of of burn-in time can be readily apprehended in reliability
stress that causes failure. The two-parameter Weibull studies where the aim is to discard the defective items
model is one of the most entertained models as it can before launching into the field. Similarly, estimation of
model a large variety of data sets that generally accrue in replacement time helps in making decisions regarding the
life testing experiments. The probability density function replacement policies for the products. Readers are referred
(pdf) of the two-parameter Weibull model is given by to a recent paper by Bebbington et al. [1] for the related
discussion.
f ðxja; bÞ ¼ abxb1 expðaxb Þ; a; b40. (1) Several approaches have been proposed in the literature
to model data set that exhibits bathtub-shaped hazard rate
It can be shown that the shape of the hazard rate
function. One such approach is to consider the combina-
function of the two-parameter Weibull model is increasing,
tion of two or more Weibull models, each representing a
decreasing or constant according as b41, bo1 or b ¼ 1.
particular region of the hazard rate (see, for example, the
Hence, it can model a variety of data sets although it fails
additive Weibull distribution proposed by Xie and Lai [2]).
to describe other complex situations that might arise for
However, this idea will result in too many parameters of
Corresponding author. the proposed model and, as such, the resulting inferences
E-mail addresses: ashu.stats@gmail.com (A. Gupta), may be difficult to obtain. The situation worsens in the
skupadhyay@gmail.com (S.K. Upadhyay). reliability studies where we often have very less amount of

0951-8320/$ - see front matter r 2007 Elsevier Ltd. All rights reserved.
doi:10.1016/j.ress.2007.10.008
ARTICLE IN PRESS
A. Gupta et al. / Reliability Engineering and System Safety 93 (2008) 1434–1443 1435

data. As an alternative, other generalized and extended graphical and maximum likelihood methods of estimation.
families of the Weibull model have been proposed in the The authors also considered model comparison with some
literature. Three-parameter exponentiated Weibull model competitive models for these data sets. Tang et al. [11] also
proposed by Mudholkar and Srivastava [3] is an important discussed maximum likelihood estimation for the model
member of such class of distributions. It is simply a parameters for the same data set and compared the model
generalization of the two-parameter Weibull model with an with its sub-models. Nadarajah [12] is another recent work
added shape parameter. Other examples include IDB on the model but the author was mainly confined to
model proposed by Hjorth [4], additive Burr type XII obtaining the moments of the distribution. Two other
model proposed by Wang [5], etc. Some other recent references worth to mention include [13] and [7] where
proposals that accommodate various shapes of hazard rate the authors proposed new approaches based on least-
function including bathtub shape can be found in [6–8]. square techniques and/or graphical methods. In the last
The last reference describes a model, namely, the flexible two references although the focus was on related distribu-
Weibull distribution that accommodates slightly different tions, the study is relevant to Weibull extension model
form of bathtub shape and this has been termed as as well.
modified bathtub. It is to be noted that most of the cited literatures are
A two-parameter model recently proposed by Chen [9] confined to classical developments and any systematic
also shows bathtub-shaped behavior of the hazard rate development on Bayesian results are rarely seen. The
function. This model seems appealing as besides having importance of the Bayesian method is well known both in
only two parameters it holds some nice properties on the the context of reliability studies and otherwise. Among
classical inferential front although it lacks a scale several advantages, the most important is the fact that the
parameter that makes it less flexible for analyzing a variety Bayesian methods are equally well applicable for small
of data sets. To overcome this limitation, Xie et al. [10] sample sizes and censored data problems, the two common
proposed a model known as the Weibull extension model features in reliability data analyses. Martz and Waller [14]
that can be considered as an extension of Chen’s [9] model is a significant contribution in Bayes reliability studies,
with an additional scale parameter. As a result, the model where the authors have systematically pointed out a few
becomes more flexible and persuasive from the point of advantages of Bayesian methods and disadvantages of
view of practitioners. The Weibull extension model can other methods of analyses. Without going for any
also be seen as a generalization of the two-parameter debatable issue on the comparison of various paradigms,
Weibull model with pdf given by let us be confined to the Weibull extension model and the
associated Bayesian developments. The reason behind the
f ðxja; b; lÞ ¼ bla1ð1=bÞ xb1 expðaxb Þ non-availability of any significant Bayesian work in the
  
 exp la1=b expðaxb Þ  1 . ð2Þ context of the Weibull extension model may be attributed
primarily to the complicated form of the model and
Both b and l represent the shape of the distribution, perhaps the involvement of high-dimensional integrals in
whereas a behaves like a scale parameter. It can be easily posterior-based inferences that are difficult to solve
seen that the model approaches the two-parameter Weibull analytically. Sophisticated techniques such as analytical
model when the scale parameter tends to zero (see [10]). approximation and/or numerical integration are certainly
The reliability function and the hazard rate for the model capable of solving these complications but they often
can be written as require too much expertise on the part of users. Alter-
   natively, sample-based approaches such as Markov chain
Rðx; a; b; lÞ ¼ exp la1=b expðaxb Þ  1 , (3) Monte Carlo (MCMC) offer easily manageable solutions
even in low-dimensional scenarios and this fact has already
been established in a number of references such as [15–18],
hðxÞ ¼ bla11=b xb1 expðaxb Þ. (4)
etc. Upadhyay et al. [17] and Pang et al. [18] have
Study of the hazard function of the Weibull extension considered MCMC methods to estimate parameters of
model shows that it follows bathtub-shaped behaviour the three-parameter Weibull distribution and they have
when 0obo1 although the role of the parameter a is also shown the flexibility of MCMC methods over other
important in evaluating the position of change point in traditional approaches. Besides, these techniques offer
bathtub curve. The change
 point of the Weibull extension user-friendly solutions once the sample-generating strate-
1=b
model is given by t ¼ ð1  bÞ=ab ; which clearly shows gies are successfully worked out. Ideas such as that of
that for fixed value of b the change point decreases as a importance sampling do provide some additional features,
increases. say, for example, the changes in prior and/or likelihood
A review of the past literature shows that the Weibull will not cause any remarkable burden. Readers are referred
extension model lacks its developments both in classical as to [19], a recent book by Robert and Casella [20], for a
well as in Bayesian framework. Xie et al. [10] considered thorough discussion on MCMC.
data sets that exhibit bathtub-shaped hazard behaviour The present paper focuses on two different aspects. The
and estimated parameters of the model using both first part proposes to consider MCMC techniques for the
ARTICLE IN PRESS
1436 A. Gupta et al. / Reliability Engineering and System Safety 93 (2008) 1434–1443

Bayes analysis of the Weibull extension model. We have be written as


!
developed a hybrid strategy combining the Metropolis Y
n X
n
algorithm within the Gibbs sampler for obtaining Lðx; a; b; lÞ ¼ b l a n n nðn=bÞ
xb1
i exp axbi
the samples from the posterior arising from the model. i¼1 i¼1
!
The details of implementation have been provided in n 
X   
b
Section 2.  exp la1=b exp axi  1 . ð5Þ
i¼1
The second part of the paper focuses on checking the
appropriateness of the model for a given data set. It is to be Let us consider independent priors for the parameters a,
noted that the earlier works have mainly focused on b and l as
introducing the model and studying its various character-  a
p1 ðaÞ / aZ1 exp  ; k; Z40,
istics (see, for example, [10,11]) and, as such, the important k
aspect of the study of model compatibility, etc. has 1
p2 ðbÞ ¼ ; 0oboM 1 ,
been completely missing. The study of model compatibility M1
comes during the model-building phase, where interest 1
is focused on answering the question whether the p3 ðlÞ ¼ ; 0oloM 2 . ð6Þ
M2
considered model appears to provide adequate fit to the
data in hand. One can also identify or evaluate the We propose that the hyperparameters of priors are
specific features of the data, not represented by the model chosen such that the priors become diffused and the
during this model-assessment process. Therefore, advan- inferences are data driven. The expression for the posterior
tage of the study of model compatibility is two fold. First, can be obtained up to proportionality by multiplying the
it may work as a preliminary screening in choosing an likelihood with the prior and this can be written as
!
appropriate model and second, it may sometimes provide Y
n
b1
Xn
b
the guidelines to improve the current model by identifying pða; b; lj xÞ / bn ln anðn=bÞþZ1 xi exp axi
i¼1 i¼1
the feature of the data not captured by the model under !
n 
X   
consideration. b
 exp la1=b exp axi  1
Several approaches are available for the study of model i¼1
compatibility in Bayesian framework. Predictive simula-  a
tion is an easiest and perhaps the rational one. The basic  exp  . ð7Þ
k
idea of studying the model compatibility through predictive
simulation is to compare the observed data or some The posterior is obviously complicated and no close
function of it with the data that would have been form inferences appear possible. We, therefore, propose to
anticipated from the assumed model called the predictive consider MCMC methods, namely the Gibbs sampler and
data (see [21]). If the two data sets compare favourably, the the Metropolis algorithms (see, for example, [19,24]), to
assumed model can be considered to be an appropriate simulate samples from the posterior so that sample-based
choice for the data in hand. Different versions of p-values inferences can be easily drawn.
have also been discussed in the Bayesian literature to Before we finish this section, we shall provide a brief
provide an easy quantification of predictive simulation review of the two algorithms, a hybrid strategy that has
ideas and to provide a parallel treatment with classical been used for the present implementation and some of the
counterpart. Some of the important Bayesian versions of related concepts useful for the reliability practitioners. Let
p-values are detailed in [22,23,17], etc. us first consider the Gibbs sampler. Gibbs sampler
The rest of the paper is organized as follows. The next algorithm is a cyclic process that starts with the assumption
section provides details of Bayesian model formulation of arbitrary chosen initial values for the concerned
along with a brief but an informal discussion of MCMC variables. Each cycle consists of drawing samples of each
methods. Section 3 begins with a short description of variable from the corresponding full conditional using the
the data set taken for the present study and finally pro- most recent values of all other variables. For example,
vides the posterior-based inferences for the model. suppose y=(y1, y2,y,yk) is a k-dimensional parameter
Section 4 discusses some recent Bayesian tools available vector and pðyj xÞ is the corresponding posterior obtained
for model validation and suitability of model for the up to proportionality as a product of likelihood function
present data set is examined by both graphical and and prior. Then a complete cycle of the Gibbs algorithm
numerical methods. Section 5 provides some important can be described by the following generating scheme:
 
conclusion in brief. y11 p y1 jy02 ; . . . ; y0k ; x ,

2. Bayesian model formulation  


y12 p y2 jy01 ; y03 ; . . . ; y0k ; x ;
Let x: x1, x2,y, xn be the observed failure times of n ...
items put on a life test and let the failure time distribution  
be given by (2). The corresponding likelihood function can y1k p yk jy01 ; . . . ; y0k1 ; x ,
ARTICLE IN PRESS
A. Gupta et al. / Reliability Engineering and System Safety 93 (2008) 1434–1443 1437

where y0 ¼ ðy01 ; . . . ; y0k Þ is the initial starting value and [25]). These samples can then be used to find estimates or
pðyi jy1 ; . . . ; yi1 ; yiþ1 ; . . . ; yk Þ is the full conditional of other features of these characteristics.
the ith component obtained up to proportionality from Finally, it is important to mention that MCMC methods
the joint posterior pðyj xÞ. Repetition of the above are by no means exhaustive and they can always be
algorithm produces a sequence y1 ; y2 ; . . . ; yt ; . . . where amalgamated to develop hybrid samplers. The hybrid
yi ¼ ðyi1 ; yi2 ; . . . ; yik Þ. strategy may be chosen either because of ease of
Metropolis algorithm is an alternative to Gibbs sampler implementation or perhaps because of consideration such
that does not require availability of full conditionals. It, as efficiency, etc. For the present model, generating
however, chooses a Markov kernel for variate generation samples from all the full conditionals corresponding to
and incorporates a further randomization for the necessary posterior (7) is not easily manageable and, therefore, we
drawings. Let q(y, y0 ) be a symmetric Markov kernel, considered mixing of Metropolis chains for those full
i.e., q(y, y0 )=q(y0 , y), where y is the current value of conditionals. Alternatively, one could have done the entire
the chain and y0 is the next realization from q(y, y0 ). run using Metropolis algorithm but at the cost of stringent
0
According
 0 to the algorithm, we accept y with probability evaluation of variance–covariance matrix to be used for
min pðy j xÞ=pðyj xÞ; 1 otherwise retain y as the next initialization of the chain. Hence, the posterior distribution
proposal for the chain. This is a simplified version of the in the present case has been extracted using a hybrid
algorithm. For more complicated scenario, one can sampler taking Metropolis within the Gibbs. In this hybrid
consider a non-symmetric kernel as well. There have been sampler, the Metropolis step is used to extract samples
several choices of candidate-generating density q(y, y0 ) from some of the full conditionals to complete a cycle
suggested in the literature. Important among these are in Gibbs chain. For various full conditionals and other
properly centered and scaled multivariate normal, rectan- implementation details, readers are referred to Appendix A
gular, t and split t, etc. Normally, such densities should of the paper.
have the dimension same as that of posterior distribution.
The proper centralizing and scaling of candidate-generat- 3. Numerical illustration
ing density should be decided (possibly) by an approximate
study of the posterior density. If, however, the form is too 3.1. The data
typical to guess, we can go on trying for maximum
likelihood estimates and corresponding Hessian-based For numerical illustration, we considered the data set
approximation for the same. A scaling constant c is also initially reported in Aarset [26] and later on extensively
often chosen with values to be set between 0.5 and 1.0. This reanalyzed by a number of authors (see, for example,
scaling constant helps in getting smooth and efficient [27,10,28], etc.). The data set has been shown in Table 1.
algorithm. Smith and Roberts [19] and Upadhyay and The data set is well known to represent bathtub-shaped
Smith [25] are good references that provide enough behaviour of the hazard rate function as pointed out by the
comment on the choice of suitable candidate generating previous authors. Aarset [26] noted this behaviour by
densities to be used for Metropolis algorithm. advocating the use of total time on test (TTT) plot
Repetition of the Metropolis steps also provide a although the TTT plot never provides the exact shape of
sequence y1 ; y2 ; . . . ; yt ; . . .. It can be easily shown that, the bathtub curve, its change point and curvature.
after a large repetition, the generated sequence converges in Several nonparametric approaches have been given in
distribution to a random sample from the corresponding the literature that can produce the shape of the estimated
posterior distribution with components from the corre- hazard rate inherent in the data. These are useful
sponding marginal posteriors. Convergence monitoring approaches in the sense that they can facilitate atleast an
can be done either through a single chain of long run or initial assessment of the model for the data in hand by
through a multiple independent chains and then monitor- comparing the data-based estimated hazard rate with the
ing a few of posterior characteristics of interest, say for hazard rate of the hypothesized model. We considered an
example, the ergodic averages, etc. For details regarding important approach given by Martz and Waller [14]. This
MCMC, readers are referred to [19] and a recent reference method considers non-parametric estimate of the hazard
by Robert and Casella [20]. rate at different data points and use it to draw an estimated
It is important to mention here that samples of any linear shape of the hazard rate function. Fig. 1 provides the
or non-linear function of parameters can be easily obtained
once a random sample from the true posterior distribution
has been made available. For example, a reliability Table 1
Failure time of 50 items
practitioner may find interest in estimating reliability,
hazard rate, median life, etc. Change point is another 0.1 0.2 1.0 1.0 1.0 1.0 1.0 2.0 3.0 6.0
important function of parameters where the practitioner 7.0 11.0 12.0 18.0 18.0 18.0 18.0 18.0 21.0 32.0
might be interested. Samples from these parametric 36.0 40.0 45.0 45.0 47.0 50.0 55.0 60.0 63.0 63.0
67.0 67.0 67.0 67.0 72.0 75.0 79.0 82.0 82.0 83.0
functions can be routinely obtained by substitution using
84.0 84.0 84.0 85.0 85.0 85.0 85.0 85.0 86.0 86.0
the generated values of posterior variates (see, for example,
ARTICLE IN PRESS
1438 A. Gupta et al. / Reliability Engineering and System Safety 93 (2008) 1434–1443

Fig. 1. Bathtub behaviour of the hazard rate estimated from the data.

estimated hazard rate function based on this procedure (see


[14]). Obviously, the bathtub behaviour is depicted by the
estimated hazard rate curve (see Fig. 1).

3.2. Posterior analysis

As pointed out earlier, the posterior analysis has been


done based on a hybrid strategy combining Metropolis
within the Gibbs chain. For the purpose of implementa-
tion, we considered the Metropolis algorithm separately for
each single dimension, that is, by defining the Metropolis
algorithm for generating from those full conditionals,
which are not available for direct sample generation (see
Appendix A). The implementation phase considered a
single long run of the chain using maximum likelihood
estimates of the parameters as the initial values of the
concerned variates. For the parameters of the candidate-
generating densities in the Metropolis step, we considered,
besides the maximum likelihood estimates, the correspond-
ing variances as well.
A systematic pattern of convergence based on ergodic
averages was assessed after an initial transient behaviour of
the chain. Final posterior sample of size 1000 was taken by
choosing equally spaced outcomes after 70,000 iterations
although convergence was achieved quite early at about
30,000 iterations. We considered (2.0, 1.0) as values of
hyperparameters (Z, k) of the gamma prior. Several values
were assigned to hyperparameters M1 and M2 and it was
seen that it does not have a considerable effect on the
results. However, for the purpose of illustration, we used
both M1 and M2 to be equal to 10.0.
Fig. 2 provides the kernel density estimates of a, b and l.
These estimates have been drawn using R software with
the assumption of Gaussian kernel and properly chosen
values of bandwidths. It can be seen that both b and l are Fig. 2. Kernel density estimates of a, b and l.
almost symmetric, whereas a shows slight positive skew-
ness. Table 2 provides the corresponding values of
parameter estimates and some of the other important than unity clearly advocates for bathtub-shaped hazard
posterior characteristics based on final sample of size 1000. rate for the model. This observation is further strengthened
Among different conclusions that can be easily drawn, an with the 90% credible region (0.414, 0.597) of b shown in
important one is the estimated values of b. The value less the last column of the table (see also the estimated density
ARTICLE IN PRESS
A. Gupta et al. / Reliability Engineering and System Safety 93 (2008) 1434–1443 1439

Table 2 tional tools however provide straightforward solutions as


Posterior estimates for the Weibull extension model one can easily simulate predictive samples if MCMC
Parameters Mean Mode SD Percentiles
outputs are available from the posterior corresponding to
the assumed model. Most of the standard numerical and
5% 95% graphical methods based on predictive distribution can
then be easily implemented to study the compatibility of
a 0.365 0.303 0.143 0.177 0.627
b 0.500 0.487 0.055 0.414 0.597 the model (see [17,30], etc.).
l 0.011 0.010 0.002 0.006 0.016 Comparison of empirical distribution function plots
based on the observed and the predictive data may be
considered as an informal way to check discrepancies
between the data and the model. Cumulative distribution
Table 3 function (Cdf) of the model under consideration evaluated
Estimated posterior correlations based on MCMC output
at, say, posterior mode can also be added in this
Parameters a b l comparison to further verify any possible differences
between the model and the data (for details, see [17]). A
a 1.000 0.842 0.021 more appealing graphical method that can be used in initial
b 0.842 1.000 0.399
g 0.021 0.399 1.000 stages of the study of model compatibility is to look for
discrepancies present in the hazard rate functions obtained
by both the data and the model. Hazard rate function plays
a vital role in reliability studies and, most importantly, it is
of b in Fig. 2). As a word of final remark, it is important to free from some of the defects like smoothing that may
mention that once we have a random sample from the occur with Cdf and cumulative hazard function, etc.
posterior, we can study perhaps any desired (posterior) Numerical measures have also been proposed for the
characteristics. The estimates and the pictures shown in the purpose of studying the model compatibility, and the tail
paper are for the purpose of illustration only. area probability or the p-value is a most frequent measure
Table 3 provides the values of estimated posterior that has been successfully used in both classical and
correlations. It is clear that a and b are highly correlated Bayesian paradigms. Both the paradigm offers straightfor-
a posteriori, whereas the pair (a, l) is almost uncorrelated. ward computation of the p-value when parameter(s) are
The high posterior correlation among a, b may be an known. In situations where parameters are unknown,
important reason behind the slow convergence of the classical approach replaces the unknown parameters with
MCMC chain. their point estimates (usually the maximum likelihood
estimates) and computes the tail area probability based on
4. Model compatibility some discrepancy measure. Such a p-value is called as Plug-
in p-value. The main problem with the classical approach is
4.1. Bayesian tools that it fails to produce a proper comparison between the
data and the model as it compares the data with the best
The study of model compatibility with the data in hand fitted model (usually the maximized likelihood). Another
offers a significant step in the model building process. A difficulty with this approach is that classical goodness of fit
wrongly entertained model cannot provide the expected tests are well defined and calculable only if the discrepancy
solutions to the underlying engineering or scientific measure is a pivotal quantity, a situation that we often do
problems under investigation and, therefore, it becomes not find in practice (see [22] for a detailed discussion).
pertinent to authenticate the model proposed in any In a Bayesian framework, the idea is to determine the tail
scientific inquisition. Classical paradigm provides a num- area probability based on a discrepancy measure, and in
ber of toolkits for studying model compatibility but mostly determining the same the unknown parameter(s) can be
the techniques rely on asymptotic approximations and/or eliminated by integrating with respect to certain distribu-
large sample distributional properties (see, for example, tion of the same. Bayesian p-value is also referred to as the
[29] for a detailed documentation). Moreover, the situation predictive p-value.
worsens as the modeling assumption becomes more and Many versions of predictive p-values have been proposed
more difficult. Bayesian paradigm offers several techniques in the literature. These are, for example, prior predictive
to authenticate the considered model for the data in hand. p-value (see [31]), posterior predictive p-value and its
Predictive distribution plays a vital role in this development generalized versions (see [32,21,22,17] etc.), conditional
and logic behind it is quite convincing too; we suspect a predictive p-value and partial predictive p-value (see [23]),
model if it predicts something very different from what we etc. Among these different versions, the prior predictive
started with. The problem with this approach is often the p-value is simple and perhaps most appealing, provided
non-availability of predictive distribution in closed form suitable proper prior information is made available.
excluding the situations where we assume some simple and Posterior predictive p-value has been advocated in case of
standard form of the models. Modern Bayesian computa- non-informative priors, but it suffers with several defects
ARTICLE IN PRESS
1440 A. Gupta et al. / Reliability Engineering and System Safety 93 (2008) 1434–1443

such as apparent double use of data and its non-Bayesian model under consideration. We begin with the estimated
characteristics. In fact, it becomes data driven when sample hazard rate obtained on the basis of the data and the
size is large. Conditional predictive and partial predictive model. Fig. 3 provides an extension of Fig. 1 with
p-values are defined in an attempt to consider important superimposed hazard rate estimates in the form of dashed
features of prior and posterior predictive p-values and to lines. These lines correspond to the estimates of the hazard
preclude their unimportant features. Bayarri and Berger rate corresponding to the fitted model evaluated on the
[23] have given a nice discussion on the different versions of basis of simulated posterior samples using the MCMC
Bayesian p-values. technique. Obviously, the model can be considered for the
The main problem with the last two measures, however, data but with reservation. A difference of the model with
lies in their computation coupled with choice of suitable the data is quite evident at the upper region where the two
test statistics. Choice of test statistics may rather be done sets of curvatures show different tendencies. Thus, one may
intuitively and depends upon what aspects or character- consider the model as a true representative everywhere
istics of the model are considered to be important for the except at the upper region. For the upper region of the
problem under study (see [21]). The computational aspects failure times one should, however, try to find a better
are really daunting except for some standard problems; we, candidate.
however, remain confined to partial predictive p-value due We next considered computation of partial predictive
to its relative ease of computation over conditional p-value detailed in the previous subsection (see also [23]).
predictive p-value. For the purpose of illustration, we first considered 1000
Partial predictive p-value can be defined as samples from the partial posterior p*(y) using the
Z Metropolis chain. We then considered the smallest and
P ¼ prðTXtobs Þpn ðyÞ dy, (8) largest ordered observations as test statistics for computing
the partial predictive p-values. The reason behind choosing
where p*(y) is partial posterior given by the smallest and largest order observations is simply the
f ðxobs jyÞpðyÞ fact that most reliability models provide good fit to the
pn ðyÞ / f ðxobs jtobs ; yÞpðyÞ / . (9) data in the central region and discrepancy, if any, between
f ðtobs jyÞ
the model and the data, generally occurs in the tails. In no
Obviously, the double use of the data has been avoided way we claim that these are the only possible statistics,
since the contribution of tobs to the posterior is removed there can always be other choices better than what we have
before y is eliminated through integration (see also [23]). considered although we have paid attention to other
Computation of partial p-value involves generation of neighbouring observations as well (see below). Table 4
sample from p*(y). This can be done according to the provides the corresponding partial predictive p-values
procedure described in [23] using the Metropolis algorithm. based on the smallest and largest ordered future observa-
A simple alternative to the Metropolis algorithm suggested tions say, Y1 and Yn. It is obvious that the model is
by the authors involves an importance-sampling-based
estimate of partial predictive p-value that is comparatively
easier to evaluate (see also [30]).
As a final remark, it can be said that numerical measures
are good but suffer with some important drawbacks too.
One such drawback may be their tendency to be narrowly
focused on a particular aspect of the relationship between
the model and the data and, as such, they often try to
squeeze that information into a single descriptive number.
Graphical methods based on predictive data, on the other
hand, may offer suitable tools to describe the compatibility
of the model with the data on various aspects. Predictive
distribution of a measure such as, for example, mean,
variance, any ordered future observations, etc. can be
plotted with the corresponding observed value to obtain a
complete and clear idea. Such a distribution may not only Fig. 3. Estimated hazard rate based on the observed data (solid line) and
provide a detailed scenario on compatibility of the model the fitted model (dashed lines).
but also provide an idea on several other aspects where the
desirability of improvement of model can be understood.
Table 4
Partial posterior predictive p-value
4.2. Real data illustration
Test statistics Y1 Yn
Continuing with the same data set that is considered in
p-Value 0.47 0.99
the previous section, let us examine the compatibility of the
ARTICLE IN PRESS
A. Gupta et al. / Reliability Engineering and System Safety 93 (2008) 1434–1443 1441

Fig. 4. Density estimates of the three smallest order future observations, vertical lines represent corresponding observed values (Y1, Y2 and Y3 in clockwise
direction).

Fig. 5. Density estimates of the three largest order future observations, vertical lines represent corresponding observed values (Y48, Y49 and Y50 in
clockwise direction).
ARTICLE IN PRESS
1442 A. Gupta et al. / Reliability Engineering and System Safety 93 (2008) 1434–1443

strongly supported by the data in hand especially when Acknowledgement


one relies on Y1 but is highly suspicious when one con-
siders the test statistics Yn. It is to be noted that a Research work of Ashutosh Gupta is financially
similar conclusion was drawn on the basis of estimated supported by the Council of Scientific and Industrial
hazard rate too. Research, New Delhi, India, in the form of Senior
To obtain further clarity on our conclusion for the study Research fellowship.
of model compatibility, we considered plotting of density
estimates of three smallest and three largest replicated Appendix A
future observations from the model with superimposed
corresponding observed data. For this purpose, 1000 For implementing the Gibbs sampler algorithm the full
samples were drawn from the posterior using MCMC conditionals of various parameters can be obtained from
procedure and then obtained predictive samples from the the posterior (7) as
model under consideration using each simulated posterior  
f 1 a b; l; x
sample. The size of predictive samples was same as that of !
observed data. X
n

Density estimates based on replicated future data sets are / anðn=bÞþZ1 exp axbi
i¼1
shown in Figs. 4 and 5. Fig. 4 represents the estimates !
n 
X     a
corresponding to smallest three predictive observations, 1=b b
 exp la exp axi  1 exp  , ðA:1Þ
whereas the same for largest three observations is shown in i¼1
k
Fig. 5. The corresponding observed values are also shown
 
by means of vertical lines. Although it is going to be vague f 2 b a; l; x
conclusion, it is obvious through visual inspection of !
Y
n X
n
figures that the lower ordered observations are highly /b a n nðn=bÞþZ1
xib1 exp axbi
probable values in the corresponding predictive density i¼1 i¼1
estimates, whereas the probabilities are diminishing rapidly !
n 
X   
b
with regard to higher ordered observations in their  exp la1=b exp axi  1 , ðA:2Þ
corresponding predictive density estimates. It, therefore, i¼1
provides an impression that there is some scope of !
  n 
X   
improvement of the model in the region of wear out n 1=b b
f3 l a; b; x / l exp la exp axi  1 .
failures, a conclusion, which was drawn earlier by
i¼1
alternative means of studying model compatibility.
(A.3)
5. Conclusion It can be seen that (A.1) and (A.2) are quite complicated
and do not belong to any well-known family of distribu-
The paper successfully describes the scope of Markov tions. These two full conditionals are otherwise also
chain Monte Carlo (MCMC) technique in the Weibull difficult to simulate, say using adaptive rejection or
extension model that is capable of representing bathtub direction sampling schemes. Therefore, we can conclude
behavior of the hazard rate. It has been further shown that that the Gibbs sampler scheme is not easy to implement for
the issue of model compatibility can also be routinely the prior-likelihood combinations that we have considered
tackled using the output of MCMC. It was observed that for defining the posterior. Metropolis algorithm, however,
the model can be pretty well considered for the data though can be a good candidate for these full conditionals. We
it fails to provide a good fit to the higher ordered used univariate normal candidate-generating densities to be
observations, which are responsible for providing the used for implementing the Metropolis algorithm on the full
increasing portion of the hazard rate. In fact, it is conditionals (A.1) and (A.2). The full conditional (A.3) is
often seen that for bathtub hazard rates a single simple and it can be shown that it belongs to gamma family
mathematical formulation often fails to provide the actual P and scale parameters, respectively, as n+1 and
with shape
shape of the bathtub curve, the location of its two change ½a1=b ni¼1 ðexpðaxbi Þ  1Þ1 . The generation from (A.3) is,
points and the length of its constant region, a common therefore, quite straightforward (see, for example, [33] for a
finding that has been noticed to some extent in the present gamma-generating routine).
study as well.
Such procedures can be developed for other similar References
distributions as well. The only point is that one is required
to develop a proper and efficient algorithm, the develop- [1] Bebbington M, Lai CD, Zitikis R. Optimum burn-in time for a
ment of which very much depends on the likelihood bathtub shaped failure distribution. Methodol Comput Appl Probab
2007;9:1–20.
forms and prior inputs. One such strategy has been [2] Xie M, Lai CD. Reliability analysis using an additive Weibull model
successfully evolved in the present paper by means of a with bathtub-shaped failure rate function. Reliab Eng Syst Saf 1995;
hybrid strategy. 52:87–93.
ARTICLE IN PRESS
A. Gupta et al. / Reliability Engineering and System Safety 93 (2008) 1434–1443 1443

[3] Mudholkar GS, Srivastava DK. Exponentiated Weibull family for [18] Pang WK, Hou SH, Yu WT. On a proper way to select population
analyzing bathtub failure rate data. IEEE Trans Reliab 1993;42(2): failure distribution and a stochastic optimization method in para-
299–302. meter estimation. Eur J Oper Res 2007;117:604–11.
[4] Hjorth U. A reliability distribution with increasing, decreasing, [19] Smith AFM, Roberts GO. Bayesian computation via the Gibbs
constant and bathtub-shaped failure rates. Technometrics 1980;22(1): sampler and related Markov chain Monte Carlo methods. J R Stat
99–107. Soc B 1993;55:2–23.
[5] Wang FK. A new model with bathtub-shaped failure rate using [20] Robert CP, Casella G. Monte Carlo statistical method. New York:
an additive Burr XII distribution. Reliab Eng Syst Saf 2000;70: Springer; 2004.
305–12. [21] Rubin DB. Bayesianly justifiable and relevant frequency calculations
[6] Dimitrakopoulou T, Adamidis K, Loukas S. A lifetime distribution for the applied statistician. Ann Stat 1984;12:1151–72.
with an upside-down bathtub-shaped hazard function. IEEE Trans [22] Gelman A, Meng XL, Stern HS. Posterior predictive assessment of
Reliab 2007;56:308–11. model fitness via realized discrepancies. Stat Sin 1996;6:733–807.
[7] Zhang TL, Xie M. Failure data analysis with extended Weibull [23] Bayarri MJ, Berger JO. Quantifying surprise in the data and model
distribution. Commun Stat-Simul Comput 2007;36:579–92. verification. In: Bernardo JM, Berger JO, Dawid AP, Smith AFM,
[8] Bebbington M, Lai CD, Zitikis R. A flexible Weibull extension. editors. Bayesian statistics, vol. 6. London: Oxford University Press;
Reliab Eng Syst Saf 2007;92:719–26. 1998. p. 53–82.
[9] Chen Z. A new two-parameter lifetime distribution with bathtub [24] Gelfand AE, Smith AFM. Sampling based approaches to calculating
shape or increasing failure rate function. Stat Probab Lett 2000;49: marginal densities. J Am Stat Assoc 1990;85:398–409.
155–61. [25] Upadhyay SK, Smith AFM. Bayesian inference in life testing and
[10] Xie M, Tang Y, Goh TN. A Modified Weibull extension with reliability via Markov chain Monte Carlo simulation, Technical
bathtub-shaped failure rate function. Reliab Eng Syst Saf 2002;76: Report 93-19, Department of Mathematics, Imperial College,
279–85. London, 1993.
[11] Tang Y, Xie M, Goh TN. Statistical analysis of a Weibull extension [26] Aarset MV. How to identify bathtub hazard rate. IEEE Trans Reliab
model. Commun Stat—Theory Meth 2003;32(5):913–28. 1987;R-36(1):106–8.
[12] Nadarajah S. On the moments of the Modified Weibull distribution. [27] Mudholkar GS, Srivastava DK, Kollia GD. A generalization of the
Reliab Eng Syst Saf 2005;90:114–7. Weibull distribution with application to the analysis of survival data.
[13] Zhang TL, Xie M, Tang LC. A study of two estimation approaches J Am Stat Assoc 1996;91(436):1575–83.
for parameters of Weibull distribution based on WPP. Reliab Eng [28] Lai CD, Xie M, Murthy DNP. A modified Weibull distribution.
Syst Saf 2007;92:360–8. IEEE Trans Reliab 2003;52:33–7.
[14] Martz HF, Waller RA. Bayesian reliability analysis. New York: [29] Lawless JF. Statistical models and methods for lifetime data.
Wiley; 1982. New York: Wiley; 1982.
[15] Upadhyay SK, Smith AFM. Modeling complexities in reliability, and [30] Upadhyay SK, Gupta A. Bayesian analysis of modified Weibull
the role of simulation in Bayesian computation. Int J Cont Eng Educ distribution using Markov chain Monte Carlo simulation. Submitted
1994;4:93–104. for possible publication in J Stat Plann Infer, 2007.
[16] Upadhyay SK, Agrawal R, Smith AFM. Bayesian analysis of inverse [31] Box GEP. Sampling and Bayes’ inferences in scientific modelling. J R
Gaussian and non linear regression by simulation. Sankhya, Series B Stat Soc, Ser A 1980;143:383–430 (with discussions).
1996;58:363–78. [32] Guttman I. The use of the concept of a future observation in
[17] Upadhyay SK, Vasishta N, Smith AFM. Bayes inference in life goodness-of-fit problems. J R Stat Soc, Series B, 1967;29:83–100.
testing and reliability via Markov chain Monte Carlo simulation. [33] Devroye L. Non-uniform random variate generation. New York:
Sankhya, Series A 2001;63(1):15–40. Springer; 1986.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy