1995 Control charts using robust estimators
1995 Control charts using robust estimators
1, 1995 79
Abstract A study is made of the Shewhart control charts when sampling from several non-normal
populations. The benefits of the use of robust estimators suggested by Andrews et al. (1972) in the
Princeton Study are evaluated for each of these distributions. All computations are done using the
Monte Carlo swindle techniques.
Introduction
In the case of the random variable X having the normal distribution with known mean ^i and
variance a^, the 99.73 X control chart for the mean is given by the limits
(1)
LCL = fi--j=
Vn
For details, see Montgomery (1991). Unfortunately, it is hardly realistic to assume that /( and
a^ are known. When the parameters are unknown, it is suggested that /< be estimated by the
sample mean X and the standard deviation a be replaced by slc^, its unbiased estimator. Here
C4 is a tabled constant, which in the case of the normal distribution is
id
where v = « - 1 and n is the sample size. Thus, the control limits become
An alternative method of estimating the standard deviation a is by using the sample range R.
Then the control chart is defined by the control limits
where ^2 is a tabled constant determined so that the ratio Rld^ is an unbiased estimate of cr.
The question of interest here is to assess the effect of non-normality of the variable X and
to consider the role of robust estimation procedures on the control charts for ^. The robust
estimators examined in this connection are the ones presented in the Princeton Study. See
Andrews et al. (1972) for a detailed discussion; an application of this technique to confidence
intervals is presented in Gross (1976). In addition, Tiku's (1967) modified maximum
likelihood (MML) estimator is also included in this study. This estimator has been con-
sidered by several authors as a strong contender for robust estimation procedures in the case
of symmetric thick-tailed distributions.
Robust estimation
A large number of robust procedures for estimating the location parameter of a symmetric
distribution are presented by Andrews et al. We shall discuss a few of these methods here. In
what follows the order statistics for a random sample of size n are represented by x\, X2,... Xn.
where
m = n-2k + 2k(i, A = n-2k
n-k
C= X
fit) «
a= fit,
The value of t is detennined as a solution to the equation F (t) = I — q. Here / (t) and F (r)
stand, respectively, for the pdf and cdf of the random variable X. The number of observations
censored from either end is fixed and denoted by k.
It has been shown by Tiku (1967) that, for large «, /i is approximately N [/i, a^/m] and
{A — 1)CT^/CT^ is an independent /^ variate on v = .^ — 1 degrees of freedom.
i=k+i
CONTROL CHARTS USING ROBUST ESTIMATORS 81
where k — [pw] + 1 and [/i] stands for the greatest integer contained in h.
7 = median (x,)
S = MAD = median {|x, - T|}
Let Ta and So represent the median and MAD as given above. With zi = (x, — To)/k*So
where the summations are over values of the index i for which |0| < TI. We note that the
estimators are of the form given by Gross (1976). As pointed out by him, the estimators
without tan"' are a one-step Newton-Raphson M-estimator with the i/^-function given by
= sm z, \z\<Ti
The tan"' is introduced to speed up the convergence. S^ is a finite sample evaluation of the
asymptotic variance formula for an M-estimator.
Hampel estimators
'a[c-\zW b<\z\<c
(c-
= 0,
with
1 = 1, 1^1 < a
= 0, a<|.2|-
—a ,,
c-b'
= 0,
Then
7 = median + M A D
Here the triple a, b and c are taken to be 2.25, 3.75 and 15.0. The estimator Tis once again
82 K. KOCHERLAKOTA & S. KOCHERLAKOTA
Y X
Non-normal distributions
In this study, we consider a wide class of non-normal distributions defined hy X=ZIY, where
Z is N{0,\) and V is a random variable which imparts the non-normality to X. The random
variables Z and Y are independently distributed. The random variable X has a symmetric
distribution with tails that are, in general, thicker than the tails of the standard normal. The
random variables Y and the corresponding distributions of X are shown in Table 1.
Control limits
With a robust location estimator T and the corresponding scale estimator S, the control limits
given in (1) become
CL=r (2)
„ 35
In (2) the constant A is determined so that SIA is an unbiased estimator of the scale
parameter.
As mentioned earlier the most commonly used control charts are (a) X charts using the
sample standard deviation and (b) X charts using the range. For X charts using the sample
standard deviation, the T in (2) is the sample mean X and 5 is the sample standard deviation
with A = C4.
Shewhart suggested strengthening these control limits using rational subgroups (see
Montgomery, 1991, p. 113). In this modification, m rational subgroups each of size n are
taken. Following Shewhart's suggestion, these subgroups are formed so that the between-
groups variability is maximized while the within-group variation is minimized, from a
practical point of view. Then
where Xi and Si are the group mean and standard deviation. Each of these estimates is an
CONTROL CHARTS USING ROBUST ESTIMATORS 83
unbiased estimate of the corresponding parameter. Then the control limits using rational
subgroups are
f> 35
3S
In effect the control limits are the average of the control limits for the m subgroups.
For X charts using the range the scale parameter is estimated by the average range found
by averaging over the m subgroups:
1 "•
When using R, the constant A is the tabled value d^. Using R and d^ the control limits are
JR
To construct the control charts for the normal distribution using the robust estimators, we
need to determine the constant A appropriate to the particular estimators. These constants
were obtained using computer simulations. In each case, a sample of size n was taken from
N(O,1) and Sy the scale estimator, was determined. The constant A was found by averaging
over 5000 repetitions. This gives E (S/A) = cr for anyCT.From Table 2, it can be seen that
when S is the sample standard deviation (estimator 1) or the range (estimator 2) the
simulated values agree closely with the standard tabled values. In Table 2, estimates 3-8
stand for the estimator given in the section on robust estimation.
procedures, usually fewer than 1000 repetitions are needed. Since we have used the swindle
techniques extensively, it is pertinent to describe them.
An excellent discussion of the Monte Carlo swindle method has been given by Simon
(1976). As pointed out by him, there does not seem to be single seminal work published on
this topic. Besides Simon's paper, the papers by Gross (1973, 1976) are useful for under-
standing the underlying ideas.
The term 'swindle' is applied to any trick which either reduces the effort or improves the
precision, or both, of a computer simulation. The swindle technique can be applied in a
variety of situations. The only requirement is that the parameter being determined be
expressible as an expectation. This permits estimation by an averaging over the sample
generated by the simulation. Examples of such a representation are: 6 = E (X),
0 = V a r ( ^ = E (X^), ifxhe E(X) = O and 6 = P{X>y} = £ ( / ) where / = 1 if {X>y} and
7 = 0 if {X<y}.
Let Xi, «= 1, 2,. . ., n be a random sample of size n from a symmetric distribution of the
type Z/Y discussed above. Let Tn be a statistic with the property
where a and b are constants. The expectation of [giT)] can be determined by decomposition:
Elg (T)] = EyFnrEiEslg iT{(xi- a)/b}b+ a]
where a, b are any pair of location and scale estimators, nr stands for the sample of normalized
residuals {(x, — a)/b}. The expectations are taken successively starting with the innermost one:
£•„: averaging over a (location estimator) conditionally on b, the normalized residuals
and the observed denominators {3;,}.
Ei: subsequently evaluated by averaging over b (scale estimator) conditionally on the
normalized residuals and the observed {jy,}.
Enr'. evaluated by averaging over the observed normalized residuals conditionally on
{yi}-
EY'. denotes averaging over the observed denominators {yi).
If Ei is evaluated analytically, then the procedure is referred to as 'location swindle' only. If,
however, both E^ and Ei are detennined analytically, then we have a 'location and scale
swindle'. It should be noted that T {nr} can be detennined as [T {xi] — a]/6 with T {xi} being
the value of the statistic T found from the observations Xi, X2, . . . ., Xn. Although the choice
of a, b is entirely arbitrary, it is helpful to choose them as
Tl i ,-
Such a choice makes the analytical evaluation of the first two expectations quite easy in most
instances.
CONTROL CHARTS USING ROBUST ESTIMATORS 85
To evaluate the percentage points, we have to find probabilities of the type P{T<k}, for
several values of k, starting with an initial value in the neighborhood of the true value. Using
the location swindle, we have
where the summation is over the simulation and N is the simulation size. Here ay, 7} and
can be readily detennined from the observed values of x.j, yij and z,y. This representation
follows from the conditional distribution of a given in (4). Here we are determining the
expectation over a analytically. Hence, it is a 'location swindle'.
The symmetry of the distribution of T can be exploited to evaluate the probability in (5)
as the average
i[P{T<k}+P{T> -k}]
where
To answer these questions, a simulation experiment consisting of two parts was carried out:
first the control limits were determined and then, using a further independent set of
observations, the performance was evaluated.
To fix our ideas, let us consider the first question regarding the performance of the
robust estimators under normality. Here the simulation experiment consisted of the following
steps.
86 K. KOCHERLAKOTA & S. KOCHERLAKOTA
(1) A sample of size n was taken using the IMSL subroutine GGNML. For this sample
the estimators Tk and Sk were evaluated using £* for ^ = 1, 2, . . . , 8.
(2) To simulate data from m rational subgroups, the procedure in (1) was repeated m
times and the average of each of the estimators was determined.
(3) The simulated experiment consisted of averaging these values over N repetitions.
The control limits were then formed:
CL*= Tk, ^ - 1, 2, . . ., 8
Table 3 has these simulated control limits for the eight estimators using the normal
distribution. The results are based on n = 5, 10 and 20 with w = 25 in each case.
The simulation size iVwas taken as 1000.
(4) To study the behavior and comparative performance of the robust control charts, the
tail probabilities
+P{T< LCL*} (6)
were calculated using the location swindle. From equation (3) we recall that
yf = M and a=X
To compute y, a further set of AT = 1000 samples of size n were generated from the
normal distribution. For the jxh sample the values of the summary statistics corre-
sponding to the ^th estimating procedure
TkpXj {ork=l,2,...,8;j=l,2,...,N, (7)
were generated and retained for determining the probabilities. In the special case of
the normal distribution, for formulae for calculation of the tail probabilities given in
equation (5) reduce to
P{Tk s ^
TV
P{Tk^LCL,} =^
Using equation (6), a simulated value of y* was computed for each of the eight
estimators.
Now suppose that it is suspected that the observations may be non-normal. Then the focus
of the study is directed at the robustness of the estimators in the presence of non-normality.
The non-normal distributions that we have considered are discussed above. To examine the
effect of non-normality, the simulation experiment outlined above was repeated for each of
the non-normal models. Recall that the non-normal variables are generated as X = Z/y with
Z~Ar(O,l) and Y chosen appropriately. For example, to generate X ~ r on v df, y was
CONTROL CHARTS USING ROBUST ESTIMATORS 87
Table 3. Control limitsfor samples ofsize n,m = 25 subgroups and tail probabilities determined using the swindle uchnique
with a simulation size of N= 1000
n=5 n= 10 n = 20
N(O,1)
1 .339, .344 0.002697 -0.9530, C).9528 0.002583 -0.6724, 0.6711 0.002662
2 .338, .343 0.002721 -0.0526, ().9524 0.002595 -0.6729, 0.6716 0.002644
3 .354, .358 0.004626 -0.0533, C .9540 0.003125 -0.6726, 0.6713 0.002878
4 .352, .357 0.004730 -0.9535, C).954O 0.003253 -0.6726, 0.6713 0.002963
5 .361, .367 0.010627 -0.9541, C).9518 0.009987 -0.6752, 0.6705 0.012655
6 .351, .356 0.008248 -0.9553, C).9552 0.005641 -0.6732, 0.6714 0.005663
7 - 0.346, .350 0.006281 -0.9548, ().9545 0.003984 -0.6730, 0.6714 0.003582
8 .347, .351 0.005123 -0.9549, 0.9550 0.003702 -0.6226, 0.6712 0.003589
t-distribution with 3 df
1 - ;2.013, 2.002 0.021939 - 1.487, 1 .487 0.014790 - 1.072, 1.768 0.008191
2 - ;2.028, ;2.017 0.021475 - 1.575, 1 .575 0.012028 - 1.232, 1.228 0.002774
3 .666, .661 0.011335 - 1.224, .227 0.008898 -0.9094, 0.9054 0.007452
4 .667, .661 0.010697 -1.225, 1 .225 0.006937 -0.9092, 0.9052 0.004244
5 1.719, .710 0.015454 - 1.144, 1 .142 0.009556 -0.7844, 0.7804 0.011674
6 1.738, .723 0.011186 -1.183, 1 .184 0.007068 -0.8175, 0.8144 0.005083
7 1.770, 1.762 0.010653 -1.221, .223 0.007242 -0.8504, 0.8473 0.004287
8 .769, .762 0.010124 -1.215, .217 0.007566 -0.8421, 0.8383 0.004648
t-distribution with 9 df
1 1.497, 1.496 0.004852 -1.072, .074 0.003127 -0.7624, 0.7560 0.003294
2 1.504, 1.503 0.004682 - 1.094, .095 0.002599 -0.7983, 0.7920 0.002123
3 1.431, .431 0.006599 - 1.030, .031 0.004019 -0.7381, 0.7310 0.003830
4 1.445, .445 0.006273 -1.030, .030 0.003800 -0.7380, 0.7308 0.003427
5 1.464, 1.460 0.013061 - 1.010, .009 0.010005 -0.7089, 0.7021 0.013353
6 1.469, 1.460 0.009365 - 1.029, .028 0.005157 -0.7248, 0.7164 0.005039
7 1.464, 1.463 0.007694 - 1.039, .038 0.003846 -0.7333, 0.7260 0.003772
8 1.463, .463 0.007104 -1.037, .037 0.003803 -0.7297, 0.7221 0.003845
Table 3.—Continued
n=5 n=10 n = 20
generated as V;r?/v using the IMSL subroutine GGCHS. To determine the control limits, the
first three steps were repeated for each of the non-normal models. Again the values oiA given
in Table 2 based on the normal distribution were used.
The fourth step had to be modified to incorporate the non-normality. In this case the
summary statistics for the jxh repetition are given by
k=-l,2,...,8;j=l,2,...,N (9)
CONTROL CHARTS USING ROBUST ESTIMATORS 89
p{ n ^ UCL*}
The control limits and y for a selection of non-normal distributions are given in Table 3.
Results
In Table 3 the estimation procedures are compared for each of the distributions considered
in the paper: (a) AT (0,1); (b) r-distribution on three degrees of freedom (df); (c) J-distribution
on nine df; (d) outlier model with one observation form N(0,9); (e) outlier model with one
observation from N{0, 16); (f) outlier model witb one observation from N(fi, 100); (g)
mbtture with a = 0.1; (h) mixture with a = 0.2; (i) slash distribution.
As one would expect, in the case of the normal distribution the control limits using the
standard deviation and the range have the shortest width. The probability of the false alarm
with the process in control is equal to the one prescribed. The small deviations from the
target value of 0.0027 are due to simulation. The robust procedures, on the other hand, are
not good in terms of the width and the probability of false alarm. For all the six estimators,
these quantities are larger than those for the standard deviation and the range. However, as
the sample size increases to 20 the performance of estimators 3 and 4 improves, attaining
those achieved by the standard deviation and the range. Unfortunately, samples as large as
20 may be impractical in a statistical quality control situation.
In the non-normal situations the performance is varied under the different estimators
suggested here. We will look at their performance for each non-normal model in terms of the
two criteria used in the normal case: width of the interval and probability of a false alarm.
The r-distribution is closest to the normal distribution, especially when the degrees of
freedom are large. For v = 3 the estimators are all poor performers when n = 5, giving
relatively wide limits with a very large probability of false alarm. The procedure using the
range improves when the sample size increases to 20, yielding y close to the prescribed level
of 0.0027. Winsorized estimators along with wave and Hampel estimators are better than the
other three procedures. The median/MAD is by far the worst. If the degrees of freedom
increase to 9, the performance of all estimators improved when judged in terms of the widtb
of the intervals. However, the probabilities of false alarm, while being reduced to a large
extent, deviate considerably from the prescribed value of 0.0027 when n is small. As n goes
to 20 tbe width is close to the ones obtained for tbe normal distribution. Moreover, the
probabilities of false alarm are brougbt much closer to 0.0027. The procedure using tbe
standard deviation performs the best in this respect.
In tbe outlier model, there are three different types of disturbance included in the study,
with the variances of the single outlier being 9, 16 and 100, respectively. As one would
expect, the performance in the first case is the closest to the one without any outliers while
in the case when a^ = 100 the charts using the standard deviation and the range are adversely
afFected. It is interesting to note that the robust procedures are indeed quite robust, providing
a protection of nearly 0.0027 for all sample sizes. The Winsorized metbod gives the best
results among all the methods examined, while the median/MAD is the worst even for large
sample sizes. This is reflected in the procedure using the range.
With the non-normality induced by the contamination model with proportion of the
contamination in the distribution being 0.1, none of the procedures seems to provide
90 K. KOCHERLAKOTA & S. KOCHERLAKOTA
acceptable levels of protection even if one is willing to deviate from the prescribed level of
0.0027. Sample size does not afford any improvement. The performance becomes worse
when the contamination proportion is increased to 0.2.
Finally, with the slash distribution the widths, as well as the probabilities of false alarm,
are very high. As n goes to 20 the range and the Winsorized charts provide adequate
protection against this type of problem.
References
ANDREWS, D.F., BICKEL, P.J., HAMPEL, F.R., HUBER, P.J., ROGERS, W . H . & TUKEY, J.W. (1972) Robust
Estimation of Location: Survey and Advances (Princeton, NJ, Princeton University Press).
GROSS, A.M. (1973) A Monte Carlo swindle for estimators of location. Applied Statistics, 22, pp. 347-353.
GROSS, A.M. (1976) Confidence interval robustness with long-tailed symmetric distributions. Journal of the
American Statistical Association, 71, pp. 409-416.
IMSL (1980) (Houston, Texas).
MONTGOMERY, D . C . (1991) Introduction to Statistical Qualiry Control, 2nd edn (New York, John Wiley and
Sons).
SIMON, G . (1976) Computer simulation swindles, with applications to estimates of location and dispersion.
Applied Statistics, 25, pp. 266-274.
TiKU, M.L. (1967) Estimating the mean and standard deviation from a censored normal sample, Biometrika,
54, pp. 155-165.
YUEN, K.K. (1971) A note on Winsorized t. Applied Statistics, 20, pp. 297-304.