Croston Method
Croston Method
Abstract
Intermittent demand appears at random. with many time periods having no demand. Manufacturers perceive the
forecasting of intermittent data to be an important problem. In practice, the standard method of forecasting
intermittent demand is single exponential smoothing, although some production management texts suggest the lesser-
known alternative of Croston’s method [Croston, J.D. , 1972, Forecasting and stock control for intermittent demands,
Operational Research Quarterly, 23(3), 289—303]. We compared the two methods, using artificial data created to
violate Croston’s assumptions and real-world data from industrial sources. We conclude that Croston’s method is
robustly superior to exponential smoothing and could provide tangible benefits to manufacturers forecasting
intermittent demand.
Table 1
Description of industrial sources and datasets
Company Product type # Series Description of data
A Electrical equipment 26 36 months of shipments
B Jet engine tools 16 910 days of shipments; most active items chosen
C Veterinary health 6 28 months of shipments
D Consumer food item 6 56 to 210 weekly shipments to 1 warehouse of 1 product in 6 package sizes‘
" The difference in data count arose because some package sizes were new.
T.R. Willemain et al. I International Journal of forecasting 10 (1994) 529 —538 531
Table 2A
Intermittent shipment data: product type
Data features Product type
Electrical Jet engine Veterinary Consumer food
equipment tools health (6-pack sizes)
f Items 4" 16 6 6
# Observations /item 36 910 2b 56 to 210'
f Pairs /item" 5 to 19 13 to 17 4 to S 36 to 175
Time unit Month Day Month Week
Table 2B
Intermittent shipment data: summary statistics
Product type
Electrical Jet engine Veterinary Consumer food
equipment tools health (6-pack siZes)
High Low High Low High Low High Low
Intervals
Mean 1.8 4 45 449 3 7 i.* 1.72
Median 1 3 14 449 2.5 6 i 1
Maximum 4 8 169 694 6 16 4 10
Minimum 1 1 1 261 1 1 1
CV 61 to 87Wo 21% 139Wo 60R 118% 43Wo 81%
Autocorrelation —0.43 0.25 —0.53 0.25 -0.17 0.25 0.07 0.39
Six-es’
Mean 2.1 13.6 1.2 4 6.8 7764 358 5991
Median 1 12 1 4 1 8000 320 3015
Maximum 1 26 1 9 10 10080 800 48160
Minimum 1 4 1 1 1 4975 34 80
CV 41Wo 180Wo 45Wo 106Wr 34to 153 to 58% 184 to
Autocorrelation —0.21 0.12 —0.33 0.41 —0.72 —0.19 —0.08 0.16
Wc Nonzero 25Wo 53Wo 0.2Wo 2Wo l4Wo 29Wo 58Wo 85 to
Both
Crosscorrelation —0.42 —0.01 —0.33 0.33 —0.29 0.31 —0.22 0.05
" Only 4 of the 26 series had enough nonzero data values for statistical analysis.
Some of the package sizes were new and so had fewer observations.
' Number of interval-size pairs used to estimate auto- and crosscorrelations.
Summary statistics are based on nonzero shipment sizes.
length, which expresses the standard deviation as comparisons using data from our industrial
a percentage of the mean, provides a useful sources.
summary. Many series had intervals with co- The classic work on intermittent demand fore-
efficients of variation exceeding 100% . Shipment casting is that of Croston (1972), as corrected by
sizes were also highly variable, with coefficients Rao (1973). Croston noted that exponential
of variation from 34% to 184%. smoothing systems are in common use in large
inventory control systems, but that exponential
3.2. Autocorrelations and crosscorrelations smoothing leads to stock levels and stock outages
that are both needlessly high. Croston’s ingeni-
Earlier studies of intermittent demand looked ous alternative first converts the sequence of
at the magnitude and variability of demand intermittent demands into two constituent se-
intervals and sizes, but none reported auto- and quences: one for the intervals between demands,
crosscorrelations. Table 2 shows that some sub- the other for the sizes of the demands. Exponen-
stantial positive and negative auto- and tial smoothing is then applied to each series
crosscor- relations arose in all four datasets. separately, with updating occurring only at mo-
This suggests that the stochastic models of ments of demand. Croston assumed demand to
intermittent de- mand in the literature may be occur as a Bernoulli process, making the inter-
too simplistic in their assumptions that vals between demands iid (independent and
successive interval dura- tions are independent, identically distributed) geometric. He further
successive demand sizes are independent, and assumed the demand size to be iid Normal. Our
intervals and sizes are mutually independent. simulations tested the effect of violating these
Unfortunately, the spareseness of intermittent assumptions.
data makes it difficult to estimate correlations. The following notation parallels that of Cros-
As a result, only two correlations in our datasets ton. Let
reached statistical significance at the 0.05 level.
Rather than interpreting this as justifying the
independence assumptions in the literature, we z, — binary indicator of demand at time r;
consider the question still open. Indeed, the z, — size of demand;
observation that all interval autocorrelations y, = x,z, = demand for an item at time i;
reported for the consumer food items were p — mean value of demand when nonzero;
positive p < 0.02 by sign test), all size auto- w 2 = variance of demand when nonzero;
correlations for the veterinary health products p -— average number of time periods between
were negative (p < 0.02), and all crosscorrela- demands;
tions for electrical equipment were negative (p a = smoothing parameter;
0.07) casts doubt on the independence assump- y', —— exponential smoothing estimate of mean
tions. demand for period;
y* = exponential smoothing estimate made im-
mediately after a demand occurs;
4. Forecasting methods q = time interval since last demand;
p,“ —— Croston’s estimate of mean interval be-
We compare single exponential smoothing tween demands;
with Croston’s method. The former is the z' = Croston’s estimate of mean demand size;
statistical forecasting method used most often in y', = Croston’s estimate of mean demand per
practice; the latter is the variant advocated in period;
leading production texts. First we explain the
two methods. Then we describe a Monte Carlo When demand is stable, the goal of the forecast-
analysis of the robustness of Croston’s method ing methods is to estimate the mean demand per
against complications. Finally, we report on period, / p.
T.R. Willemain ct al. / International Journal of Forecasting 10 (1994) 529—S.J8
deviation (MAD). To save space, and because mean interval between demands is greater than
the relative results were the same for all four two periods. Longer mean intervals would have
measures, we report only the MAPEs. We made increased the relative advantage of Croston’s
pairwise comparisons of the two forecasting method.
methods for each artificial data series, thereby The performance of both methods depended
removing the effect of inter-series variation and on the value of the exponential smoothing pa-
increasing the power of the comparison. rameter a. We replicated all simulations with
For each method, following Croston, we com- three values of a : 0.01, 0.1, and 0.5. Since the
puted the performance measures using only the artificial data series were constructed to have no
estimates of mean demand per period made shifts in mean, lower values of a always provided
immediately after a demand. The error in each better forecasts. In practice, mean demand per
period was the difference between the current period could change, requiring the use of higher
estimate of demand per period and the known, values of a. To both conserve space and be
constant value. consistent with Croston’s empirical hnding that
Each scenario had three or four subscenarios. values of a between 0.1 and 0.2 worked well in
In each subscenario, we generated five data practice, we report only results for n — 0.1. The
series long enough to contain 1000 pairs of proportional advantage of Croston’s method was
demand sizes and intervals. With series this long the same for the other values of a.
and the use of paired comparison, the sample
size of five was sufficient.
Unlike Croston, who assumed a Normal dis- 5.2. Scenarios
tribution of demand size, we took the distribu-
tion to be lognormal. Our data analysis showed We designed four scenarios corresponding to
the log transformation to be, in general, useful at violations of Croston’s assumptions. Table 3
symmetrizing the distributions of demand sizes. provides technical details of the scenarios and
We set the mean interval between demands at their subscenarios.
three time periods. Johnston (1980) noted that Scenario 1: Intervals and sizes uncorrelated.
the difficulties with using exponential smoothing This scenario was essentially the one analyzed by
on intermittent data become apparent when the Croston. Our scenario differed only in that the
Table 3
Technical details of Monte Carlo scenarios
Scenario # Demand intervals
Demand sizes
1 iid Geometric (3)
iid Lognormal (p, m)
: 2 2 10 10
o-: 0.25 3 0.25 3
iid Geometric (3)
Lognormal ( p, cr)
p = 5*Interval“
distribution of demand was lognormal rather SCenario 3: Sizes autocorrelated. Accuracy was
than Normal. quite insensitive to the autocorrelation of de-
Scenario 2: Intervals correlated with sizes. mand sizes. However, Croston’s method was
Croston assumed mutual independence between uniformly superior.
intervals and sizes. We investigated both positive Scenario 4: Intervals autocorrelated. The sub-
and negative crosscorrelations. scenario with high positive correlation (p - 0.8)
Scenario 3: Sizes autocorrelated. Croston as- between successive demand intervals produced
sumed demand sizes to be independent. We the largest forecasting errors in the experiment.
investigated both positive and negative auto- Even in this case, Croston’s method reduced
correlations. MAPE by 10 points.
Scenario 4: Intervals autocorrelated. Croston In all comparisons, Croston’s method was
assumed the binary demand indicator variable x, more accurate than single exponential smooth-
to be a Bernoulli process. This implies that the ing. The MAPE differences were of both statisti-
intervals between demands are iid geometric. We cal and practical significance and reflected similar
investigated both positive and negative auto- results for MdAPE, RMSE, and MAD. We
correlations. To generate correlated geometric conclude that Croston’s method is quite robust,
intervals, we first devised a new method for with practical value beyond that claimed in
creating correlated uniform random deviates Croston’s original paper.
(Willemain and Desautels, 1993), then trans-
formed the uniforms into geometrics.
6. Comparison using industrial data
5.3. Monte Carlo comparison results
In the Monte Carlo study, we established the
In all cases, Croston’s method provided more superiority of Croston’s method over single ex-
accurate estimates of the true constant demand ponential smoothing for several types of simu-
per period, using all four measures of accuracy. lated data. We then compared the two methods
Table 4 reports MAPEs averaged over the five using the real-world data summarized in Tables 1
experimental data series in each subscenario. and 2.
The MAPEs for Croston’s method were general-
ly 10—20 points lower than for exponential 6.1. Background conditions
smoothing. MAPE improvements of this size
would be economically significant. All MAPE With the real-world data, we focused our
differences were statistically significant using the attention on the MAPE for one-step-ahead fore-
paired i test. casts, comparing forecasted values per period
We remark now on details of the four in- with actual values (both zero and nonzero). The
dividual scenarios. choice of MAPE allows for scale-free compari-
Scenarto 1: Intervals and sizes unCorrelated. sons across datasets and permits comparison
The subscenario offering the greatest forecasting with the Monte Carlo results. The choice of one-
challenge had demand size coefficient of vari- step- ahead errors corresponded to the common
ation greater than 100% (p — 2, w — 3). The prac- tice in industry of reviewing inventory
advantage of Croston’s method was greatest in levels every time period. We allowed the
this case, reducing MAPE from 51% to 30% . smoothing parameter n to take on different
Scenario 2: Intervals correlated with sizes. The values for each method, using a grid search over
subscenarios with an inverse relationship be- the convention- al range from 0.01 to 0.9.
tween demand size and interval (B —— —0.5) were We had to specify initialization procedures for
the most challenging. They also produced the the two methods, since these can be very influen-
greatest accuracy improvements for Croston’s tial in short series. We used two procedures. In
method. the first, which we called ‘blind’ initialization, we
T.R. Willemain ei al. 1 International Journal of Forecasting I II (1994) .$29—538
Table 4
Monte Carlo comparison of forecast accuracy
Scenario 1: Intervals and sizes uncorrelated
/i cr Mean Demand MAPE ( %)
Per period
Croston Expo smoothing
2 0.25 0.69 16 28
2 3 0.65 30 51
10 0.25 3.33 15 27
10 3 3.33 16 30
did not look ahead at the data and began with tion). On average, Croston’s method was more
y/, = 0, z§ — 0, p0 = I. In the second, we used the accurate for all four companies’ data. The aver-
interval to the first demand q and the size of the age size of the reductions in MAPE ranged from
first demand y¿, to set y = y¿/q/, z/ — y , p — ldc for Company B (daily data) to 14% tor
q . Company A. The down-side risk reflected in the
worst results was relatively small (—3% to
6.2. Forecast results for industrial data —1%), while the chances of Croston’s method
being more accurate for a given series were
The comparison using industrial data also excellent, ranging from 83% to 100% across
favored Croston’s method over exponential companies. Furthermore, the up-side potential
smoothing. Table 5 shows the reduction in reflected in the best results was substantial (79c
MAPE when we initialized the calculations with- to 27 No). Similar results, not reported here, held
out reference to the data (i.e., blind initializa- for the other initialization procedure, which
T.R. Willemain et al. / International Journal of Forecasting 10 (1994) 529—538 537
Table S
Forecast error reduction with Croston’s method, by company
Company Type of # Data Absolute Reduction in MAPE‘ Percentage of Series with
Product data series Reduced MAPE
Best Average Worst
case case
A Electrical 4 24Wo 14Wo 7Wo 100Wo
equipment
Veterinary 6 27 10 i 100
health
Consumer 6 7 3 —l 83
food
“ MAPE (Single exponential smoothing) — MAPE (Croston’s method) for one-step-ahead forecasts.
assumed some foreknowledge of the size and to be as important problem. Especially when
timing of demands. forecasting large numbers of items, practitioners
To see the effects of temporal aggregation, we tend to use single exponential smoothing, al-
computed MAPE reductions for the data from though leading production management texts
Company B in both daily and weekly forms. suggest the lesser-known alternative of Croston’s
Forecasting the same data in weekly form re- method (1972).
duced MAPEs significantly and increased the Croston’s method makes separate exponential
improvement derived from using Croston’s smoothing forecasts of demand sizes and the
method. Croston’s method showed the smallest intervals between demands, then divides these to
improvement for daily data for Company B, forecast average demand per period. Croston
which had the largest proportion of zero values showed the advantages of his method, given
(sce Table 2), and also for Company D, which assumptions of independence and Normality.
had the smallest proportion of zero values. This However, we found correlations and distribu-
suggests that there may be some optimal degree tions in real-world data that violated Croston’s
of intermittency, from the point of view of assumptions. This prompted us to compare ex-
switching from exponential smoothing to Cros- ponential smoothing and Croston’s method
ton’s method. Perhaps too many zero values under less idealized conditions.
make it essentially impossible to forecast well We first compared the two methods using
using any statistical method, while two few zero artihcial data that violated Croston’s assump-
values make it unnecessary to abandon exponen- tions. The kinds of data most troublesome for
tial smoothing. both Croston's method and exponential smooth-
ing had a highly skewed distribution of demand
size and positive autocorrelation of intervals
7. Summary and conclusions between demands. Croston’s method was
superior in forecasting the artificial data series,
Intermittent demand appears at random, with generally reducing MAPE by l0 to 20 points.
many time periods having no demand. Manufac- We also compared the two methods using real-
turers perceive forecasting intermittent demand world data from four companies in different
535 T.R. Willemain ct al. 1 International Journal of Forecasting UI (1994) 529—538
industries. Here too, Croston’s method im- SchultZ, C.R., 1957, Forecasting and inventory control for
proved accuracy. The probability of improve- sporadic demand under periodic review, Journal of the
ment was high for all four companies. The Operational Research Society, 37, 303—308.
average reduction in MAPE was modest for two Silver. E.A. , 1981, Operations research in inventory manage-
ment: a review and critique, Operations Research, 29(4), 628
of the companies and consistent with the Monte
—645.
Carlo results for the other two. Silver, E.A. and R. Peterson, 1985, Decision Systems for
Perhaps by default, most companies now use In ventory Management and Production Planning, 2nd edn.
exponential smoothing to forecast intermittent (Wiley, New York).
demand. However, in our study, Croston’s meth- Swain, W. and B. Switzer, 1980, Data analysis and the design
of automatic forecasting systems, Proceedings of the Busi-
od proved more accurate than exponential
ness and Economic Statistics Section, American Statistical
smoothing. Because Croston’s method involves Association, 219—223.
fairly simple calculations appropriate to forecast- Tavares, L.V. and L.T. Al meida, 1983, A binary decision
ing large numbers of low-volume items, pro- model for the stock control of very slow moving items,
duction planners and inventory managers facing lournal of the Operational Research Societ y, 34( 3), 249—
252.
intermittent demand would benefit from switch- Watson , R.B. , 1987, The effects of demand-forecast fluctua-
ing to Croston’s method. tions on customer service and inventory cost when demand
is lumpy, lournal of the Operational Research Society,
38( 1), 75—82.
Acknowledgement Willemain, T.R. and P.A. Desautels, 1993, A method to
generate autocorrelated uniform random numbers, /otirrial
of Statistical Computation and Simulation, 45, 23—31.
This research was supported by a Small Busi- W illiams, T.M. , 1982, Reorder levels for lumpy demand,
ness Innovation Research grant given by the J$ournal of the Operational Research Societ y, 33(2), 185—
National Science Foundation to Smart Software,
Inc. We appreciate the assistance of Dr. Nelson Williams, T.M. , 1984, Stock control with sporadic and slow-
moving demand, Journal of the Of›erational Research
Hartunian in conducting the research and the
Society, 35( 10), 939—94S.
help of the referees in sharpening the presenta- Wright, D.I . , 1986, Forecasting data published at irregular
tion. Preliminary results were presented at the time intervals using an extension of Holt’s method, Atari-
Eleventh International Symposium on Forecast- agement Science, 32, 499—51fJ.
ing, New York City, June 1991.
Biographies: Thomas R. WILLEMAIN is Senior Vice Presi-
dent of Smart Software, Inc. and Associate Professor in the
Department of Decision Sciences and Engineering Systems at
References Rensselaer Polytechnic Institute. He holds a BSE from
Princeton University and a SM and PhD from MIT.
Buffa, E.S. and J. G . Miller, 1979, Production-ln ventory
Charles N. SMART is Chief Executive Officer of Smart
Systems: Planning and Control, 3rd edu. (Irwin, Software, Inc. He holds BA and MA degrees in Applied
Homewood, IL). Mathematics from Harvard University and a MBA degree
Croston, J.D. , 1972, Forecasting and stock control for from MIT.
intermittent demands, Operational Research Quarterl y,
23(3), 289—303. Philip A. DeSAUTELS holds BS and MS degrees in In-
Dunsmuir, W.T.M. and R.D. Snyder, 1989, Control of dustrial and Management Engineering from Rensselaer Poly-
inventories with intermittent demand, European Journal of technic Institute. He is now with the IBM Corporation.
Operational Research, 40, 16—21.
Joseph H. SHOCKER holds a BS degree from the United
Hax, A.C. and D. Candea, 1984, Production and Inventory
States Military Academy and a MS in Operations Research
Management, (Prentice—Hall, Englewood Cliffs, NJ). and Statistics from Rensselaer Polytechnic Institute. He is
Johnston, F.R. , 1980, An interactive stock control system now with the National Aeronautics and Space Administra-
with a strategic management role, Operational Research tion.
Quarterl y, 31, 1069—1085.
Rao, A.V. . 1973, A comment on “Forecasting and stock
control for intermittent demands”, Operational Research
QLiorierl5', 24(4), 639—640.