Applying Reliability Models More Effectively Software
Applying Reliability Models More Effectively Software
Reliability
Models More
Effectively
MICHAELR. LYU,University of Iowa
ALLENNIKORA, Jet Propukion Laboratory, Caltech
I)Combining the
results of individual
models may giue
M ore than 40
software-reliability models have been
created since the first one appeared in
1972.' As new projects arose, new reliabil-
ity models were created to suit them. Now
software engineers have a plethora of reli-
way of knowing in advance which model
is likely to produce the most trustworthy
predictions.
Instead of developing more detaded -
and potentidy more complicated-models,
we chose to focus on using existing models
more accurate ability models, none of whch work opti- more effectively. To this end, we have devel-
mally across projects. oped a set of hear combinationmodels that
predictions than Consequently, a major difficulty in combine the results of single, or component,
software measurement is analyzing the models. As measured by statistical methods
using component context in which measurement is to take fordeterminingamodel'sappliabilitytoaset
place to determine beforehand which of Mure data,a combinationmodel tends to
modelsalone, model is likely to be tmstworthy. Because have more accurate short-term and long-
software development and operation in- term predictions than a componentmodel.
providing a general volve many intricate human activities and After evaluating these models using
because software failure pattems are un- both historical data sets and data from re-
reliability method certain, such determinations are difficult cent Jet Propulsion Laboratory projects,
- if not impossible. Also, project data we have found that they are consistently
amoss projects. varies considerably and often does not satisfactory.To make it easier to apply re-
coniply with a model's underlying as- liability models and to form combination
sumptions. models, we are developing a tool to auto- '8
074CL7459/92/0700/0043/503 00 5, IEEE 43
Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR. Downloaded on January 04,2023 at 12:52:31 UTC from IEEE Xplore. Restrictions apply.
WELL=KWOWW RELIABILITY MODELS
The traditional soharere- rameter-estimation procedures. perfeq thus, the program’s fail- than earlier counts in predict-
liabilitymodel is a set of tech- It is available fiee &omWilliam ure rate improves by the same ing future counts. (N.
niques that apply probability Fan; Naval Surface Warfare amount at each fk.(2.Jelinski Schneidewind, “Analysisof
theory and statisticalanalysisto Center, Code K-52, Dahlgren, and P. Moranda, “Software Re- Error Processes in Computer
software reliability. Areliability VA22448.SRMP is the only liabilityResearch,” in Stat&?& Software,”SigPh Notice,June
model specifies the general tool that offersmethods to ana- ComputerP ~ n cEvdw- e 1975,pp. 337-346.)
form of the dependence of the lyze prediction quality. It pro- tian, W. Freiberger, Academic - s1mRFs
Mure process on the principal vides users with graphics @e a- Press,New York, 1972,pp, + Geometric(GW:Avaria-
factorsthat affect it: fault incro- 465-484.) -smRFs,SRFYrlP tion ofJM, this model does not
duction, fault manifestation, + Bqesian3elimki-Mwanda assume a fixed, finite number of
failure detection and recovery, ures (BJM): hentially the same as program errors, nor does it as-
fault removal, and operational and noise of the prediction er- JM, this model uses a Bayesian sume that e m r s are equally
environmentThe primary inference scheme rather than likely to occur. (P. Moranda,
goal of these models is to assess maximum likelihood.(A. “Event-AlteredRate Models
current reliabilityand forecast Abdel-Ghaly, P. Chan, and B. for General Reliability Analy-
future reliability. Littlewood,“Evaluationof sis,” IEEE Tram.Reliabdiy,
We believe the component Competing Software Reliabil- Dec. 1979, pp. 376-381.)
models &omtwo reliability- UK, ity Predictions,”IEEE Tram. ---SMWS
measurement tools- Statisti- T h e models these tools sofiwaveEqykehg, Sept. + GawalizedP o k m (PM):
dModeling and Estimation of offer are 1986,pp. 950-967.) Similar to JM, except within
Reliability Functions for Soft- + 3dkk-Mwmda (JM): -SMERFS the error-count framework (R.
ware and Software Reliability One ofthe earliest models, it as- + Schneidaubd(SM):Smi- Schafer et al., Widation of
Modeling Programs -are the sume~failures occur purely ran- ilartoJM, dw model’s philoso- Software Reliability Models,”
most frequently used. domly and that all faults con- phy is that the error-detection Tech. Report RADC-TR-79-
S M E m is the only tool tribute an equally to total process changesas testing pro- 147, Rome Air Development
that allows multiple timedo- unreliability.When a Mure oc- gresses and that recent error Ctr., Rome, N.Y., 1979.)
main and interval-domainpa- curs,it assumes that the fkis counts are usually more useful -SiWWS
44 JULY 1992
Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR. Downloaded on January 04,2023 at 12:52:31 UTC from IEEE Xplore. Restrictions apply.
~ . - _ _~__ _ . ~ ~
+ GoeLOkmto (GO):Sim- they become familiar with the when fixes take place. @. + Brooks m d Motley @M):
ilar to JM, except it assumes the software at the start of testing. Miller, “Exponential Order Sta- The BM binomial and Poisson
failure rate improves continu- (S. Yamada,M. Ohba, and S. tistic Models of Software Reli- models attempt to consider
ously in time. (A Goel and K. Osaki, “S-Shaped Reliability ability Growth,IEEE Trans. that not all of a program is
Okumoto, “Tune-Dependent Growth Modeling for S o h a r e Sofrware Eng.,Jan. 1986, tested equally during a testing
Error-Detection Rate Model Error Detection,” IEEE Trm. pp. 12-24.)-SRMP period and thatonly somepor-
for Sohare Reliabilityand Rehabdiq, Dec 1983, pp. 475- + L‘itthoOd-Vmul o: tions of the p m g ” may be
Other P&rmance Measures,” 478.) -SMERFS Lets the size of failure-rate im- available for testingduring its de-
IEEE Trm.Rel&& Aug. + Littlewood 0: Similar provement at a fix vary ran- velopment (Vi.?Brooks and R
1979,pp. 206-2 11.) toJM, except it assumes that domly, representingthe uncer- Modey,Analysts of Discrete
-ssMERFs,sRMP different faultshave different tainty about fault size and the Software Reliability Models,
+ M~-OkWurto(MO) sizes (contributeunequally to efficacyof the fix. (B. Little- Tech. Report RADC-TR-80-84,
Similar to GO, except it at- unreliability),which is more re- wood and J. Verrall, “ABayes- Rome Air DevelopmentCtr,
tempts to considerthat later alistic Larger faults tend to be ian ReliabilityGrowth Model Rome, NX, 198O.)--SMERFS
fixeshave less effect on pro- removed earlier, causing a “law for Computer Sohare,”J. + h n e (DU): Developed
gram reliability than earlier of diminishingreturns” in de- Royal StahticsSoc. C, Vol. 22, for hadware burn-in testing,in
ones. 0. Musa and K bugging. @. Littlewood, “Sto- pp. 332-346.) -S M W S , which defective system compo-
Okumoto, “ALogarithmic chastic Reliability Growth: A SRMP nents are detected and replaced
Poisson Execution T i e Model for Fault Removal in + likuer-fihd(KL) in the early days of use. Once
Model for S o h a r e Reliability Computer Programsand Hard- Similar to the LV model but has again,the model assumes that
Measurement,” h. IntS Cb$ ware Designs, ” IEEE Trans. adifkrentmathaaticalformfor the failure rate changes contin-
S&me Eng.,IEEE CS Press, &kabdiq, Oct. 1981,pp. 3 13- &bilitygrowth. (2Keiller et uously in time. (L.Cow, “Con-
Los Alamitos, Calif., 1984pp. 320.) -SRMP al., “Comparisonof SofisvareRe- fidence Interval Proceduresfor
230-238.)-ShlERFS,SRMP + Littlewood Nmbmge- liability Predictions,”h. lEEE ReliabilityGrowth Analysis,”
+ E i d DelayedS-Sbape nem Poisson Pmces (LNHPP): IntSSyq. Frmi-Ehmt Camput- Teck Report 197,US Army
(YMt): S d a r to GO, except it Similarto LM but assumes a ing,IEEECS Pres, Los Al- Materiel Systems Analysis Ac-
accounts for the learningpe- continuouschange in failure a”, W.,1983,pp. 128- tivity, Aberdeen, Md., 1977.)
riod that testers go through as rate, rather than discrete jumps, 134.)-SRMP -SM-, SRMP
stant, equal weight. The arithmetic aver- 1 4 1 quantitatively define it; other measures
UL(‘--O+-Iz/I+-P
age of all component models’ predictions ’-6 6 6 tend to be more subjective. We adopted
is taken as the ELC model prediction where 0 represents an optimistic predic- four measures to rate model
tion, P, a pessimistic prediction, and M, 1. Acaimy. We defined accuracy as
1
ELC =+GO + h 0 + -W the median prediction. the prequential likelihood measure, in
3 3
+ DymmicaUy Weighttd Linem- Cwnbi- which the observed data is a sequence of
I hese weightings remain constant and natim. In the DLC model, we assunie that
? -
that is far away from the others. SELECTING MODELS FOR COMPARISON PI>,/ = n jit,)
+ Unequally Weighted Lineal- Combina- I=/+ 1
tion.The ULC model is similar to the MLC To compare the combination models’ Since this measure is usually very close
model except that optimisticand pessimistic performance, we selected a subset of the to zero, we take its logarithmic value for
predic~onscontribute to the final predic- component models in SMERFS and comparison. The resulting number is al-
tion.The predictionisnotdetermined solely SRlLzp that ranked the highest in the fol- ways negative. Given several models that
by the median value. Here we use lowing criteria: use the same data set, the model with the
weighting s d a r to those in the Program + M o d e l ualidity. We viewed this crite- largest value gives the most accurate pre-
Evaluation and Review Technique: rion as the most important because we can diction.
~~~ ~
~~~~
IEEE SOFTWARE 45
Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR. Downloaded on January 04,2023 at 12:52:31 UTC from IEEE Xplore. Restrictions apply.
RECENT PROJECTS FROM THE JET PROPULSION LABORATORY
Voyager. The Voyager 1 and subsystem in which the failure was nearly constant throughout pus of the University of Alaska,
2 spacecraft were developed occurred. Roughly 9.5 faults the two testing stages. In addi- is a facilityfor tracking and ac-
during the mid- 1970sand per thousand lines of code were tion, the main functionalareas quiring data &omEarth re-
launched inmid-1977. Both discovered during system test of the software received sourcessatellites in high-in&-
spacecraft flew past Jupiter and G d h Launched in 1989, roughly the same amount of nation orbits. Totaling about
Saturn.Voyager 2 continued Galileo was developed as a testing every calendarweek. 103,000 uncommented source
exploring the outer solar sys- Jupiter orbiter c.rrying an at- Thisinformation made the linesof code, the s o h a r e is
tem by flying past Uranus in mospheric probe. As with the failuredata more accurate than writcm in a “ r e of C, For-
1986 and Neptune in 1989. Voyagers, a large fraction of that for other projeas. About tran,Equel, and OSL. About
The Voyagers were one of Galileo’s functionalitywas pro- 15,oOOsourcehesofasxmbly 14,000lineswere reused from
the first spacecraft in which vided by software. Galileo con- languagewere developed fwthe previous efforts. We obtained
software provided a large part tains an AACS and a Com- CDS. Dunngintegration testing, the failure data presented here
of the functionality.This soft- mand and Data Subsystem. roughly 10.1 faultsperthousand from the development organi-
ware, approximately 14,000 Approximately7,000 un- lines of code were b e r e d . zation’s anomaly reporting
lines of uncommented assem- commentedsourmlinesof M a p h . A large portion of system during software integra-
bly language, was divided W S were implemented for the on-board software for the tion and test
among three real-time em- the AACS.As with the Voyager;, MagellanVenus radar mapper As with the other projects,
bedded subsystems -the the failuredata comes fi-om is derived from Galileo’s soft- we assumethe test time per unit
Attitude and Articulation spacegaft-systemtesting Anesti- ware. Like Galileo, Magellan interval of calendar time was
Control Subsystem, the mated 10.2 faultsperthousand has an AACS and a CDS -the relatively constant, and the test-
Command and Control Sub- linesof codewere detected. number of uncommented ing method remained constant,
system, and the Flight Data GdbUK.Failure data for source lines of code for each is since this information was not
Subsystem. the Galileo Command and roughly the same as that for systematicallyrecorded.
The failure data we ana- Data Subsystem during one Galileo. As wi& Galileoand the Largely because of thislack of
lyzed comes from spacecraft- phase of subsystem-level inte- Vbyagers, the failuredata comes information, we decided to
system testing, at which point gration testing was available for fromthespaedqstemtest model the reliability of the facil-
the AACS, CCS, and FDS had analysis. Because one of us had period.An estimated 8.0 faults ity as a whole, rather than at-
been integrated into the space- been involved in this testing ef- per thousand linesof code were tempt to model the component
craft. Among the items re- fort, we could reconstruct some deteaed during testing. reliabilities. For the part of sys-
corded on the problendfahe elements of the testing profile. M a SAR. The Alaska Syn- tem testing that we analyzed,
reports during system test are For example, we knew that thetic Aperture Radar fadty, about 3.6 faults per thousand
time of failure, failure type, and the hours of testing per week installed on the Fairbanks cam- lines of code were discovered.
46 JULY 1992
Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR. Downloaded on January 04,2023 at 12:52:31 UTC from IEEE Xplore. Restrictions apply.
Measure JM GO MO DU LM 1v EL( ULC MLC DLC
Accuracy -811.1 -811.2 -811.1 -812.7 -810.8 -811.1
(4) (7) (4) (9) (2) (4)
Bias .0835 .0761 .0586 -.O 845 .OS94 .OS86
(8) (6) (1) (9) (3) (1)
Trend .0623 .0663 .0487 .0630 .0474 .0480
(7) (9) (5) (8) (3) (4)
Noise 5.384 5.209 4.088 3.714 4.196 4.073
(9) (8) (9 (2) (6) (4)
Rank (6) (8) (4) (6) (3) (2)
turd,functional, and application domains. sion Laboratory. We then evaluated the models performed relatively well com
+ Simplicy.Simplicityis generallya de- models in terms of all the data. pared with the six component models. W
sirable feature for most mathematicalmod- To compare models, we first deter- obtained similar results for Musa’s othe
els. The simpler the model, the easier it is to mined each model’s rank for each mea- data sets.
gather project#c data, select generic sure. We then equally weighed the ranks
parameten fromadatabase, andundemand by summing them. The models with a JPL doto sets.We collected failure dat
and interpret the m o d e h results. lower overall sum were better than those &om recent JPL projects, which are de
+ Imm‘tizity to noise. Reliability data with a higher sum. Of course, others scribed in the box on the facing page. Th
generally contains information that is ir- might apply different weights to each data we collectedwas based mainly on cal
relevant to the modeling process. A model measure, and there can always be a “wild” endar times. The following informatior
is appealing if it can make accurate mea- measure that might totally disqualify a whch would have been useful, was nc
surements even when failure data is in- model. Nevertheless, we used this simple available because it was not routinely re
complete or contains uncertainties. ranking algorithm without expanding the corded:
After applying the seven evaluation cri- details of each measure, since such elabo- + Execution times between successiv
teria! we found that JM, GO, MO, DU, rations might involve subjective judg- failures or comparable information, sucl
LM, and LV ranked high enough to war- ments that could themselves be biased. as the total time spent testing during
rant further study. We used these six com- calendar interval.
ponent models plus our four combination MUWdots sets.Table I shows the results + Operational profile informatioi
models to evaluate prediction validity. fi-omMusa’sdata set 3, whichcontains 207 (like functional area being tested), refer
data points. We began predictions at data enced to requirements or design docu
EVALUATINGMODEL PERFORMANCE point 60 so that we would have a small but mentation, the subsystem being testec
reasonable set of data points (1-59) for pa- and the points at which the testing metha
To assess the prediction validity of our rameter estimation. may have changed.
combination models, we evaluated their The numbers in each row represent In general, data based on calendar tim
performance, plus the performance of the the computed measure under each crite- tends to be noisy and might not compl
six component models we selected, using rion; the ranks are in parentheses. We ar- with most of the reliability models’ as
first three data sets fromJohn Musa’s reli- rived at the values in Rank (the last row) by sumptions. We present it to show circum
ability data compiled in 19806 and then summing them. stances typical of actual practice.
data from recent projects at theJet Propul- As the table shows, the combination Table 2 shows comparisons when w
IEEE SOFTWARE 4;
Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR. Downloaded on January 04,2023 at 12:52:31 UTC from IEEE Xplore. Restrictions apply.
Dato JM GO MO DU LM LV EL( ULC MLC DLC I
_ _ ~ -_ ~~
applied the four measurements of the tracting 4 (the par rank) kom the rank of a MO, and LV are weighted or averaged,
model validity criterion for the Galileo model for each data set before we added the combinational models are less sensi-
Command and Data Subsystem’s flight up its rankings in the overall evaluation tive to potential data noise than compo-
s o h a r e , which is a representative data set (another way is to subtract 32 from Sum of nent models. This is true with data based
(Resultskom all the data sets would take rank). A negative handicap means that the on both execution and calendar time.
too much space.) The data contains 358 model’s overall performance was satisfac- Across all project data for the four accu-
points, and the startingpoint is 152. As the tory for the eight data sets. racy criteria, the combination models
table shows, the ELC and ULC models These tables illustrate several impor- sometimes outperform all their compo-
ranked the highest. tant points: nent models and never perform worse
+ In general, the combination models than the worst component model.
colnbineddata sets. Tables 3 and 4 list the perform better than the component mod- The DLC and ELC models per-
performance comparisons for all eight els. In Table 3 (all criteria), the only ac- form more consistently than the other
data sets we investigated. In Table 3, we ceptable models (those with a negative models. Most other models seem to per-
used the four model validity measures; in handicap) are the combination models. In form well for a few data sets but poorly for
Table 4, we used only the accuracy mea- Table 4 (accuracy criterion alone), the other data sets - and the fluctuation is
sure, since we thought it was the most im- three acceptable models are also combina- significant. T h e ELC model’s perfor-
portant and would give a more detailed tion models. The handicap values of the mance is due to its equal weighting, which
breakdown of performance. combination models usually beat those of preserves GO’S,MO’s, and LVs good
We considered a model satisfactory if the component models by a significant properties. On the other hand, since the
and only if its ranking was 4 or better. We margin. D L C model is allowed to change its
arrived at the values in Handicap by sub- + When the predictions from GO, weighting; dynamically, accordmg to the
48 JULY 1992
Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR. Downloaded on January 04,2023 at 12:52:31 UTC from IEEE Xplore. Restrictions apply.
wi computation wi+, computation Wit? computation
i
- wi reference
window
- 1- I
wit, reference
window
t
w , + ~reference
c------c
window
“ I
EXTENSIONS AND ALTERNATIVES
Figure 1shows how the two models differ. allows the observing window to advance Figure 2. Summary of the DLC/F and DLC/S
In the DLC/F window, the weight as- dynamically as step-by-step prediction madehfm w r h s up to 10 nme$ames.
iEEE SOFTWARE 49
Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR. Downloaded on January 04,2023 at 12:52:31 UTC from IEEE Xplore. Restrictions apply.
Doh GO MO LV ELC DLC
-
.q.
500 -
(umulotive failures ,,
’ ’
I
as indicated by the dashed line, to estimate
each model’s parameters. Immediately
following &IS estimation stage is the pre-
diction stage. For the W l e o CDS, these
two stages follow the project’s natural
400 l l l l l l 1
breakdown into two testing stages.
I l l 1 For the DLC model, we computed
~
model preferences and weights in the esti-
300 - /1 mation phase, and fixed the weight assign-
ments in the prediction phase.
LVs prediction curve is too pessimis-
tic, and GO’Sand MO’s are too optimistic.
In fact, all three curves for the component
models are out of the actual project data
curve (the line labeled Actual). ELC and
DLC, on the other hand, compensate
these extremes and make rather reason-
able long-term predictions.
To show quantitative comparisons of
”n ,711
., long-term predictions, we use mean
0 0.4 08 1. 2 16 2 24 ia square error instead of prequential likeli-
tumulative test hours (thousands) hood. Prequential likelihood is more ap-
propriate for comparing step-by-step pre-
- dictions, while the mean square error
Figure 3. Long-temnpredictrons~~3PLS.
Galileo Commandand Data Sub.ystem.
Drovides a more widelv understood mea-
~ sure of the distance between actual and
predicted values.The mean square error is
the prequential hkelhood, but other accu- model’s assumptions about the Dhvsical I ,
defined as
racy measures, such as the Akaike infor- process. It then becomes harder to get in-
mation criterion - a criterion to denote sight into the process of reliability engi-
how close a prediction is to the actual data7 neering. Most reliabilitymodels view sofi- AISE=
-or mean square error, are also feasible. ware as a black box, fiom whch to observe N
The main strength of the DLC models is failure data and make predictions. In that where N is the total number of p r e e t e d
that they combine component models in a context, our combination models do not points in the prediction phase, and yl and
way that lets the output be fed back for degrade any properties assumed in current yI are the predicted and actual number of
model adjustment. reliability-modeling practices. failures, respectively.
The fundamental approach of the lin- Table 5 shows the summary of long-
ear combination models is simple. How- LONG-TERM PREDICTIONS term predictions. The values under Sum
ever, by applylng more complicated pro- of MSEs and Sum of ranks show that the
cedures, we risk losing the individual Our results showed that the combina- ELC and DLC models generally perform
50 JULY 1992
Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR. Downloaded on January 04,2023 at 12:52:31 UTC from IEEE Xplore. Restrictions apply.
better th;m component models. Even
though the component models make a To screen, printer, or disk
better prediction than the ELC and DLC c
models on several occasions, they also per-
form signiticantly worse on others. The
ELC ancl D L C models, on the other
hand, never make the worst long-term I Model ___
predictions.
rerultr ,
A I Bios ( U , )
-
AUTOMATING RELIABILITYMEASUREMENT Trend (Y,)
Foilure data
Selecting component models to form (interfailure times,
combination models can be tedious and failure frequencies)
Execution - to noise
computation intensive. We are in Phase 1
of a three-phase effort to develop a tool,
called Computer-Aided Software Reli-
ability Estimation, whch will automate
most reliability-measurement tasks. Model
evaluotionl
Figure 4 shows CASRE’s archtecture.
You can find many of its functions in cur-
rent reliability-measurement tools, but no
other tool lets you combine the results of
~-
~ 21, Plotting
L~
r~ -~
Bios (ut)
Trend (Y)
several models in addition to executing Component models, I
one model. Feedback from model evalua- weighting schemes To screen, printer or disk
tion helps you identify a model or combi-
nation of models best suited to the failure Figure 4. A-lrchitertureOf‘CASRE,a tool to autmate the selection ofcomponent models tofmnt il iombindon
data being analyzed. Also, CASRE’s U 0 model. PL ispreque??tiallikelihood;AIC i.s Akaike Infirmation Criterion.
facility, the user interface, and the mea-
surement procedures are greatly enhanced
over those in existing tools. Disployr o new copy of the groph on screen Press to invoke Displays content of graphical
Figures 5 and 6, two screen dumps from
CASRE, show that you have many choices
of mtdels and evaluation criteria, yet the se-
lection operation remains fairly simple.
CASRE’s major functions are
+ Dam modificRti0n. C A S E lets you
create new failure-data files, modify exist-
ing files, and perform global operations on
files. You can also select appropriate
s m o o h g techques or apply data trans-
fonnations to the failure data being ana-
lyzed. You can plot the modified input
data, use it as input to a reliability model,
or write it to a new file for later use.
+ Failure-data ana&. You can display
the fdure data’s summary statistics, includ-
ing the data’s mean, median, and variance
and 2 5 - and 7s-percentile cutofk.
+ Motkling and meamrement. CASRE
has two modeling functions: As Figure 4
shows, you can execute either single compo-
nent models on a data set or several models
and combine their results. Through model
evaluation, you can detennine how well a Figure F. CASRElr initial display offailure data
IEEE SOFTWARE 51
Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR. Downloaded on January 04,2023 at 12:52:31 UTC from IEEE Xplore. Restrictions apply.
model applies to the data.
+ Reszllts dqlay. C A S E graphically
displays model results of interfailure
times, cumulative failures, failure intensi-
ties, and the reliability-growth curve. You
can plot both actual and estimated quanti-
ties on the same figure. Plots also include
user-specified confidence limits and con-
trol over the plotted range of data.
In a windowing environment, you can
displaymultiple plots. You canthen either
print the plots, save them as a disk file, or
Figclre 6. Selecting the best modelswrth CASRE. To specifi the m t m a ly whichyou willpdge a model to be feed them to other software, such as a
the best, mow the slide bars on the Selection Cnterra panel ( h e r nght comer) to set the relative weights of spreadsheet. The plotting function also
four mteria.
produces graphics from the model
evaluation’soutput, whch indicate the de-
ACKNOWLEDGMENTS gree and direction of model bias and the
We thank William Farr of the Naval Surface Warfare Center for the use of SMEWS and Bev Littlewood way in whch the bias changes over time.
of the City University of London for permission to use SRMP. The research described in this article was done
at the University of Iowa under a faculty starting fund and at the Jet Propulsion Laboratory, CaliforniaInsti-
tute ofTechnology, under a NASA connact through the director’sdiscretionary fund. CASRE’s implementa-
tion is being supported by the Air Force Operational Test and Evaluation Center under Task Order RE-182,
Amendment 655, Proposal 80-3417. T“ e combination models we have pro-
posed show promising results com-
pared with traditional single models. Our
REFERENCES approach is also flexible,letting you select
1. J. llusa, il Iannino, and IC Okumoto, Sojizai-eReliahiliQ - ,Vleamrement,Prediction,Applicatim,AfcGrdw-
models that best suit the failure data.
Hill, NewYork, 1987.
2. .\I. Lyu, “,Measuring Reliability of Embedded Sotiware: An Empirical Study with JPL Project Data,” PI-oc. C A S E automates significant portions of
Iiit’l C m f Pn~habrl*trcSafety Arsesment and.2lanagement, Elsevier, New York, Feb. 1991, pp. 493-500. the work, making software-reliability
3. ALL1 !\Torking Group, “Recommended Practice for Software Reliability,”American Inst. of Aeronautics measurement even simpler. For instance,
and .Astronautics,Washington, D C (to appear). CASRE will let users runthe combination
4. A. Dawid, “StatisticalTheory: The Prequentid Approach,”y. RoyalStatiticsSoc. A, Vol. 147, pp. 278.292.
5 . A. Ahdel-Ghaly, P. Chan, and B. Littlewood, “Evaluation ofcompeting SoftwareReliabilityPredictions,”
models we have described just by selecting
IEEE Tram &&are Engmeermg, Sept. 1986,pp. 950-967. them from a menu.
6. J. A h a , “SoftwareReliabilityData,” tech. report, Rome .4ir Development Center, Griffis AFB, S.Y., Users can also form their own combi-
1980. nation models, save them as part of the
7 . H. Akaikc, “Prediction and Entropy,” Tech. Report 2397, Mathematics Research Cu., Univ. of \.\isconsin,
tool’s configuration, and run them in cur-
,Madison. 1982.
rent or subsequent sessions.
We recognize that much more work
needs to be done to gain confidence that
Michael R. Lyu is an assistant professor of elecmcal and computer engneering at the the combination models consistently out-
Umversity of Iowa. His research interests include software engineering, software reliabil-
ity, fault-tolerant computing, and distributed-systemsengineering.
perform component models. We urge you
Lyu received a BS in electrical engineering from the National Taiwan University, an to apply different data sets to these models
MS in electrical and computer engineering from the University of Califorma a t Santa and to compare resulting predictions
Barbara, and a PhD in computer science from the University of Califomia at Los across a variety of projects.
Angeles. He is a vice chair ofthe subcommittee on software reliability engineering of the
IEEE Coinputer Society’sTechnical Committee on Software Enpeering.
We have not addressed how models
can more accurately describe software de-
velopment and testing, although we real-
ize that this area is of increasing concem.
Men Nikora is a member ofthe Jet Propulsion Laboratop’s software product assur
ance secaon His research interess d u d e software-rehahihtymeasurement, software
Because the detailed information we
safety,and software-systemdevelopment methodologies would require for such an investigation is
Nikora received a BS in engneenng and applied science from the Califomia I n n not available, we decided OUT work was
tute ofTechnology and an MS in computer saence from the University of Southern Cal- better confined to evaluating how to use
ifomia. He is pursmg a PhD in computer saence at USC He is a member of the IEEE existing models more effectively.
Computer Soaety and the IEEE Rehabdity Society
Address quesnons about tlus amcle to Lvu at ECE Dept ,Umv ofIowa, Iowa C q ,
We hope someday to address how to
I 4 52242, Internet lyu@eng.uiowaedu, or Nikora atJPL,M/S 125-233,4800 Oak develop models that can more accurately
Gro>eD r , Pdqadena, CA91 109, Internet bignukeBspa1 jpl ndsa go\ describe software development. +
__
52 JULY 1992
Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR. Downloaded on January 04,2023 at 12:52:31 UTC from IEEE Xplore. Restrictions apply.