0% found this document useful (0 votes)
51 views30 pages

2010 RAMS Doe and Data Analysis

doe

Uploaded by

jokonudi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
51 views30 pages

2010 RAMS Doe and Data Analysis

doe

Uploaded by

jokonudi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 30

Copyright © 2010 IEEE.

Reprinted from “2010 Reliability and


Maintainability Symposium,” San Jose, CA, USA, January 25-
28, 2010.

This material is posted here with permission of the IEEE. Such


permission of the IEEE does not in any way imply IEEE
endorsement of any of ReliaSoft Corporation's products or
services. Internal or personal use of this material is permitted.
However, permission to reprint/republish this material for
advertising or promotional purposes or for creating new
collective works for resale or redistribution must be obtained
from the IEEE by writing to pubs-permissions@ieee.org.

By choosing to view this document, you agree to all provisions


of the copyright laws protecting it.
2010 Annual RELIABILITY and MAINTAINABILITY Symposium

Design of Experiments and Data Analysis

Huairui Guo, Ph. D. & Adamantios Mettas

Huairui Guo, Ph.D., CPR. Adamantios Mettas, CPR


ReliaSoft Corporation ReliaSoft Corporation
1450 S. Eastside Loop 1450 S. Eastside Loop
Tucson, AZ 85710 USA Tucson, AZ 85710 USA
e-mail: Harry.Guo@ReliaSoft.com e-mail: Adam.Mettas@ReliaSoft.com

Tutorial Notes © 2010 AR&MS


SUMMARY & PURPOSE
Design of Experiments (DOE) is one of the most useful statistical tools in product design and testing. While many
organizations benefit from designed experiments, others are getting data with little useful information and wasting resources
because of experiments that have not been carefully designed. Design of Experiments can be applied in many areas including
but not limited to: design comparisons, variable identification, design optimization, process control and product performance
prediction. Different design types in DOE have been developed for different purposes. Many engineers are confused or even
intimidated by so many options.
This tutorial will focus on how to plan experiments effectively and how to analyze data correctly. Practical and correct
methods for analyzing data from life testing will also be provided.

Huairui Guo, Ph.D., CRP


Huairui Guo is the Director of Theoretical Development at ReliaSoft Corporation. He received his Ph.D. in Systems and
Industrial Engineering from the University of Arizona. He has published numerous papers in the areas of quality engineering
including SPC, ANOVA and DOE and reliability engineering. His current research interests include repairable system
modeling, accelerated life/degradation Testing, warranty data analysis and robust optimization. Dr. Guo is a member of SRE,
IIE and IEEE. He is a Certified Reliability Professional (CRP).

Adamantios Mettas, CRP


Mr. Mettas is the Vice President of product development at ReliaSoft Corporation. He fills a critical role in the advancement of
ReliaSoft's theoretical research efforts and formulations in the subjects of Life Data Analysis, Accelerated Life Testing, and
System Reliability and Maintainability. He has played a key role in the development of ReliaSoft's software including,
Weibull++, ALTA and BlockSim, and has published numerous papers on various reliability methods. Mr. Mettas holds a B.S
degree in Mechanical Engineering and an M.S. degree in Reliability Engineering from the University of Arizona. He is a
Certified Reliability Professional (CRP).

Table of Contents
1. Introduction ..........................................................................................................................................................................1
2. Statistical Background ..........................................................................................................................................................2
3. Two Level Factorial Design .................................................................................................................................................3
4. Response Surface Methods (RSM) ......................................................................................................................................6
5. DOE for Life Testing ...........................................................................................................................................................9
6. Conclusions ........................................................................................................................................................................10
7. References ..........................................................................................................................................................................11
8. Tutorial Visuals…………………………………………………………………………………….. .................................12

ii – Guo & Mettas 2010 AR&MS Tutorial Notes


1. INTRODUCTION Variables that may affect the life are temperature, voltage,
duty cycle, humidity and several other factors. DOE can be
The most effective way to improve product quality and
used to quickly identify the troublemakers and a follow-up
reliability is to integrate them in the design and manufacturing
experiment can provide the guidelines for design modification
process. Design of Experiments (DOE) is a useful tool that can
to improve the reliability.
be integrated into the early stages of the development cycle. It
3. Transfer Function Exploration. Once a small number of
has been successfully adopted by many industries, including
variables have been identified as important, their effects on the
automotive, semiconductor, medical devices, chemical
system performance or response can be further explored. The
products, etc. The application of DOE is not limited to
relationship between the input variables and output response is
engineering. Many successful stories can be found in other
called the transfer function. DOE can be applied to design
areas. For example, it has been used to reduce administration
efficient experiments to study the linear and quadratic effects
costs, improve the efficiency of surgery processes, and
of the variables and some of the interactions between the
establish better advertisement strategies.
variables.
1.1 Why DOE 4. System Optimization. The goal of system design is to
improve the system performance, such as to improve the
DOE will make your life easier. For many engineers,
efficiency, quality, and reliability. If the transfer function
applying DOE knowledge in their daily work will reduce lots
between variables and responses has been identified, the
of trouble. Here are two examples of bad experiments that will
transfer function can be used for design optimization. DOE
cause trouble.
provides an intelligent sequential strategy to quickly move the
Example 1: Assume the reliability of a product is affected by
experiment to a region containing the optimum settings of the
voltage. The usage level voltage is 10. In order to predict the
variables.
reliability at the usage level, fifty units are available for
5. System Robustness. In addition to optimizing the
accelerated life testing. An engineer tested all fifty units at a
response, it is important to make the system robust against
voltage of 25. Is this a good test?
“noise,” such as environmental factors and uncontrolled
Example 2: Assume the reliability of a product is affected by
factors. Robust design, one of the DOE techniques, can be
temperature and humidity. The usage level is 40 degrees
used to achieve this goal.
Celsius and 50% relative humidity. In order to predict the
reliability at the usage level, fifty units are available for 1.3 Common Design Types
accelerated life testing. The design is conducted in the
Different designs have been used for different experiment
following way:
purposes. The following list gives the commonly used design
Number of Temperature Humidity types.
Units (Celsius) (%) 1. For comparison
25 120 95 • One factor design
25 85 85 2. For variable screening
• 2 level factorial design
Table 1 – Two Stress Accelerated Life Test • Taguchi orthogonal array
Will the engineer be able to predict the reliability at the usage • Plackett-Burman design
level with the failure data from this test? 3. For transfer function identification and optimization
• Central composite design
1.2 What DOE Can Do • Box-Behnken design
DOE can help you design better tests than the above two 4. For system robustness
examples. Based on the objectives of the experiments, DOE • Taguchi robust design
can be used for the following purposes [1, 2]: The designs used for transfer function identification and
1. Comparisons. When you have multiple design options, optimization are called Response Surface Method designs. In
several materials or suppliers are available, you can design an this tutorial, we will focus on 2 level factorial design and
experiment to choose the best one. For example, in the response surface method designs. They are the two most
comparison of six different suppliers that provide connectors, popular and basic designs.
will the components have the same expected life? If they are 1.4 General Guidelines for Conducting DOE
different, how are they different and which is the best?
2. Variable Screening. If there are a large number of DOE is not only a collection of statistical techniques that
variables that can affect the performance of a product or a enable an engineer to conduct better experiments and analyze
system, but only a relatively small number of them are data efficiently; it is also a philosophy. In this section, general
important, a screening experiment can be conducted to guidelines for planning efficient experiments will be given.
identify the important variables. For example, the warranty The following seven-step procedure should be followed [1, 2].
return is abnormally high after a new product is launched. 1. Clarify and State Objective. The objective of the
experiment should be clearly stated. It is helpful to prepare a

2010 Annual RELIABILITY and MAINTAINABILITY Symposium Guo & Mettas – 1


list of specific problems that are to be addressed by the As a general rule, no more than 25 percent of the available
experiment. resources should be invested in the first experiment.
2. Choose Responses. Responses are the experimental
2. STATISTICAL BACKGROUND
outcomes. An experiment may have multiple responses based
on the stated objectives. The responses that have been chosen Linear regression and ANOVA are the statistical methods
should be measurable. used in DOE data analysis. Knowing them will help you have
3. Choose Factors and Levels. A factor is a variable that is a better understand of DOE.
going to be studied through the experiment in order to
2.1 Linear Regression[2]
understand its effect on the responses. Once a factor has been
selected, the value range of the factor that will be used in the A general linear model or a multiple regression model is:
experiment should be determined. Two or more values within Y = β 0 + β1 X 1 + ... + β p X p + ε (1)
the range need to be used. These values are referred to as Where: Y is the response also called output or dependent
levels or settings. Practical constraints of treatments must be variable. X i is the predictor also called input or independent
considered, especially when safety is involved. A cause-and- variable. ε is the random error or noise, which is assumed to
effect diagram or a fishbone diagram can be utilized to help be normally distributed with mean 0 and variance σ 2 , usually
identify factors and determine factor levels. noted as ε ~ N (0, σ 2 ) . Because ε is normally distributed,
4. Choose Experimental design. According to the objective then for a given value of X, Y is also normally distributed and
of the experiments, the analysts will need to select the number Var (Y ) = σ 2 .
of factors, the number of level of factors, and an appropriate From the model, it can be seen that the variation or the
design type. For example, if the objective is to identify difference of Y consists of two parts. One is the random part of
important factors from many potential factors, a screening ε . The other is the difference caused by the difference of the
design should be used. If the objective is to optimize the X values. For example, consider the data in Table 2:
response, designs used to establish the factor-response
function should be planned. In selecting design types, the Y
Observation X1 X2 Y
available number of test samples should also be considered. Mean
5. Perform the Experiment. A design matrix should be used 1 120 90 300
325
as a guide for the experiment. This matrix describes the 2 120 90 350
experiment in terms of the actual values of factors and the test
3 85 95 150
sequence of factor combinations. For a hard-to-set factor, its 170
4 85 95 190
value should be set first. Within each of this factor’s settings,
the combinations of other factors should be tested. 5 120 95 400
415
6. Analyze the Data. Statistical methods such as regression 6 120 95 430
analysis and ANOVA (Analysis of Variance) are the tools for Table 2 – Example Data for Linear Regression
data analysis. Engineering knowledge should be integrated
into the analysis process. Statistical methods cannot prove that Table 2 has three different combinations of X1 and X2,
a factor has a particular effect. They only provide guidelines showing at different colors. For each combination, there are
for making decisions. Statistical techniques together with good two observations. Because of the randomness caused by ε ,
engineering knowledge and common sense will usually lead to these two observations are different although they have the
sound conclusions. Without common sense, pure statistical same X values. This difference usually is called within-run
models may be misleading. For example, models created by variation. The mean values of Y at the three combinations are
smart Wall Street scientists did not avoid, and probably different too. This difference is caused by the difference of X1
contributed to, the economic crisis in 2008. and X2 and usually is called between-run variation.
7. Draw Conclusions and Make Recommendations. Once the If the between-run variation is significantly larger than the
data have been analyzed, practical conclusions and within-run variation, it means most of the variation of Y is
recommendations should be made. Graphical methods are caused by the difference of X settings. In other words, Xs
often useful, particularly in presenting the results to others. significantly affect Y. The difference of Ys caused by the Xs
Confirmation testing must be performed to validate the are much more significant than the difference caused by the
conclusion and recommendations. noise.
The above seven steps are the general guidelines for From Table 2, we have the feeling that the between-run
performing an experiment. A successful experiment requires variation is larger than the within-run variation. To confirm
knowledge of the factors, the ranges of these factors and the this, statistical methods should be applied. The amount of the
appropriate number of levels to use. Generally, this total variation of Y is defined by the sum of squares:
information is not perfectly known before the experiment. n
Therefore, it is suggested to perform experiments iteratively SST = ∑ Yi − Y( )2 (2)
i =1
and sequentially. It is usually a major mistake to design a
Where Yi is the ith observed value and Y is the mean of all
single, large, comprehensive experiment at the start of a study.

2 – Guo & Mettas 2010 AR&MS Tutorial Notes


the observations. However, since SST is affected by the can easily verify that:
number of observations, to eliminate this effect, another SST = SS R + SS E = SS X1 + SS X 2 + SS E (9)
metric called mean squares is used to measure the normalized The fifth column shows the F ratio of each source. All the
variability of Y. values are much bigger than 1. The last column is the P value.
SS 1 n
MST = T = (
∑ Yi − Y
n − 1 n − 1 i =1
2
) (3) The smaller the P value is, the larger the difference between
the variance caused by the corresponding source and the
Equation (3) is also the unbiased estimator of Var (Y ) . variance caused by noise. Usually, a significance level α ,
As mentioned before, the total sum of squares can be such as 0.05 or 0.1 is used to compare with the P values. If a
partitioned into two parts: within-run variation caused by P value is less than α , the corresponding source is said to be
random noise (called sum of squares of error SS E ) and the significant at the significance level of α . From Table 3, we
between-run variation caused by different values of Xs (called can see that both variables X1 and X2 are significant to the
sum of squares of regression SS R ). response Y at the significance level of 0.1.
n n
SS T = SS R + SS E = ∑ (Yˆi − Y ) 2 + ∑ (Yi − Yˆi ) 2 (4) Source of
Degrees
Sum of Mean F P
i =1 i =1 of
Variation Squares Squares Ratio Value
Freedom
Where: Yˆi is the predicted value for the ith test. For tests with
Model 2 6.14E+04 3.07E+04 36.86 0.0077
the same X values, the predicted values are the same.
X1 1 6.00E+04 6.00E+04 72.03 0.0034
Similar to equation (3), the mean squares of regression
and the mean squares of error are calculated by: X2 1 8100 8100 9.72 0.0526
3 2500 833.3333
SS 1 n
MS R = R = ∑ Yˆi − Y
p p i =1
( 2
) (5)
Residual
Total 5 6.39E+04

Table 3 – ANOVA Table for the Linear Regression Example


( )2
SS E 1 n
MS E = = ∑ Yi − Yˆi (6)
n − 1 − p n − 1 − p i =1 Another way to test whether or not a variable is
Where: p is the number of Xs. significant is to test whether or not its coefficient is 0 in the
When there is more than one input variable, SS R can be regression model. For this example, the linear regression
further divided into the variation caused by each variable, such model is:
as: Y = β 0 + β1 X 1 + β1 X 2 + ε (10)
SS R = SS X1 + SS X 2 + ... + SS X p (7) If we want to test whether or not β1 = 0 , we can use the
following hypothesis:
The mean squares of each input variable MS X i is compared H0: β1 = 0
with MS E to test if the effect of X i is significantly greater Under this null hypothesis, the statistic is a t distribution:
than the effect of noise.
1 β
The mean squares of regression MS R is used to measure T0 = (11)
the between-run variance that is caused by predictor Xs. The se( β1 )
mean squares of error MS E represents the within-run variance Se( β1 ) is the standard error of β1 that is estimated from the
caused by noise. By comparing these two values, we can find data. The t test results are given in Table 4.
out if the variance contributed by Xs is significantly greater
than the variance caused by noise. ANOVA is the method Term Coefficient Standard Error T Value P Value
used for the comparison in a statistical way.
Intercept -2135 588.7624 -3.6263 0.0361
2.2 ANOVA (Analysis of Variance) X1 7 0.8248 8.487 0.0034

The following ratio X2 18 5.7735 3.1177 0.0526


MS R Table 4 – Coefficients for the Linear Regression Example
F0 = (8)
MS E
Table 3 and Table 4 give the same P values.
is used to test the following two hypotheses: With linear regression and ANOVA in mind, we can start
H0: There is no difference between the variance caused discussing DOE now.
by Xs and the variance caused by noise.
H1: The variance caused by Xs is larger than the variance 3. TWO LEVEL FACTORIAL DESIGNS
caused by noise. Two level factorial designs are used for factor screening.
Under the null hypothesis, the ratio follows the F distribution In order to study the effect of a factor, at least two different
with degree of freedoms of p and n − 1 − p . By applying settings for this factor are required. This also can be explained
ANOVA to the data for this example, we get the following from the viewpoint of linear regression. To fit a line, two
ANOVA table. points are the minimal requirement. Therefore, the engineer
The third column shows the values for the sum of squares. We

2010 Annual RELIABILITY and MAINTAINABILITY Symposium Guo & Mettas – 3


who tested all the units at a voltage of 25 will not be able to Effet of A = Avg. at A high - Avg. at A low
predict the life at the usage level of 10 volts. With only one (13)
30 + 35 20 + 25
voltage value, the effect of voltage cannot be evaluated. = − = 10
2 2
3.1 Two Level Full Factorial Design
3.2 Two Level Fractional Factorial Design
When the number of factors is small and you have the
resources, a full factorial design should be conducted. Here When you increase the number of factors, the number of
we will use a simple example to introduce some basic test units will increase quickly. For example, to study 7
concepts in DOE. factors, 128 units are needed. In reality, responses are affected
For an experiment with two factors, the factors usually are by a small number of main effects and lower order
called A and B. Uppercase letters are used for factors. The interactions. Higher order interactions are relatively
first level or the low level is represented by -1, while the unimportant. This statement is called the sparsity of effects
second level or the high level is represented by 1. There are principle. According to this principle, fractional factorial
four combinations of a 2 level 2 factorial design. Each designs are developed. These designs use fewer samples to
combination is called a treatment. Treatments are represented estimate main effects and lower order interactions, while the
by lowercase letters. The number of test units for each higher order interactions are considered to have negligible
treatment is called the number of replicates. For example, if effects.
you test two samples at each treatment, the number of Consider a 2 3 design. 8 test units are required for a full
replicates is 2. Since the number of replicates for each factor factorial design. Assume only 4 test units are available
combination is the same, this design is also balanced. A two because of the cost of the test. Which 4 of the 8 treatments
level factorial design with k factors usually is written as 2 k should you choose? A full design matrix with all the effects
design and read as “2 to the power of 3 design” or “2 to the 3 for a 2 3 design is:
design.” For a 2 2 design, the design matrix is:
Order A B AB C AC BC ABC
Factors 1 -1 -1 1 -1 1 1 -1
Response 2 1 -1 -1 -1 -1 1 1
Treatment A B
3 -1 1 -1 -1 1 -1 1
-1 -1 -1 20
4 1 1 1 -1 -1 -1 -1
a 1 -1 30
5 -1 -1 1 1 -1 -1 1
b -1 1 25
6 1 -1 -1 1 1 -1 -1
ab 1 1 35
7 -1 1 -1 1 -1 1 -1
Table 5 – Treatments for 2 Level Factorial Design 8 1 1 1 1 1 1 1

This design is orthogonal. This is because the sum of the Table 6 – Design Matrix for a 2 3 Design
product of A and B is zero, which is
(−1 × −1) + (1 × −1) + (−1 × 1) + (1 × 1) = 0 . An orthogonal If the effect of ABC can be ignored, the following 4
design will reduce the estimation uncertainty of the model treatments can be used in the experiment.
coefficients.
The following linear regression model is used for the Order A B AB C AC BC ABC
analysis: 2 1 -1 -1 -1 -1 1 1
Y = β 0 + β1 X 1 + β 2 X 2 + β12 X 1 X 2 + ε (12) 3 -1 1 -1 -1 1 -1 1
Where: X 1 is for factor A; X 2 is for factor B; and their 5 -1 -1 1 1 -1 -1 1
interaction is represented by X 1 X 2 . The effects of A and B 8 1 1 1 1 1 1 1
are called main effects. The effects of their interaction are
called two-way interaction effects. These three effects are the Table 7 –Fraction of the Design Matrix for a 2 3 Design
three sources for the variation of Y. Since equation (12) is a In Table 7, the effect of ABC cannot be estimated from the
linear regression model, the ANOVA method and the t-test experiment because it is always at the same level of 1. Since
given in Section 2 can be used to test whether or not one effect Table 7 uses only half of the treatments from the full factorial
is significant. design in Table 6, it is represented by 2 3−1 and read as “2 to
For a balanced design, a simple way to calculate the effect the power of 3 minus 1 design” or “2 to the 3 minus 1 design.”
of a factor is to calculate the difference of the mean values of From Table 7, you will also notice that some columns
the response at its high and low setting. For example, the have the same values. For example, column AB and C are the
effect of A can be calculated by: same. Using equation (13) to calculate the effect of AB and C,
we will end up with the same procedure and result. Therefore,
from this experiment, the effect of AB and C cannot be

4 – Guo & Mettas 2010 AR&MS Tutorial Notes


distinguished because they change with the same pattern. In A Aperture Setting small large
DOE, effects that cannot be separated from an experiment are B Exposure Time minutes 20 40
called confounded effects or aliased effects. A list of aliased C Develop Time seconds 30 45
effects is called the alias structure. For the design of Table 7, D Mask Dimension small large
the alias structure is: E Etch Time minutes 14.5 15.5
[I]=I+ABC; [A]=A+AC; [B]=B+BC; [C]=C+AB
Where: I is the effect of the intercept in the regression model, Table 9 –Factor Settings for a Five Factor Experiment
which represents the mean value of the response. The alias for
With five factors the total number of runs required for a
I is called the defining relation. For this example, the defining
full factorial is 2 5 = 32 . Running all of the 32 combinations is
relation is written as I = ABC. In a design, I may be aliased
too expensive for the manufacturer. At the initial
with several effects. The order of the shortest effects that
investigation, only main effects and two factor interactions are
aliased with I is the “resolution” of this design.
of interest, while higher order interactions are considered to be
From the alias structure, we can see that main effects are
unimportant. It is decided to carry out the investigation using
confounded with 2-way interactions. For example, the
the 2 5−1 design, which requires 16 runs. The defining relation
estimated effect for C in fact is the combination effect of C
is I=ABCDE, or in other words, the generator for factor E is
and AB.
E=ABCD. Table 10 gives the experiment data.
Checking Table 7, we can see it is a full factorial design if
we have only factor A and B. Therefore, A and B usually are
Run Order A B C D E Yield
called basic factors and the full factorial design for them is
1 Large 20 30 Large 15.5 10
called the basic design. A fractional factorial design is
2 Large 20 45 Large 14.5 21
generated from its basic design and basic factor. For this
example, the values for factor C are generated from the values 3 Small 40 45 Small 15.5 45
of the basic factors A and B using the relation C=AB. Usually 4 Small 20 45 Small 14.5 16
AB is called the factor generator for C. 5 Large 40 30 Small 15.5 52
By now, it should be clear that the design given in Table 1 6 Large 40 45 Small 14.5 60
at the beginning of this tutorial is not a good design. If you 7 Small 40 30 Large 15.5 30
check the design in terms of coded value, the answer is 8 Small 20 45 Large 15.5 15
obvious. Table 8 shows the design again. 9 Large 20 30 Small 14.5 9
10 Small 40 30 Small 14.5 34
Number of Temperature Humidity
Units (Celsius) (%) 11 Small 20 30 Small 15.5 8
25 120 (1) 95 (1) 12 Large 40 30 Large 14.5 50

25 85 (-1) 85 (-1) 13 Small 40 45 Large 14.5 44


14 Small 20 30 Large 14.5 6
Table 8 – Two Stress Accelerated Life Test (Coded Value) 15 Large 20 45 Small 15.5 22
In this design, temperature and humidity are confounded. In 16 Large 40 45 Large 15.5 63
fact, to study the effect of two factors, at least three different
settings are required. From the linear regression point of view, Table 10 –Design Matrix and Results
at least three unique settings are needed to solve three Since the design has only 16 unique factor combinations, it
parameters: the intercept, the effect of factor A and the effect can be used to estimate only 16 parameters in the linear
of factor B. If their interaction is also to be estimated, four regression model. If we include all main and 2-way
different settings should be used. Many DOE software interactions in the model, we get the following ANOVA table.
packages can generate a design matrix for you according to
the number of factors and the level of factors. It will help you
Source of Sum of Mean P
avoid bad designs such as the one given in Table 8. Variation
DF
Squares Squares
F
Value
3.3 An Example of a Fractional Factorial Design Model 15 5775.4375 385.0292 - -
Assume an engineer wants to identify the factors that A 1 495.0625 495.0625 - -
affect the yield of a manufacturing process for integrated B 1 4590.0625 4590.0625 - -
circuits. By following the DOE guidelines, five factors are C 1 473.0625 473.0625 - -
brought up and a two level fractional factorial design is D 1 3.0625 3.0625 - -
decided to be used [1]. The five factors and their levels are E 1 1.5625 1.5625 - -
given in Table 9. AB 1 189.0625 189.0625 - -
AC 1 0.5625 0.5625 - -
Low High AD 1 5.0625 5.0625 - -
Factor Name Unit
(-1) (1)

2010 Annual RELIABILITY and MAINTAINABILITY Symposium Guo & Mettas – 5


AE 1 5.0625 5.0625 - - A 1 495.0625 495.0625 193.1951 2.53E-08
BC 1 1.5625 1.5625 - - B 1 4590.0625 4590.0625 1791.244 1.56E-13
BD 1 0.0625 0.0625 - - 1 473.0625 473.0625 184.6098 3.21E-08
C
BE 1 0.0625 0.0625 - - 1 189.0625 189.0625 73.7805 3.30E-06
AB
CD 1 3.0625 3.0625 - -
Residual 11 28.1875 2.5625
CE 1 0.5625 0.5625 - -
Lack of Fit 3 9.6875 3.2292 1.3964 0.3128
DE 1 7.5625 7.5625 - -
Pure Error 8 18.5 2.3125
Residual 0
Total 15 5775.4375
Total 15 5775.4375

Table 11 –ANOVA Table with All Effects Table 12 –ANOVA Table with Significant Effects

There are no F ratio and P values in the above table. This is From Table 12, we can see that effects A, B, C and AB are
because there are no replicates in this experiment when all the indeed significant because their P values are close to 0.
effects are considered. Therefore, there is no way to estimate Once the important factors have been identified, follow-
the error term in the regression model. This is why the SSE up experiments can be conducted to optimize the process.
(Sum of Squares of Error), labeled as Residual in Table 11, is Response Surface Methods are developed for this purpose.
0. Without SSE, the estimation of the random error, how can 4. RESPONSE SURFACE METHODS (RSM)
we test whether or not an effect is significant compared to
random error? Don’t panic. Statisticians have already Response surface methods (RSM) are used to estimate the
developed methods to deal with this situation. When there is transfer functions at the optimal region. The estimated
no error in a screening experiment, Lenth’s method can be function is then used to optimize the responses. The quadratic
used to identify significant effects. Lenth’s method assumes model is the model used in RSM. Similar to the factorial
that all the effects should be normally distributed with a mean design, linear regression and ANOVA are the tools for data
of 0, given the hypothesis that they are not significant. If any analysis in RSM. Let’s use a simple example to illustrate this
effects are significantly different from 0, they should be type of design.
considered significant. So we can check the normal probability 4.1 Initial Investigation
plot for effects.
Assume the yield from a chemical process has been found
ReliaSoft DOE++ - www.ReliaSoft.com to be affected by two factors [1]:
Normal ProbabilityPlot of Effect
99.000
Effect Probability • Reaction Temperature
B:Exposure Time
Yield
Y' = Y • Reaction Time
Non-Significant Effects
A:Aperture Setting Significant Effects
Distribution Line The current operating conditions of 230 Fahrenheit and 65
minutes give a yield of about 35%. The engineers decide to
C:Develop Time
AB

explore the current conditions in the range [L=225, H=235]


Probability

50.000
Fahrenheit and [L=55, H=75] minutes to see how the
temperature and time affect the yield. The design matrix is:
10.000
Std. Point A: B:
5.000 Yield (%)
QA
Reliasoft
Order Type Temperature Reaction Time
7/10/2009
1.000
-40.000 -24.000 -8.000 8.000 24.000
3:03:18 PM
40.000
1 1 -1 -1 33.95
Effect
Alpha = 0.1; Lenth's PSE = 0.9375 2 1 1 -1 36.35
3 1 -1 1 35
Figure 1-Effect Probability Plot Using Lenth’s Method
4 1 1 1 37.25
From Figure 1, the main effect A, B, C and the 2-way 5 0 0 0 35.45
interaction AB are identified as significant at a significance 6 0 0 0 35.75
level of 0.1. Since the rest of the effects are not significant, 7 0 0 0 36.05
they can be treated as noise and used to estimate the sum of
8 0 0 0 35.3
squares of error. In DOE, it is a common practice to pool non-
9 0 0 0 35.9
significant effects into error. With only A, B, C and AB in the
model, we get the following ANOVA table. Table 13-Design Matrix for the Initial Experiment
Table 13 is the design matrix in coded values where -1 is the
Source of
DF
Sum of Mean F
P Value lower level and 1 is the higher level. For a given actual value
Variation Squares Squares Ratio
for a numerical factor, its corresponding coded value can be
4 5747.25 1436.8125 560.7073 1.25E-12
calculated by:
Model

6 – Guo & Mettas 2010 AR&MS Tutorial Notes


Actual Value - Middle Value of High and Low (13) 2 4.8 2 254 85 39.35
Coded value =
Half of the Range Between High and Low 3 7.2 3 266 95 45.65
In Table 13, there are several replicated runs at the setting 4 9.6 4 278 105 49.55
of (0, 0). These runs are called center points. Center points 5 12 5 290 115 55.7
have the following two uses. 6 14.4 6 302 125 64.25
• To estimate random error.
7 16.8 7 314 135 72.5
• To check whether or not the curvature is significant.
8 19.2 8 326 145 80.6
At least five center points are suggested in an experiment.
9 21.6 9 338 155 91.4
When we analyze the data in Table 13, we get the
ANOVA table below. 10 24 10 350 165 95.45
From Table 14, we know curvature is not significant. 11 26.4 11 362 175 89.3
Therefore, the linear model is sufficient in the current 12 28.8 12 374 185 87.65
experiment space. The linear model that includes only
significant effects is: Table 15-Path of Steepest Ascent
Y = 35.6375 + 1.1625 X 1 + 0.4875 X 2 (14) From Figure 2, we know the fastest lane to increase yield
Equation (14) is in terms of coded value. We can see both is to move along the direction that is perpendicular to the
factors have a positive effect to the yield since their contour lines. This direction is (1.1625, 0.4875), or about (2.4,
coefficients are positive. To improve the yield, we should 1) in terms of normalized scale. Therefore, if 1 unit of X2 is
explore the region with factor values larger than the current increased, 2.39 units of X1 should be used in order to keep
operation condition. There are many directions we can move moving on the steepest ascent direction. To convert the code
to increase the yield, but which one is the fastest lane to values to the actual values, we should use the step size of (12
approach the optimal region? By checking the coefficient of degree, 10 mins) for factor A and B. The table above gives the
each factor in equation (14), we know the direction should be results for the experiments conducted along the path of
(1.1625, 0.4875). This also can be seen from the contour plot steepest ascent.
in Figure 2. From Table 15, it can be seen that at step 10, the factor
setting is close to the optimal region. This is because the yield
decreases on either side of this step. The region around setting
Source of Sum of Mean
Variation
DF
Squares Squares
F Ratio P Value of (350, 165) requires further investigation. Therefore, the
analysts will conduct a 2 2 factorial design with the center
Model 4 6.368 1.592 16.4548 0.0095 point of (350, 165) and the range of [L=345, H=355] for factor
A 1 5.4056 5.4056 55.8721 0.0017 A (temperature) and [L=155, H=175] for factor B (reaction
1 0.9506 0.9506 9.8256 0.035
time). The design matrix is given in Table 16.
B
AB 1 0.0056 0.0056 0.0581 0.8213
Std. Point A:Temperature B:Reaction Time Yield
Curvature 1 0.0061 0.0061 0.0633 0.8137 Order Type (F) (min) (%)
Residual 4 0.387 0.0967 1 1 345 155 89.75
Total 8 6.755 2 1 355 155 90.2

Table 14-ANOVA Table for the Initial Experiment 3 1 345 175 92


4 1 355 175 94.25
5 0 350 165 94.85
6 0 350 165 95.45
7 0 350 165 95
8 0 350 165 94.55
9 0 350 165 94.7

Table 16-Factorial Design around the Optimal Region


The ANOVA table for this data is:
Figure 2--Contour Plot of the Initial Experiment
Source of Sum of Mean P
Factor Levels DF F Ratio
Variation Squares Squares Value
Yield
Step Coded Actual 4 37.643 9.4107 78.916 5E-04
(%) Model
A B A B 1 1.8225 1.8225 15.283 0.017
A
Current Operation 0 0 230 65 35
B 1 9.9225 9.9225 83.208 8E-04
1 2.4 1 242 75 36.5
AB 1 0.81 0.81 6.7925 0.06

2010 Annual RELIABILITY and MAINTAINABILITY Symposium Guo & Mettas – 7


Curvature 1 25.088 25.088 210.38 1E-04 2 1 355 155 90.2
Residual 4 0.477 0.1193 3 1 345 175 92
Total 8 38.12 4 1 355 175 94.25
5 0 350 165 94.85
Table 17-ANOVA for the Experiment at the Optimal Region
6 0 350 165 95.45
Table 17 shows curvature is significant at this experiment 7 0 350 165 95
region. Therefore, the linear model is not enough for the 8 0 350 165 94.55
relationship between factors and response. A quadratic model 9 0 350 165 94.7
should be used instead. An experiment design that is good for
10 -1 342.93 165 90.5
the quadratic model should be used for further investigation.
11 -1 357.07 165 92.75
Central Composite Design (CCD) is one of these designs.
12 -1 350 150.86 88.4
4.2 Optimization Using RSM 13 -1 350 179.14 92.6
Table 17 is the 2-level factorial design at the optimal
Table 18-CCD around the Optimal Region
region. CCD is build based on this factorial design and used to
estimate the parameters for a quadratic model such as: We use equation (15) to calculate α for our example. So
Y = β 0 + β1 X 1 + β 2 X 2 + β11 X 12 + β 22 X 22 + β12 X 1 X 2 + ε (15) α = 1.414. Since we already have five center points in the
factorial design at the optimal region, we only need to add
In fact, a CCD can be directly augmented from a regular
start points to have a CCD. The complete design matrix for
factorial design. The augmentation process is illustrated in
the CCD is:shown in Table 18. The last 5 runs in the above
Figure 3.
(0, α) table are added to the previous factorial design. The fitted
(-1,1) (1,1)
quadratic model is:
+ (-α, 0) (0, 0) (α, 0)
Y = 94.91 + 0.74 X 1 + 1.53 X 2 − 1.52 X 12 − 2.08 X 22 + 0.45 X 1 X 2 (16)
(-1,-1) (1,-1) The ANOVA table for the quadratic model is:
(0, -α)

Source of Sum of Mean


(0, α)
DF F Ratio P Value
(-1,1) (1,1) Variation Squares Squares
Model 5 65.0867 13.0173 91.3426 3.22E-06
(-α, 0) (α, 0)
(0, 0)
A 1 4.3247 4.3247 30.3465 0.0009
(-1,-1) (1,-1)
B 1 18.7263 18.7263 131.4022 8.64E-06
(0, -α)

AB 1 0.81 0.81 5.6838 0.0486


Figure 3-Augmnent a Factorial Design to CCD 1 16.0856 16.0856 112.8724 1.43E-05
AA
Points outside the rectangle in Figure 3 are called axial points BB 1 30.1872 30.1872 211.8234 1.73E-06
or start points. By adding several center points and axial Residual 7 0.9976 0.1425
points, a regular factorial design is augmented to a CCD. In Lack of Fit 3 0.5206 0.1735 1.4551 0.3527
Figure 3, we can see there are five different values for each
Pure Error 4 0.477 0.1193
factor. So CCD can be used to estimate the quadratic model in 12 66.0842
Total
equation (15).
Several methods have been developed to calculate α to Table 19-ANOVA Table for CCD at the Optimal Region
make the CCD have special properties such that the designed
As mentioned before, the model for CCD is used to
experiment can better estimate model parameters or can better
optimize the process. Therefore, the accuracy of the model is
explore the optimal region. The commonly used method to
very important. From Table 19, the Lack of Fit test, we see the
calculate α is:
1/ 4
P value is relatively large. It means the model can fit the data
 2 k − f (n f )  well. The Lack of Fit residual is the estimation of the
α =  (16) variations of the terms that are not included in the model. If its
 ns 
  amount is close to the pure error, which is the within-run
Where: nf is the number of replicates of the runs in the variation, it can be treated as part of the noise. Another way to
original factorial design. ns is the number of replicates of the check the model accuracy is to check the residual plots.
runs at the axial points. 2k-f represents the original factorial or Through the above diagnostic, we found the model is
fractional factorial design. adequate and we can use it to identify the optimal factor
settings. The optimal settings can be found easily by taking the
Std. Order
Point A:Temperature B:Reaction
Yield (%) derivative of each factor from equation (16) and setting them
Type (F) Time (min) to 0. Many software packages can do optimization. The
1 1 345 155 89.75
following results are from DOE++ from ReliaSoft [3].

8 – Guo & Mettas 2010 AR&MS Tutorial Notes


data is to use the center point of the interval data as the failure
time, and treat the suspension units as failed.
ReliaSoft DOE++- www.ReliaSoft.com
Optimal Solution 1
Central Composite Design
Even with the modification of the original data, another
95.326
Factor vs Response
Continuous Function issue may still exist. In the using of linear regression and
93.733 Factor Value
Response Value
ANOVA, the response is assumed to be normally distributed.
The F and T tests are established based on this normal
92.141
distribution assumption. However, life time usually is not
Y = 95.3264
(Maximize)
Yield

90.548
normally distributed.
Given the above reasons, correct analysis methods for
88.955
data from life testing are needed.
87.362
QA
Reliasoft
7/11/2009
5.2 Maximum Likelihood Estimation and Likelihood Ratio
1:16:17 PM
Test [2]
342.929
345.757
348.586
351.414
354.243
357.071

150.858
156.515
162.172
167.828
173.485
179.142
A:Temperature B:Reaction Time
Maximum Likelihood Estimation (MLE) can estimate
X =351.5045 X =168.9973
model parameters to maximize the probability of the
Figure 4-Optimal Solution for the Chemical Process occurrence of an observed data set. It has been used
successfully to handle different data types, such as complete
Figure 4 shows how the response changes with each data, suspensions and interval data. Therefore, we will use
factor. The red dashed line points to the optimal value of each MLE to estimate the model parameters for life data from
factor. we can see that the optimal settings are 351.5 degrees DOE.
Fahrenheit for temperature and 169 minutes for the reaction Many distributions are used to describe lifetimes. The
time. At this setting, the predicted yield is 95.3%, which is three most commonly used are [4]:
much better than the yield at the current condition (about • Weibull distribution with probability density function
35%). (pdf):
β
5. DOE FOR LIFE TESTING β −1 t 
− 
t β µ
When DOE is used for life testing, the response is the life  
f (t ) = e   (17)
or failure time. However, because of the cost, time or other η  η
constraints, you may not have observed values of life for some • Lognormal distribution with pdf:
β
test units. They are still functioning at the time when the test 1  ln(t ) − µ 
−  
ends. The end time of the test is called the suspension time for 1
f (t ) = e 2 σ  (18)
the units that are not failed. Obviously, this time is not their σt 2π
“life.” Should the suspension time be treated as the life time in • Exponential distribution with pdf:
order to analyze the data? In this section, we will discuss the  t 
correct statistical method for analyzing the data for life testing. 1 − 
f (t ) = e  m  (19)
First, let’s explain some basic concepts in life data analysis. m
Assume there is only one factor (in the language of DOE), or
5.1 Data Type for Life Test
stress (in the language of accelerated life testing) that affects
When the response is life, there are two types of data the lifetime of the product. The life distribution and factor
• Complete Data relationship can be described using the following graph.
• Censored Data
o Right Censored (Suspended)
o Interval Censored
If a test unit is failed during the test and the exact failure
time is known, the failure time is called complete data.
If a test unit is failed and you don’t know the exact failure
time -- instead, you know the failure occurs within a time
range -- this time range is called interval data.
If a unit does not fail in the test, the end time of the test of
the unit is called right censored data or suspension data.
Obviously, ignoring the censored data or treating them as
failure times will underestimate the system reliability.
However, in the use of the linear regression and ANOVA, an
exact value for each observation is required. Therefore,
engineers have to tweak the censored data in order to use Figure 5-pdf at Different Stress/Factor Levels
linear regression and ANOVA. A simple way to tweak the Figure 5 shows that life decreases when a factor is changed

2010 Annual RELIABILITY and MAINTAINABILITY Symposium Guo & Mettas – 9


from the low level to the high level. The pdf curves have the Table 20- Data for the Life Test Example
same shape while only the scale of the curve changes. The
20+ means that the test unit was still working at the end of the
scale of the pdf is compressed at the high level. It means the
test. So this experiment has suspension data. (14, 16) means
failure mode remains the same, only the time of occurrence
that failure occurred at a time between the 14th and the 16th
decreases at the high level. Instead of considering the entire
day. So this experiment also has interval data. The Weibull
scale of the pdf, a life characteristic can be chosen to represent
model is used as the distribution for the life of the fluorescent
the curve and used to investigate the effect of potential factors
lights. The likelihood ratio test table is given below.
on life. The life characteristics for the three commonly used
distributions are:
P
• Weibull distribution: η Model Effect DF Ln(LKV) LR
Value
• Lognormal distribution: µ
Reduced A 1 -20.7181 3.1411 0.0763
• Exponential distribution: m
The life-factor relationship is studied to see how factors affect B 1 -24.6436 10.9922 0.0009
life characteristic. For example, a linear model can be used as C 1 -19.2794 0.2638 0.6076
the initial investigation for the relationship: D 1 -25.7594 13.2237 0.0003
µ ' = β 0 + β1 X 1 + β1 X 2 + ... + β12 X 1 X 2 + ... (20) E 1 -21.0727 3.8504 0.0497
Where: µ ' = ln(η ) for Weibull; µ ' = µ for lognormal and Full 7 -19.1475
µ ' = ln(m) for exponential.
Table 21- LR Test Table for the Life Test Example
Please note that in equation (20) a logarithmic
transformation is applied to the life characteristics of the Table 21 has a layout that is similar to the ANOVA table. This
Weibull and exponential distributions. One of the reasons is makes it easy to read for engineers who are familiar with
because η and m can take only positive values. ANOVA. From the P value column, we can see factor A, B, D
To test whether or not one effect in equation (20) is and E are important to the product life. The estimated factor-
significant, the likelihood ratio test is used: life relationship is:
L(effect k removed) ln(η ) = 2.9959 + 0.1052 A − 0.2256 B − 0.0294C
LR (effect k ) = −2 ln (21) (22)
L(full model) − 0.2477 D + 0.1166 E
Where L() is the likelihood value. LR follows a chi-squared The estimated shape parameter β for the Weibull distribution
distribution if effect k is not significant. is 7.27.
If effect k is not significant, whether or not it is removed For comparison, the data was also analyzed using
from the full model of equation (20) will not affect the traditional linear regression and the ANOVA method. To
likelihood value much. It means the value of LR will be close apply linear regression and ANOVA, the data set was
to 0. Otherwise, if the LR value is very large, it means effect k modified by using the center points as the failure time for
is significant. interval data and treating the suspensions as failures. Results
are given below.
5.3 Life Test Example
Consider an experiment to improve the reliability of Source of Sum of Mean
DF F Ratio P Value
Variation Squares Squares
fluorescent lights [2]. Five factors A-E are investigated in the
experiment. A 2 5−2 design with factor generators D=AC and Model 5 143.3125 28.6625 4.2384 0.025
E=BC is conducted. The objective is to identify the significant A 1 1.5625 1.5625 0.2311 0.6411
effects that affect the reliability. Two replicates are used for B 1 33.0625 33.0625 4.8891 0.0515
each treatment. The test ends at the 20th day. Inspections are C 1 3.0625 3.0625 0.4529 0.5162
conducted every two days. The experiment results are given in 1 95.0625 95.0625 14.0573 0.0038
D
Table 20.
E 1 10.5625 10.5625 1.5619 0.2398
Residual 10 67.625 6.7625
Run A B C D E Failure Time
Total 15 210.9375
1 -1 -1 -1 -1 -1 (14,16) 20+
2 -1 -1 1 1 1 (18,20) 20+ Table 23- ANOVA Table for the Life Test Example
3 -1 1 -1 -1 1 (8,10) (10, 12)
In Table 23, only effects B and D are showing to be
4 -1 1 1 1 -1 (18,20) 20+ significant. The estimated linear regression model is:
5 1 -1 -1 1 -1 20+ 20+ Y = 16.9375 + 0.3125 A − 1.4375 B + 0.4375C
6 1 -1 1 -1 1 (12,14) 20+
(23)
− 2.4375 D + 0.8127 E
7 1 1 -1 1 1 (16,18) 20+ Comparing the results in Tables 22 and 23, we can see
8 1 1 1 -1 -1 (12,14) (14, 16) that they are quite different. The linear regression and
ANOVA method failed to identify A and E as important

10 – Guo & Mettas 2010 AR&MS Tutorial Notes


factors at the significance level of 0.1. knowledge of this tutorial, readers should be able to learn most
of them easily.
6. CONCLUSION
REFERENCES
In this tutorial, simple examples were used to illustrate the
basic concepts in DOE. Guidelines for conducting DOE were 1. D. C. Montgomery, Design and Analysis of Experiments,
given. Three major topics were discussed in detail: 2-level 5th edition, John Wiley and Sons, Inc. New York, 2001.
factorial design, RSM and DOE for life tests. 2. C. F. Wu and M. Hamad, Experiments: Planning,
Linear regression and ANOVA are the important tools in Analysis, and Parameter Design Optimization, John
DOE data analysis. So, they are emphasized. For DOE Wiley and Sons, Inc. New York, 2000.
involving censored data, the better method of MLE and 3. ReliaSoft, http://www.ReliaSoft.com/doe/index.htm
likelihood ratio test should be used. 4. ReliaSoft, Life Data Analysis Reference, ReliaSoft
DOE involves many different statistical methods. Many Corporation, Tucson, 2007.
useful techniques, such as blocking and randomization, 5. R. H. Myers and D. C. Montgomery, Response Surface
Methodology: Process and Product Optimization Using
random and mixed effect model, model diagnostic, power and
Designed Experiments, 2nd edition, John Wiley and Sons,
sample size, measurement system study, RSM with multiple
Inc. New York, 2002.
responses, D-optimal designs, Taguchi orthogonal array,
6. ReliaSoft, Experiment Design and Analysis Reference,
Taguchi robust designs, mixture designs and so on are not ReliaSoft Corporation, Tucson, 2008.
covered in this tutorial [1, 2, 5, 6]. However, with the basic

2010 Annual RELIABILITY and MAINTAINABILITY Symposium Guo & Mettas – 11


San Jose, CA USA

January 25-28, 2010

Outline
Introduction
Design of Experiments

 Why DOE

(DOE) and Data Analysis 


What DOE Can Do
Common Design Types
 General Guidelines for Conducting DOE
Huairui (Harry) Guo, Ph.D.  Statistic Background
Adamantios Mettas  Linear Regression and Analysis of Variance (ANOVA)
 Two Level Factorial Design
 Response Surface Method
 Reliability DOE
2010 RAMS –Tutorial DOE – Guo and Mettas 1

2010 RAMS –Tutorial DOE – Guo and Mettas 2

Tutorial: Design of Experiments (DOE) and Data Analysis


Why DOE
 Introduction

Statistical  DOE Makes Your Life Easier


Background
 Example 1
Two Level  Usage voltage: V=10v
Factorial Design
 Test all the 50 units at V=25v to predict the reliability at V=10v
Response
Surface Method
Introduction  Example 2
 Usage temperature T=40 (Celsius) and humidity H=50%
Reliability DOE
 Accelerated life testing to predict reliability at usage level
Summary
Temperature
Num of Units (Celsius) Humidity (%)
25 120 95
25 85 85
2010 RAMS –Tutorial DOE – Guo and Mettas 3

2010 RAMS –Tutorial DOE – Guo and Mettas 4

What DOE Can Do


Common Design Types
 Comparisons
 For comparison
 Variable screening  One factor design
 Transfer function exploration  For variable screening
 System optimization  2 level factorial design
Plackett-Burman design
 System robustness 

 Taguchi orthogonal array design


 For transfer function identification and optimization
 Central composite design
 Box-Behnken design
 For system robustness
2010 RAMS –Tutorial DOE – Guo and Mettas 5
 Taguchi robust design
2010 RAMS –Tutorial DOE – Guo and Mettas 6

12 – Guo & Mettas 2010 AR&MS Tutorial Notes


Linear Regression (cont’d)
 Model
Y = β 0 + β1 X 1 + ... + β p X p + ε
General Guidelines for Conducting DOE  Assumptions
ε: The random error or noise. ε ~ N (0, σ )

 Clarify and state objective It is assumed to be normally distributed with mean of 0


and variance of σ 2
 Choose responses
 Choose factors and levels Y : For a given model, the response is also normally
distributed and
 Choose experimental design Var (Y ) = σ 2
 Perform the experiment  Parameter Estimation: Least Squares Estimation
 Analyze the data
2010 RAMS –Tutorial DOE – Guo and Mettas 11
 Draw conclusions and make recommendations

2010 RAMS –Tutorial DOE – Guo and Mettas 7

Tutorial: Design of Experiments (DOE) and Data Analysis

Introduction

Statistical Background  Statistical


Background

 Linear regression Two Level


Factorial Design
It is the foundation for DOE data analysis.

Response Statistical Background
 ANOVA (analysis of variance) Surface Method

 ANOVA is a way to present the findings from the Reliability DOE


linear regression model.
Summary
 ANOVA can tell you if there is a strong relationship
between the independent variables, Xs, and the
response variable, Y.
2010 RAMS –Tutorial DOE – Guo and Mettas 8
 ANOVA can test whether or not an individual X can
affect Y significantly.

2010 RAMS –Tutorial DOE – Guo and Mettas 9

Linear Regression
1.5

 Regression analysis is a 1.0


Temperature Anomoly ( F)
o

statistical technique that 0.5

attempts to explore and 0.0

model the relationship


-0.5
between two or more
variables. -1.0
1880 1900 1920 1940 1960 1980 2000
Year

 A linear regression
model attempts to explain
the relationship between
two or more variables
using a straight line.

2010 RAMS –Tutorial DOE – Guo and Mettas 10

2010 Annual RELIABILITY and MAINTAINABILITY Symposium Guo & Mettas – 13


Example: Linear Regression
ANOVA Table for the Example
 Within-run Variations Y = β 0 + β1 X 1 + β 2 X 2 + ε
 For the same values of X1
and X2, the observed Y Source of Degrees of Sum of Mean P
values are different. Variation Freedom Squares Squares F Ratio Value
 This difference of Y is Model 2 6.14E+04 3.07E+04 36.86 0.0077
caused by noise. X1 1 6.00E+04 6.00E+04 72.03 0.0034
X2 1 8100 8100 9.72 0.0526
 Between-run Variations Residual 3 2500 833.3333
 For different values of X1 Total 5 6.39E+04
and X2, the observed mean
* Residual is the estimated value for error or noise.
values of Y are different.
 P Value: The smaller the P value is, the more significant the corresponding source.
 This difference of Y is
caused by the different The P value is compared with a given significance level, alpha. If a P value is less than
the alpha, the corresponding source is significant at the significance level of alpha. The
values of the Xs.
commonly used alpha values are 0.01, 0.05, 0.1.
 The Model and X1 are significant at Level of 0.05.
2010 RAMS –Tutorial DOE – Guo and Mettas 12  The Model, X1 and X2 are significant at Level of 0.1.
2010 RAMS –Tutorial DOE – Guo and Mettas 17

Measure the Variation


 Sum of Squares:
Partition the Total Variation
n  The total variation of Y includes two parts
SST = ∑ Yi − Y ( )2 SST = SS R + SS E
i =1  Between-run Variation
n n
 Within-run Variation = ∑ (Yˆi − Y ) 2 + ∑ (Yi − Yˆi ) 2
n: Number of observations. (It is 6 for this example.) i =1 i =1
 Mean Squares: SSR: Sum of Squares of Regression. The amount of variation explained by Xs.

SS T 1 n SSE: Sum of Squares of Error. The amount of variation caused by noise.


MST = = ∑ Yi − Y
n − 1 n − 1 i =1
( )2
 Mean squares of regression and error
 Mean Squares is the normalized variation of a data set.
SS R SS E
 It is also the unbiased estimation of the variance of Y. MS R = MS E =
p n −1− p
2010 RAMS –Tutorial DOE – Guo and Mettas
n: Number of observations p: Number of Xs. It is 2 for the previous example.
13

2010 RAMS –Tutorial DOE – Guo and Mettas 14

Partition the Sum of Squares of Regression


 The SSR can be divided into the variation caused by each
Analysis of Variance (ANOVA)
variable:
MS R
SS R = SS X1 + SS X 2 + ... + SS X p  F test for the regression model: F0 =
MS E
H0: There is no difference between the variation caused by Xs
 Each X has its own mean squares
and the variation caused by noise.
H1: The variation caused by Xs is larger than the variation
MS X 1 , MS X 2 ,..., MS X P caused by noise.
 The mean squares of each source is compared with MSE MS X i
 F test for each variable xi: F0 =
 If MSXi is significantly larger than MSE, Xi significantly affects Y.
MS E
 If it is not significantly larger than MSE, Xi’s effect is close to the H0: There is no difference between the variation caused by Xi and
effect of noise. Xi doesn’t significantly affect Y. the variation caused by noise.
H1: The variation caused by Xi is larger than the variation
2010 RAMS –Tutorial DOE – Guo and Mettas 15
caused by noise.

2010 RAMS –Tutorial DOE – Guo and Mettas 16

14 – Guo & Mettas 2010 AR&MS Tutorial Notes


Tutorial: Design of Experiments (DOE) and Data Analysis

Introduction

Statistical
T-test for the Coefficients Background

Two Level
 T-test is an alternative to the F-test. Factorial Design
 T-test is used to test whether or not a coefficient is 0.
 For example,
Response
Surface Method Two Level Factorial Design
H0: β1 = 0; H1: β1 ≠ 0
Reliability DOE
 The test statistic is:
βˆ1 Summary
T0 =
se( βˆ1 )

β̂1 is the estimated value for β1. se( βˆ1 ) is the standard error.
2010 RAMS –Tutorial DOE – Guo and Mettas 21

Under the Null-hypothesis, T0 follows the t distribution with the degree of


freedom (df) that is the same as the df of the error term.

2010 RAMS –Tutorial DOE – Guo and Mettas 18

Some Basic Concepts in DOE


 Treatment: A unique combination of all the factors.
 Replicate: The number of test units under each
treatment.
T-test Results
Standard
Term Coefficient Error T Value P Value
Intercept -2135 588.7624 -3.6263 0.0361
X1 7 0.8248 8.487 0.0034
X2 18 5.7735 3.1177 0.0526

Treatment = 4 Replicate = 1
Source of Degrees of Sum of Mean P
Variation Freedom Squares Squares F Ratio Value
 Design notation: 2k k is the number of factors.
Model 2 6.14E+04 3.07E+04 36.86 0.0077
X1 1 6.00E+04 6.00E+04 72.03 0.0034
2010 RAMS –Tutorial DOE – Guo and Mettas 23
X2 1 8100 8100 9.72 0.0526
Residual 3 2500 833.3333
Total 5 6.39E+04

Note: The P values from the T-test are the same as in the F-test.

2010 RAMS –Tutorial DOE – Guo and Mettas 19

Pareto Chart for the T-Values


ReliaSoft DOE++ - www.ReliaSoft.com
Pareto Chart
Pareto Chart

Critical Valu
Significant

X1
Term

X2

QA
Reliasoft
7/28/2009
4:57:54 PM
0.000 2.353 3.600 5.400 7.200 9.000
Effect
Alpha = 0.1; Threshold = 2.3534

2010 RAMS –Tutorial DOE – Guo and Mettas 20

2010 Annual RELIABILITY and MAINTAINABILITY Symposium Guo & Mettas – 15


Two Level Factorial Design
Two Level Fractional Factorial Design
 Why two levels?
 To study the effect of a factor or a variable, at least two  Why fractional factorial design
different settings are needed. High = 1, Low = -1.  Reduce the sample size. For a 7 factor design, you need
 From a linear regression viewpoint, you need at least 27 = 128 runs.
two points to fit a straight line.  Reduce cost and time by only using part of the full
 Answer to the first example design.
Example 1  Why fractional factorial design works
 Usage voltage: V=10v
 Sparsity of effects principle: Most of the time,
 Test all the 50 units at V=25v to predict the reliability at V=10v.
responses are affected by only a small number of main
effects and lower order interactions.
 Fractional factorial design: Focus on main effects and
2010 RAMS –Tutorial DOE – Guo and Mettas 22 lower order interactions.
2010 RAMS –Tutorial DOE – Guo and Mettas 27

Balanced and Orthogonal


Half Fractional Factorial Design 23-1
 Balanced: The number
of test units is the same Order A B AB C AC BC ABC
2 1 -1 -1 -1 -1 1 1
for all the treatments. 3 -1 1 -1 -1 1 -1 1
5 -1 -1 1 1 -1 -1 1
 Orthogonal: The sum 8 1 1 1 1 1 1 1
of the product of each  From this experiment, the effect of AB and C, AC and B, and BC
factor column is 0. and A cannot be distinguished. Each pair of columns (AB and C,
(−1 × −1) + (1 × −1) + (−1 × 1) + (1 × 1) = 0 A and BC, B and AC) have the same pattern.
 Effects that cannot be distinguished within a design are called
 A balanced and orthogonal design can evaluate the effect of each factor more
Confounded Effects or Aliased Effects.
accurately.
 The summary of the aliased effects for the design is called Alias
 If you add one more sample for any one of the treatments, the design will be Structure.
unbalanced and non-orthogonal. [I]=I+ABC; [A]=A+AC; [B]=B+BC; [C]=C+AB
2010 RAMS –Tutorial DOE – Guo and Mettas 24  I=ABC is called the Defining Relation.
2010 RAMS –Tutorial DOE – Guo and Mettas 29

Linear Model for 2 Factorial Design


A Simple Way to Estimate the Coefficients
Y = β 0 + β1 X 1 + β 2 X 2 + β12 X 1 X 2 + ε  Least Square Estimation is used for any design
 Special case: only valid for a balanced design
X1: Factor A, X2: Factor B

 Main effect: effect A(X1) and B(X2)


 Two-way interaction effect: AB(X1X2)
 The number of factors in a interaction effect is the
order of the interaction.
 ANOVA is used to test whether or not an effect is Effet of A = Avg. at A high - Avg. at A low
significant. 30 + 35 20 + 25
= − = 10
2 2
Effect of A = 2 β1
2010 RAMS –Tutorial DOE – Guo and Mettas 25

2010 RAMS –Tutorial DOE – Guo and Mettas 26

16 – Guo & Mettas 2010 AR&MS Tutorial Notes


Review the Second Example
 Example 2
 Usage temperature T=40 (C) and humidity H=50%.
Design Matrix for a 23 Design  Accelerated life test to predict reliability at usage level.

Temperature
Num of Units (Celsius) Humidity (%)
25 120 (+1) 95 (+1)
25 85 (-1) 85 (-1)
 It is a bad design: Main effects Temperature and Humidity are
Confounded.
 From the viewpoint of Linear Regression, at least 3 unique
combinations are needed — why?
Y = β 0 + β1 X 1 + β 2 X 2 + ε

2010 RAMS –Tutorial DOE – Guo and Mettas 31

If effect ABC can be ignored in the study, then we can select the
treatments with ABC=1 for a Fractional Factorial Design.

2010 RAMS –Tutorial DOE – Guo and Mettas 28

Fractional Factorial Design Example (cont’d)

 With five factors the total number of runs required for a


Generate a Fractional Factorial Design full factorial is 25 = 32. These runs are too expensive for
the manufacturer.
Order A B C AB AC BC ABC  Only the main effects and the two factor interactions are
2 1 -1 -1 -1 -1 1 1 assumed to be important.
3 -1 1 -1 -1 1 -1 1
5 -1 -1 1 1 -1 -1 1  It is decided to carry out the investigation using the 25-1
8 1 1 1 1 1 1 1 design (requiring 16 runs) with the factor generator
E= ABCD.
 The design is a Full Factorial Design for A and B.
 A and B are called Basic Factors.
 The Full Factorial Design for A and B is called the Basic
Design. 2010 RAMS –Tutorial DOE – Guo and Mettas 33

 Factor C can be generated from C = AB.


 C = AB is called the Factor Generator for C.

2010 RAMS –Tutorial DOE – Guo and Mettas 30

ANOVA Table with All Effects


Source of Sum of Mean
DF F P Value
Variation Squares Squares

Model 15 5775.4375 385.0292 - -


A 1 495.0625 495.0625 - -
B 1 4590.0625 4590.0625 - -
C 1 473.0625 473.0625 - -
D 1 3.0625 3.0625 - -
E 1 1.5625 1.5625 - -
AB 1 189.0625 189.0625 - -
AC 1 0.5625 0.5625 - -
AD 1 5.0625 5.0625 - -
AE 1 5.0625 5.0625 - -
BC 1 1.5625 1.5625 - -
BD 1 0.0625 0.0625 - -
BE 1 0.0625 0.0625 - -
CD 1 3.0625 3.0625 - -
CE 1 0.5625 0.5625 - -
DE 1 7.5625 7.5625 - -
Residual 0
Total 15 5775.4375

2010 RAMS –Tutorial DOE – Guo and Mettas 35

2010 Annual RELIABILITY and MAINTAINABILITY Symposium Guo & Mettas – 17


Fractional Factorial Design Example
 The yield in the manufacture of Pool Non-Significant Effects to Error
integrated circuits is thought to be
affected by the following five  Non-significant effects are treated as “noise” and are
factors*:
used to estimate the error.
Factor Name Unit Low High
A Aperture Setting small large
B Exposure Time minutes 20 40
C Develop Time seconds 30 45
D Mask Dimension small large
E Etch Time minutes 14.5 15.5

 The objective of the


experimentation is to increase the
yield by finding the significant
effects.
* Montgomery, D. C, 2001

2010 RAMS –Tutorial DOE – Guo and Mettas 32

2010 RAMS –Tutorial DOE – Guo and Mettas 37

Design Matrix and Observations


Run Order
1
A
1
B
-1
C
-1
D
1
E
1
Yield
10 Response Surface Methods (RSM)
2 1 -1 1 1 -1 21
3 -1 1 1 -1 1 45
4 -1 -1 1 -1 -1 16  RSM is used to estimate the transfer function at
5 1 1 -1 -1 1 52
6 1 1 1 -1 -1 60 the optimal region.
7 -1 1 -1 1 1 30
8 -1 -1 1 1 1 15  The Quadratic model is used in RSM.
9 1 -1 -1 -1 -1 9
10
11
-1
-1
1
-1
-1
-1
-1
-1
-1
1
34
8
 The estimated function is used for optimization.
12
13
1
-1
1
1
-1
1
1
1
-1
-1
50
44
 Linear regression and ANOVA are the tools for
14
15
-1
1
-1
-1
-1
1
1
-1
-1
1
6
22
data analysis.
16 1 1 1 1 1 63

Run Order: The test sequence of the treatments. The experiment is


conducted in a Random Sequence of the factor combinations.

2010 RAMS –Tutorial DOE – Guo and Mettas 34

2010 RAMS –Tutorial DOE – Guo and Mettas 39

Identify Significant Effects


Design Matrix for the Initial Investigation
ReliaSoft DOE++ - www.ReliaSoft.com
Normal ProbabilityPlot of Effect
99.000
Effect Probability

B:Exposure Time
Yield
Y' = Y
Non-Significant Effects
Significant Effects
Experiment
A:Aperture Setting
Distribution Line

AB
C:Develop Time  The initial design is a modified factorial design. The Design Matrix is:
A: B:
Std. Order Point Type Temperature Reaction Time Yield (%)
Probability

50.000
1 1 -1 -1 33.95
2 1 1 -1 36.35
3 1 -1 1 35
10.000 4 1 1 1 37.25
5.000
5 0 0 0 35.45
QA
Reliasoft 6 0 0 0 35.75
7/10/2009
1.000 3:03:18 PM 7 0 0 0 36.05
-40.000 -24.000 -8.000 8.000 24.000 40.000
Effect 8 0 0 0 35.3
Alpha = 0.1; Lenth's PSE = 0.9375
9 0 0 0 35.9
Lenth’s method assumes all the effects should be normally distributed with a mean of
0, given the hypothesis that they are not significant.  The initial design at the current operation condition is used to search
2010 RAMS –Tutorial DOE – Guo and Mettas 36 the direction for locating the optimal region for optimization.
2010 RAMS –Tutorial DOE – Guo and Mettas 41

18 – Guo & Mettas 2010 AR&MS Tutorial Notes


Tutorial: Design of Experiments (DOE) and Data Analysis
Results for the Initial Experiment
Introduction

Statistical
 ANOVA Table
Background

Two Level
Factorial Design

 Response Response Surface Method


Surface Method

Reliability DOE

Summary * Curvature and AB are not significant.


 The final linear regression model (in coded values) with only
significant terms:
2010 RAMS –Tutorial DOE – Guo and Mettas 38
Y = 35.6375 + 1.1625 X 1 + 0.4875 X 2
2010 RAMS –Tutorial DOE – Guo and Mettas 43

RSM: Example
 The yield from a chemical process has been Path of Steepest Ascent
found to be affected by two factors:
 Reaction Temperature Factor Levels
 Reaction Time Step Coded Actual Yield (%)
A B A B
 The current operating conditions of Current
0 0 230 65 35
230 Fahrenheit and 65 minutes give a yield Operation
1 2.4 1 242 75 36.5
of about 35%. 2 4.8 2 254 85 39.35
 The engineer decides to explore the current 3 7.2 3 266 95 45.65
4 9.6 4 278 105 49.55
conditions in the range (225, 235)
5 12 5 290 115 55.7
Fahrenheit and (55, 75) minutes so that 6 14.4 6 302 125 64.25
steps can be taken to achieve the maximum 7 16.8 7 314 135 72.5
yield. 8 19.2 8 326 145 80.6
9 21.6 9 338 155 91.4
10 24 10 350 165 95.45
11 26.4 11 362 175 89.3
12 28.8 12 374 185 87.65
2010 RAMS –Tutorial DOE – Guo and Mettas 40

2010 RAMS –Tutorial DOE – Guo and Mettas 45

Coded Values and Center Points


 Coded value (-1, 0, 1)
Actual Value - Middle Value of High and Low
Coded value =
Half of the Range Between High and Low

For example, the temperature is (Low = 225, High = 235). So when


Temperature = 230, the corresponding coded value is 0.
 Reasons for using Center Points
 Replicates at Center Points can be used to estimate the random error.
 Replicates at Center Points will keep the Orthogonality of the design.
 Can be used to check whether or not the Linear Model is sufficient or
whether the curvature is significant.

2010 RAMS –Tutorial DOE – Guo and Mettas 42

2010 Annual RELIABILITY and MAINTAINABILITY Symposium Guo & Mettas – 19


Factorial Design Around the Optimal Region
Std. Order
1
Point Type
1
A:Temperature (F)
345
B:Reaction Time (min) Yield (%)
155 89.75
Designs for Quadratic Model
2 1 355 155 90.2
3 1 345 175 92
4 1 355 175 94.25

Quadratic model
5 0 350 165 94.85
6 0 350 165 95.45 
7 0 350 165 95
8 0 350 165 94.55
9 0 350 165 94.7
Y = β 0 + β1 X 1 + β 2 X 2 + β11 X 12 + β 22 X 22 + β12 X 1 X 2 + ε

 Special designs are used for the Quadratic model.


Central Composite Design (CCD) is one of them.
 CCD can be directly augmented from a regular
factorial design.

At the optimal region, the curvature is significant. The simple linear model is not enough any more.

2010 RAMS –Tutorial DOE – Guo and Mettas 47

2010 RAMS –Tutorial DOE – Guo and Mettas 48

Direction of Steepest Ascent


Y = 35.6375 + 1.1625 X 1 + 0.4875 X 2 Augment a Factorial Design to a Central Composite Design

 The coefficients show that the direction is (1.1626, 0.4875) or  Using a 2 factor factorial design as example:
can be normalized to (2.4, 1). (-1,1) (1,1)
(0, α)

 The direction of steepest ascent is perpendicular to the contour lines.


(-α, 0) (0, 0) (α, 0)

(-1,-1) (1,-1)
(0, -α)

(0, α)

(-1,1) (1,1)

(-α, 0) (α, 0)
(0, 0)

(-1,-1) (1,-1)

(0, -α)

 The points outside the rectangle are called axial points or


2010 RAMS –Tutorial DOE – Guo and Mettas 44 star points.
2010 RAMS –Tutorial DOE – Guo and Mettas 49

Further Investigation at the Optimal Region


The Complete CCD Matrix for the Example
 From the sequential tests, we found the setting at Std. Order Point Type A:Temperature (F) B:Reaction Time (min) Yield (%)
step 10 (Temp = 350, Time = 165) is close to the 1
2
1
1
345
355
155
155
89.75
90.2
optimal region. 3
4
1
1
345
355
175
175
92
94.25

 The region around the setting at step 10 5


6
0
0
350
350
165
165
94.85
95.45
(Temp = 350, Time = 165) should be investigated. 7
8
0
0
350
350
165
165
95
94.55

A Factorial Design with a center point of (350, 165)


9 0 350 165 94.7
 10 -1 342.93 165 90.5

was conducted. 11
12
-1
-1
357.07
350
165
150.86
92.75
88.4
13 -1 350 179.14 92.6

Four Star Points are added to the previous factorial design.

2010 RAMS –Tutorial DOE – Guo and Mettas 46

2010 RAMS –Tutorial DOE – Guo and Mettas 51

20 – Guo & Mettas 2010 AR&MS Tutorial Notes


Model Diagnostic
 Check the residual plots Model Diagnostic (cont’d)
ReliaSoft DOE++- www.ReliaSoft.com ReliaSoft DOE++- www.ReliaSoft.com
Normal Probability Plot of Residual Residual vs Run Order
99.000 0.700
Residual Probability Residual vs Order
0.621
Yield
Y' = Y
Yield
Data Points Y' = Y
Residual Line 0.420
Data Points
Upper Critical Value
Lower Critical Value

Zero Value Line


0.140
Probability

50.000

Residual
0.000

-0.140

10.000

5.000
QA -0.420
Reliasoft
7/11/2009
1:07:07 PM
1.000 QA
-2.000 -1.200 -0.400 0.400 1.200 2.000 Reliasoft
Residual -0.621 7/11/2009
Anderson-Darling =0.3202; p-value =0.4906 -0.700 1:08:33 PM
0.000 3.000 6.000 9.000 12.000 15.000
Run Order
2010 RAMS –Tutorial DOE – Guo and Mettas 53 Alpha =0.1; Upper Critical Value =0.6209; Lower Critical Value =-0.6209

2010 RAMS –Tutorial DOE – Guo and Mettas 54

How to Decide α
 Several methods have been developed for deciding Optimization
α. Y = 94.91 + 0.74 X 1 + 1.53 X 2 − 1.52 X 12 − 2.08 X 22 + 0.45 X 1 X 2
ReliaSoft DOE++- www.ReliaSoft.com
Optimal Solution 1
 Special values of α can make the design have special Central Composite Design

properties, such as a design that can better estimate model


95.326
Factor vs Response
Continuous Function

parameters, or that can better explore the optimal region. 93.733 Factor Value
Response Value

A commonly used method is:


92.141


Y = 95.3264
(Maximize)
Yield

1/ 4 90.548

 2 k − f (n f ) 
α =  88.955

 ns 
  87.362
QA
Reliasoft
7/11/2009
1:16:17 PM

nf is the number of replicates of the runs in the original factorial design.


342.929
345.757
348.586
351.414
354.243
357.071

150.858
156.515
162.172
167.828
173.485
179.142
ns is the number of replicates of the runs at the axial points. A:Temperature B:Reaction Time

2k-f represents the original factorial or fractional factorial design. X =351.5045 X =168.9973

 Optimal Setting (Temperature = 351.5, Time = 169.0)


2010 RAMS –Tutorial DOE – Guo and Mettas 50  Predicted Yield at the Optimal Setting is 95.3%
2010 RAMS –Tutorial DOE – Guo and Mettas 55

Results from the Experiment


 ANOVA Table

 Model in terms of coded value:


Y = 94.91 + 0.74 X 1 + 1.53 X 2 − 1.52 X 12 − 2.08 X 22 + 0.45 X 1 X 2

2010 RAMS –Tutorial DOE – Guo and Mettas 52

2010 Annual RELIABILITY and MAINTAINABILITY Symposium Guo & Mettas – 21


DOE For Life Testing
 The response is the failure time. Data Types
 The failure time usually is not normally distributed, so linear
regression and ANOVA are not good for life data.
 If a unit is still running at the time when the test ends, you only  Complete data
have suspension time, not failure time for this unit.
 The correct analysis method is needed for reliability
DOE.  Censored data
 A method that can handle lifetime distributions such as Weibull,  Right Censored
lognormal and exponential. ?
 A method that can handle suspension times correctly.
 A method that can evaluate the significance of each effect,
similar to ANOVA.  Interval Censored ?

2010 RAMS –Tutorial DOE – Guo and Mettas 57

2010 RAMS –Tutorial DOE – Guo and Mettas 58

Right Censored (Suspended) Data: Example


Interval Censored Data: Example
 Imagine that we tested five units and three failed. In this
scenario, our data set is composed of the times-to-failure  Imagine that we are running a test on five units and inspecting
of the three units that failed and the running time of the them every 100 hours. If a unit failed between inspections, we
other two units that did not fail. do not know exactly when it failed, but rather that it failed
between inspections. This is also called “inspection data.”

2010 RAMS –Tutorial DOE – Guo and Mettas 59

2010 RAMS –Tutorial DOE – Guo and Mettas 60

Tutorial: Design of Experiments (DOE) and Data Analysis


Distributions Commonly Used in Reliability
Introduction

Statistical  Weibull distribution pdf:


Background β
β −1 t
β  t  − 
Two Level f (t ) =   e  η 
Factorial Design η η 
Response
Reliability DOE (R-DOE)  Lognormal distribution pdf:
Surface Method 2
1  ln( t ) − µ 
1 − 
σ 
Reliability DOE f (t ) = e 2 

σ t 2π
Summary
 Exponential distribution pdf:
 t 
1 − m 
f (t ) = e
m
2010 RAMS –Tutorial DOE – Guo and Mettas 56

2010 RAMS –Tutorial DOE – Guo and Mettas 61

22 – Guo & Mettas 2010 AR&MS Tutorial Notes


Life-Factor Relationship Simplify: Infer a Characteristic
Life-Factor Relationship
 Using the life characteristic, the model to investigate the effect of
factors on life can be expressed as:

µ ' = β 0 + β1 x1 + β 2 x2 + ... + β12 x1 x2 + ...

where:
 Instead of considering the entire scale of the pdf, the life µ ' = ln(η ) or µ'= µ or µ ' = ln(m)
characteristic can be chosen to investigate the effect of potential
factors on life.
xj : jth factor value
 The life characteristics for the three commonly used distributions  Note that a logarithmic transformation is applied to the life
are: characteristics of the Weibull and exponential distributions.
Weibull: η Lognormal: µ Exponential: m  This is because η and m can take only positive values.
2010 RAMS –Tutorial DOE – Guo and Mettas 63

2010 RAMS –Tutorial DOE – Guo and Mettas 64

The Likelihood Function Based on Life-


Factor Relationship Testing Effect Significance: Likelihood
 Life-Factor Relationship µ = β 0 + β1 xi1 + β 2 xi 2 + ... + β12 xi1 xi 2 + ...
i
'
Ratio Test
Life-factor relationship:
N
 Failure Time Data L f = ∏ f (Ti ; µi′, σ ) 
i =1
M
 Suspension Data LS = ∏ R ( S j ; µi′, σ ) µi' = β 0 + β1 xi1 + β 2 xi 2 + ... + β12 xi1 xi 2 + ...
j =1
P
 =
Interval Data LI ∏[ F (I
l =1
Ul ; µi′, σ ) − F ( I Ll ; µi′, σ ) ]
 Likelihood ratio test:
MLE L(effect k removed )
L = L f × Ls × LI LR (effect k ) = −2 ln
L( full Model )

β 0 , β1 , β 2 ,... and σ for lognormal


If LR (effect k ) > χ1,2α then effect k is significant.
2010 RAMS –Tutorial DOE – Guo and Mettas 65

2010 RAMS –Tutorial DOE – Guo and Mettas 66

Combining Reliability and DOE: Life-Factor


Relationship

 The graphic shows an example where life decreases when a factor is


changed from the low level to the high level.
 It is seen that the pdf changes in scale only. The scale of the pdf is
compressed at the high level.
 The failure mode remains the same. Only the time of occurrence
decreases at the high level.
2010 RAMS –Tutorial DOE – Guo and Mettas 62

2010 Annual RELIABILITY and MAINTAINABILITY Symposium Guo & Mettas – 23


R-DOE: Example
R-DOE: Example (cont’d)
 Consider an experiment to
improve the reliability of
fluorescent lights. Five A B C D E Failure Time
-1 -1 -1 1 1 14~16 20+
factors A-E are investigated -1 -1 1 -1 -1 18~20 20+
in the experiment. A 25-2 -1 1 -1 1 -1 8~10 10~12
-1 1 1 -1 1 18~20 20+
design with factor generators 1 -1 -1 -1 1 20+ 20+
D=AC and E=BC was 1 -1 1 1 -1 12~14 20+
1 1 -1 -1 -1 16~18 20+
conducted.* 1 1 1 1 1 12~14 14~16
 Objective: To identify the
significant factors and adjust  Two replicates at each treatment.
them to improve life.  Inspections were conducted every two hours.
*Taguchi, 1987, p. 930.
 Results have interval data and suspensions.
2010 RAMS –Tutorial DOE – Guo and Mettas 67

2010 RAMS –Tutorial DOE – Guo and Mettas 68

Traditional DOE Approach


Fluorescent Lights R-DOE: Life-Factor
 Assumes that the response (life) is normally Relationship
distributed.  For the ith observation
µi ' =β 0 + β1 Ai + β 2 Bi + β 3Ci + β 4 Di + β 5 Ei + ...
 Treats suspensions as failures. A B C D E Failure Time

 Uses the middle point of the interval data as the -1


-1
-1
-1
-1
1
1
-1
1
-1
14~16
18~20
20+
20+

failure time. -1
-1
1
1
-1
1
1
-1
-1
1
8~10
18~20
10~12
20+
1 -1 -1 -1 1 20+ 20+
 Problem: The above assumptions and adjustments 1
1
-1
1
1
-1
1
-1
-1
-1
12~14
16~18
20+
20+
are incorrect. 1 1 1 1 1 12~14 14~16

For example, for the first run, the equation is:


µ1' = β 0 + β1 × (−1) + β 2 × (−1) + β 3 × (−1) + β 4 × (+1) + β 5 × (+1)

2010 RAMS –Tutorial DOE – Guo and Mettas 69 …assuming that the interactions are absent.
2010 RAMS –Tutorial DOE – Guo and Mettas 70

MLE and Likelihood Ratio Test


Z-test for the Coefficients
 Life-factor relation (Weibull distribution)
ln(η ) = 2.9959 + 0.1052 A − 0.2256 B − 0.0294C  Z-test is used to test whether or not a coefficient is 0.
− 0.2477 D + 0.1166 E  For example,
H0: β1 = 0; H1: β1 <> 0
 Likelihood Ratio (LR) Test Table
 The test statistic is:
βˆ1
Z0 =
se( βˆ1 )

β̂1 is the estimated value for β1. se( βˆ1 ) is the standard error.

Under the Null-hypothesis, Z0 is assumed to be standard normally distributed.

2010 RAMS –Tutorial DOE – Guo and Mettas 71

2010 RAMS –Tutorial DOE – Guo and Mettas 72

24 – Guo & Mettas 2010 AR&MS Tutorial Notes


Z-test Results
Pareto Chart for the Z-values
 Z-values are the normalized coefficients.
ReliaSoft DOE++ - www.ReliaSoft.com
Pareto Chart
Pareto Chart

D:D Critical Value


Non-Significant
Significant

B:B

Term
E:E

Note: The P values from the Z-test are slightly different from the Likelihood Ratio test. A:A

 When the sample size is large, they are very close.


When the sample size is small, the likelihood ratio test is more
C:C QA
 Reliasoft
7/11/2009
9:06:30 PM

accurate than the Z-test. 0.000 1.000 1.645 2.000 3.000


Standardized Effect (Z Value)
4.000 5.000

2010 RAMS –Tutorial DOE – Guo and Mettas 73  From both LR test and the Z-test, A, B, D and E are significant.
2010 RAMS –Tutorial DOE – Guo and Mettas 74

Model Diagnostic
 When using the Weibull distribution for life, the residuals from the life- Model Diagnostic (cont’d)
factor relationship should follow the extreme value distribution with a
mean of zero. Residuals against run order plot

2010 RAMS –Tutorial DOE – Guo and Mettas 75

2010 RAMS –Tutorial DOE – Guo and Mettas 76

Fluorescent Lights R-DOE: Interpreting the Results


 From the results, factors A,B, D and E are significant at the risk level of Fluorescent Lights Example: Traditional DOE
0.10. Therefore, attention should be paid to these factors.
Approach
 Suspensions are treated as failures.
 Mid-points are used as failure times for interval data.
 Life is assumed to follow the normal distribution.

 In order to improve the life, factors A and E should be set to the high
level; while factors B and D should be set to the low level.
MLE Information
Term Coefficient
A:A 0.1052
B:B -0.2256
C:C -0.0294
D:D -0.2477
E:E 0.1166

2010 RAMS –Tutorial DOE – Guo and Mettas 77

2010 RAMS –Tutorial DOE – Guo and Mettas 78

2010 Annual RELIABILITY and MAINTAINABILITY Symposium Guo & Mettas – 25


More Information
 On-line textbook and examples
 http://www.weibull.com/doewebcontents.htm
 http://www.itl.nist.gov/div898/handbook/
Traditional DOE Approach: Fluorescent Lights Example - Results

2010 RAMS –Tutorial DOE – Guo and Mettas 83

 B and D come out to be significant using traditional DOE approach.


 A, B, D and E were found to be significant using R-DOE.
 Traditional DOE fails to identify A and E as important factors at a
significance level of 0.1.

2010 RAMS –Tutorial DOE – Guo and Mettas 79

Tutorial: Design of Experiments (DOE) and Data Analysis

Introduction

Statistical
Summary: Topics Covered Background

Two Level
 Why DOE, what DOE can do and common design Factorial Design

types Response
Surface Method
Summary
 General guidelines for conducting DOE
Reliability DOE
 Linear regression and ANOVA
 Summary
 2-level factorial and fractional factorial design
 Response surface method
 Reliability DOE 2010 RAMS –Tutorial DOE – Guo and Mettas 80

2010 RAMS –Tutorial DOE – Guo and Mettas 81

Topics Not Covered


 Blocking
 Power and sample size
 RSM with multiple responses
 RSM: Box-Behnken design
 D-optimal design
 Taguchi robust design
 Taguchi orthogonal array (OA)
 Mixture design
 Random and mixed effect model
 …more
2010 RAMS –Tutorial DOE – Guo and Mettas 82

26 – Guo & Mettas 2010 AR&MS Tutorial Notes


The End

2010 RAMS –Tutorial DOE – Guo and Mettas 84

2010 Annual RELIABILITY and MAINTAINABILITY Symposium Guo & Mettas – 27

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy