0% found this document useful (0 votes)
26 views94 pages

Statistics Foundation Slider Team Group#1

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views94 pages

Statistics Foundation Slider Team Group#1

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 94

Six Sigma Training

Data / Probability / Process Capability


Six Sigma Training
• Data
• Probability
• Process Capability
Data
What is Data
What is Data in our manufacturing
(Example)
How to deal with Data

1.Planning
2.Data collection
3.Analysis
4.Presentation
Type of Data
Data can be qualitative or quantitative.
Quantitative data is numerical
Qualitative data is descriptive
information (numbers)
information (it describes
something)
Quantitative:

Qualitative: - Discrete:(Attribute)
Drive Failed 2 units
He is brown and black 2 Particles on a Slider
He has long hair
He has lots of energy - Continuous:
He weighs 25.5 kg
Slider resistance 3 .21 Ohm
Test#1 Type of Data

How people describe the smell of a new perfume

Height

Weight

Petals on a flower

Customers in a shop
Quantitative: Discrete>>>(Attribute)
Quantitative: Continuous

Continuous data is data that can


be measured and broken down
into smaller parts and still have
meaning. Money, temperature
and time are continuous. Volume
(like volume of water or air) and
size are continuous data.
To understand and describe
the POPULATION

1.we collect and measure SAMPLES

2.In this process we generate DATA

3.Data is the “raw material” for engineering


studies and analyses.
Population

A population is a collection of people,


items, or events about which you want
to make inferences.
Sample

A sample is a subset of people, items, or


events from a larger population that you
collect and analyze to make inferences.
What is a 'Random Sample'

A simple random sample is a subset of a


statistical population in which each member
of the subset has an equal probability of
being chosen.

An example of a simple random sample would be the names of 25


employees being chosen out of a hat from a company of 250 employees. In
this case, the population is all 250 employees, and the sample is random
because each employee has an equal chance of being chosen.
Descriptive Statistics

Mean
Median Location
Mode
Sample Variance
Sample Standard Deviation Dispersion
Range
Mean:
A commonly used measure of
the center of a batch of
numbers. The mean is also
called the average. It is the sum
of all observations divided by
the number of (non-missing)
observations.
Highly influenced by outliers.
Median :
Equals the 50% point of the data: half the sample is above the
median, the other half below the median.

Robust to extreme values or outliers.


Basic steps:
Order all values from the smallest to the largest
Count the number of values in the data set (=n)
For odd n, the median is the (n+1)/2 ranked value;
for even n, the median is the average of the (n/2)th and (n/2+1)th values
Mode

Represents the most frequent value of


the data set .
Test

After inspecting 10 random production lots, the following number of


defects were found on each of those lots:

0, 4, 2, 1, 3, 2, 3, 1, 2, 2

Find the mean, median and mode of the data set.

What would happen if “4” is changed to “100” ?


Range

Difference between the largest and smallest values


of the data set.

Advantages: easy to understand and calculate (with small data sets).


Is a good estimator for small data sets only (< 8)

Disadvantages: highly sensitive to extreme values or outliers.


Only uses two values from the entire data set (the largest and smallest
values).
Sample Variance
The variance measures how spread out the data are about
their mean. The variance is equal to the standard deviation
squared.
n

 (x  x)i
2

σˆ  s 
2 2 i 1

n 1
Disadvantage: Variability gets measured in squared units, which can be
confusing.
Sample Standard Deviation
The standard deviation is the most common
measure of dispersion, or how spread out the data
are about the mean. The sample standard
deviation is equal to the square root of the sample
variance. n
2
 (x  x)
i
σˆ  s  i1

n 1
Preserves the same units, as the original data which facilitates its understanding
Uses all the data in its calculation
Graphs
Scatter plot

Use Scatterplot to investigate the relationship between a


pair of continuous variables. A scatterplot displays ordered
pairs of x and y variables in a coordinate plane.

For example, a medical researcher


creates a scatterplot to show the
positive relationship between Body
Mass Index (BMI) and body fat
percentage in adolescent girls.
Scatter plot (Type of relationship)
Determine which model relationship

Outliers, which are data values that are far away


from other data values, can strongly affect your
results.
Scatter plot (Type of relationship)

Groups with different slopes Groups with different locations


When a group has a steeper slope, One group has consistently higher y-values for
changes in x-values are associated with each specific value of x than the other group.
greater changes in y-values.
Matrix Plot
Use Matrix Plot to assess the relationships between several pairs of
variables at once. A matrix plot is an array of scatterplots. There are two
types of matrix plots: Matrix of plots and Each Y versus each X.

For example, a business analyst creates a


matrix of plots to explore the relationships
between several business metrics. The plot
shows every combination of number of clients,
rate of return, and number of years in business.
Histogram
Use Histogram to examine the shape and spread of your
data. A histogram divides sample values into many
intervals and represents the frequency of data values in
each interval with a bar.
For example, a quality engineer creates a
histogram to examine the distribution of
the amount of torque that is required to
remove the caps from a sample of
shampoo bottles.

A histogram works best when the sample size is at least 20. However, a sample size that is considerably greater
than 20 may better represent the distribution.
Histogram
Peaks and spread Outliers Multi-modal data

Location Mean
Spreads
Marginal Plot
Use Marginal Plot to assess the relationship between two variables and
examine their distributions. A marginal plot is a scatterplot that has
histograms, boxplots, or dotplots in the margins of the x- and y-axes.
Boxplot
Use Boxplot to assess and compare the shape, central
tendency, and variability of sample distributions, and
to look for outliers.

For example, a scientist creates


a boxplot to compare the
height of plants grown with two
different fertilizers and a control
group with no fertilizer.

A boxplot works best when the sample size is at least 20. By default, a boxplot shows the median, interquartile range, range, and outliers for each group.
Boxplot Analysis

* Outlier
Distribution Maximum
= Min[highest data point, Q3 + 1.5(Q3-Q1)]

75th Percentile (Third Quartile)

50% + Mean
of the
data Median (50th Percentile)

25th Percentile (First Quartile)

Distribution Minimum
= Max[lowest data point, Q1 - 1.5(Q3-Q1)]

* Outlier
Interpret the key results for Boxplot
Skewed data Outliers Centers Spreads

Sample size (n)


The sample size can affect the appearance of the graph.
For example, although these boxplots seem quite different, both of them were created using
randomly selected samples of data from the same population.

A boxplot works best when the sample size is at least 20. If the sample size is too small, the quartiles and outliers shown by the boxplot
may not be meaningful. If the sample size is less than 20, consider using an Individual value plot instead.
Pareto
Use Pareto Chart to identify the most frequent defects, the most
common causes of defects, or the most frequent causes of customer
complaints.
Pareto charts can help to focus improvement efforts on areas where
the largest gains can be made.

For example, a manager wants to investigate causes


of customer dissatisfaction at a particular hotel. The
manager investigates and records reasons for
customer complaints.
Time Series Plot
Time Series Plot to look for patterns in your data over time, such as
trends or seasonal patterns. A time series plot can help you choose a
time series analysis to model your data.

The following time series plot shows the stock prices


for two companies over time. The stock price for
Company B appears to be growing in value faster
than the stock price for Company A.
Interpret the key results for Time
Series Plot Additive changes
In this example of additive seasonal
changes, the data values tend to
Outliers Sudden shifts Look for trends increase over time, but the
magnitude of the seasonal change
remains the same.

Seasonal pattern Cyclic movements Random variation


Multiplicative changes
These data show a These data show cyclic These data show random variation.
In this example of multiplicative
movements. The cycles do not There are no patterns or cycles.
seasonal pattern. The repeat at regular intervals and do seasonal changes, the
pattern repeats every 12 not have the same shape. magnitude of the seasonal
change increases over time as
months. the data values increase.
Distribution Plot
Binomial, n=10000, p=0.2

0.010

0.008
Probability

0.006

0.004

0.002

0.02443 0.02384
0.000
1921 2080
X

Probability Distribution
Experiments,
Sample Space & Events
In statistics,
Experiment refers to any activity that generates a set of data.

A random experiment is one that can result in different outcomes, even


though it is repeated in the same manner every time.

The set of all possible outcomes of a random experiment is called the


sample space.

An event is a subset of the sample space


and represents one of the outcomes.
Example 1: Tossing a Die
Consider the experiment of tossing a die (6-side).

• This experiment has six possible outcomes, i.e.,

• The sample space of this experiment is S = {1, 2, 3, 4, 5, 6}

• The follows events can be defined


– the outcome is odd  E1 = {1, 3, 5}
– the outcome is even  E2 = {2, 4, 6}
– the outcome is less then 3  E3 = {1, 2 }
– the outcome is 6  E4 = { 6 }
Probability
A probability is always a numerical value between 0 and 1.
Probability may be
deterministic (based on a mathematical model)
empirical (based on actual results of experiment)
subjective (based on experience)

Probability
1/6 1/6 1/6 1/6 1/6 1/6 =6/6
0.1667 0.1667 0.1667 0.1667 0.1667 0.1667 =1
Discrete Probability Distributions
Distribution Plot Probability of Label

• Discrete Uniform Distribution


0.10

0.08

Probability
0.06

0.04

0.02

0.00
1 2 3 4 5 6 7 8 9 10
Distribution Plot Label
Binomial, n=100, p=0.05
0.20

• Binomial Distribution
0.15

Probability
0.10

0.05

0.00
0 2 4 6 8 10 12 14
X Distribution Plot
Poisson, Mean=0.5

0.6

0.5

0.4

• Poisson Distribution

Probability
0.3

0.2

0.1

0.0
0 1 2 3 4
X
Binomial Distribution (Quantitative discrete )
A binomial distribution is a discrete distribution that models the
number of events in a fixed number of trials. Each trial has two
possible outcomes and event is the outcome of interest from a trial.

The number of events (X) in n trials follows a binomial distribution if the


following conditions are met:

-The number of trials is fixed.


-Each trial is independent of the other trials.
-Each trial has one of two outcomes: event or nonevent.
-The probability of an event is the same for each trial.
Binomial Distribution (Quantitative discrete )
• The mean and variance of a Binomial Distribution are

 np  np and  np2  np (1  p )

• The location, dispersion and shape of a binomial distribution is


affected by the sample size ( n ) and probability of success ( p ).
Binomial Distribution (Quantitative discrete )
The probability mass distribution for binomial random variable, X,
having parameters n and p can be obtained from

P ( X  x)  n C x p x (1  p ) n  x
n n! A process yields a defective rate of 10%. For a
where n C x    
 x  x!(n  x)! sampling plan of 10 units, determine the probability
distribution. What is the chances of finding zero
defective unit
Commonly used in Acceptance Sampling, where P=0.1 / n=10 / X=0
p is the probability of success (defective rate),
n is the number of trials (sample size),
x is the number of successes (defectives found). Probability to find 0 unit from 10 units = 0.3487
Suitable for sampling with replacement, and sampling without
replacement if the sample size is less than 10% of the lot size.
Binomial Example
Poisson Distribution (Quantitative discrete )
Poisson Distribution is characterized by the form “the number of
occurrences per unit interval.”
Distribution Plot
A man was able to
Poisson complete 3 files a day on
0.4
Mean=1
0
an average.
10
Mean=2
20
Mean=3

The average number of


0.2

homes sold by the Acme


0.0
Mean=4 Mean=5 Mean=6
Probability

0.4

0.2 Realty company is 2


0.4
Mean=7 Mean=8 Mean=9
0.0 homes per day.

Number of particle found


0.2

on slider
0.0
0 10 20 0 10 20

X
Poisson Distribution (Quantitative discrete )
The probability mass distribution for Poisson random variable, X, is
defined by
x e  
P( X  x)  for x  0 , 1, 2 , ...
x!

The mean and variance of a Poisson Distribution are

  and 2 

where l is a positive constant representing the average number of


occurrences per unit interval (e.g. defect rate).
Poisson Distribution (Quantitative discrete )
The location, dispersion and shape of a Poisson distribution is affected mean.

Distribution Plot
Poisson
0 10 20
Mean=0.5 Mean=1 Mean=2

0.50

0.25

0.00
Mean=3 Mean=4 Mean=5
Probability

0.50

0.25

0.00
Mean=6 Mean=7 Mean=7.5

0.50

0.25

0.00
0 10 20 0 10 20

X
Poisson Distribution (Quantitative discrete )

Example Mean history of 5 particle found per slider , what is the


probability of having less than 3 particle on slider ?
Let  = average of particle = 5

P ( X  3)  P ( X  0)  P ( X  1)  P ( X  2)


5 e
0 5

5 e
1 5

5 e 5
2

0! 1! 2!
 0.125
x e  
P( X  x) 
x!
Poisson Distribution (Quantitative discrete )
Graph >>> Probability>>>View Probability

Distribution Plot
Poisson, Mean=5
0.20

0.15

Probability
0.10

0.05
0.1247

0.00
2 13
X
Poisson Example
Normal Distribution(Continuous Probability Distributions)

The normal distribution is a continuous distribution that is specified by the


mean (μ) and the standard deviation (σ). The mean is the peak or
center of the bell-shaped curve. The standard deviation determines the
spread of the distribution.

The normal distribution is the most common statistical distribution. Many statistical analyses assume
that the data come from approximately normally distributed populations.
https://www.mathsisfun.com/data/quincunx.html

Normal Distribution(Continuous Probability Distributions)


• Shape like a “Bell”
99.7%
• The mode occurs at x =
 95.5%

• The curve is symmetric


about 
• Point of inflexion at
x= 68.27%

• Probability distribution
  1 @ 68.27%
  2 @ 95.45% –3 –2 –1  +1 +2 +3
  3 @ 99.73%
Normal Distribution(Continuous Probability Distributions)
• The location and dispersion of a normal distribution
is determined by its mean and variance.

Same mean & Deferent Sigma Different mean & same Sigma
55
Normal Distribution(Continuous Probability Distributions)
Standard Normal Distribution
• A normal distribution with  = 0 and 2 = 1 is called a standard normal
distribution. A standard normal random variable is denoted as Z.
0.4 1.00
Cumulative
Normal:  =0, 2 = 1 Probability
Distribution

Cumulative Frequency
Relative Frequency

0.3 0.75

0.2 0.50

0.1 Probability 0.25


Distribution

0.0 0.00
-4 -2 0 2 4

57
Standard Normal Distribution
A value from any normal distribution can be transformed
into its corresponding value on a standard normal
distribution using the following formula:

X 
Z

where Z is the value on the standard normal distribution, X is the value


on the original distribution, μ is the mean of the original distribution,
and σ is the standard deviation of the original distribution.
58
Standard Normal Distribution
what portion of a normal distribution with a mean of 50 and a
standard deviation of 10 is below 26?
Applying the formula, we obtain
X 
Z >>>>>> Z = (26 - 50)/10 = -2.4

Distribution Plot Distribution Plot
Normal, Mean=50, StDev=10 Normal, Mean=0, StDev=1

0.04 0.4

0.03
>>>>>> 0.3

Density
Density

0.02 0.2

0.01 0.1

0.008198 0.008198
0.00 0.0
26 50 -2.4 0
X X
Example 9
The reaction time of a driver to visual stimulus is normally distributed
with a mean of 0.4 second and standard deviation of 0.05 second.
(a) What is the probability that a reaction time requires more than 0.5
second?

(b) What is the probability that a reaction time requires between 0.4 and 0.5
second?

(c) What is reaction time that is exceeded 90% of the times?

https://www.mathsisfun.com/data/standard-normal-distribution-table.html 60
Process Capability
Process Capability
A process is capable when it is able, with its
natural variability, to meet the customer’s
specification (the process fits inside the spec. ).
Process Capability b)vs Spec Limits
a)

c)
a) Process is highly capable
b) Process is marginally capable
c) Process is not capable
Process Capability b)vs Spec Limits
• Capability is often thought
of in terms of the proportion
of output that will be within
product specification tolerances.
The frequency of defectives
produced may be measured in:
• a) percentage (yield)
15 20 25 30

• b) parts per million (ppm)


• c) Z-score
Defectives
• d) Process Capability metrics (Cpk, Ppk) (outside the specs)
Understanding Performance
USL

Target
Overall

LSL
Monday Tuesday Wednesday Thursday Friday

• We have short term performance and long term performance.


• What’s the best way to characterize them?
• What’s the best capability metric to use?
Process Potential

Cp : Short Term
Pp : Long Term
distance between specs
= Process spread
USL – LSL Appropriately
= substitute ST or LT
6 for 
Process Potential
The Cp index ratio between the difference
between the specification limits compared to the
process spread (6s)

USL – LSL

6s
Process Potential
a)

Cp or Pp > 1 Process has high potential

b)

Cp or Pp = 1 Process has marginal


potential

c)

Cp or Pp < 1 Process has low potential


Process Potential
The Cp or Pp index compares the allowable spread (USL-LSL)
against the process spread (6). BUT, it fails to take into
account if the process is centered between the specification
limits.

Process is centered Process is not centered

SAME Cp or Pp
Process Performance

Cpk : Short Term


Ppk : Long Term

= distance from mean to closest spec


½ Process spread
Cpl Cpu

= minimum of X – LSL OR USL - X


3 3

Appropriately substitute ST or LT for 


Process Performance

The Cpk index relates the scaled distance between the


process mean and the nearest specification limit
compared to 3s.

X-Bar – LSL

USL- X-Bar

3s 3s

LSL X-Bar USL


Process Performance
a)
Cp = 2 b) Cp = 2
Cpk = 2 Cpk = 1

c)
Cp = 2 a) Process is highly capable (Cpk >1.5)
Cpk < 1

b) Process is capable (Cpk= 1 to 1.5)

c) Process is not capable (Cpk <1)


Metrics Summary

USL  LSL    LSL USL   


Cp  Cpk  Min , 
6ST  3ST 3ST 

USL  LSL    LSL USL   


Pp  Ppk  Min , 
6LT  3LT 3LT 
Test
Specification Limits: 4 to 16 g
Machine Mean Std Dev

(a) 10 4

(b) 10 2

(c) 7 2

(d) 13 1

Determine the corresponding Cp and Cpk for each machine.


All the Metrics...
Process Capability and Performance Indicators
“Snapshot” “Video” Look
Look
Centered:
relates standard
Capability Cp Pp
deviation to
tolerance
Not centered:
relates mean and
Performance Cpk Ppk
standard deviation to
spec
Short-term Long term
(within) (overall) CP represents
standard standard “entitlement”!
deviation deviation
Aside: Motorola™ 6 Z-Score
In traditional 6 methodology, the Z-score is computed on
the basis of the total amount outside specifications.

PNCL PNCU

LSL USL

PNCTot = PNCL + PNCU


Then the Z-score is derived by finding the Z value corresponding to
PNCTot .Motorola™ then adds or subtracts the 1.5 to this Z.

Note: “PNC” = Percent Non-Conforming

76
Aside: Motorola™ 6 Z-Score
Graph > Probability Distribution plot > View Probability
Normal > Mean=0, Sigma=1 >Tab ‘Shaded Area’ > Click ‘Probability’ > ‘Right Tail’ > Fill in 0.0027
> OK
Distribution Plot Distribution Plot
Normal, Mean=0, StDev=1 Normal, Mean=0, StDev=1

0.4 0.4

0.3 0.3

Density
Density

0.2
0.2

0.1
0.1
0.001350 0.001350 0.0027

0.0
0.0
-3 0 3
0 2.782
X X

Process Mean = 0 and SD = 1 PNC Total = 0.00135 + 0.00135 = 0.0027


USL = 3 & LSL = -3 Z-Score Or Z-Bench = 2.782

Z.LSL = (Mean-LSL)/SD = 3
Z.USL = (USL-Mean)/SD = 3
77
Example (Process Capability)
Step 1.1 :
Visually examine the distribution fit
Compare the solid overall curve to the bars of the histogram to assess
whether your data are approximately normal. If the bars vary greatly from
the curve, your data may not be normal and the capability estimates may
not be reliable for your process. If your data appear to be nonnormal, use
Individual Distribution Identification to determine whether you need to
transform the data or fit a nonnormal distribution to perform capability
analysis.

Good fit Poor fit


Step 1.2
Compare the within and overall curves
Compare the solid overall curve and the dashed within curve to see how
closely they are aligned. A substantial difference between the curves may
indicate that the process is not stable or that there is a significant amount of
variation between the subgroups. Use a control chart to verify that your
process is stable before you perform a capability analysis.

Closely aligned Poorly aligned


Step 2: Examine the observed
performance of the process
Use the capability histogram to visually Assess the center of the process
examine the sample observations in Evaluate whether the process is centered
between the specification limits or at the
relation to the process requirements. target value

In this histogram, the process spread is larger In this histogram, although the sample
than the specification spread, which suggests observations fall inside of the
poor capability. Although most of the data are specification limits, the peak of the
within the specification limits, there are distribution curve is not centered on the
target. Most of the data exceed the
nonconforming items below the lower
target value and are close to the upper
specification limit (LSL) and above the upper specification limit..
specification limit (USL).
Step 3: Evaluate the capability of
the process
Assess potential capability Assess overall capability
Use Cpk to evaluate the potential Ppk to evaluate the overall capability
capability of your process based on of your process based on both the
both the process location and the process location and the process
process spread. spread.

Potential capability indicates the Overall capability indicates the


capability that could be achieved if actual performance of your
process shifts and drifts were process that your customer
eliminated. experiences over time.
Introduce Minitab Assistance
(Process Capability)
Capability Analysis for After
Diagnostic Report
Xbar-R Chart
Confirm that the process is stable.

0.5

Mean
0.0

-0.5

2
Range

0
1 11 21 31 41 51 61 71 81 91

Normality Plot
The points should be close to the line.
Normality Test
(Anderson-Darling)
Results Pass
P-value 0.539
Capability Analysis for After
Report Card
Check Status Description
Stability The process mean and variation are stable. No points are out of control.

Number of You have 100 subgroups. For a capability analysis, this is usually enough to capture the different sources of process variation
Subgroups i when collected over a long enough period of time.

Normality Your data passed the normality test. As long as you have enough data, the capability estimates should be reasonably
accurate.

Amount The total number of observations is 100 or more. The capability estimates should be reasonably precise.
of Data
Appendix
Bubble Plot
• Use Bubble Plot to explore the relationships
among three variables on a single plot. Like a
scatterplot, a bubble plot plots a y-variable versus an
x-variable. However, the symbols (also called
bubbles) on the bubble plot vary in size. The area of
each bubble represents the value of a third variable.

• For example, a bank administrator creates a bubble


plot to examine the relationship between income,
savings, and debt for a group of loan applicants.
Marginal Plot

• Use Marginal Plot to assess the relationship


between two variables and examine their
distributions. A marginal plot is a scatterplot that
has histograms, boxplots, or dot plots in the
margins of the x- and y-axes.
• For example, a quality engineer for a camera
manufacturer creates a marginal plot to examine
the relationship between flash recovery time
(minimum time between flashes) and the voltage
remaining in a camera battery.
Dotplot

• Use Dotplot to assess and compare sample


data distributions. A dotplot divides sample
values into small intervals and represents
each value or small group of values with a
dot along a number line. A dotplot works
best when the sample size is less than
approximately 50.
• For example, a quality engineer creates a
dotplot to examine the distribution of the
amount of torque that is required to remove
the caps from a sample of shampoo bottles.
Contour Plot

• Use Contour Plot to examine the relationship


between a response variable and two predictor
variables. In a contour plot, the values for two
predictor variables are represented on the x- and y-
axes, and the values for the response variable are
represented by shaded regions, called contours. A
contour plot is like a topographical map in which x-,
y-, and z-values are plotted instead of longitude,
latitude, and altitude.
• For example, the following contour plot shows the
effect of temperature and time on the quality of
reheated frozen entrees.
3D Surface Plot

• Use 3D Surface Plot to examine the


relationship between a response
variable (Z) and two predictor
variables (X and Y), by viewing a
three-dimensional surface of the
predicted response. You can choose
to represent the predicted response
as a smooth surface or a wireframe.
• For example, the following 3D surface
plots show the effect of temperature
and time on the quality of reheated
frozen entrees.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy