Auto Encoder
Auto Encoder
A R T I C L E I N F O A B S T R A C T
Keywords: In this paper, the unsupervised autoencoder learning for automated defect detection in manufacturing is eval
Anomaly detection uated, where only the defect-free samples are required for the model training. The loss function of a Convolu
Defect inspection tional Autoencoder (CAE) model only aims at minimizing the reconstruction errors, and makes the representative
Autoencoders
features widely spread. The proposed CAE in this study incorporates a regularization that improves the feature
Machine vision
distribution of defect-free samples within a tight range. It makes the representative feature vectors of all training
samples as close as possible to the mean feature vector so that a defect sample in the evaluation stage can
generate a distinct distance from the trained center of defect-free samples. The proposed CAE model with reg
ularizations has been tested on a variety of material surfaces, including textural and patterned surfaces in images.
The experimental results reveal that the proposed CAE with regularizations significantly outperforms the con
ventional CAE for defect detection applications in the industry.
1. Introduction from a set of normal samples. The loss function of the conventional CAE
measures only the reconstruction errors. It could make the extracted
Machine vision is an effective non-contact technology for automated feature values widely spread in the high-dimensional variable space.
defect inspection in the manufacturing process. Most of the traditional When the trained CAE is used for anomaly detection, the encoded fea
machine vision techniques are based on texture analysis. A set of tures of a defect sample image may thus fall within the range of the
discriminative features are extracted from the spatial or the spectral normal samples’ variable space. It could make the extracted features
domain of the test image. A high-level multiple dimensional classifier indistinguishable between normality and abnormality. A regularization
such as Support Vector Machine (SVM) or Random Forest is then applied penalty is thus included in the original CAE loss function to limit the
to identify defect samples. The success of the classification highly relies spread of the learned feature values for normal training samples. It is
on the human experts to extract and select representative features based expected that the distances of feature vectors between the normal
on the local gray-level (or color) and structure variations of a defect in samples are close to each other and the unseen defect samples yield
the test image. sufficiently large distances from the normal samples. The proposed CAE-
In the manufacturing environment, it is quite easy to collect normal based model, denoted by λ-CAE, is tested on various material surfaces
samples as many as required. However, it is difficult to collect a suffi for defect detection, including textural surfaces and patterned surfaces.
cient number of defective samples in a short period of time to train The proposed method is also compared to autoencoder-variant models
robust classification models for defect detection. The machine vision based on encoded features and image reconstruction error.
methods currently available need handcrafted features based on the This paper is organized as follows. Section 2 reviews the related work
characteristics of individual defect types of a specific product, where the on defect detection with traditional machine vision and deep learning
defect samples may not be sufficient for the analysis. techniques. Anomaly detection with autoencoders for various applica
In this paper, the deep learning technique is explored to tackle the tions is also discussed. Section 3 presents the original CAE and the
defect detection task without defect samples for the training. The pro regularized λ-CAE models for defect detection. Section 4 discusses the
posed method is image-wise defect detection, i.e. it classifies a test image experimental results on various material surfaces. The paper is
as defective or defect-free. It is not used for pixel-wise defect segmen concluded in Section 5.
tation. The unsupervised convolutional autoencoder (CAE) is applied to
extract the representative features that can well describe the distribution
* Corresponding author.
E-mail address: iedmtsai@saturn.yzu.edu.tw (D.-M. Tsai).
https://doi.org/10.1016/j.aei.2021.101272
Received 15 October 2020; Received in revised form 4 February 2021; Accepted 16 February 2021
Available online 5 March 2021
1474-0346/© 2021 Elsevier Ltd. All rights reserved.
D.-M. Tsai and P.-H. Jen Advanced Engineering Informatics 48 (2021) 101272
2
D.-M. Tsai and P.-H. Jen Advanced Engineering Informatics 48 (2021) 101272
for the comparison between encoded features and image reconstruction G : Rm →Rn , F(x)→G(F(x)) (4)
errors.) In this study, regularized CAE models are thus proposed to
extract representative features from a set of normal sample images. The where
proposed CAE models are trained so that the reconstruction error is
minimized while the variation of the extracted feature vectors from all n is the dimension of the input x;
defect-free training samples are as small as possible. Simple thresholding m is the dimension of the abstract representation, and m << n.
based on the SPC (Statistical Process Control) is applied to distinguish
the distance between the test image’s feature vector and the mean The autoencoder is trained so that the reconstruction error is mini
feature vector of defect-free training images for defect-image identifi mized:
cation. The proposed method does not require defective samples for the
model training, and does not require handcrafted features for the MinL(x, Gθ (Fφ (x))) (5)
φ,θ
discrimination. It is especially well suited for new product quality in
spection in manufacturing, where the defect samples are not available or where φ and θ are the parameters to learn for the encoder and decoder,
very rare to collect. respectively. The CAE architecture used in this study for defect detection
is depicted in Fig. 1. It consists of 5 convolutional layers in the encoder,
3. AE models for anomaly detection and no pooling layers are applied. The decoder contains 6 deconvolu
tional/convolutional layers.
3.1. CAE model The loss function of the conventional CAE is measured by the mean
square error (MSE), i.e.
The autoencoder model used in this study is based on Convolutional
N ⃦ ⃦2
Autoencoder (CAE) for practical implementation in manufacturing. The 1 ∑ ⃦ ⃦
MinL(X) = xi⃦
⃦xi − ̂ (6)
autoencoder architecture for unsupervised learning is composed of two φ,θ N i=1
parts, encoder and decoder. The encoder takes the raw image as the
input, and the abstract representation from the encoder is then the input where
to the decoder. The encoder involves a series of convolutional layers and
downsampling to compress the original data in a high-dimensional x i is the reconstructed image of the input image xi , and ̂
̂ xi =
space into an abstract in a lower-dimensional space. It is expected the Gθ (Fφ (xi ));
abstract representation can well describe the distribution of the defect- X = {x1 , x2 , ..., xN }, a set of defect-free samples for the model
free training samples. The decoder is a generative model that involves a training.
series of deconvolution and upsampling to reconstruct the image from
the representative features. The feature maps in the last convolutional layer of the encoder are
Let F and G be the encoder and decoder, then used to form the feature vector that describes the inherent properties of
defect-free sample images. Let wk be the k-th feature map of size r × c,
F : Rn →Rm , x→F(x) (3)
and k = 1, 2, ..., K (i.e. a total of K feature maps). The 2D feature map wk
is converted into a 1D vector vk of size 1 × r⋅c. The K hidden
3
D.-M. Tsai and P.-H. Jen Advanced Engineering Informatics 48 (2021) 101272
size 1 × r⋅c⋅K. The concatenated vector V contains the final representa Detailed parameter setting of the CAE models: (a) encoder; (b) decoder.
tive features used for anomaly detection. (a) Encoder
∑
Let V = N1 i Vi be the mean feature vector of all defect-free training Number Layer Number of Filter Activation
samples in X, and Vi the feature vector of sample xi . The distance of a test filters size function
image I with representative feature vector VI is given by 1 Convolution (s = 64 3*3 Relu
⃦ ⃦ 1)
⃦ ⃦
dI = ⃦VI − V ⃦ (7) 2 Convolution (s = 32 3*3 Relu
2)
It is expected that a defect-free test image represented by the 3 Convolution (s = 10 3*3 Relu
1)
extracted feature vector V is very close to the training samples, whereas
4 Convolution (s = 10 3*3 Relu
a defective test image is far away from the training samples in the 2)
feature vector space. 5 Convolution (s = 10 3*3 Tanh
1)
3.2. Improved loss functions of CAE (b) Decoder
In each epoch of the model training, the mean feature vector V is not backpropagation updating can then focus more on the training samples
available until all training samples in X are evaluated. The mean feature with distances larger than the mean. This regularization can further
vector from the previous epoch is thus used as the center of all training reduce the upper bound of the distances and makes all training samples
samples. The proposed new loss function of CAE at epoch l is given by fall within a tight cluster in the distance domain. It can be expected that
a defective sample will generate a significantly large distance from the
N ⃦ ⃦2
1 ∑ ⃦ ⃦ 1 ∑N ⃦
⃦ (l)
⃦
(l− 1) ⃦2 cluster center of normal samples.
⃦xi − ̂ (l) ⃦
(9)
(l)
LV (X) = (1 − λ) x ⃦ +λN ⃦Vi − V ⃦
N i=1 ⃦ i
i=1
(l− 1) ∑N (l− 1)
For the extracted features from CAEs, the SPC (Statistical Process
1
V = N i=1 Vi , the mean feature vector at epoch l-1, and Control) thresholding is applied as a one-class classifier to identify
V = 0;
(0)
defective samples. The proposed regularization in the loss function tends
λ is the penalty factor, and 0⩽λ⩽1. to minimize the spread (i.e. variance) of the extracted features. The
simple SPC is thus a natural selection to set up the discrimination
When λ = 0, it is the conventional CAE. When λ > 0, the regulari threshold.
zation is put into action. The first term of the loss function LV minimizes The SPC threshold of distance dI for a test image I is given by
the reconstruction error with the weight (1 − λ). The second term makes
Td = μd + C⋅σd (11)
the feature vector Vi of each training sample xi as close as possible to its
center V. The regularization term can be interpreted as minimizing the where
variance of the representative features. ⃦ ⃦
In this study, a tight regularization is also proposed to further restrict ⃦
dI = ⃦V − V
⃦
⃦;
⃦ I X ⃦
the spread range of defect-free samples in the CAE model training. It is
⃦ ⃦
given by ∑N ∑N ⃦ ⃦
μd = N1 i=1 di =
1
N
⃦Vi − V X ⃦;
i=1 ⃦ ⃦
N ⃦ ⃦2 { }2
1 ∑ ⃦ ⃦ 1 ∑ N
(10)
(l− 1)
L(l) ⃦xi − ̂ (l) ⃦
max di(l) − d { }1/2
d (X) = (1 − λ) x ⃦ +λN ,0
N i=1 ⃦ i
∑N 2
1
;
i=1
σd = N i=1 (di − μd )
where
Vi is the feature vector of normal sample xi in the training set X;
⃦ ⃦
1) ⃦
di
(l) ⃦ (l)
= ⃦Vi − V
(l−
⃦; V X is the mean feature vector of the training set X;
C is a control limit constant
(l− 1) 1
∑N (l− 1)
d = N i=1 di
The test image I is identified as defective if dI > Td . Otherwise, it is
defect-free.
4
D.-M. Tsai and P.-H. Jen Advanced Engineering Informatics 48 (2021) 101272
Fig. 2. Natural wood examples for the evaluation: (a) 4 defect-free surface images; (b) contaminant; (c) oil-stain; (d) glue; (e) scratch.
4. Experimental results
In this section, the performance of the CAE models with and without
regularization is evaluated for defect detection in various material sur
faces. The proposed models and methods are implemented with Ten
sorflow and Keras. They are executed on a PC equipped with Intel i7-
7700 3.40 GHz CPU and one NVIDIA GTX 1080 Ti GPU. The CAE ar
chitecture used for feature extraction has been shown in Fig. 1. The
detailed setting of each layer in the CAE models is presented in Table 1.
The Adam optimizer with a learning rate = 0.001, batch size = 128
and epochs = 1000 has been applied to the conventional CAE and the
regularized CAE models. The test sample images used in the experiments
are provided by local manufactures. They are available upon request
from the authors. Public image datasets used for industrial optical in
spection are also evaluated.
presents four normal wood surface images that contain irregular varying λ values, where VIj is the feature vector of the test sample Ij and
structure-textures. Some involve knots that could be falsely identified as V X is the mean feature vector of the training set X.
defects. Fig. 2(b)–(e) are defective test images that involve contaminant, The box-plot shows the 50th percentile (median), 25th percentile
oil-stain, glue and scratch. The wood images were taken by an IDS UI- and 75th percentile (and minimum and maximum) of the test data. The
3360CP camera from a working distance of 360 mm with coaxial
lighting. The image resolution is 165 μm/pixel. The input image to the
CAE models is of size 128 × 128 pixels, which corresponds to 21.1 × Table 2
21.1 mm2. Effect of various λ values in the loss LV for wood defect detection.
A total of 1900 defect-free images, each of size 128 × 128 pixels, are λ 0 (CAE) 0.05 0.1 0.2 0.5 1
used to train the standard CAE (Eq. (6)) and the proposed λ-CAE with FP rate 38% 35% 22% 31% 50% 21%
loss function LV (Eq. (9)). The test images for the evaluation include 100 (38) (35) (22) (31) (50) (21)
normal wood images and 200 defective wood images (50 samples for FN rate 50% 34% 4% 47% 63% 66%
each defect type). Five feature maps, each of size 8 × 8, from the en (100) (68) (8) (64) (127) (132)
Accuracy 54% 65% 90% 68% 41% 49%
coders are used to form a 320-dimensional feature vector V for the CAEs.
The regularization factor λ is varied from 0, 0.05, 0.1, 0.2, 0.5 to 1. Note *100 defect-free and 200 defective test images.
5
D.-M. Tsai and P.-H. Jen Advanced Engineering Informatics 48 (2021) 101272
Table 3
Effect of various λ values in the loss Ld for wood defect detection.
λ 0 (CAE) 0.05 0.1 0.2 0.5 1
Table 4
Effect of various numbers of filters for feature extraction in defect detection: (a)
standard CAE model; (b) λ-CAE model with loss LV . (Filter size: 8 × 8).
(a) CAE
No. of filters 1 2 3 5 10
Fig. 4. Box-plot of defect-free (x+ ) and defective (x− ) wood test samples for the
⃦ ⃦
FP rate 36%(36) 34%(34) 36%(36) 38%(38) 38%(38) ⃦ ⃦
FN rate 64%(128) 69%(138) 60%(120) 50%(100) 60%(120) three CAE models, where the y-axis shows the distance dIj = ⃦ ⃦VIj − V X ⃦.
⃦
Accuracy 45% 42% 48% 54% 47%
(b) λ-CAE-LV
No. of filters 1 2 3 5 10
results reveal that the box-plots between defect-free and defective test
samples can be best separated when λ = 0.1. It far outperforms the
standard CAE (with λ = 0). When λ⩾0.5, the proposed λ-CAE with loss LV
cannot distinguish the difference between defect-free and defective
samples due to the blurred reconstruction. By using the SPC with control
limit constant C = 2 to set up the threshold Td for the distance dIj , Table 2
summarizes the False Positive (FP) rate, False Negative (FN) rate and the
overall detection accuracy (i.e. [TP + TN]/[(TP + FP)+(TN + FN)]) of
the 300 wood test images from the λ-CAE model with varying λ values.
It shows that the standard CAE (λ=0) performs poorly with a low
Fig. 5. ROC curves of the CAE, λ-CAE with LV and λ-CAE with Ld for the natural
detection accuracy of 54%. The proposed λ-CAE with loss LV detects
wood test samples.
most of the true defects with a very low FN rate of 4%. The overall
detection accuracy is increased to 90%. The experimental results indi
cate that the proposed λ-CAE model can indeed make the extracted wood test examples, 10 convolutional filters yield a detection accuracy
feature vectors of defect-free training samples tightly concentrate of 88%. Excessive feature maps may not generate better detection
around the mean feature vector. It causes the defective samples under accuracy.
test far away from the cluster center of the defect-free training samples.
Table 3 further presents the effect of various λ values for the λ-CAE with
4.3. Comparison of loss functions LV and Ld
loss Ld . It also indicates that λ = 0.1 yields the best detection accuracy of
95%.
In section 3, two regularizations (loss functions LV and Ld ) have been
proposed to improve the CAE model for anomaly detection. The wood
4.2. Effect of the number of filters (dimension of V) test examples described in Section 4.1 are used for the evaluation. The
same CAE architecture with the same hyperparameter setting is used for
The number of filters used in the last convolutional layer of the the standard CAE and the λ-CAE models with loss LV and Ld . For the
encoder determines the number of the corresponding feature maps and, λ-CAE Models, the regularization weight λ is fixed to 0.1. Fig. 4 presents
thus, the dimension of the feature vector V. Let r × c be the feature-map the box-plots of the three comparative CAE models, where the distance
size, and K the number of filters used for the convolution. The size of the d is shown in the y-axis.
feature vector is then r⋅c⋅K. The same wood images described above in The results show that the λ-CAE model with loss Ld gives the best
Sec. 4.1 are also used for the evaluation. The number of filters is varied discrimination between defective and defect-free test images. The
from 1, 2, 3, 5 to 10 with λ value fixed at 0.1. Table 4(a) and (b) presents Fisher’s ratios (between-class variance/within-class variance) are 0.14,
the detection statistics of the standard CAE and the proposed λ-CAE with 1.54 and 2.90 for the standard CAE, λ-CAE with loss LV , and λ-CAE with
loss LV , respectively. loss Ld , respectively. The discrimination power of the proposed λ-CAE
The SPC threshold with control constant C = 2 is used for the models is significantly higher than the conventional CAE model. Fig. 5
discrimination. The results show that the proposed λ-CAE performs far further illustrates the ROC (Receiver Operating Characteristic) curves of
better than the standard CAE, regardless of the number of filters. For the the three comparative CAE models. The AUCs (Area Under Curves) are
proposed λ-CAE, insufficient number of filters (such as 1, 2 or 3 filters for respectively given by 0.70, 0.89 and 0.96 for the standard CAE, λ-CAE
low-dimensional feature vectors) generates a low accuracy rate around with loss LV , and λ-CAE with loss Ld . The λ-CAE model with loss Ld
50%. When the number of representative feature maps is large enough, generates an ROC with the AUC close to an ideal value of 1.
the detection accuracy can be boosted up to 90% with 5 filters. In the In terms of optimization convergence of the CAE modeling training,
6
D.-M. Tsai and P.-H. Jen Advanced Engineering Informatics 48 (2021) 101272
Fig. 6. Loss function over training epochs: (a) CAE; (b) λ-CAE with LV ; (c) λ-CAE with Ld trained by a fixed learning rate; (d) λ-CAE with Ld trained by adaptive
learning rate.
0.1 and 5 filters at the last layer of the encoder. The results show that the
Table 5
proposed CAE with loss Ld yields a detection accuracy of 95%, which is
Performance comparison of the wood test samples for the three CAE models
far better than 54% given by the standard CAE.
(based on 5 filters and λ = 0.1).
Model FP rate FN rate Accuracy
7
D.-M. Tsai and P.-H. Jen Advanced Engineering Informatics 48 (2021) 101272
Fig. 7. Liquid Crystal Display (LCD) test examples: (a) 4 defect-free panel surfaces; (b) defective samples (marked by dot-frames).
Table 6 lists the resulting FP and FN rates from the standard CAE and the
Table 6
proposed λ-CAE with loss LV and Ld . For the highly regular texture-
Detection results on the LCD test samples by CAE, λ-CAE with loss LV , and λ-CAE
surfaces, the standard CAE yields good detection results with FP rate
with loss Ld .
= 11% and FN = 6%. The proposed λ-CAE model with loss LV or Ld
Model FP rate FN rate Accuracy
improves the performance without false alarm and missing detection for
CAE 11%(11) 6%(2) 90% all defect-free and defective LCD test samples.
λ-CAE-LV 0%(0) 0%(0) 100%
λ-CAE-Ld 0%(0) 0%(0) 100% B. PCB fiberglass substrates with less-regular background texture
*100 defect-free and 33 defective test images.
Fig. 8(a) and (b) shows, respectively, defect-free and defective
woven fiberglass substrates of printed circuit boards (PCBs). The PCB
with two bar-shaped LED lights for the illumination. The image reso
fabric weaves present also structural textures in the image. The surface
lution is 16.6 μm/pixel.
patterns are relatively blurred with some structure variations, compared
The LCD panel is composed of orthogonal data lines and gate lines
to the LCD patterns above. As observed closely from the demonstrative
and results in homogeneously structured textures in the image. The
images, the test samples contain varying structural patterns. This is
defects could be particles, pinholes or contaminants in the LCD surfaces.
because the test samples were collected from different machine vision
The image patch size for the experiment is 28 × 28 pixels, which cor
systems with different image resolutions in various production lines. The
responds to 0.47 × 0.47 mm2. The feature map is 7 × 7. The number of
images were taken by a line-scan 8 K mono camera with line-bar lighting
defect-free LCD sample images for training the CAE models is 400. The
for illumination. The working distance was around 400 mm, depending
test image patches contain 100 defect-free and 33 defective samples.
Fig. 8. PCB fiberglass substrates: (a) 4 defect-free woven surfaces; (b) defects of various types on fabric weaves.
8
D.-M. Tsai and P.-H. Jen Advanced Engineering Informatics 48 (2021) 101272
Table 7 The acne patch image to the CAE models is of size 300 × 300 pixels.
Detection results on the PCB fiberglass substrates by CAE, λ-CAE with loss LV , The feature-map size is 25 × 25 for feature extraction. A total of 400
and λ-CAE with loss Ld . defect-free samples are used for the model training. The test samples for
Model FP rate FN rate Accuracy the evaluation contain 100 defect-free and 500 defective images. Table 8
CAE 7%(7) 44%(88) 68%
lists the resulting FP and FN rates from the standard CAE and the pro
λ-CAE-LV 4%(4) 25%(50) 82% posed λ-CAE with loss LV and Ld . It shows that the standard CAE and the
λ-CAE-Ld 4%(4) 21%(41) 85% proposed λ-CAE with loss LV yield the same FN rate of 20%. However,
the λ-CAE model with loss LV improves the FP rate from 10% to a low
*100 defect-free and 200 defective test images. 1%. The λ-CAE with loss Ld further improves the detection accuracy from
81% to 86%.
on the manufacturing sites. The test image is of size 100 × 100 pixels,
which corresponds to 8 × 8 mm2. The image resolution is 80 μm/pixel. D. Public DAGM datasets
The defects could be blob-shaped in random locations, or line-shaped
along the warp or weft direction in the woven substrates. The input The DAGM competition datasets available on Kaggle for industrial
image size to the CAE models is 100 × 100 pixels. The feature map is of optical inspection are also evaluated in the experiment. The DAGM
size 13 × 13 for feature extraction. A total of 500 defect-free samples are datasets contain textured surfaces with tiny defects in 512 × 512 images.
used for the model training. The test images for the evaluation contain Two DAGM datasets that involve structural texture and random texture
100 defect-free and 200 defective samples. Table 7 presents the resulting are analyzed. The representative DAGM textured images for the evalu
FP and FN rates from the standard CAE and the proposed λ-CAE with loss ation are demonstrated in Fig. 10. The full 512 × 512 image size is used
LV and Ld . The results show that the standard CAE yields a low FP rate for the experiments. Each textured pattern is trained with 900 defect-
7% with a poor FN rate of 44%. The proposed λ-CAE models give also a free samples, and is evaluated with 100 defect-free and 100 defective
low FP rate of 4%, and improve the FN rate down to 21%. The detection samples. The performance of the conventional CAE and the regularized
accuracy is improved from 68% up to 85%. CAE models for the 2 DAGM datasets is present in Table 9. The proposed
The experiment has revealed that the proposed λ-CAE model can methods can detect small defects in a large textured image. The results
indeed improve the discrimination for anomaly detection, even the show that the regularized CAE models can improve the detection ac
defect-free training samples involve a variation in the texture back curacy by 5% to 10% over the conventional CAE model for the two
ground. It is believed that the CAE-based anomaly detection can be DAGM datasets.
further improved if the CAE models are individually trained with the
texture samples from the same imaging system. E. Comparison with other anomaly detection methods
C. Medical acne patches with non-textural patterns The proposed methods (CAE with regularizations Lv and Ld) are
compared to Deep SVDD (Support Vector Data Description) (Refs.
The last demonstrative examples are medical acne patches. Fig. 9(a)
and (b) shows, respectively, defect-free and defective samples. The
patterned acne patch is of circular-shaped without textures on the sur Table 8
face. The defective patches could involve contaminants of various sizes Detection results on the medical acne patches by CAE, λ-CAE with loss LV , and
at random locations. Some defects are around the edges of the circular λ-CAE with loss Ld .
pads. The non-textured pattern images with subtle small defects are Model FP rate FN rate Accuracy
highly challenging for anomaly detection when only defect-free samples
CAE 10%(10) 20%(100) 81%
are used for the training. The IDS UI-3360CP camera with 25 mm lens at λ-CAE-LV 1%(1) 20%(100) 83%
a working distance of 150 mm was used to take the image. Coaxial λ-CAE-Ld 1%(1) 16%(79) 86%
lighting was used for illumination. A 300 × 300 image corresponds to
10 × 10 mm2 in physical size. The image resolution is 33 μm/pixel. *100 defect-free and 500 defective test images.
Fig. 9. Patterned acne patches: (a) 4 defect-free sample images; (b) defective sample images.
9
D.-M. Tsai and P.-H. Jen Advanced Engineering Informatics 48 (2021) 101272
Fig. 10. DAGM datasets for the evaluation: (a) structural textures (defect-free vs. defective); (b) random textures (defect-free vs. defective).
Table 9 Table 10
Detection results on the DAGM datasets by CAE, λ-CAE with loss LV , and λ-CAE Performance comparison of anomaly detection methods on various test datasets.
with loss Ld . (CAE models are based on 5 filters and λ = 0.1).
Fig. 11. MVTech carpet dataset for the evaluation: (a) defect-free image; (b) (e) MVTech carpet images (Fig. 11)
cut- defect sample. Model FP rate FN rate Accuracy
10
D.-M. Tsai and P.-H. Jen Advanced Engineering Informatics 48 (2021) 101272
11
D.-M. Tsai and P.-H. Jen Advanced Engineering Informatics 48 (2021) 101272
Fig. B1. ROC curves of CAE based on feature extraction (dx ) and reconstruction
error (Δεx ).
In this appendix, the performance of the CAE and VAE models for defect detection is evaluated. A VAE model also inherits the CAE architecture, but
learns the joint distribution over the input image and a set of latent variables by Bayesian inference, where the latent variables are restricted to be a
standard multivariate Gaussian distribution. The wood image examples described in Section 4.1 are used for the evaluation. The training and test
samples are the same as those described in the text. The feature vector is 320-dimensional for the CAE and, thus, the latent variables are set to 320 for
the VAE for a fair comparison. Fig. A1 shows the box-plots of defect-free and defective test samples for the CAE and VAE. The Fisher’s ratios are 0.14
and 0.09 for CAE and VAE, respectively.
Fig. A2 further presents the ROC curves of the CAE and VAE models. The ROC curve of CAE completely dominates that of VAE for defect detection
in natural wood surfaces. All evaluation measures indicate the representative features extracted from the CAE perform better than the latent variables
extracted from the VAE for defect detection applications when only defect-free samples are used for the model training.
This appendix evaluates the performance of feature extraction and reconstruction error for defect detection with CAE. The discriminant measure is
⃦ ⃦ ⃦ ⃦
⃦ ⃦ ⃦ ⃦
given by the distance dxI = ⃦VxI − V ⃦ for feature extraction, and ΔεxI = ⃦ x
⃦ I − x
̂ I ⃦ for reconstruction error, where xI is the input test image, ̂
⃦ x I the
reconstructed image, and VxI the extracted features from the trained CAE. The natural wood examples described in Section 4.1 are used for the
evaluation. The SPC thresholdings are applied to dxI and ΔεxI for the discrimination. Fig. B1 illustrates the ROC curves based on the feature extraction
and reconstruction error. The results indicate that the ROC curve of the feature-extraction measure completely dominates that of the reconstruction-
error measure. For defect detection applications that involve small defect sizes, the feature-extraction measure outperforms the reconstruction-error
measure.
References [11] J.B. Florindo, O.M. Bruno, Texture analysis by fractal descriptors over the wavelet
domain using a best basis decomposition, Physica A 444 (2016) 415–427.
[12] W.-C. Li, D.-M. Tsai, Wavelet-based defect detection in solar wafer images with
[1] X. Xie, A review of recent advances in surface defect detection using texture
inhomogeneous texture, Pattern Recogn. 45 (2012) 742–756.
analysis techniques, Electron. Lett. Comput. Vision Image Anal. 7 (2008) 1–22.
[13] M. Ahmed, A.N. Mahmood, J. Hu, A survey of network anomaly detection
[2] K. Hanbay, M.F. Talu, O.F. Ozguven, Fabric defect detection systems and methods -
techniques, J. Network Comput. Appl. 60 (2016) 19–31.
a systematic literature review, Optik 127 (2016) 11960–11973.
[14] R. Domingues, M. Fillippone, P. Michiardi, J. Zouaoui, A comparative evaluation of
[3] S.H. Hanzaei, A. Afshar, F. Barazandeh, Automatic detection and classification of
outlier detection algorithms: experiments and analysis, Pattern Recogn. 74 (2018)
the ceramic tiles’ surface defects, Pattern Recogn. 66 (2017) 174–189.
406–421.
[4] L. Liu, P. Fieguth, Y. Guo, X. Wang, M. Pietikanen, Local binary features for texture
[15] R. Mohammadi-Ghazi, Y.M. Marzouk, O. Buyukozturk, Conditional classifiers and
classification: taxonomy and experimental study, Pattern Recogn. 62 (2017)
boosted conditional Gaussian mixture model for novelty detection, Pattern Recogn.
135–160.
81 (2018) 601–614.
[5] S.A.H. Ravanidi, N. Pan, The influence of gray-level co-occurrence matrix variables
[16] M.S. Sadooghi, S.E. Khadem, Improving one class support vector machine novelty
on the textural features of wrinkled fabric surfaces, J. Text. Inst. 102 (2011).
detection using nonlinear features, Pattern Recogn. 83 (2018) 14–33.
[6] F. Bianconi, A. Fernandez, Rotation invariant co-occurrence features based on
[17] D. Chakraborth, V. Narayanan, A. Ghosh, Integration of deep feature extraction
digital circles and discrete Fourier transform, Pattern Recogn. Lett. 48 (2014)
and ensemble learning for outlier detection, Pattern Recogn. 89 (2019) 161–171.
34–41.
[18] Y. LeCun, Y. Bengio, G. Hinton, Deep learning, Nature 521 (2015) 436–444.
[7] G.-H. Hu, Q.-H. Wang, G.-H. Zhang, Unsupervised defect detection in textiles based
[19] J. Wang, Y. Ma, L. Zhang, R.X. Gao, D. Wu, Deep learning for smart manufacturing:
on Fourier analysis and wavelet shrinkage, Appl. Opt. 54 (2015) 2963–2980.
methods and applications, J. Manuf. Syst. 48 (2018) 144–156.
[8] A.S. Malek, L. Bique, Optimization of automated online fabric inspection by fast
[20] A. Krizhevsky, L. Sutskever, G. Hinton, ImageNet classification with deep
Fourier transform (FFT) and cross-correlation, Text. Res. J. 83 (2013) 256–268.
convolutional neural networks, Adv. Neural Inf. Process. Syst. 25 (NIPS) (2012).
[9] J. Jing, H. Zhang, P. Li, Improved Gabor filters for textile defect detection, Procedia
[21] D. Weimer, B. Scholz-Reiter, M. Shpitalni, Design of deep convolutional neural
Eng. 15 (2011) 5010–5014.
network architectures for automated feature extraction in industrial inspection,
[10] F. Riaz, A. Hassan, S. Rehman, U. Qamar, Texture classification using rotation- and
CIRP Ann. – Manuf. Technol. 65 (2016) 417–420.
scale-invariant Gabor texture, IEEE Signal Process Lett. 20 (2013) 607–610.
12
D.-M. Tsai and P.-H. Jen Advanced Engineering Informatics 48 (2021) 101272
[22] F.-C. Chen, M.R. Jahanshahi, NB-CNN: deep learning-based crack detection using [32] A. Makhzani, B.J. Frey, Winner-take-all autoencoders, NIPS (2015) 2791–2799.
convolutional neural network and naïve Bayes data fusion, IEEE Trans. Ind. [33] J.K. Chow, Z. Su, J. Wu, P.S. Tan, X. Mao, Y.H. Wang, Anomaly detection of defects
Electron. 65 (2018) 4392–4400. on concrete structures with the convolutional autoencoder, Adv. Eng. Inf. 45
[23] D. Soukup, R. Huber-Mork, Convolutional neural networks for steel surface defect (2020) 101105.
detection form photometric stereo image, in: Int’l. Symp. on Visual Computing, [34] J. An, S. Cho, Variational autoencoder based anomaly detection using
2014, pp. 668–677. reconstruction probability, Technical report, SNU Data Mining Center, 2015.
[24] H. Yang, S. Mei, K. Song, B. Tao, Z. Yin, Transfer-learning-based online mura defect [35] C. Zhou, R.C. Paffenroth, Anomaly detection with robust deep autoencoders, in:
classification, IEEE Trans. Semicond. Mfg. 31 (2018) 116–123. ACM SIGKDD Conf. on Knowledge Discovery and Data Mining, 2017.
[25] W. Dai, A. Mujeeb, M. Erdt, A. Sourin, Soldering defect detection in automatic [36] H. Yang, Y. Chen, K. Song, Z. Yin, Multiscale feature-clustering-based fully
optical inspection, Adv. Eng. Inf. 43 (2020) 101004. convolutional autoencoder for fast accurate visual inspection of texture surface
[26] Y. Wang, M. Liu, P. Zheng, H. Yaung, J. Zou, A smart surface inspection system defect, IEEE Trans. Auto. Sci. Eng. 16 (2019) 1450–1467.
using faster R-CNN in cloud-edge computing environment, Adv. Eng. Inf. 43 (2020) [37] T. Schlegl, P. Seebock, S.M. Waldstein, U. Schmidt-Erfurth, G. Langs, Unsupervised
101037. anomaly detection with generative adversarial networks to guide marker
[27] S. Niu, B. Li, X. Wang, H. Lin, Defect image sample generation with GAN for discovery, in: Int’l. Conf. on Information Processing in Medical Imaging, 2017, pp.
improving defect recognition, IEEE Trans. Auto. Sci. Eng. 17 (2020) 1611–1622. 146–157.
[28] D. Kwon, H. Kim, J. Kim, S.C. Suh, I. Kim, J.J. Kim, A survey of deep learning-based [38] T. Schlegl, P. Seebock, S.M. Waldstein, G. Langs, U.M. Schmidt-Erfurth, f-AnoGAN:
network anomaly detection, Cluster Comput. 22 (2019) 949–961. fast unsupervised anomaly detection with generative adversarial networks, Med.
[29] L. Ruff, R. Vandermeulen, N. Gornitz, L. Deecke, S.A. Siddiqui, A. Binder, E. Muller, Image Anal. 54 (2019) 30–44.
M. Kloft, Deep one-class classification, in: Proc. of Intl. Conf. on Machine Learning [39] H. Zenati, C.-S. Foo, B. Lecouat, G. Manek, V.R. Chandrasekhar, Efficient GAN-
Research, vol. 80, 2018, pp. 4393–4402. based anomaly detection, in: Intl. Conf. on Learning Representations (ICLR), 2018.
[30] L. Ruff, R. A. Vandermeulen, N. Gornitz, A. Binder, E. Muller, M. Kloft, Deep [40] L. Deecke, R. Vandermeulen, L. Ruff, S. Mandt, M. Kloft, Image anomaly detection
support vector data description for unsupervised and semi-supervised anomaly with generative adversarial networks, in: Joint European Conf. on Machine
detection, in: ICML Workshop on Uncertainty and Robustness in Deep Learning, Learning and Knowledge Discovery in Databases, 2018.
2019. [41] M. Arjovsky, L. Bottou, Towards principled methods for training generative
[31] M. Tschannen, O. Bachem, M. Lucic, Recent advances in autoencoder-based adversarial networks, in: Intl. Conf. on Learning Representations (ICLR), 2017.
representation learning, in: Third Workshop on Bayesian Deep Learning (NeurIPS), [42] K. Choi, M. Wu, N. Goodman, S. Ermon, Meta-amortized variational inference and
2018. learning, arXiv preprint arXiv:1902.0195v1, 2019.
13