Monitoring of Casting Quality Using Principal Component Analysis and Self Organizing Map
Monitoring of Casting Quality Using Principal Component Analysis and Self Organizing Map
https://doi.org/10.1007/s00170-022-08993-9
ORIGINAL ARTICLE
Received: 2 November 2021 / Accepted: 27 February 2022 / Published online: 7 March 2022
© The Author(s), under exclusive licence to Springer-Verlag London Ltd., part of Springer Nature 2022
Abstract
The monitoring of casting quality is very important to ensure the safe operation of casting processes. In this paper, in order
to improve the accurate detection of casting defects, a combined method based on principal component analysis (PCA)
and self-organizing map (SOM) is presented. The proposed method reduces the dimensionality of the original data by the
projection of the data onto a smaller subspace through PCA. It uses Hotelling’s T 2 and Q statistics as essential features for
characterizing the process functionality. The SOM is used to improve the separation between casting defects. It computes
the metric distances based similarity, using the T2 and Q (T2Q) statistics as input. A comparative study between conventional
SOM, SOM with reduced data, and SOM with selected features is examined. The proposed method is used to identify the
running conditions of the low pressure lost foam casting process. The monitoring results indicate that the SOM based on
T2Q as feature vectors remains important comparatively to conventional SOM and SOM based on reduced data.
Keywords Condition monitoring · Principal component analysis · Self-organizing map · Casting quality · Hotelling’s T2 · Q
statistic
13
Vol.:(0123456789)
3600 The International Journal of Advanced Manufacturing Technology (2022) 120:3599–3607
13
The International Journal of Advanced Manufacturing Technology (2022) 120:3599–3607 3601
Table 2 Process variables National Instruments (NI) data acquisition board with a USB
Variable Description Unit
interface. The foam pattern was supported in the flask and
the thermocouples were wired to the data acquisition unit.
T Temperature °C When the vessel is pressurized, liquid metal rises through a
P Pressure bar steel pipe into the flask. All test part was cast using AlSi12
S Rise (height of filling) m alloy at temperatures between 725 and 750 °C (Table 1). NI
T1 Temperature 1 °C DASY Lab software was used to collect and analyze the sig-
T2 Temperature 2 °C nals from the temperature sensors. The measured variables
T3 Temperature 3 °C are listed in Table 2 and presented in Fig. 2.
T4 Temperature 4 °C Ten days of experimental measurements were collected
T5 Temperature 5 °C from 9 casting tests, including normal data and faulty data.
Each data set includes eight measurement variables and m
observations or samples at different times which make it
The experimental measurements presented in this paper possible to construct an m × 8 input matrix. The obtained
are entirely based on the casting data acquired from the low matrices, from all measured data, are then used to compute
pressure lost foam casting process. As shown in Fig. 1, the the PCA and SOM models to better control the quality of
casting process utilizes a resistance furnace capable of melt- castings.
ing standard aluminum base alloys. Air pressure is applied
to the chamber containing the crucible to push liquid metal
up into a flask containing the foam pattern and unbounded
sand. More details about the casting process were reported 3 Improvement of condition monitoring
in [19, 20]. using the hybrid form PCA‑SOM
The low pressure lost foam casting process is utilized
to create complex castings. However, filling the mold with 3.1 Related theories
molten metal can produce an undesired casting. Generally,
a casting defect is defined as all observable and unplanned 3.1.1 Principal component analysis
variation. If a defect occurs, measures must be adopted to
control and monitor the casting conditions. The data used The PCA [21] is a multivariate statistical analysis technique,
in this study were gained through the experimental meas- which reduces the original data space into a smaller dimen-
urements in normal (NR) or healthy operating conditions sion space in terms of protecting the main original data
and abnormal functionality of the casting process including information. Given data matrix X ∈ ℜ m×n composed of m
three defects: cracks (SC), holes (SH), and metal penetra- observations or samples and n variables, which have been
tion (MP). normalized to have 0 mean and unit variance, PCA is only
To acquire data, five temperature transducers were imple- interested in its variance and covariance. PCA actually relies
mented in the process by five thermocouples for temperature on eigenvalue/eigenvector decomposition of the covariance
input. The pressure and temperature inputs were wired to a or correlation matrix C given as follows:
750 500
0.2 0.2
Temperature1 (°C)
Temperature (°C)
Pressure (bar)
748 400
0.15 0.15
Rise (m)
746 300
0.1 0.1
744 200
740 0 0 0
0 50 100 0 50 100 0 50 100 0 50 100
Time (s) Time (s) Time (s) Time (s)
Temperature3 (°C)
Temperature4 (°C)
Temperature5 (°C)
0 0 0 0
0 50 100 0 50 100 0 50 100 0 50 100
Time (s) Time (s) Time (s) Time (s)
13
3602 The International Journal of Advanced Manufacturing Technology (2022) 120:3599–3607
where D = diag(λ1….λn) is a diagonal matrix with diagonal Artificial neural networks (ANNs) are mathematical or com-
elements in decreasing magnitude order and P contains the putational models, inspired by biological nervous system.
eigenvectors. ANNs are comprised of an interconnected group of artifi-
PCA determines an optimal linear transformation of the cial neurons, and they have long been used for data-driven
data matrix X in terms of capturing the variation in the data decision-making. Based on their learning process (supervised
as follows: or unsupervised), ANNs are performed on the computer to
perform certain specific tasks like optimization and pattern
T = XP and X̂ = TPT (2) recognition. They can help draw useful conclusions and inter-
where T ∈ ℜ m×k is the principal component matrix and the pretations from observed data.
matrix, P ∈ ℜ m×k contains the principal vectors which are the ANNs using supervised learning like multi-layer per-
eigenvectors associated to the eigenvalues λi of the covari- ceptron (MLP), probabilistic neural networks (PNN), and
ance matrix and k denotes the principal components number radial basis functions (RBF) have proved to be advanta-
(PCs) of PCA model. A key issue to develop a PCA model geous in obtaining a good model with accurate predictions.
is to choose the adequate number of PCs. A number of well- However, they require that the output vector be known for
known techniques have been proposed for selecting the num- training phase. Unlike supervised learning, output vector is
ber of PCs [22]. In this study, the specific calculation method not required to be known with unsupervised learning, i.e.,
of selected k follows the cumulative percent variance (CPV) the network does not use training pairs consisting of input
principal [23]. It is a measure of the percent variance, such vector and desired output.
as 85%, captured by the first PCs. The SOM, invented by Kohonen [24], is a kind of ANNs
The difference between X and X̂ is the residual matrix E. that use unsupervised competitive learning to map a high
It can be calculated as follows: dimensional input space (the data space) onto a low dimen-
( ) sional output space, usually of two dimensions. It is a tech-
E = X − X̂ = X I − PPT (3) nique to group data with similar characteristics. Each neuron
or node comprises a vector of weights of the same dimension
where I is the unit matrix. as the input data vectors. The SOM is trained by presenting
To perform process fault detection, a PCA model of the the data repeatedly and upgrading the weights to learn the
normal operating conditions must be built. When a new structure of the data.
observation data is subject to faults, these new data can be The SOM consists of an input layer and a competitive
compared to the PCA model. The correlation of the new data or output layer, fully interconnected to each other. The
is detected by Hotelling’s T 2 and Q, called also as squared output layer consists of m neurons. Each neuron i (i = 1,
prediction error (SPE), statistics as follows: 2,..., m) is represented by an n dimensional weight vector
(4) wi = [wi1,....,win] where n is the dimension of the input vector.
T 2 = X T PD−1 PT X
In the output layer, the competitive process is done and the
( )T ( ) weight of connection is updated to choose a winner neuron.
Q = ET E = X − X̂ X − X̂ (5) The key steps in the SOM learning process (training) are
first, for each input vector, determining its best matching
The process is considered normal if Hotelling’s T 2 and unit (BMU). The BMU is the node that is most similar to
Q statistics do not exceed threshold values. These statis- the input vector. If we denote b the BMU of input vector x
tics alone could not detect the faulty conditions. They are and wb the weight vector of this BMU, the identification is
used, in this study, as input to the SOM algorithm to further based on the minimum Euclidian distance, which is defined
improve the separation between casting conditions. as follows:
The computing steps using PCA method are summarized
{ }
as follows: ‖x − wb ‖ = mini=1,….,m ‖x − wi ‖
‖ ‖ ‖ ‖ (6)
13
The International Journal of Advanced Manufacturing Technology (2022) 120:3599–3607 3603
3.2 Similarity measure
current operating data and process conditions. The differences
After SOM training, the winner neuron is used to com- between two running conditions i and j (ΔDIN = DINi − DINj
pute metric distances. Distances or similarity measures are and ΔOGX = OGXi − OGXj) are used as indexes to distinguish
essential to solve many pattern recognition problems such the defective and healthy casting conditions.
as classification and clustering. Various distance measures
are applicable. In this work, we consider a similarity evalu- 3.3 Description of the proposed method
ation based on the Euclidian distances [16, 17]. There will
be used different formula to measure distances between two In order to construct a successful casting defects identi-
data points characterizing two monitoring conditions. The fication system, a combination using PCA and SOM is
followings distances can be applied: described in this work (Fig. 4). PCA is trained with input
matrix that contains the process variables, where the goal
• Distance between input and neuron data: is to establish the normal statistical correlation among the
√
( )
√ n
√∑ ( )2 measured data to characterize the operating conditions of
DIN x, wbj =√ xj − wbj (8) the casting process using Hotelling’s T2 and Q statistics.
j=1 Sometimes, these statistics cannot exactly detect the defec-
tive conditions. They are therefore proposed in this study
as characteristic features of the measured casting data. The
• Mean or gravity center of the input data:
n chosen features are used as input to SOM algorithm, which
∑
is used for casting quality evaluation. The main objective is
GCI = (1∕n) xj (9)
j=1 to compute the distances between the input vectors and win-
ning neuron to better identify casting defects. A compara-
tive study between conventional SOM, SOM with reduced
• Gravity center of the neuron data: data matrix (RD), and SOM with selected features (T2Q)
/ n
n
∑ ∑ is examined.
GCN = wbj xj wbj (10)
j=1 j=1
13
3604 The International Journal of Advanced Manufacturing Technology (2022) 120:3599–3607
Table 3 Eigenvalues and variances of PCs the performance of the SOM and PCA-SOM techniques;
PCs Eigenvalues Variances (%)
the corresponding indexes are thus calculated and discussed.
Through PCA, the eigenvalues of the covariance matrix,
1 6.4363 80.45 which are the variances of PCs, are listed in Table 3. The
2 1.0164 12.71 anterior 2 PCs explain over 85 of the total variance of the
3 0.2960 3.70 data. The PCA model is established making use of them, and
4 0.1143 1.43 then the monitoring performance is progressed. As shown
5 0.0700 0.87 in Fig. 5, all process variables are correctly estimated with
6 0.0395 0.50 this PCA model, except that some certain variables are less
7 0.0199 0.24 well estimated than others.
8 0.0076 0.10 According to SOM algorithm and PCA-SOM with RD as
input (PCA-SOM/RD), casting conditions are evaluated by
the metric distances DIN and OGX given by Eqs. (8) and
During this test period, three types of casting defects (SC, (11), respectively. Obtained results are represented in Figs. 6
SH, MP) were recorded and the remainder of the castings and 7. As we can see, it is not easy to distinguish between
were obtained without defects (NR). In total, 8 variables and normal and abnormal conditions and also between false and
930 data points at different times, for each condition, are missed alarms.
collected into a 930 × 8 matrix for computing the PCA and The proposed algorithm, PCA-SOM with T2Q as input
SOM models. The obtained matrices are used to evaluate (PCA-SOM/T 2Q), is tested using the same computed
4 2 1 1
0
0 0
Temperature1 (°C)
Temperature (°C)
2
Pressure (bar)
-1
Rise (m)
-2 -1
0 -2
-4 -2
-3
-2
-6 -4 -3
-4 -8 -5 -4
0 50 100 0 50 100 0 50 100 0 50 100
Time (s) Time (s) Time (s) Time (s)
1 1 1 2
0 0 0
Temperature4 (°C)
Temperature2 (°C)
Temperature3 (°C)
Temperature5 (°C)
0
-1 -1 -1
-2 -2 -2 -2
-3 -3 -3
-4
-4 -4 -4
-5 -5 -5 -6
0 50 100 0 50 100 0 50 100 0 50 100
Time (s) Time (s) Time (s) Time (s)
13
The International Journal of Advanced Manufacturing Technology (2022) 120:3599–3607 3605
indexes. The isolation level of the normal and abnormal To evaluate the performance of the SOM, PCA-SOM/
casting conditions is clearer than the SOM and PCA-SOM/ RD and PCA-SOM/T2Q techniques, a comparative study
RD. The tested distances indicate that the casting defects using ΔDIN and ΔOGX distances is carried out. Differ-
are successfully identified (Fig. 8). The output results can ences between computed indexes of defective and healthy
be used to describe the capability of fault isolation. casting conditions are listed in Table 4. From obtained
OGX index
DIN index
1000 800
800
600
NR
600
SC 400
400 SH
MP 200
200
0 0
0 200 400 600 800 1000 0 200 400 600 800 1000
Sample number Sample number
14
OGX index
DIN index
40
12
30
10
8 20
6
10
4
2 0
0 200 400 600 800 1000 0 200 400 600 800 1000
Sample number Sample number
100
50
50
0 0
0 200 400 600 800 1000 0 200 400 600 800 1000
13
3606 The International Journal of Advanced Manufacturing Technology (2022) 120:3599–3607
13
The International Journal of Advanced Manufacturing Technology (2022) 120:3599–3607 3607
18. Bendjama H, Mahdi D (2016) Computer code for materials diag- 22. Tamura M, Tsujita S (2007) A study on the number of principal
nosis using Monte Carlo method and neural networks. J Fail Anal components and sensitivity of fault detection using PCA. Comput
Prev 16(4):931–937 Chem Eng 31:1035–1046
19. Lang L (1999) Development and testing of a low pressure lost 23. Jackson JE (2003) A user’s guide to principal components. John
foam casting machine and an examination of the process. PhD Wiley & Sons, New York
thesis, Freiberg University, Germany 24. Kohonen T (2001) Self-organizing Maps. Springer-Verlag, Berlin
20. Bast J, Aitsuradse M, Hahn T (2004) Advantages of the low pres-
sure lost foam casting process. AFS Transactions 112:1131–1144 Publisher's Note Springer Nature remains neutral with regard to
21. Anderson TW (2003) An introduction to multivariate statistical jurisdictional claims in published maps and institutional affiliations.
analysis. John Wiley & Sons, California
13