1. Introduction
Sapphire crystal with excellent thermal properties, optical properties, and mechanical properties has important scientific research value and wide market prospect [
1,
2,
3], such as excellent light-emitting diodes of the substrate material etc. However, the quantity of natural sapphire is rare and its purity is low. Hence, it cannot meet the great demand of the industry and other fields. Therefore, it is important to efficiently produce mass sapphire by artificial synthesis.
The growth of artificial crystal is usually finished in the melting furnace through the process of melting, seeding, crystallizing, and cooling [
4], and the whole process depends on the temperature distribution of the melt surface. Because the melting point of sapphire crystal is up to 2045 °C, tungsten rods are used to heat the crucible furnace, and the maximum electric current through the tungsten rods is more than 7000 A when the equipment is working. Hence, it is difficult to measure the temperature in the furnace by sensors directly or set up pyrometers for obtaining the temperature distribution. Therefore, it is indispensable to develop technologies to obtain the status of crystal growth and it is a good method to measure the temperature distribution by machine vision [
5]. Thus, the operators could judge the growth status of the crystal in the stove by human-machine interface rather than observing the melt surface by their naked eyes, since long observation of the melt surface is harmful to the eyes. A CCD image sensor, produced by Hitachi Corp. in Japan, is used here for image acquisition. The images obtained by this sensor are of good quality. In fact, images can be affected by a number of factors. Due to the positions of the tungsten rods, heat waves from the tungsten rods arranged on the edge are not coherent in every direction and the light is strong in the furnace. Under this environment, the image of objective and background is very close and it causes the contrast degree of the images obtained by CCD to be low. This special signal environment and high temperature of more than 2000 °C will lead to low contrast and resolution of CCD images. In addition, uneven and random interference, such as oil mist and noise from the heater, will often make the texture of the melting interface obscure. Obviously, these all make it difficult to form a correct description of the status of crystal growth.
The temperature distribution of the melting interface in the seeding step has an important influence on product quality [
6]. The purpose of seeding is to remove impurities and dislocation [
7]. For this reason, the key is to find the right growth temperature for the sapphire and to develop fine seeds. As shown in
Figure 1, improper operation will directly lead to crystal defects, including polycrystallinity, cracking, air bubbles, and so on. Therefore, it is necessary to collect information of the melting interface by CCD and to judge the growth status by the segmented images in real-time [
7]. Obviously, it is also valuable to realize automatic and large-scale production.
To obtain the growth status of sapphire correctly and quickly, image segmentation is no doubt an important topic, and thus it is the emphasis of this paper. Many techniques have been developed for image segmentation. Among them, the principle of similarity has been widely used, which utilizes the similarity among the image objects with pre-defined criteria for partitioning. Thresholding, edge detecting, region splitting, and clustering are similarity-based methods [
8,
9]. In recent years, increasing importance has been given to novel technologies such as the fractal-wavelet technique in image segmentation [
10,
11].
1.1. Methods Based on Threshold Segmentation
Threshold segmentation of digital images is one of the widely used techniques for image segmentation. It divides a digital image with several or fixed threshold values. In the same kinds of pixels, objects with same gray value are regarded as the same object [
8]. The characteristics of this method are simple and convenient; it does not need prior information of the image. However, lighting and noise will affect the determination of the threshold. Image information may be lost if the gray value of the target is not far from that of the background. Hence, it does not work well for an image without any obvious peaks or with broad and flat valleys [
12].
1.2. Methods Based on Edge Detection Segmentation
Edge detection can realize image segmentation by recognizing and extracting the features of the image boundary and can detect image contours. The image obtained after edge segmentation well retain the image’s morphological features. In the region of the image edge, the gray value changes significantly. On the contrary, in the edge of the area, the image gray value changes little [
9]. Thus, a derivative of the maximum or the zero point of the second derivative can be used to determine the boundary of the image. The first order differential operators include the Roberts operator, Sobel operator, Prewiit operator, etc. The second order differential operators include the Laplacian operator, Kirsh operator, and so on. The advantage of edge detection algorithms is the convolution operation between template and image, which greatly reduces the time complexity. It works well for images with good contrast between regions and it does not work well with images in which the edges are ill-defined or too numerous [
12]. In order to minimize the impact of noise on image segmentation, image smoothing is often required in these methods.
1.3. Methods Based on Region Segmentation
Region segmentation divides an image into different regions according to the similarity criteria. Methods based on common regional segmentation include the regional growth method, regional division combination method, and watershed method. The key idea of this method is to select the right seed points and determine the right growth rules [
8]. The advantage of regional segmentation is that it considers not only the similarity of pixel grayscale values, but also the spatial adjacency of pixel points. It works best when the region homogeneity criterion is easy to define. However, region growing has inherent dependence on the selection of the seed region and the order in which pixels and regions are examined. The segments resulting from region splitting appear too square due to the splitting scheme [
12].
1.4. Methods Based on Clustering Segmentation
The clustering method divides the sample data into different categories according to the similarity of measurement criteria However, there are two problems in this method. One is how to compare the similarity between sample data, and the other is how to divide these data into different categories according to their similarity [
13]. The main advantage of this method is its strong adaptability to various images, while the disadvantages are that it can fall into local optimization and has poor noise resistance.
Considering the above existing segmentation methods, thresholding holds the prime position from the view of robustness, simplicity, and accuracy [
14]. Thresholding-based segmentation subdivides an image into smaller segments, using at least one gray level value to define its boundary. However, it is difficult to choose the threshold values because gray level histograms of real-world images are more complex than bi-model grey level histograms.
Over the past few years, many thresholding-based segmentation techniques have been reported [
15,
16,
17]. Among them, Otsu’s method, based on the principle of between-class variance, has been proven to be one of the best thresholding methods [
15]. To decide the optimal threshold values, maximizing the between-class variance of the histogram is used in this method. The conventional Otsu method is often proposed to solve bi-level thresholding problems. However, this method suffers a serious drawback of exponential growth in computing complexity. Hence, it cannot be practically extended to multilevel thresholding problems. Several methods have been reported to improve efficiency and reduce complexity. Typically, recursive algorithms can reduce the long processing time to determine the optimum threshold values with the help of a lookup table [
16,
17]. But, at the same time, these methods still suffer from increasing computational time as the number of thresholds increases.
In order to overcome the above problems, some methods based on swarm intelligence have been proposed in recent years. These methods refer to utilizing the behavior characteristics of individuals with simple intelligence through cooperation and organization. They have naturally distributed and self-organization characteristics. Research on swarm intelligence has existed for many years, and several important results have been achieved. Evolutionary algorithms such as genetic algorithms (GA) [
18,
19], ant colony optimization [
20], particle swarm optimization (PSO) [
21,
22], and bacterial foraging algorithm (BF) [
23] are commonly used in image segmentation based on multilevel thresholding. Among these algorithms, GA, PSO, and BF are more popular. PSO algorithms are inspired by the intelligent behaviors of bird flocking to find the optimal thresholds [
24]. However, PSO shows some problems, such as the inability to find global optimization values, low convergence speed, and so on. Bacterial foraging is motivated by the foraging behavior of
Escherichia coli presented in the human intestine. The BF algorithm provides better performance based on the quality of the solution and the speed of convergence than the other existing multilevel thresholding methods. However, the efficiency and the robustness of this algorithm are related to the chemotaxis step size. Large step size accelerates the progress of searching for the optimum position but does not ensure the global optimum. On the other hand, small chemotaxis step size guarantees that bacteria will find the global optimum, but it requires more searching time. Meanwhile, a few state-of-the-art methods have been developed for multilevel thresholding based on bio-inspired computing paradigm, such as the Grey Wolf optimizer [
25], Krill Herd Optimization [
26], Whale Optimization, and the modified firefly algorithm [
27]. In [
28], a novel thresholding extraction method based on variational mode decomposition (VMD) is used to non-recursively decompose a histogram into several sub-modes for minimizing Otsu’s objective function. Recently, a multilevel thresholding technology using adaptive wind driven optimization was proposed [
29]. Though the above methods perform satisfactorily, achieving the best values for the objective function still requires high computational cost.
The image segmentation of a melting interface is essential to decide the status or correct seeding time for intelligent crystal growth equipment. However, strong heat radiation and turbulent flow fields in the high-temperature and high-luminance environment inevitably interfere with the CCD camera signal. This causes the target image to become blurred and the images of the melt interface to have poor texture. Hence, it is difficult to obtain good performance by traditional methods. Inspired by the fact that the grey value of the boundary of different regions is different due to the uneven illumination, and the grey value of the crystal image is continuously distributed, an improved Otsu algorithm based on dynamic particle swarm (DPSO) is presented for image segmentation during crystal growth. Firstly, particles are distributed in different areas, and different areas will undergo local optimization. Secondly, parallel processing of the proposed algorithm reduces the computing time and improves the performance. Then, image enhancement is adopted to enlarge the grey difference between crystal and boundary. This method maximizes the Otsu algorithm’s objective function using DPSO to find the optimal threshold band. After getting the threshold band, the given growth images are subdivided into small segments based on the thresholds.
The remainder of the paper is organized as follows: in
Section 2, a brief review of the Otsu algorithm and the standard PSO is presented. Then,
Section 3 describes the proposed method in detail. Then, experimental analysis and results are discussed in
Section 4. Finally, conclusions are drawn in
Section 5.
3. Proposed Method
A graphical illustration of the proposed method is given in
Figure 2. In the proposed method, DPSO is utilized to find the optimal thresholds for segmentation of the image by maximizing the objective function of the improved Otsu algorithm. The step by step procedure for the proposed method is as follows:
Step I: Parameter initialization
- (a)
Initializes the maximum number of iterations ;
- (b)
Initializes velocity and position of particles;
- (c)
Set the lower boundary and upper boundary of and ;
- (d)
Initializes the width of threshold band.
Step II: Image enhancement
- (a)
Histogram calculation;
- (b)
Histogram equalization.
The basic idea of histogram equalization is to put the histogram of the original image into a form of uniform distribution, thus increasing the pixel grayscale to a dynamic range to obtain the overall image contrast. The occurrence probability of the
-level grayscale of the original image can be denoted as
where
is the independent variable and
is defined as the function is the histogram of the image.
The distribution of
can be transformed into the uniform distribution of
by using the cumulative distribution function:
Step III: Fitness function calculation
Calculate the fitness of current particles’ position using the improved Otsu objective function.
Due to the low contrast of the image and inconspicuous boundary line, a single threshold cannot separate the crystal image well. Therefore, a continuous threshold band is proposed for segmentation. Because the crystal image has the characteristic of continuous grey values, taking the threshold band as the boundary line would classify a small part of the crystal as the boundary line and make the overall division clearer.
Equation (1) can be rewritten as:
The band is in the range of [
,
] and the image is divided into two categories,
and
, according to the threshold band.
and
are in the range of [0,
and [
,
], respectively. As shown in
Figure 3,
and
correspond to the background and target, and other areas represent noise and edges.
Step IV: Update position and velocity for each particle (using Equations (4) and (5))
The inertial coefficient is a key parameter for PSO performance. The fixed inertial parameter will cause the algorithm to converge to the local optimal solution. Moreover, the linear reduction of the inertial parameter will cause the algorithm to miss the global optimal solution. Hence, in the proposed method, the
in Equation (4) is adopted [
30]:
where
and
denote maximum or minimum inertia coefficient, respectively;
is the current adaptive value with particle
; and
is the current average fitness for all particles;
is the minimum fitness for all particles. When the fitness values of all particles converge or tend to converge, the inertia weight
is automatically increased. On the other hand, when the fitness value of all particles scattered, the inertia weight value is automatically reduced.
here is called dynamic inertia weight because the inertia weight is dynamically changed along with the fitness value. The advantages of dynamic inertia weight are shown in [
30].
Also, in [
30], dynamic learning factors are proposed. The selection of learning factors will also directly affect the output results. Therefore, it is necessary to select the appropriate learning factors in the optimizing process. The calculation formula is
where
;
,
denotes the maximum number of iterations.
and
are automatically become larger or smaller along with time. Thus, at the initial stage of optimization, particles have strong self-learning ability and weak collective learning ability, enhancing the algorithm’s global search capability. In the later stages of optimization, the particles will have weak self-learning capability and strong social-learning ability so that the convergence to the global optimal solution can be accelerated.
- Step V:
Calculate fitness for each particle after updating.
- Step VI:
If the number of iterations reaches the maximum go to Step VII. Otherwise go to Step IV.
- Step VII:
Select the threshold band associated with the overall best particles.
Besides the Otsu objective function, regional consistency is also adopted to judge the effect of image segmentation. The interior of the region formed by image segmentation is similar, and the quality of the segmentation can be described by the degree of uniformity within each region. The
region is represented by
,
stands for its area, and
represents the pixel value of the segmentation image with coordinate as
. Then, the internal uniformity of the segmented image can be expressed as:
where
is the total number of pixels in the image,
ranges in (0,1). The larger the
is, the better the uniformity within the region is.