1. Introduction
Space infrared detector is an essential part of the earth observation and remote sensing system, which plays a vital role in early warning, missile interception, and other aspects, and is one of the research hotspots in the military field [
1]. Infrared detection technology has such advantages as strong survivability, good portability, and the ability to detect radar blind areas [
2]. The infrared and visible light detectors and telescopes carried by satellites are used to detect and track target aircraft, ships, etc. These infrared radiations are represented as infrared dim small targets in satellite infrared images, and the performance of infrared target detection algorithm is mainly reflected in the detection ability of infrared dim small targets. With the continuous development of infrared imaging detection system, algorithms for small target detection and recognition have been emerging in recent years [
3,
4,
5,
6,
7,
8,
9,
10,
11]. However, because there are a large number of natural landscapes with high radiation in the imaging band of infrared images, such as cirrus, which is similar to the target in the satellite infrared image and has high gray level, it may cause false alarm of early warning system and interfere with small target detection, thus, it is difficult to detect small targets directly. In order to solve the problems existing in small target detection, it is necessary to study the imaging characteristics and detection methods of cirrus, so as to improve the accuracy and response speed of the ground detection system.
Cirrus cloud detection is also a vital part of data processing, which plays an essential role in ecological environment monitoring, weather forecasting, natural disaster prevention, and so on. Domestic and foreign scholars have proposed many cirrus cloud detection methods. They can detect cirrus clouds based on physical models such as infrared radiation and atmospheric attenuation [
12,
13], which requires prior knowledge. It can also be based on the time series of automatic screening detection methods [
14,
15], but there are difficulties in data acquisition. In recent years, with the development of artificial intelligence, many methods based on machine learning and neural network have been proposed [
16,
17], but this method relies on large sample image data and is not suitable for most cases. The method proposed in this paper is based on small sample image data, and the cirrus cloud is detected from the visual features and sparse representation of the cirrus cloud.
The classic principal component analysis (PCA) [
18] model is to transform the high-dimensional data to the low-dimensional data, obtain the main information by reducing the dimension, and remove the sparse irrelevant information. At the same time, principal component analysis can be used to obtain sparse components, so as to obtain sparse images with cirrus clouds. PCA has always been an essential research hotspot, widely used in the signal field [
19,
20], but it is highly dependent on data because the noise of data assumed by PCA is Gaussian. In order to improve the PCA algorithm, Wright et al. proposed robust principal component analysis (RPCA) [
21]. RPCA, on the other hand, does not assume Gaussian noise, and its core idea is that the data matrix Y can be represented as the superposition of a low-rank matrix L and a sparse matrix S under the optimization criterion, that is, Y = L + S, as shown in
Figure 1. In a physical sense, the rank of a matrix measures the correlation between the columns and columns of a matrix. If the rows or columns of the matrix are linearly independent, the matrix is full rank, which means the rank is equal to the number of rows. There’s some correlation between the rows in this matrix, thus, this matrix is generally low rank. The sparse matrix means that the number of 0 elements in the matrix is much larger than the number of non-zero elements, and the distribution of 0 elements is irregular. Typical practical applications are face shadow removal [
22], background estimation [
23,
24], and infrared dim target detection [
25,
26]. Face shadow removal and background estimation are mainly used to analyze the low-rank components obtained by RPCA, because faces are low-rank relative to shadows and backgrounds are low-rank relative to moving objects. The detection of infrared small and small targets is to analyze the sparse components obtained by RPCA, because small and small targets are sparse compared with the infrared background. Since different setting parameters can obtain different degrees of sparse components, while the virtual alarm source is sparse compared with the infrared background, while the background is low-rank, RPCA can be used to obtain the sparse components of the infrared image, including the virtual alarm source, noise, and clutter.
With the development of image algorithms, sparse representation, and dictionary learning are increasingly applied to target detection [
27], image reconstruction [
28], image denoising [
29], image compression [
30], and other aspects. Sparse representation is to express most or all of the data matrix Y with a linear combination of fewer basic signals. Find a coefficient matrix A and a dictionary matrix D, so that D*A can restore Y as much as possible, and A is as sparse as possible. A is the sparse representation of Y. Dictionary learning is to find the appropriate dictionary for the samples of common dense expressions and transform the samples into appropriate sparse expressions, so as to simplify the learning task and reduce the complexity of the model. The overall strategy for solving the above problems is to optimize the dictionary D and sample sparse representation A iteratively. To start, initialize dictionary D, 1. Fix dictionary D to optimize A. 2. Fix A to optimize dictionary D. Repeat the above two steps to obtain the sparse representation of A for the final D and Y, where each column
di in D represents the dictionary atom, and each row
αi in A represents the sparse coefficient corresponding to
di. In recent years, dictionary construction in sparse representation has developed from an orthogonal basis to over complete dictionary [
31]. Compared with a complete orthogonal basis, the basis of an overcomplete basis is usually redundant, that is, the number of base elements is larger than the number of dimensions. Given an initial dictionary and a signal to be trained, the dictionary learning algorithm constantly adjusts the dictionary atoms to make the description of the signal more accurate, and finally achieve the goal of constructing redundant dictionary. K-clustering with singular value decomposition (K-SVD) [
32], a representative dictionary learning algorithm, is used to construct a dictionary by minimizing the reconstruction error of the original sample in compressed sensing of images [
33] and image denoising [
34,
35], which achieves good results.
In the cirrus cloud detection based on sparse representation, the construction of redundant dictionary is a difficult problem. Through the texture analysis of the false alarm source in the infrared image, it is found that it has fractal characteristics. Fractal characteristics mainly refer to the self-similarity of objective things, which is embodied in fractal dimension, fractal error and multifractal index. In recent years, fractal features have been widely applied in texture analysis [
36,
37], Image sampling [
38,
39] and segmentation of medical signals (one-dimensional (1D), two-dimensional (2D), or three-dimensional (3D)) [
40,
41]. The study shows that most of the natural objects in nature have strong fractal characteristics, which can be consistent with the fractal model. Different types of signals have different shapes and attributes, with low correlation, while the same type of signal has high correlation; thus, the specific type of signal components in infrared image can be efficiently represented by the same type of over complete dictionary [
42]. The random fractal image constructed by diamond square method is similar to cloud image, thus, the redundant dictionary constructed by fractal image can effectively represent cirrus.
In this paper, a new method based on RPCA and fractal dictionary learning to detect the cirrus cloud is proposed. By studying the component composition of infrared images, it was found that cirrus, noise, and clutter are sparse relative to the background, while the background is low-rank. Infrared images are composed of low-rank images and sparse images, as shown in
Figure 1. In order to study the cirrus cloud more accurately, the sparse components of the infrared image were obtained by means of Robust Principal Component Analysis (RPCA), and only the sparse components of the infrared image were studied. Because the fractal can describe cirrus well, the over-complete dictionary based on the fractal structure can characterize cirrus well. The construction of fractal dictionary can be generated according to random fractal images. The fractal dictionary can be constructed from the random fractal image obtained by diamond square algorithm [
43]. Then, the fractal dictionary
Ds was studied and sparsely coded by the k-clustering with singular value decomposition (KSVD) method, and the sparsely represented images were obtained. Finally, the sparse represented images were segmented by threshold values to obtain the cirrus cloud false alarm source detection results. The method proposed in this paper has a higher accuracy under the same recall rate and a larger F-measure value and Intersection-over-Union (IOU) value for the best detection effect, indicating that it has a better detection effect. As a matter of convenience,
Table 1 represents the nomenclature of this paper.
2. Materials and Methods
In this paper, a new method based on fractal dictionary learning to detect cirrus was proposed. The key is to learn the constructed fractal dictionary to detect cirrus. In this section, we first introduce Robust Principal Component Analysis (RPCA), which is used to obtain sparse components of the original image. Then, the algorithm of generating random fractal image is introduced, and the fractal dictionary is constructed by fractal image. Finally, a dictionary learning algorithm based on KSVD is introduced to obtain sparse representation images of sparse components and detect cirrus.
2.1. Robust Principal Component Analysis
The component composition of infrared image shows that cirrus, noise and clutter are sparse with respect to the background, while the background is low-rank. Then, the original infrared image can be superposed by sparse component and low-rank component, that is, Y = L + S; Y represents the original infrared image, L represents low-rank background component, and S represents sparse cirrus, noise, and clutter component. Among them, the noise refers to the point-like salt and pepper noise, while the clutter refer to the coastline, water ripple, and other long impurities that will interfere with cirrus detection. Because cirrus clouds have fractal features and fractal-based dictionaries can better sparsely represent cirrus clouds, these clutters will not be detected incorrectly. Robust Principal Component Analysis (RPCA) is a current popular model, which is used in this paper to obtain the sparse Component S of infrared images, as shown in
Figure 2.
Principal component analysis (PCA) is to find a low rank matrix L, which minimizes the difference between L and Y. It is considered that Y is contaminated by Gaussian noise and the optimal solution can be obtained by singular value decomposition (SVD). However, due to the existence of cirrus, noise, and clutter, the effect of PCA is poor, and the proposal of RPCA makes up for the shortcomings of PCA. Because the noise of the data assumed by the PCA is Gaussian, the PCA will be affected by it, while the RPCA does not exist this hypothesis, but only assumes that the noise is sparse. Therefore, RPCA can be used to obtain the sparse image with the cirrus cloud. Restoring sparse matrix S is a two-objective optimization problem:
where rank(∙) is the rank of the matrix;
is the zero norm of the matrix, which represents the non-zero number of the matrix; and λ is the compromise factor, which can control the proportion of low-rank images and sparse images.
The optimal convex approximation is as follows:
RPCA is often used to remove image noise, and the sparse components containing cirrus can also be obtained. Different sparse images can be obtained by changing the value of λ, where the larger the λ is, the smaller the original image component of the sparse image is, as shown in
Figure 2.
There are many models to solve RPCA, such as the dual method, Accelerated Proximal Gradient (APG) [
44], Iterative Thresholding (IT) [
45], Exact Augmented Lagrange Multiples (EALM), and Inexact Augmented Lagrange Multiples (IALM) [
46]. IALM is an improvement of EALM, which requires fewer SVD times and has higher accuracy and convergence speed. Therefore, IALM is used to solve RPCA problems.
Sparse images obtained from RPCA mainly consist of cirrus component YS, noise, and clutter component n. When the sparse component is acquired by the RPCA method, the images with different sparseness can be acquired by controlling parameter λ. In order to get more complete cirrus clouds, there are still many noises and clutters. The specific type of signal in infrared image can be efficiently represented by the over-complete dictionary of the same type of signal, thus, the dictionary constructed by random fractal image can be used to represent cirrus clouds in infrared image. Then, the method based on fractal dictionary learning can remove noise and cirrus clouds that do not have fractal features.
2.2. Random Fractal
For the objective existence of coastlines, cirrus clouds, rivers, snow mountains, etc. in nature, when some of them are taken out and enlarged appropriately, the images obtained are not the same as the original ones. However, the complexity of dense bending is similar to the original ones, thus, the self-similarity of natural landscape is called random self-similarity. Fractals with random self-similarity are called random fractals. The random fractal images obtained by the Diamond-Square algorithm are similar to the texture images of clouds, thus, the fractal dictionary constructed from this image can efficiently represent cirrus. Next, the Diamond-Square algorithm [
43] is introduced to generate random fractal images.
To generate random fractal images, the number of iterations
n is first determined, and the square ABCD is meshed to generate
resolution fractal images. The generation process is shown in
Figure 3. The random value
X at the midpoint M of the square ABCD is generated, and the calculation formula is as follows:
where H represents the value of Hurst index, and
t represents the number of current iterations; the calculation formula for the gray value at the middle point M is as follows:
The midpoints of edge AB, BC, CD, and DA are E, F, G, and H, respectively. The gray values of point E are calculated according to the gray values of A, B, and M. The formulas are as follows:
Similarly, the gray value of point F is calculated according to the gray value of B, C, and M, the gray value of point G is calculated according to the gray value of C, D, and M, and the gray value of point H is calculated according to the gray value of D, A, and M.
The average
m of four vertices gray value of small square EBFM is calculated, and the sum of average
m and random value
X is taken as the gray value of the middle point of small square EBFM. By analogy, the gray value of the middle point of small square MFCG, HMGD, and AEMH is obtained. Repeat the above steps until the current iteration times satisfy n < t, and get the random fractal image, as shown in
Figure 4.
Each pixel point in a random fractal image of M × N size is taken as the center, and the atomic sample block with the size of
pixels is selected to convert the atomic sample block into a column vector, so as to obtain (M × N) sample atoms. According to the sample atoms, an over-complete dictionary D is formed, that is, an original dictionary atomic matrix with
rows and (M × N) columns is obtained. The construction process is shown in
Figure 5.
2.3. Sparse Representation and Dictionary Learning
The purpose of sparse representation is to represent the signal with as few atoms as possible in a given over-complete dictionary, so as to obtain the information in the signal more easily and facilitate further processing of the signal, such as compression and encoding. The sparse representation model can be described as:
where
is expressed as a sparse transform basis and A is expressed as a sparse coefficient. The key to sparse representation is the choice of
. At present, the most widely used is the sparse representation based on redundant dictionary D. The redundant dictionary is composed of vectors, in which each column is the atom of the dictionary. Dictionary learning is mainly to update the dictionary after the initial dictionary is fixed and adjust the redundant dictionary according to the specific iterative method, so as to get a better sparse representation.
Sparse representation and dictionary learning were first used to solve the signal processing problem in compressed sensing, but now they are increasingly used in image processing. By applying sparse representation and dictionary learning methods to image processing, noise in image can be separated simply and efficiently, and image quality can be improved.
2.3.1. Orthogonal Matching Pursuit Algorithm
The sparse representation model can be transformed into the following forms:
where S =
, A =
,
represents the column with index
i in the sparse coefficient matrix A, and T
0 represents the sparsity. The goal of sparse representation is to solve for A.
The greedy algorithm based on redundant dictionary is widely used because of its high efficiency and high accuracy. Matching Pursuit (MP) and Orthogonal Matching Pursuit (OMP) [
47] are commonly used in greedy algorithms.
OMP algorithm is an improvement on the MP algorithm with faster convergence speed. The improvement is to orthogonalize all selected atoms at each step of decomposition. The main idea of OMP algorithm is to select the best atom to enter the atom set according to the matching degree, find the projection of the measured signal in the orthogonal space of the atom set, get the optimal sparse approximate solution of the original signal by solving the least square problem, update the signal margin, and make it enter the next iteration. Finally, the signal is linearly represented by atoms through a certain iteration process.
2.3.2. Dictionary Learning Based on KSVD
KSVD algorithm is mainly divided into two stages, the first is sparse coding and the second is dictionary learning. The KSVD algorithm is used to fix the initial dictionary D first, and then, the following two stages are carried out. The objective function of dictionary learning is:
where
is a matrix to be decomposed,
is a dictionary (when
, D is an over-complete dictionary),
is a sparse coefficient matrix, and
denotes the row with the subscript
i in the sparse coefficient matrix A.
In the first stage, the OMP algorithm is mainly used to solve the sparse coefficient matrix A.
In the second stage, dictionary learning is a further operation of sparse representation. The objective function can ignore the penalty term
and change it to the following form:
The KSVD algorithm is used to update dictionary and sparse coding simultaneously. The dictionary is updated by column by column. When column k is updated, other atoms remain unchanged.
where E is the error estimation matrix and
is the sparse coefficient corresponding to the atom in the kth column to be updated, which is the kth row of the sparse coefficient matrix.
The Singular Value Decomposition (Singular Value Decomposition, SVD) method can be used to solve the two solutions. Firstly, the zero element in
should be removed, that is, the position of 0 in the corresponding
of
is removed, and the new
matrix and
vector can be obtained. In this case, the optimization problem can be described as:
Singular value decomposition of is performed to obtain . Take the first column vector of left singular value matrix U as , that is, . Take the product of the first row vector of the right singular value matrix and the first singular value as a product of , that is, , and get the corresponding according to .
After fixing the fractal dictionary, learn the dictionary according to ksvd algorithm, as shown in
Figure 6.
Figure 6a shows the initial dictionary constructed according to the random fractal image,
Figure 6b shows the initial fractal dictionary displayed by converting each column of atoms into image blocks, and
Figure 6c shows the learned fractal dictionary.
2.4. Cirrus Detection by RPCA and Fractal Dictionary Learning
The algorithm flow is shown in
Table 2. First of infrared image
RPCA decomposition, get sparse component
, the parameter
value of 0.03. Next, block column vectorization is carried out for S. In this paper, image blocks of size
are selected, with each pixel point as the center point, image blocks are selected and converted into column vectors to obtain a matrix of size
, which is still named S. Because the sparse component of cirrus has fractal characteristic, thus, the use of the diamond–square algorithm to generate random fractal image
to construct a complete dictionary
(if
, then D
s has k columns; if M by N is less than k, let k be M by N). Then the sparse component S is sparsely represented by KSVD algorithm, and the learned dictionary D
l and sparse coefficient matrix A are obtained. Sparse representation of sparse component S is reconstructed by D
l and A; then, morphological filtering and threshold segmentation are performed. Morphological filtering is the application of open and close operations to selectively remove noise and irrelevant targets at specified scales in texture details while retaining other useful information. Open operation removes the smaller points in the image. Closed operation transforms the fracture structure into a whole. Finally, the detected cirrus image
is obtained.
3. Results
In order to better illustrate the performance of infrared imaging cirrus detection method based on fractal dictionary learning, nine representative cirrus infrared images are tested in this paper, as shown in
Figure 7. The test data was derived from the near-infrared band of Landsat8 data set. Let us introduce the morphology and distribution of cirrus. The cirrus in test image (a) are slender and sparsely distributed in the image, with sky and large clouds in the background. The image size is 320 × 256. The cirrus cloud shape of the test image (b) is filamentous and coiled, which are densely distributed in the whole image. The image size is 230 × 162. The cirrus clouds in the test image (c) are densely distributed point-shaped, with mountains and coastlines in the background. The image size is 232 × 162. In the test image (d), strip and cluster cirrus clouds are randomly distributed over the coast and sea water. The image size is 247 × 156. The test images (e) and (f) are similar to each other and both are clustered cirrus clouds with sparse distribution. The image sizes are 349 × 265 and 255 × 171, respectively. The cirrus clouds in the test images (g) and (h) are spot-shaped and densely distributed in the images. The image sizes are both 2035 × 1291. The cirrus clouds in the test image (g) are densely distributed in the lower left, and the test image (h) is densely distributed in the entire image. The cirrus clouds in the test image (i) are large and small, and are sparsely distributed in the entire image. The image size is 329 × 241. These nine test images cover the shape and distribution of most cirrus images, and their test experiments are more convincing.
In order to objectively evaluate the method proposed in this paper, it is compared with the cirrus detection method based on extracting fractal features and the classical detection method. The objective evaluation methods include the receiver operating characteristic (ROC) curve, Precision-Recall (PR) curve, comprehensive evaluation index (F-Measure), and Intersection-over-Union (IOU). The software used is MATLAB R2018.
3.1. Parameter Settings
First, the method proposed in this paper is to perform block decomposition of image . The specific step is to select an image block of size s × s, with each pixel as the center point, and convert it into column vectors to obtain a matrix of size . The key problem is to find the appropriate s value. In this paper, the s value was set to 8, 15, 20, 30, 40, 45 to find the best s value.
In order to objectively evaluate the value of s, the receiver operating characteristic (ROC) curve and Precision-Recall (PR) curve were used to evaluate six of the images.
The ROC curve is a functional image that describes the sensitivity. ROC curve can be achieved by describing true positive rate (TPR) and false positive rate (FPR). The ROC curve is also known as the correlation operation characteristic curve, because it is used as the standard by comparing two operation characteristics (TPR and FPR). Its abscissa is FPR and its ordinate is TPR. In addition, Area Under Curve (AUC) can be used as a quantitative evaluation index of ROC Curve. Generally, the larger the AUC, the better the detection effect of ROC curve.
ROC and PR curves are supervised evaluations, which need to manually mark the ground truth image as shown in
Figure 8a. The predicted image is shown in
Figure 8b. The concepts of TP, FP, FN, and TN are illustrated by the obfuscation matrix in
Table 3.
where TP represents the total number of pixels in which the pixel value after I threshold segmentation is 1 and the pixel value in ground truth is also 1. FP represents the total number of pixels whose I threshold value is 1 and the corresponding ground truth is 0. FN represents the total number of pixels whose pixel value after I threshold segmentation is 0 and the pixel value in ground truth is also 1. TN represents the total number of pixels whose pixel value after I threshold segmentation is 0 and the pixel value in ground truth is also 0.
In this paper, the ROC curve of
Figure 9 will be used to represent the detection effect of different images under different s values, and the closer to the upper left corner, the better. The PR curve in
Figure 10 is closer to the upper right corner, the better the detection effect.
Table 4 shows the area under the curve of ROC curve in
Figure 9, and
Table 5 shows the area under the curve of PR curve in
Figure 10. The closer the value is to 1, the better the detection effect is.
In order to solve the shortcomings of ROC curve, PR curve is proposed, which is precision-recall curve, recall as abscissa axis, precision as ordinate axis. When the output image is labeled as the target, the recall rate will be equal to 100%, but the precision rate is very low. However, for ROC images, the evaluation effect is still very good. At this time, the PR curve will play a vital role.
According to ROC curve of
Figure 9 and PR curve of
Figure 10, it was found that the effect is better when s is 8 and 15, but when s is 15, the running time is 19.620 s, and when s is 8, the running time is 12.983 s. Therefore, the s value was set to 8.
3.2. Experimental Results and Analysis
The experimental results of the proposed algorithm for the test image of
Figure 7 are shown in
Figure 11. (a) Represents the sparse image obtained by RPCA decomposition. (b) The coefficients obtained by updating and sparse encoding fractal dictionary using KSVD algorithm are used to represent the image. (c) Represents the final result of threshold segmentation.
From
Figure 11, it can be seen that the low rank components of infrared images can be removed by robust principal component analysis, and the sparse components including cirrus can be obtained. Then the sparse representation image is obtained by updating the fractal dictionary and sparse coding according to the KSVD algorithm. At this time, most of the noise and clutter in the image have been removed, and some image selection has been normalized. Finally, threshold segmentation is carried out according to OTSU method to obtain the final detection result image.
3.3. Evaluation
In order to evaluate the performance of the algorithm objectively, the receiver operating characteristic (ROC) curve, Precision-Recall (PR) curve, comprehensive evaluation index (F-measure), and Intersection-over-Union (IOU) are used to evaluate the performance of the algorithm. The proposed method will be compared with fractaldim [
40], DivisorstepTP [
48], MaxMedian, EightpixelTP [
49], singularityExponent [
50], and areaMeasure methods [
51].
The method based on fractal dictionary proposed in this paper has time advantages in other methods of extracting fractal features. The time complexity of RPCA is O(CNlgN), where C represents the number of iterations and N represents the number of image pixels. The KSVD algorithm is mainly divided into two stages, the first is sparse coding and the second is dictionary learning. The time complexity of dictionary learning algorithm based on KSVD is O (t(n^2*m+m^2*n)), where t is the number of iterations and m*n is the size of the image. The total time complexity is O(CNlgN+t(n^2*m+m^2*n)).
Table 6 shows the average running time of different methods.
In order to observe the experimental results more intuitively, the ROC curve in
Figure 12 shows the overall evaluation of the detection effect of different algorithms in different test images. Each point on the curve represents the false alarm rate and recall rate under different thresholds. Where, the ROC curves of (
a–
i) in
Figure 12 respectively represent the ROC curves of (
a–
i) images in
Figure 7.
Table 7 shows the area under the ROC curves in
Figure 12. The closer the value is to 1, the better the detection effect. The bold number in the table represent the maximum value.
Figure 13 shows the PR curves of different test images. Each point on the curve represents recall and precision under different thresholds. Through the analysis of 9 PR curve images, it can be seen that the proposed algorithm has a higher accuracy under the same recall rate, which indicates that it has a better detection effect. Where, the PR curves of (
a–
i) in
Figure 13 respectively represent the PR curves of (
a–
i) images in
Figure 7.
Table 8 shows the area under the PR curves in
Figure 13. The closer the value is to 1, the better the detection effect.
The conflict between precision and recall may occur, thus, they need to be considered comprehensively. The most common method is F-Measure (also known as F-Score). F-Measure is the weighted harmonic average of precision, and recall:
The value of is generally 0.3, which increases the weight of precision and considers the precision to be more essential than the recall. Because when the model marks all the output images as targets, the recall rate will be equal to 100%, but the precision rate is very low.
Table 9 shows the F-Measure corresponding to the detection results of the above methods in nine test images. For each test image, the maximum value is shown in bold. It can be seen that the method proposed in this paper not only has better precision rate, but also has a good recall rate and better detection effect in the detection of false alarm source.
The full Intersection of IOU is called Intersection over Union, which is the ratio of intersection and union of result image obtained by threshold segmentation of image I (
predicted) and ground truth image.
Table 10 shows the IOU corresponding to the detection results of the above method in 9 test images. For each test image, the maximum value is shown in bold. It can be seen that the method proposed in this paper has better IOU on the cirrus detection and better detection effect.
4. Discussion
With the continuous development of infrared imaging detection system, in recent years, small target detection and recognition algorithms continue to emerge, but there are few algorithms to assist small and weak target detection by detecting the false alarm source. In this paper, a new method to detect the false alarm source of the cirrus cloud based on RPCA and fractal dictionary learning was proposed. Considering the sparsity of the cirrus cloud, the sparse component of infrared image was obtained by RPCA, which includes the cirrus cloud and noise. Then, the noise in the image was removed by using the fractal dictionary learning method, and finally, the cirrus cloud image was obtained.
Fractal is more and more widely used, and new methods of extracting fractal features are emerging, including the box counting method (fractaldim) to extract fractal dimension, the step-by-step triangular prism method (DivisorstepTP) to extract fractal dimension, the eight pixel triangular prism method (EightpixelTP) to extract fractal dimension, multi-scale fractal area (areaMeausre), and the singular index of multifractal analysis (singularityExponent). Because the cirrus cloud has self-similarity, it can be detected by extracting fractal features. The fractal Dictionary of this paper is also based on the fractal characteristics of the cirrus cloud, and it also verifies that the algorithm of fractal dictionary is better than other fractal algorithms.
The performance of the proposed algorithm is fully verified by experiments. According to the ROC curve in
Figure 12, it can be seen that the proposed algorithm curve is generally closer to the upper left corner, so its detection effect is better.
Figure 13 shows the PR curve of nine images. The algorithm curve proposed in this paper is closer to the upper right corner, with better effect.
Table 7 and
Table 8, respectively, represent the area under the ROC curve (AUC) in
Figure 12 and the area under the PR curve (AUCpr) in
Figure 13. Generally, the higher the AUC value, the better the detection effect. The algorithm proposed in this paper showed that the effect of the ROC curve and AUC value was lower than that of other algorithms, for example, as shown in
Figure 12g, the Fractaldim algorithm, DivisorstepTP algorithm, and singularityExponent algorithm correspond to the ROC curve with a large AUC value. However, when the points on the curve were observed, the false alarm rate was relatively high. As can be seen from the PR curve evaluation in
Figure 13g, the accuracy rate was not high and the detection effect was not good. According to the ROC curve, it can be seen that in the proposed algorithm, the recall rate is high, there is a low false alarm rate, the AUC value is bigger, and it gives better detection results. Because ROC curve ignores the accuracy, in order to evaluate the detection effect more accurately, the ROC curve and PR curve are used to evaluate the algorithm at the same time.
Table 9 shows the comprehensive evaluation index (F-measure) of nine test images. It can be seen from the bold value that the F-measure of the proposed method is higher than other methods after combining the precision and recall indexes.
Table 10 shows the intersection over union (IOU). It can be seen from the bold value that the IOU value of this method is higher than other algorithms, indicating that the segmentation detection effect of this method is better. In conclusion, the method based on RPCA and fractal dictionary learning proposed in this paper has good detection performance for the detection of cirrus false alarm source.
5. Conclusions
In this paper, a novel infrared cirrus detection method based on RPCA and fractal dictionary learning was proposed to suppress the false alarm sources in infrared detection system. The algorithm focuses on the construction of fractal dictionary for dictionary learning, in order to characterize cirrus cloud more reasonably and completely. Cirrus clouds usually satisfy fractal distribution, such as irregular shape, rough gray surface, complex texture, self-similarity, etc. Since the signal components of a specific type in infrared images can be effectively represented by an over-complete dictionary of the same type of signals, fractal dictionaries based on random fractal construction can well represent false alarm sources. Compared with the traditional detection method, the improved scheme has better detection performance and precision; its quality index, such as ROC, PR, AUC, IOU value, and F-measure, also shows better performance. As an auxiliary scheme, cirrus false alarm source detection and forecast is effective approach to improve the performance of photoelectric detection system, especially in small target detection. The proposed method is suitable for infrared images with single false alarm source. If there are several false alarm sources coexisting in the imaging area, more complex algorithms need to be further considered, such as hybrid modeling with multiple features for infrared imagery and more complete adaptive dictionary learning scheme.