Fabric Detection
Fabric Detection
https://doi.org/10.1007/s00371-023-03066-8
ORIGINAL ARTICLE
Abstract
Fabric defect detection is an important part of the textile industry to ensure product quality. To solve the problems such
as the difficulty of detecting small defects and the coexistence of multi-scale defects in fabric defect detection, a fabric
defect detection method based on the feature purification fusion structure is proposed in this paper. Specifically, we improve
the feature extraction network to enhance the network’s ability to focus on small defective features and effectively reduce
the model parameters. The existing methods use direct fusion between multi-level feature maps, which will lead to feature
confusion. Therefore, we propose the feature purification fusion structure (FPF), which includes the semantic information
supplementation strategy (SIS) and the detail information supplementation strategy (DIS). SIS extracts valid information
from the deep feature map and supplements it to the shallow feature map, weakening feature information irrelevant to the
shallow feature map detection task. DIS adaptively supplements the feature information required by the deep feature map
detection task from the shallow feature map. FPF improves the ability of the network to detect small defects and effectively
mitigates the aliasing effect generated during feature fusion. The experimental results show that compared to the baseline
model YOLOv5s algorithm, our model achieved a 6.8% improvement in detection accuracy, with an average inference frame
rate of 37.6 FPS, demonstrating better detection performance in fabric defect detection. Furthermore, extending this model
to aluminum profile defect datasets also demonstrates strong performance.
Keywords Fabric defect · Defect detection · Deep learning · Small target · Feature fusion
123
G. Liu, J. Ren
Fig. 1 Two major challenges of fabric defect detection a small defect image, b multi-scale defect image
With the improvement of computer computing power and often prioritize attention to salient feature information, which
the rapid development of software technology, the deep learn- hinders the effective detection of small objects.
ing method in computer vision has reached a very advanced In order to tackle the challenge of detecting defects at mul-
level in recent years. In fabric defect detection, the detec- tiple scales, Jing et al. [12] proposed an improved YOLOv3
tion method based on convolutional neural networks (CNNs) model for fabric defect detection. They employed the k-
[8] can complete the detection task in complex scenes more means algorithm to cluster target defects based on their
effectively and has gradually replaced the traditional detec- sizes. Furthermore, they combined low-level features with
tion method. In order to overcome the difficulty in detecting high-level information and incorporated YOLO detection
small defects, Qin et al. [9] proposed a fabric defect detec- layers on feature maps of different sizes. While this approach
tion method based on the multi-branch residual network. achieves fast detection speed, it suffers from relatively lower
Based on the original residual module, this method adds a detection accuracy. Wang et al. [13] proposed a fabric defect
new convolution branch, which dynamically adjusts the size detection algorithm based on an improved YOLOv5 model.
of the receptive field with the number of network layers, and They introduced the Adaptive Spatial Feature Fusion (ASFF)
replaces the residual module in the down-sampling stage. The method to mitigate the adverse effects of the Path Aggrega-
network can meet the requirement of accurate location when tion Network (PANet) on multi-scale features. Additionally,
detecting common defects, but the detection effect will be an attention mechanism was incorporated to guide the net-
greatly reduced when encountering small defects. Liu et al. work’s focus toward relevant information. However, the
[10] improved the YOLOv4 algorithm by replacing MaxPool direct feature fusion approach between multi-level feature
with SoftPool and introducing a new SPP structure followed maps can result in information confusion among multi-scale
by convolutional layers. However, this approach still requires objects, which can negatively impact the detection perfor-
further enhancement in terms of accuracy for detecting small mance of the network.
defects. Yue et al. [11] improved the YOLOv4 algorithm by To address these issues, we improved the feature extrac-
adding new prediction layers in the yolo_head and replacing tion network by integrating Efficient Channel Attention
the CIOU loss function with the CEIOU loss function. The (ECA) into the Ghost bottleneck and adopting a dense con-
enhanced algorithm demonstrated an improvement in detec- nectivity scheme, forming the Dense-Ghost-ECA (DGE)
tion accuracy, but it still exhibited a higher missed detection module. This module increases the network’s attention to
rate for small defects. However, shallow feature maps are pri- defects, enhances its learning capability of defect features,
marily responsible for detecting small objects. When large and enables the network to detect subtle defects with less
and small objects coexist in shallow feature maps, the fea- prominent features better. We replace the traditional convo-
tures of large objects tend to be more prominent. Detectors lution in the backbone with Ghost convolution to reduce the
123
Feature purification fusion structure for fabric defect detection
model parameters. We also propose the feature purification semantic information of shallow features. However, there is
fusion structure (FPF), which consists of the semantic infor- still room to improve the detection accuracy of small targets.
mation supplement strategy (SIS) and the detail information TPH-YOLOv5 [15] improves the detection performance of
supplement strategy (DIS). FPF not only pays attention to small objects by adding target detection layers, using trans-
small defects but also emphasizes weakening the significance former prediction heads, and integrating the CBAM attention
of large defects feature information in the shallow feature module [16]. However, in situations where the targets are
map. FPF reduces feature confusion in the process of feature sparse, it can be prone to missed detections. Sun et al. [17]
fusion and improves network performance. added the deformable convolution to the feature mapping
Our contributions and innovations are summarized as fol- after eightfold down sampling and added an offset to the posi-
lows. tion of the sampling point in the convolution to expand the
actual field of perception and thus improve the recognition
1. We incorporate the DGE module in the feature extrac- of small targets. However, when the targets have background
tion network and replace the traditional convolution with occlusion, the false detection rate and missing rate are rela-
Ghost convolution to enhance the network’s feature min- tively high.
ing and reuse capability, which has faster inference speed The above networks achieve better detection results for
and higher accuracy while effectively reducing the com- small target detection, but all have the same limitations. Their
putational effort. starting points for improving small target detection capabil-
2. We propose the FPF structure, in which the SIS strategy ities focus only on small targets and ignore large targets’
supplements the rich semantic information of the deep interference in small target detection. Our approach aims not
feature map downward to the shallow feature map and only to enhance the feature information of small objects but
eliminates redundant information to ensure the purity of also to reduce the saliency of feature information from large
the feature information. The SIS strategy makes the net- objects, enabling the network to detect small objects better.
work better at detecting small defects.
3. The DIS strategy in the FPF structure adaptively supple- 2.2 Feature fusion
ments the missing detail feature information in the deep
feature map detection task from the shallow feature map. The phenomenon of large-scale variations in defect sizes is
The DIS strategy alleviates the confusion of target infor- widespread in object detection, which also holds true for
mation caused by feature fusion. fabric defect detection. Many scholars have improved the
feature fusion method to enable the detector to detect multi-
The rest of the paper is organized as follows: Section 2 scale targets better. For example, Guo et al. [18] designed
presents the work related to the proposed methodology. a novel feature pyramid structure AugFPN according to the
Section 3 elaborates on the proposed method. Section 4 of design defects of FPN that led to the under-utilization of
the paper presents the dataset and experimental setup used in multi-scale features and applied it in R-CNN. The results
this study showcases the experimental results and analysis, show that the average accuracy has been improved. Zhao
and includes an extended experiment. Section 5 concludes et al. [19] proposed the top-down and bottom-up feature
the paper. pyramid network (TDBU-FPN), which combines multi-scale
features and multiple aspect ratio anchor frame generation.
The results show that the dataset’s accuracy and speed per-
2 Related work formance has improved. Ghiasi et al. [20] explored a novel
feature pyramid structure called NAS-FPN by conducting a
In this section, we will introduce two works related to our neural network architecture search in an expandable search
algorithm in detail, including the small target detection algo- space that covers any cross-scale connections. NAS-FPN has
rithm and feature fusion algorithm. the capability to fuse features across different scales and has
been validated successfully on the COCO dataset. However,
2.1 Small target detection the feature fusion between multiple layers of feature maps
often leads to severe feature scale confusion [21], which can
Small target detection has always been a very challenging even exacerbate false negatives and false positives. In our
problem in target detection, especially in fabric defect detec- approach, we can avoid significant feature scale confusion
tion. Therefore, in order to improve the detection accuracy while performing feature fusion.
of small targets, some detection algorithms for small targets
have gradually appeared. Qu et al. [14] proposed to combine
dilated convolution and feature fusion to improve the detec-
tion results of small targets to some extent by enhancing the
123
G. Liu, J. Ren
Fig. 2 Overall framework of the network. Firstly, the input image goes produces Pm. DIS2 takes P1 and Pm (indicated by green line and Pm
through the improved feature extraction network, resulting in multi- feature map) as inputs and produces P2out. DIS3 takes P1, P2, and P3
level feature maps P1, P2, and P3. P1 is obtained from the DGE*2 (indicated by green, red, and blue lines) as inputs and produces P3out.
output in the backbone, P2 is obtained from the DGE*3 output in the It should be noted that P1 is the shallowest level feature map, which
backbone, and P3 is obtained from the SPPF output in the backbone. contains rich detail information and does not require the DIS strategy
These multi-level feature maps are then processed by the FPF struc- for detail information supplementation. Similarly, P3 is the deepest-
ture, which includes Semantic Information Supplement (SIS) and Detail level feature map, which contains rich semantic information and does
Information Supplement (DIS) strategies. SIS1 takes P1 and P2 (indi- not require the SIS strategy for semantic information supplementation.
cated by green and red lines) as inputs and produces P1out as the output. Finally, the obtained P1out, P2out, and P3out are fed into the detection
SIS2 takes P2 and P3 (indicated by red and blue lines) as inputs and heads for regression and classification to obtain the final output results
123
Feature purification fusion structure for fabric defect detection
The amount of computation using traditional convolution and dimensionality reduction are detrimental to learning
and Ghost convolution floating point type is b1 , b2 , respec- the dependencies between channels. ECA proposes a local
tively. cross-channel interaction strategy without dimensionality
reduction. This strategy allows a direct relationship between
b1 n · h · w · c · k ∧ 2 (4) channels and weights, and appropriate cross-channel inter-
actions can significantly reduce model complexity while
n n maintaining performance. The module adds only a small
b2 · h · w · c · k ∧ 2 + (s − 1) · · h · w · d ∧ 2 (5)
s s number of parameters yet achieves significant performance
gains.
The ratio of floating-point computation between the two ECA uses an adaptive method to select the size of the
is: one-dimensional convolution kernel, which is realized by a
fast one-dimensional convolution with size K. K represents
b1 n · h · w · c · k ∧ 2 the coverage area of local cross-channel interaction, which is
b2 n
s · h · w · c · k ∧ 2 + (s − 1) · ns · h · w · d ∧ 2 proportional to the channel dimension. The structural com-
s·c parison of SE and ECA is shown in Fig. 4.
≈ ≈s
s+c−1 (6)
where c represents the number of channels of the input image, 3.1.3 DGE module
and s c. k ∧ 2 represents the size of the convolution kernel
of the traditional convolution operation, h and w , respec- Ghost convolution is used to constitute a Ghost bottleneck
tively represent the height and width of the original feature structure, similar to ResNet [25], as shown in Fig. 5. In
map generated by Ghost convolution, d ∧ 2 is the size of the Fig. 5, the first Ghost convolution is used as an extension
convolution kernel of the linear operation. From the ratio layer to increase the number of feature channels, and the
of the number of parameters and the ratio of floating-point second Ghost convolution is used to decrease the number of
computation, it can be seen that when k and d are equal in channels. Then it is output after a shortcut connection.
size, the number of parameters and computation used for fea- The Ghost convolution reduces the parameter count and
ture extraction using Ghost convolution is about 1 s that of improves computational speed but may result in some accu-
traditional convolution. racy loss. Therefore, we introduce the ECA module into the
Ghost bottleneck to enhance the network’s learning capa-
3.1.2 Efficient channel attention (ECA) bility of features. This module helps the model pay more
attention to subtle defects and better focus on information rel-
ECA [23] is a form of implementation of the channel evant to the current detection task. Inspired by DenseNet [26],
attention mechanism. Senet [24] performs channel com- we adopt a dense connection approach in the DGE module.
pression on the input feature maps, and such compression This ensures maximum information transfer between layers
123
G. Liu, J. Ren
123
Feature purification fusion structure for fabric defect detection
6402 × 3 GBS – 64 – 2 1
3202 × 64 GBS – 128 – 2 1
1602 × 128 DGE 2 128 1 1 1
1602 × 128 GBS – 256 – 2 1
802 × 256 DGE 2 256 1 1 2
802 × 256 GBS – 512 – 2 1
402 × 512 DGE 2 512 1 1 3
402 × 512 GBS – 1024 – 2 1
202 × 1024 DGE 2 1024 1 1 1
202 × 1024 SPPF – 1024 – 1 1
# exp represents the expansion size, #out represents the number of output channels, ECA indicates whether
the ECA module is used, and N represents the number of times each module is used
123
G. Liu, J. Ren
Fig. 8 Visualization of multi-level feature maps. a original image of the input network, b shallow feature map P1, c intermediate feature map P2,
d deep feature map P3
detection of large defects. Different levels of feature maps feature map Pw from the full-value feature map P f . The
contain different feature information. Shallow feature maps weight values in the full-value feature map P f are all set to
contain rich details but lack semantic information, while 1. To avoid interference from the background weight value
deep feature maps contain rich semantic information but lack on the result, we introduced the threshold T . The reverse
details. The fusion of information between different levels of operation is only performed when the weight value of a point
feature maps usually leads to feature confusion. The intro- in the weight feature map Pw is greater than the threshold
duction of feature information of large defects in the shallow value T . We analyzed the threshold T in Sect. 4. RSAM
feature map originally used to detect small defects makes makes the small defects in the feature map more significant
the features of small defects not pure enough and affects the while suppressing the feature information of large defects,
network’s ability to detect small defects. Based on this, this making the feature information of small defects in the feature
paper proposes the FPF structure, including SIS and DIS map purer and improving the detection ability of the network.
strategies. RSAM operations that meet the threshold conditions can be
expressed as:
3.2.1 SIS strategy
RS AM(·) P f − Pw
Input the original image I ∈ R W ×H ×3 into the backbone
feature extraction network to obtain the multi-level feature
map P { p1 , p2 , · · · , pn }, (n 3). We assume that X 1 − σ (Conv7×7 (C( Avg(U (Pi+1 )), Max(U (Pi+1 ))))) (7)
{x1 , x2 , · · · , xn } is the set of all multi-scale defect targets,
where x1 represents the smallest defect and xn represents
the largest defect. In the training process of the network, pi , where the function U (Pi+1 ) upsamples Pi+1 to maintain the
(i 1, 2, · · · , n) is more inclined to detect defects similar in same spatial resolution as Pi . Avg(·) and Max(·) denote
size to xi , (i 1, 2, · · · , n) [28, 29]. Therefore, we propose the average pooling and max pooling along the channels,
the SIS strategy, as shown in Fig. 9. respectively, and C(·) denotes the Concatenate operation.
We input the deeper feature map Pi+1 , (i 1, 2) into Conv7×7 (·) refers to the standard convolution of 7 × 7, which
the reverse spatial attention module (RSAM). RSAM applies compresses the channels of the feature map obtained from the
spatial attention to obtain the weight feature map Pw , and the Concatenate operation to ensure the uniqueness of the weight
reverse weight values are obtained by subtracting the weight values, and σ (·) denotes the Sigmoid activation function.
123
Feature purification fusion structure for fabric defect detection
The obtained reverse weight values are used to guide the the weight value is used to guide the P1 feature map to sup-
shallow feature map Pi . To avoid the issue of gradient van- plement the required detailed information to the Pm . We use
ishing during the training process, we have introduced skip residual blocks to ensure that all feature maps have a consis-
connections. Finally, the output feature map P1out , (i 1) tent spatial resolution. The DIS2 strategy can be expressed
of P1 and the intermediate feature map Pm , (i 2) of P2 are as follows:
obtained. It can be expressed as:
S AM(·) σ (Conv7×7 (C(Avg(U (Pi+1 )), Max(U (Pi+1 ))))) (9)
P1out /Pm Pi RS AM(·) + Pi (8)
P2out R B(Pm + R B(P1 S AM(Pm ))) (10)
where refers to the Hadamard product.
where R B(·) refers to the residual block.
3.2.2 DIS strategy In DIS3 , the P3 is passed through the spatial attention
module (SAM) to obtain the weight values of each point,
The P1 feature map is supplemented with semantic informa- and the weight values are used to guide the P1 and P2 feature
tion through SIS1 and outputs P1out . It is then passed to the maps to supplement P3 with the required details. We adjust
detection head for regression and classification predictions. the number of channels of P3 by 1 × 1 convolution to ensure
The P2 feature map undergoes SIS2 to obtain the interme- that it can be fused with the feature maps obtained from the
diate feature map Pm , which receives a boost in semantic Concatenate operation. DIS3 can be expressed as:
information but still lacks detailed information. To address
this issue, we employed the DIS strategy to supplement the P3out R B(Conv1×1 (P3 )
feature maps with detailed information and generate the out- +C(R B(P1 S AM(P3 )), R B(P2 S AM(P3 ))))
put feature maps P2out and P3out for P2 and P3 , respectively. (11)
The DIS strategy is illustrated in Fig. 10.
In the DIS strategy, DIS2 and DIS3 are slightly different. where Conv1×1 (·) refers to the standard convolution of 1 ×
The reason is that Pm can only supplement detailed infor- 1.
mation from the feature map P1 , while P3 can supplement The FPF structure enhances the network’s ability to
detailed information from P1 and P2 . detect small defects and effectively alleviates the feature
In DIS2 , the Pm is passed through the spatial attention confusion caused by feature fusion. Specifically, the SIS
module (SAM) to get the weight value of each point, and strategy enriches and highlights the feature information of
123
G. Liu, J. Ren
small defects while suppressing other interfering informa- frequently encountered in practical production as our tar-
tion, enabling the network to detect small defects better get objects: hole, stain, three silk, knot, flower pattern card
and reduce false positives. The DIS strategy complements skip, coarse filling, size stain, warper, and mark. We reanno-
the detailed information required in the deep feature map tated the dataset using LabelImg to ensure the model learns
from the shallow feature map, compensating for the lack of from accurate and dependable feature information. In order
detailed information in the deep feature map and mitigating to augment the dataset and ensure an adequate number of
the information confusion caused by feature fusion. defect images for training, we applied augmentation tech-
niques such as random flipping, brightness transformation,
and rotation scaling. As a result, we gathered a total of 4,293
4 Experiments and discussions images, which form our new dataset. The images have a res-
olution of 2446 pixels × 1000 pixels. Among them, 3,435
In this section, we validate the effectiveness of our pro- images were randomly selected as the training set and 858 as
posed model through a series of experiments. We first the validation set. Sample images and corresponding anno-
introduce the dataset and experimental setup, conduct thresh- tation boxes for each category in the dataset are shown in
old analysis, and compare our proposed model with current Fig. 11.
state-of-the-art algorithms. Next, we perform ablation exper-
iments to demonstrate the improvement achieved by our 4.2 Experimental configuration
proposed method in detection performance. Finally, we
conduct extended experiments to assess the generalization The experiment is based on Windows 10 operating system,
capability of our approach. the CPU is Intel(R) Core(TM) i7-9700, the GPU is NVIDIA
GeForce GTX 1660, the experimental environment is Python
4.1 Experimental dataset 3.7, NVIDIA CUDA 10.2, CUDNN 7.6.5, using Pytorch as
the deep learning framework, and the development tool is
To train the model and verify the method’s validity, we used Pycharm. To capture objects of various sizes, we employed
a public fabric defect dataset (Smart Diagnosis of Cloth the K-means + + clustering algorithm to generate higher-
Flaw Dataset, SDCFD) [30]. The samples were taken from quality anchor boxes. The generated anchor boxes were then
the production lines of textile mills. SDCFD contains 20 assigned to the three feature layers as follows: P1: (5,10),
types of defects. In the experiments, we selected nine defects (6,15), (19,6); P2: (4,27), (24,37), (21,216); P3: (69,79),
123
Feature purification fusion structure for fabric defect detection
Fig. 11 Partial enlarged example of SDCFD defect. Images in order are a hole, b stain, c three silk, d knot, e flower pattern card skip, f coarse
filling, g size stain, h warper, i mark
123
G. Liu, J. Ren
123
Feature purification fusion structure for fabric defect detection
Hole Stain Three silk Knot fpck Coarse filling Size stain Warper Mark mAP P G
Faster R-CNN 84.6 85.2 88.3 78.1 88.4 77.2 85.5 80.1 75.1 82.5 43.7 178.9
Cascade R-CNN 90.7 89.9 88.7 80.3 90.1 79.3 90.6 81.8 79.3 85.6 73.5 219.8
SSD 79.2 81.5 73.3 71.5 82.8 72.0 77.7 76.0 81.2 77.2 34.3 51.2
YOLOv3 90.1 88.6 87.4 80.8 91.6 85.2 85.1 74.7 84.6 85.3 61.5 193.9
YOLOv5s 90.2 88.3 88.6 81.6 92.9 78.9 86.3 75.4 78.4 84.5 7.0 16.5
YOLOv5m 90.7 89.7 89.4 82.1 95.7 80.8 86.6 76.2 79.6 85.6 20.8 49.4
YOLOX-s 80.9 81.4 85.1 77.3 87.6 75.1 80.2 72.8 76.3 79.6 8.9 26.6
YOLOR-CSP 89.3 94.9 87.8 80.5 89.7 79.6 83.4 74.3 74.9 83.8 63.2 134.6
Lin et al 95.6 95.2 93.6 90.5 92.1 82.4 90.8 77.6 84.2 89.1 54.7 188.5
Zhou et al 93.4 91.1 90.5 91.4 89.3 81.7 88.9 78.1 82.5 87.4 11.4 25.6
Luo et al 95.2 90.7 92.7 94.7 90.5 86.1 95.2 80.7 81.6 89.7 65.8 205.7
Ours 96.3 94.6 98.2 95.4 95.1 84.8 94.5 79.4 83.9 91.3 15.4 37.7
we present the detection accuracy of several defects, includ- formed by adding the Ghost + DGE module and FPF struc-
ing hole, three silk, knot, and size stain. The results of the ture saw a slight decrease in mAP of only 0.1% compared
ablation experiments are shown in Table 3. to the former, but the detection speed increased by 4.1 FPS.
In Table 3, we observed a significant increase in mAP after Considering all factors, we chose to adopt the model formed
adding the DGE module, indicating that the DGE module by adding the Ghost + DGE module and FPF structure as our
enhances the network’s ability to extract features, resulting final model.
in better detection of target objects. The addition of the Ghost To provide a more specific demonstration of the impact of
+ DGE module achieved a maximum inference speed of 47.5 the FPF structure and its SIS and DIS strategies on detection
FPS, demonstrating the significant lightweight performance performance, we conducted a separate set of ablation experi-
of the Ghost module. Furthermore, compared to the baseline ments. The SIS strategy, DIS strategy, and FPF structure were
model, replacing the FPF structure led to a 4.2% improve- individually added to the baseline model. It is important to
ment in accuracy, indicating the strong performance of the note that no modules were replaced when adding the SIS
FPF structure. We also observed that the model formed by strategy, DIS strategy, and FPF structure. Instead, they were
adding the DGE module and FPF structure achieved an mAP directly added between the CSPDarkNet53 and PAN mod-
of 91.4, but the detection speed was only 33.5 FPS. The model ules of the baseline model. In this configuration, the input to
123
G. Liu, J. Ren
after adding the SIS strategy. This is because the SIS strat-
egy enhances the saliency of small defects in shallow feature
maps, enabling the network to focus more on these small
defects and improve their detection accuracy. The DIS strat-
egy supplements detailed information from shallow to deep
layers, leading to a notable improvement in detecting large-
scale defects. Due to the interdependence between the output
of SIS2 and the input of DIS2 , the overall performance
improvement is more pronounced when the FPF structure
is added as a whole, compared to individually adding the
SIS and DIS strategies. The integration of the FPF structure
leverages the correlated information between SIS and DIS,
leading to a more significant enhancement in detection per-
formance.
Baseline DGE Ghost + DGE FPF Hole Three_silk Knot Size_stain mAP P G FPS
√
90.2 88.6 81.6 86.3 84.5 7.0 16.5 42.2
√ √
92.0 91.6 85.3 88.7 86.6 6.2 14.4 46.8
√ √
91.8 91.4 85.3 88.6 86.4 5.7 13.3 47.5
√ √
94.5 95.1 90.7 91.6 88.7 17.3 41.5 32.8
√ √ √
96.3 98.1 95.6 94.5 91.4 15.9 38.9 33.5
√ √ √
96.3 98.2 95.4 94.5 91.3 15.4 37.7 37.6
Table 4 FPF structure ablation experimental results. P means Parameters (M), and G means GFLOPs
SIS DIS
√
90.2 88.6 81.6 86.3 84.5 7.0 16.5 42.2
√ √
92.4 94.3 90.7 87.2 87.3 13.2 30.9 36.6
√ √
92.1 90.6 83.8 91.7 86.8 14.3 34.7 34.5
√ √ √
96.5 98.2 95.8 95.4 91.8 20.7 50.2 29.7
123
Feature purification fusion structure for fabric defect detection
Cascade R-CNN 94.3 83.5 96.8 88.2 91.9 88.7 87.8 77.1 88.5 73.5 219.8 21.7
YOLOv3 92.2 80.6 94.3 90.2 93.5 90.7 79.8 78.6 87.5 61.5 193.9 26.4
YOLOv5s 92.5 77.4 95.7 87.3 96.1 92.4 78.2 75.9 86.9 7.0 16.5 52.2
YOLOv5m 93.7 78.3 96.2 87.5 94.9 93.5 80.3 77.8 87.8 20.8 49.4 41.3
YOLOR-CSP 90.4 79.1 94.8 86.3 91.7 91.6 77.3 79.6 86.4 63.2 134.6 22.5
Ours 95.2 87.6 96.4 87.7 94.3 95.1 85.4 84.7 90.8 15.4 37.7 43.6
defect types, including Non Conductive (NC), Scratch, Cor- while maintaining an average inference frame rate of 43.6
ner Leakage (CL), Orange Peel (OP), Convex Powder (CP), FPS, which still meets the real-time requirement. In the com-
Particle Color (PC), Dirty Point (DP), and Leakage. We parative experiments with mainstream models, our model
applied augmentation techniques such as random flipping, achieved the highest mAP value and demonstrated good
brightness transformation, and rotation scaling to enhance detection performance across various defect categories. This
the dataset. As a result, we collected a new dataset consist- indicates that our model possesses excellent generalization
ing of 7,140 images. Each image in the dataset has a size ability, enabling it to handle different data types and address
of 2560 pixels × 1920 pixels. Among these, 5,712 images various problem scenarios.
were randomly selected for training, while 1,428 images were
used for validation. We followed the same experimental setup
and evaluation metrics as in fabric defect detection, with the
5 Conclusion
exception that after multiple experiments, we set the value
of T1 to 0.3 and the value of T2 to 0.3. The results of the
This paper proposes a deep learning model for fabric defect
extended experiment are presented in Table 5, and the visu-
detection to address the difficulties in the current fabric defect
alized detection results are seen in Fig. 16.
detection tasks. Ghost convolution and DGE modules were
The bold numbers in the table represent the optimal values.
embedded into the feature extraction network to extract more
The table shows that compared to the baseline YOLOv5s,
abundant defect feature information and improve the charac-
our proposed model achieved a 3.9% improvement in mAP
terization ability of fabric defect features. Unlike the common
123
G. Liu, J. Ren
123
Feature purification fusion structure for fabric defect detection
8. Tu, Z., Xie, W., Dauwels, J., Li, B., Yuan, J.: Semantic cues 27. Wang, C. Y., Liao, H. Y. M., Wu, Y. H., Chen, P. Y., Hsieh, J. W.,
enhanced multimodality multistream CNN for action recogni- Yeh, I. H.: CSPNet: a new backbone that can enhance learning capa-
tion. IEEE Trans. Circuits Syst. Video Technol. 29(5), 1423–1437 bility of CNN. In: Proceedings of the IEEE/CVF Conference on
(2018) Computer Vision and Pattern Recognition Workshops. pp. 390–391
9. Qin, R., Li, Y., Fan, Y.: Research on fabric defect detection based on (2020)
multi-branch residual network. J. Phys. Conf. Ser. 1907(1), 012057 28. Jiaxu, L., Taiyue, C., Xinbo, G., Yongtao, Y., Ye, W., Feng, G.,
(2021) Yue, W.: A comparative review of recent few-shot object detection
10. Liu, Q., Wang, C., Li, Y., Gao, M., Li, J.: A fabric defect detec- algorithms. arXiv preprint arXiv:2111.00201 (2021)
tion method based on deep learning. IEEE Access 10, 4284–4296 29. Tian, Z., Shen, C., Chen, H., He, T.: Fcos: Fully convolutional
(2022) one-stage object detection. In: Proceedings of the IEEE/CVF Inter-
11. Yue, X., Wang, Q., He, L., Li, Y., Tang, D.: Research on tiny target national Conference on Computer Vision. pp. 9627–9636 (2019)
detection technology of fabric defects based on improved Yolo. 30. Tian, C.: Smart diagnosis of cloth flaw dataset. Avail-
Appl. Sci. 12(13), 6823 (2022) able online: https://tianchi.aliyun.com/dataset/dataDetail?dataId=
12. Jing, J., Zhuo, D., Zhang, H., Liang, Y., Zheng, M.: Fabric defect 79336 (2020)
detection using the improved YOLOv3 model. J. Eng. Fibers Fabr. 31. J. Glenn et al.: NanoCode012. Available online: https://github.com/
15, 1558925020908268 (2020) ultralytics/yolov5/ (2021)
13. Wang, Y., Hao, Z., Zuo, F., Pan, S.: A fabric defect detection system 32. Lin, G., Liu, K., Xia, X., Yan, R.: An efficient and intelligent
based improved yolov5 detector. J. Phys. Conf. Ser. 2010, 012191 detection method for fabric defects based on improved YOLOv5.
(2021) Sensors 23(1), 97 (2022)
14. Qu, J., Su, C., Zhang, Z., Razi, A.: Dilated convolution and feature 33. Zhou, S., Zhao, J., Shi, Y.S., Wang, Y.F., Mei, S.Q.: Research on
fusion SSD network for small object detection in remote sensing improving YOLOv5s algorithm for fabric defect detection. Int. J.
images. IEEE Access. 8, 82832–82843 (2020) Cloth. Sci. Technol. 35(1), 88–106 (2023)
15. Zhu, X., Lyu, S., Wang, X., Zhao, Q.: TPH-YOLOv5: improved 34. Luo, X., Cheng, Z., Ni, Q., Tao, R., Shi, Y.: Defect detection algo-
YOLOv5 based on transformer prediction head for object detec- rithm for fabric based on deformable convolutional network. Text.
tion on drone-captured scenarios. In: Proceedings of the IEEE/CVF Res. J. 93(9–10), 2342–2354 (2023)
international conference on computer vision. pp. 2778–2788
(2021)
16. Fu, H., Song, G., Wang, Y.: Improved YOLOv4 marine target detec-
Publisher’s Note Springer Nature remains neutral with regard to juris-
tion combined with CBAM. Symmetry. 13(4), 623 (2021)
dictional claims in published maps and institutional affiliations.
17. Sun, P., Piao, J. C., Cui, X.: Object Detection in Urban Aerial Image
Based on Advanced YOLO v3 Algorithm. In 2020 5th International
Springer Nature or its licensor (e.g. a society or other partner) holds
Conference on Mechanical, Control and Computer Engineering
exclusive rights to this article under a publishing agreement with the
(ICMCCE). pp. 2191–2196 (2020)
author(s) or other rightsholder(s); author self-archiving of the accepted
18. Guo, C., Fan, B., Zhang, Q., Xiang, S., Pan, C.: Augfpn: improving
manuscript version of this article is solely governed by the terms of such
multi-scale feature learning for object detection. In Proceedings of
publishing agreement and applicable law.
the IEEE/CVF Conference on Computer Vision and Pattern Recog-
nition. pp. 12595–12604 (2020)
19. Baojun, Z., Boya, Z., Linbo, T., Wenzheng, W., Chen, W.: Multi-
scale object detection by top-down and bottom-up feature pyramid Guohua Liu received his B.S.
network. J. Syst. Eng. Electron. 30(1), 1–12 (2019) and M.S. degrees in mechatronic
20. Ghiasi, G., Lin, T. Y., Le, Q. V.: Nas-fpn: Learning scalable feature engineering from Hebei University
pyramid architecture for object detection. In: Proceedings of the of Technology, Tianjin, China, in
IEEE/CVF Conference on Computer Vision and Pattern Recogni- 1992 and 1998, respectively. He
tion. pp. 7036–7045 (2019) is a professor at Tiangong Uni-
21. Li, Y., Pang, Y., Shen, J., Cao, J., Shao, L.: NETNet: Neighbor eras- versity. He is a reviewer for The
ing and transferring network for better single shot object detection. Visual Computer, IEEE Access,
In: Proceedings of the IEEE/CVF Conference on Computer Vision and Journal of Electronic Imaging.
and Pattern Recognition. pp. 13349–13358 (2020) He is engaged in the research of
22. He, Z., He, D., Li, X., Qu, R.: Blind superresolution of satel- computer vision at the Advanced
lite videos by ghost module-based convolutional networks. IEEE Mechatronics Equipment Tech-
Trans. Geosci. Remote Sens. 61, 1–19 (2022) nology Major Laboratory, Tianjin,
23. Cao, Y., Chen, J., Zhang, Z.: A sheep dynamic counting China. His research interests
scheme based on the fusion between an improved-sparrow-search include computer vision for defect
YOLOv5x-ECA model and few-shot deepsort algorithm. Comput. detection, pattern recognition, and scene understanding.
Electron. Agric. 206, 107696 (2023)
24. Jia, Z., Wang, K., Li, Y., Liu, Z., Qin, J., Yang, Q.: High pre-
cision feature fast extraction strategy for aircraft attitude sensor
fault based on RepVGG and SENet attention mechanism. Sensors
22(24), 9662 (2022)
25. Nawaz, M., Javed, A., Irtaza, A.: ResNet-Swish-Dense54: a deep
learning approach for deepfakes detection. Vis. Comput. (2022).
https://doi.org/10.1007/s00371-022-02732-7
26. Yang, M., Ma, T., Tian, Q., Tian, Y., Al-Dhelaan, A., Al-
Dhelaan, M.: Aggregated squeeze-and-excitation transformations
for densely connected convolutional networks. Vis. Comput. 38(8),
2661–2674 (2022)
123
G. Liu, J. Ren
123