Sensors 23 06084 v2
Sensors 23 06084 v2
Article
LPDNet: A Lightweight Network for SAR Ship Detection Based on
Multi-Level Laplacian Denoising
Congxia Zhao 1 , Xiongjun Fu 1,2, *, Jian Dong 1 , Cheng Feng 1 and Hao Chang 1
Abstract: Intelligent ship detection based on synthetic aperture radar (SAR) is vital in maritime
situational awareness. Deep learning methods have great advantages in SAR ship detection. However,
the methods do not strike a balance between lightweight and accuracy. In this article, we propose an
end-to-end lightweight SAR target detection algorithm, multi-level Laplacian pyramid denoising
network (LPDNet). Firstly, an intelligent denoising method based on the multi-level Laplacian
transform is proposed. Through Convolutional Neural Network (CNN)-based threshold suppression,
the denoising becomes adaptive to every SAR image via back-propagation and makes the denoising
processing supervised. Secondly, channel modeling is proposed to combine the spatial domain and
frequency domain information. Multi-dimensional information enhances the detection effect. Thirdly,
the Convolutional Block Attention Module (CBAM) is introduced into the feature fusion module
of the basic framework (Yolox-tiny) so that different weights are given to each pixel of the feature
map to highlight the effective features. Experiments on SSDD and AIR SARShip-1.0 demonstrate
that the proposed method achieves 97.14% AP with a speed of 24.68FPS and 92.19% AP with a speed
of 23.42FPS, respectively, with only 5.1 M parameters, which verifies the accuracy, efficiency, and
lightweight of the proposed method.
Keywords: synthetic aperture radar (SAR); Convolutional Neural Network (CNN); target detection;
trainable; Laplacian pyramid denoising
Citation: Zhao, C.; Fu, X.; Dong, J.;
Feng, C.; Chang, H. LPDNet: A
Lightweight Network for SAR Ship
Detection Based on Multi-Level
1. Introduction
Laplacian Denoising. Sensors 2023, 23,
6084. https://doi.org/10.3390/
Synthetic aperture radar (SAR) is an active microwave sensor with strong penetrability,
s23136084
which produces high-resolution images at all-time and all-weather. It is widely used in
disaster prevention, emergency rescue, land resources monitoring, urban management,
Academic Editor: Gwanggil Jeon
etc. [1,2]. Ship detection based on SAR images effectively monitors ocean conditions and
Received: 25 May 2023 manages ocean resources [3].
Revised: 17 June 2023 Traditional SAR ship detection methods are mainly based on the statistical features [4–6],
Accepted: 20 June 2023 scattering characteristics [7], and transform domain [8], while the most typical method
Published: 1 July 2023 is the constant false alarm rate (CFAR) [9–12]. However, its feature extraction process is
complex, and it is sensitive to speckle noise and complex backgrounds, which can not meet
the requirements of high-precision, lightweight, and real-time SAR ship detection.
With the emergence and vigorous development of CNNs, object detection algorithms
Copyright: © 2023 by the authors. based on deep learning have become the mainstream methods of image processing. The
Licensee MDPI, Basel, Switzerland. algorithms fall into two main categories: one-stage [13–16] and two-stage [17–20] meth-
This article is an open access article
ods. The one-stage algorithms, including Yolo [13], SSD [14], RetinaNet [15], etc., produce
distributed under the terms and
the detection results directly from the figures. The two-stage algorithms, such as Faster
conditions of the Creative Commons
R-CNN [18], Mask R-CNN [19], and Cascade R-CNN [20], divide the target detection into
Attribution (CC BY) license (https://
two stages, generating region proposal and classification. In recent years, with the proposal
creativecommons.org/licenses/by/
of SAR datasets, containing SSDD [21], HRSID [22], and OpenSARship [23], SAR target
4.0/).
detection based on deep learning has developed towards high precision, high speed, and
lightweight. The two-stage methods generally possess higher accuracy. Y. Li et al. [24]
designed a feature relay amplification and multiscale feature jump connection structure.
They proposed a lightweight network based on Faster R-CNN [18] to improve the detection
speed while ensuring detection accuracy. Cui et al. [25] densely connected CBAM [26] to
each concatenated feature map from top to bottom of the pyramid network to obtain abun-
dant features containing resolution and semantic information and improve the detection
accuracy of multiscale ships. The one-stage detectors reduce the amount of calculation
through various methods to improve detection efficiency. H. Wan et al. [27] designed a
lightweight backbone network based on Yolox [28], highlighting SAR targets’ unique strong
scattering characteristics by integrating channels and spatial attention mechanisms and
improving detection accuracy and speed. Qi et al. [29] introduced weak segmentation and
attention mechanism [30] into a single-stage detector to obtain richer semantic informa-
tion. Mao et al. [31] proposed a simple detector based on U-Net [32], but the detection
performance was unsatisfactory. These methods are competitive in terms of accuracy
and lightweight. However, they process SAR images as optical images and ignore their
differences. SAR images are produced in microwave/millimeter wave bands, which are
different from common three-channel images, RGB or HSV, acquired by visible and partial
infrared band sensors. The differences make the CNN-based methods not fully applicable
to SAR image processing. Moreover, because of the basic principle of coherent imaging,
SAR images contain a lot of speckle noise, which brings a low signal-to-noise ratio.
In SAR images, speckle noise can easily lead to false detection and target missing.
Traditional image denoising mainly includes spatial domain-based methods and frequency
domain-based methods. L. Liu et al. [33] used morphological filtering for SAR image
denoising to preprocess image change detection. R. Farhadiani et al. [34] used a maximum
a posteriori (MAP) estimator to denoise in the complex wavelet domain and utilized the
local pixel group filtering based on the non-local principal component analysis (LPG-PCA)
method to smooth the homogeneous areas and enhance the details. Through feature
extraction of CNNs, SAR image denoising methods based on deep learning extract the
effective features and obtain the noise and details in the image. J. Zhang et al. [35] performed
SAR image denoising by a multiconnection network that incorporates wavelet features
while maintaining the texture structure of the image. S. Liu et al. [36] extracted shallow
features from noisy images by designing kernels of different sizes to form multiscale
modules. Following this, the shallow features are mapped to the residual dense dual-
attention network to obtain the deep features of SAR images, and the final denoised images
are generated through global residual learning. However, most of the above methods only
regard SAR image denoising as the preprocessing and do not realize the end-to-end image
processing. At the same time, in the traditional denoising methods, the noise suppression
thresholds are often set according to experience, which is vulnerable to subjective influence,
and the self-adaptation to each image, each network, and target detection task cannot
be achieved.
This article proposes an integrated model of the transform domain denoising method
and target detection network, LPDNet, to solve the above problems. The main contributions
can be summarized as follows:
(1) To reduce speckle noise and improve detection performance, this article proposes a
novel intelligent denoising method based on Laplacian pyramid denoising and CNN.
Through CNN-based threshold suppression, the multi-level Laplacian denoising
becomes adaptive to every SAR image via back-propagation based on the subsequent
detection network, which makes the denoising processing supervised.
(2) This article proposes to perform channel modeling on feature maps composed of
spatial sub-bands, denoised maps, and original maps. It strengthens the balance
between the contributions of spatial and frequency features, targets, and noise, thereby
effectively combining the multi-dimensional information.
Sensors 2023, 23, 6084 3 of 16
(3) In order to further improve the accuracy of the model, the CBAM is introduced to the
feature fusion module of the basic framework. The CBAM is flexible and lightweight,
which avoids a lot of computational overhead, and the accuracy is compensated.
(4) In this article, supervised thinking in deep learning is combined with traditional
image decomposition and enhancement methods. It provides a new supervised
denoising method for SAR target detection. The method can be embedded in SAR
image processing as a preprocessing process.
The feasibility and effectiveness of LPDNet are verified on the SAR ship detection data
set (SSDD), AIR SARShip-1.0 [37], and HRSID. The method is compared with other CNN-
based algorithms and evaluated by indicators, such as average precision mean (mAP),
precision, recall, F1-score, frame per second (FPS), and parameters. The experimental
results prove its accuracy and efficiency.
The remainder of this article is organized as follows: Section 2 introduces our proposed
target detection method, Section 3 presents experiments and results analysis, and Section 4
concludes the paper.
2. Methodology
In this section, the multi-level Laplacian pyramid denoising network is developed,
and details of the implementation procedures are presented. First, an overview of LPDNet
is introduced. Then, the structure of multi-level Laplacian pyramid denoising is developed.
Finally, CBAM is introduced into the model to enhance the representation ability.
Back-propagation
IH2 LPR
LPD C
TH
LPR
LPD IH3
TH
I L1
LPD LPR
C C
I L2
I L3
C C
LPD: Laplacian Pyramid Decomposition TH: CNN-based threshold denoising
LPB: Laplacian Pyramid Reconstruction C : CBAM C : Concat Yolo Head Yolo Head Yolo Head
Oi = MaxPool ( f i ∗ IH + bi ) (1)
where Oi is the ith feature channel of the convolutional layer output, MaxPool () is the
maximum pooling function, f i represents the convolution kernel of the input feature
corresponding to the ith channel, IH is the high-frequency sub-band image input to the
convolution layer, and bi represents the biasing of the convolutional layer. The fully
connected layer takes the output of the convolutional layer, Oi , as the input. The output (D)
is a tensor, D ∈ R1×1×1 :
D = ReLU (WO + b) (2)
where ReLU () is the relu activation function, which can be defined as follows:
x, x > 0
ReLU ( x ) = (3)
0, x ≤ 0
where IH ( x, y) and ÎH ( x, y) represent the high-frequency sub-band coefficients before and
after the threshold suppression, respectively.
Finally, the Laplacian pyramid reconstruction is encapsulated into an image reconstruc-
tion layer. The low-frequency sub-band image and the denoised high-frequency sub-band
image are put into the layer to obtain the denoised image.
Sensors 2023, 23, x FOR PEER REVIEW
Sensors 2023, 23, 6084 5 of 16
Input Image
LPD
1*1 Conv
Fully
Connected
Hard
Threshold
Denoised High-
frequency Sub-band
FigureThe2. Laplacian
Laplacian pyramid
denoising denoising
module based onmodel.
the CNN can be embedded in any image
processing network as the preprocessing. The convolution layer and the fully connected
layer can be trained by the loss function through the back-propagation. The denois-
Finally, the Laplacian pyramid reconstruction is encapsulated into an
ing threshold can be calculated by the global characteristics of each image to achieve
struction layer. The low-frequency sub-band image and the denoised high-f
intelligent denoising.
band image are put into the layer to obtain the denoised image.
2.2.2. Multi-Level Laplacian Pyramid Denoising Model
The Laplacian denoising module based on the CNN can be embedded
A multi-level Laplacian pyramid denoising model is designed based on the single-level
processing network
denoising structure as the
proposed preprocessing.
in Section The
2.2.1. The model convolution
contains layer
three layers, and the fu
as shown
layer can be trained by the loss function through the
in the green box in Figure 1. Firstly, the high-frequency sub-band I H1 back-propagation. T
and low-frequency
sub-band IL1 are obtained through the first layer of Laplacian pyramid decomposition.
threshold can be calculated by the global characteristics of each image to a
By decomposing the first and second low-frequency sub-bands, IH2 , IL2 and IH3 , IL3 can
gent denoising.
be obtained, respectively. Secondly, the CNN-based threshold suppression module is
constructed for each level’s high-frequency sub-bands to obtain the sub-bands after noise
suppression.
2.2.2. Then, the denoised
Multi-Level Laplacianimages are obtained
Pyramid by Laplacian
Denoising pyramid reconstruction,
Model
as shown in Figure 3. The images in the first row are the originals in the SSDD, and the
Athe
ones in multi-level
second row Laplacian pyramiddenoised
are the corresponding denoising model
images. is designed
It is obvious that the based
speckle noise in the denoised images is weaker than that in the original
level denoising structure proposed in Section 2.2.1. The model contains th images. However,
the hard threshold function may cause image distortion, including the ringing effect and
shown in the
pseudo-Gibbs green box in Figure 1. Firstly, the high-frequency sub-band
phenomenon.
frequency sub-band I L1 are obtained through the first layer of Laplacian
2.2.3. CBAM Yolox-Tiny
composition. By decomposing
The Yolox networks, theaccurate
which are more first and second
and faster thanlow-frequency
other detection al-sub-ba
gorithms, are derived from Yolo v3. Considering the accuracy, speed, and lightweight
and I H 3 , Iof
requirements
can be obtained, respectively. Secondly, the CNN-based thres
L 3 ship detection, this article uses Yolox-tiny as the basic network.
(a)
(b)
Figure 3. Original images and denoised images in the SSDD; (a) original images, (b) denoised im
ages.
Avg
× Avg ×
F F”
Figure 4. Structure
Figure of of
4. Structure CBAM. ×
CBAM. ×
By multiplying the attention maps with the feature image and adapting the features,
By multiplying
the essential featuresthe areattention
focused,maps
andwith the featurefeatures
unnecessary image and areadapting the features,
suppressed, so that the
the essential features are focused,
representation ability is enhanced. and unnecessary features are suppressed, so that the
representation ability is enhanced.
CBAM is introduced into the feature fusion layer of the Yolox-tiny network, as shown
CBAM is introduced into the feature fusion layer of the Yolox-tiny network, as shown
in Figure 5. It is placed on the two branches of the network. Different weights are given to
in Figure 5. It is placed on the two branches of the network. Different weights are given to
each channel
each channelandandpixel ofthe
pixel of thefeature
feature map,
map, which
which provides
provides adequate
adequate information
information for fea-
for feature
ture extraction.
extraction. AtAtthethe same
same time,
time, thethe invalid
invalid information
information is suppressed,
is suppressed, thus thus the detection
the detection
accuracy is further improved.
accuracy is further improved.
CSP2_1 head
CBS = Conv BN SiLU
Figure 5. CBAM
Figure 5. CBAMYolox-tiny
Yolox-tiny structure.
structure.
3. Experiments
3. Experiments
In this section, the experiments are discussed and analyzed to verify the effect of the
In this section,
proposed the experiments
method through areon
experiments discussed
the SSDD,and
AIRanalyzed to verify
SARShip-1.0 the effect
and HRSID. Theof the
proposed
experiments discuss the effectiveness of each part of the network and the comparative The
method through experiments on the SSDD, AIR SARShip-1.0 and HRSID.
experiments discuss
effect with other the effectiveness
networks. of each
All models are part of using
implemented the network and the comparative
Pytorch framework under
effect with
Linux otherwith
system networks.
NVIDIAAll models
Tesla are implemented
K80 GPUs support. using Pytorch framework under
Linux system with NVIDIA Tesla K80 GPUs support.
3.1. Dataset
To verify the model’s effectiveness, experiments are carried out on SSDD, AIR SARShip-
3.1. Dataset
1.0, and HRSID. The SSDD dataset contains 1160 SAR images, each with a size of about
500To×verify the model’s
500. These effectiveness,
images have resolutionsexperiments
of 1–15 m, witharedifferent
carried out on SSDD,
polarization AIR SAR-
modes,
Ship-1.0, and HRSID. The SSDD dataset contains 1160 SAR images, each
including HH, HV, VH, and VV. There are 2358 ships with different scales and materials. with a size of
about
The 500
AIR × 500. Thesecontains
SARShip-1.0 images31have resolutions
images of 1–15
of large scenes withm,a size
withofdifferent polarization
3000 × 3000 and
modes,
nearlyincluding
a thousandHH, HV,
ships VH, andtypes,
of different VV. There are 2358
with image ships with
resolutions of 1 mdifferent scales
and 3 m and a and
single polarization
materials. mode. The HRSID
The AIR SARShip-1.0 includes
contains 31 5604 images
images and 16,951
of large scenes ships
withwith different
a size of 3000 ×
polarization
3000 and nearly modes and a ranging
a thousand ships ofresolution
differentfromtypes,1m to 5image
with m. Inresolutions
the experiment,
of 1 mweand 3
randomly resize the images into 416 × 416, 416 × 416, and 800 × 800, respectively.
m and a single polarization mode. The HRSID includes 5604 images and 16,951 ships with All
datasets contain sea surfaces with different sea conditions and complex scenes, including
different polarization modes and a ranging resolution from 1 m to 5 m. In the experiment,
inshore and offshore. For the convenience of the experiment, the data sets are divided into
wearandomly resize
train set, a test set,the
andimages into set
a validation 416according
× 416, 416to × 416,
the and
ratio 800 × 800, respectively. All
of 7:2:1.
datasets contain sea surfaces with different sea conditions and complex scenes, including
inshore and offshore. For the convenience of the experiment, the data sets are divided into
a train set, a test set, and a validation set according to the ratio of 7:2:1.
Sensors 2023, 23, 6084 8 of 16
TP
Precision = (7)
TP + FP
TP
Recall = (8)
TP + FN
where TP means the ships are detected correctly, FP means missed detection, and FN means
false alarm.
AP and F1 measure the balance between precision and recall:
Z 1
AP = P( R)· RdR (9)
0
2· R · P
F1 = (10)
R+P
where P respects precision and R respects recall, AP computes the average value of precision
over the interval from recall = 0 to 1, and F1 is the harmonic mean of precision and recall.
Both of them are comprehensive calculation indicators of precision and recall, and the
higher the value, the better the detection effect.
the CBAM makes the network pay more attention to the target features. The detection of
densely arranged inshore ships and parked ships in the port is more accurate. The detection
results in Table 2 further demonstrate the effectiveness of each module.
3.4.2. 2.
Table Analysis of Detection
The results of ablationResults
experiments.
Figure 10. Precision-recall (PR) curves of different algorithms on the AIR SARShip-1.0.
Figure 10. Precision-recall (PR) curves of different algorithms on the AIR SARShip-1.0.
3.6. Migration Ability
To test the migration
3.6. Migration Ability ability of LPDNet, we performed an experiment on the HRSID.
We divided the entire dataset according to the dataset division of the original paper. Table 5
showsTo thetest the migration
detection results on ability of LPDNet,
the HRSID. we method
The proposed performed an experiment
can achieve state-of- on t
We divided the entire dataset according to the dataset division of the 93%,
the-art results, with AP, F1, recall, and precision values of 93.09%, 0.89, 85.99%, and original pa
respectively. The results show that the proposed method has a strong migration ability.
5 shows the detection results on the HRSID. The proposed method can achiev
the-art results, with AP, F1, recall, and precision values of 93.09%, 0.89, 85.99%
respectively. The results show that the proposed method has a strong migratio
4. Conclusions
In this article, a new lightweight SAR ship detector, LPDNet, is proposed. Based on
the multiresolution characteristics of Laplacian pyramid transformation, a CNN-based
method is used to obtain the denoising threshold from the global characteristics of each
image in a supervised way to achieve intelligent denoising. At the same time, channel
modeling is adopted to make up for the loss of information during denoising, enhance
the contour features, and improve the detection effect. Moreover, CBAM is introduced
into the feature fusion module to enhance the ability of feature representation and further
improve the network performance. Experiments show that the proposed network achieves
a balance between lightweight and detection accuracy and has an excellent detection effect.
In the future, the generalization performance of LPDNet will be studied, including
validation on large scene datasets, and the effects of image resolution, polarization mode,
and other conditions on target detection. We will lighten the detector without sacrificing
accuracy and obtain better detection results. Further, we will explore the application of the
multi-level Laplacian pyramid denoising model in target recognition, segmentation, and
other fields. In addition, more intelligent denoising methods will be explored. We will focus
on combining traditional denoising methods and CNNs so that the traditional denoising
methods can develop intelligence and embeddability to serve in image interpretation.
Author Contributions: Methodology, C.Z. and J.D.; Data curation, C.Z.; Software, C.Z., H.C. and
C.F.; Writing original draft, C.Z. and J.D.; Validation, X.F. All authors have read and agreed to the
published version of the manuscript.
Funding: This work was supported by the 111 Project of China under Grant B14010.
Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.
Data Availability Statement: Not applicable.
Conflicts of Interest: The authors declare no conflict of interest.
References
1. Yang, R.; Hu, Z.; Liu, Y.; Xu, Z. A Novel Polarimetric SAR Classification Method Integrating Pixel-Based and Patch-Based
Classification. IEEE Geosci. Remote Sens. Lett. 2020, 17, 431–435. [CrossRef]
2. Qin, R.; Fu, X.; Lang, P. PolSAR Image Classification Based on Low-Frequency and Contour Subbands-Driven Polarimetric SENet.
IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2020, 13, 4760–4773. [CrossRef]
3. Brusch, S.; Lehner, S.; Fritz, T.; Soccorsi, M.; Soloviev, A.; van Schie, B. Ship Surveillance with TerraSAR-X. IEEE Trans. Geosci.
Remote Sens. 2011, 49, 1092–1103. [CrossRef]
4. Wei, X.; Wang, X.; Chong, J. Local region power spectrum-based unfocused ship detection method in synthetic aperture radar
images. J. Appl. Remote Sens. 2018, 12, 016026. [CrossRef]
5. Huo, W.; Huang, Y.; Pei, J.; Zhang, Q.; Gu, Q.; Yang, J. Ship detection from ocean SAR image based on local contrast variance
weighted information entropy. Sensors 2018, 18, 1196. [CrossRef]
6. Yang, M.; Guo, C. Ship Detection in SAR Images Based on Lognormal ρ-metric. IEEE Geosci. Remote Sens. Lett. 2018, 15, 1372–1376.
[CrossRef]
Sensors 2023, 23, 6084 15 of 16
7. Eldhuset, K. An automatic ship and ship wake detection system for spaceborne SAR images in coastal regions. IEEE Trans. Geosci.
Remote Sens. 1996, 34, 1010–1019. [CrossRef]
8. Shi, H.; Zhang, Q.; Bian, M.; Wang, H.; Wang, Z.; Chen, L.; Yang, J. A novel ship detection method based on gradient and integral
feature for single-polarization synthetic aperture radar imagery. Sensors 2018, 18, 563. [CrossRef]
9. Schwegmann, C.P.; Kleynhans, W.; Salmon, B.P. Manifold adaptation for constant false alarm rate ship detection in South African
oceans. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2015, 8, 3329–3337. [CrossRef]
10. Pappas, O.; Achim, A.; Bull, D. Superpixel-Level CFAR Detectors for Ship Detection in SAR Imagery. IEEE Geosci. Remote Sens.
Lett. 2018, 15, 1397–1401. [CrossRef]
11. Li, T.; Liu, Z.; Xie, R.; Ran, L. An Improved Superpixel-Level CFAR Detection Method for Ship Targets in High-Resolution SAR
Images. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2018, 11, 184–194. [CrossRef]
12. Ao, W.; Xu, F.; Li, Y.; Wang, H. Detection and Discrimination of Ship Targets in Complex Background from Spaceborne ALOS-2
SAR Images. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2018, 11, 536–550. [CrossRef]
13. Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You Only Look Once: Unified, Real-Time Object Detection. In Proceedings of the
2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788.
14. Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.Y.; Berg, A.C. SSD: Single shot multibox detector. In Proceedings of
the European Conference on Computer Vision (ECCV), Amsterdam, The Netherlands, 11–14 October 2016; pp. 21–37.
15. Lin, T.Y.; Goyal, P.; Girshick, R.; He, K.; Dollár, P. Focal Loss for Dense Object Detection. IEEE Trans. Pattern Anal. Mach. Intell.
2020, 42, 318–327. [CrossRef]
16. Tian, Z.; Shen, C.; Chen, H.; He, T. FCOS: Fully Convolutional One-Stage Object Detection. In Proceedings of the 2019 IEEE/CVF
International Conference on Computer Vision, Seoul, Republic of Korea, 27–28 October 2019. [CrossRef]
17. Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation.
In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014;
pp. 580–587. [CrossRef]
18. Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE
Trans. Pattern Anal. Mach. Intell. 2017, 39, 1137–1149 . [CrossRef] [PubMed]
19. He, K.; Gkioxari, G.; Dollár, P.; Girshick, R. Mask R-CNN. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 42, 386–397. [CrossRef]
20. Cai, Z.; Vasconcelos, N. Cascade R-CNN: Delving into High Quality Object Detection. In Proceedings of the 2018 IEEE/CVF
Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 6154–6162. [CrossRef]
21. Li, J.; Qu, C.; Shao, J. Ship detection in SAR images based on an improved faster R-CNN. In Proceedings of the 2017 SAR in Big
Data Era: Models, Methods and Applications (BIGSARDATA), Beijing, China, 13–14 November 2017; pp. 1–6. [CrossRef]
22. Wei, S.; Zeng, X.; Qu, Q.; Wang, M.; Su, H.; Shi, J. HRSID: A high-resolution SAR images dataset for ship detection and instance
segmentation. IEEE Access 2020, 8, 120234–120254. [CrossRef]
23. Huang, L.; Liu, B.; Li, B.; Guo, W.; Yu, W.; Zhang, Z.; Yu, W. OpenSARShip: A Dataset Dedicated to Sentinel-1 Ship Interpretation.
IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2018, 11, 195–208. [CrossRef]
24. Li, Y.; Zhang, S.; Wang, W.-Q. A Lightweight Faster R-CNN for Ship Detection in SAR Images. IEEE Geosci. Remote Sens. Lett.
2022, 19, 4006105. [CrossRef]
25. Cui, Z.; Li, Q.; Cao, Z.; Liu, N. Dense Attention Pyramid Networks for Multi-Scale Ship Detection in SAR Images. IEEE Trans.
Geosci. Remote Sens. 2019, 57, 8983–8997. [CrossRef]
26. Woo, S.; Park, J.; Lee, J.Y.; Kweon, I.S. CBAM: Convolutional block attention module. In Proceedings of the European Conference
on Computer Vision, Munich, Germany, 8–14 September 2018. [CrossRef]
27. Wan, H.Y.; Chen, J.; Huang, Z.X.; Xia, R.F.; Wu, B.C.; Sun, L.; Yao, B.D.; Liu, X.P.; Xing, M.D. AFSar: An Anchor-Free SAR
Target Detection Algorithm Based on Multiscale Enhancement Representation Learning. IEEE Trans. Geosci. Remote Sens. 2022,
60, 5219514. [CrossRef]
28. Ge, Z.; Liu, S.; Wang, F.; Li, Z.; Sun, J. YOLOX: Exceeding YOLO Series in 2021. arXiv 2021, arXiv:2107.08430v2.
29. Qi, X.; Lang, P.; Fu, X.; Qin, R.; Dong, J.; Liu, C. A Regional Attention-based Detector for SAR Ship Detection. Remote Sens. Lett.
2022, 13, 55–64. [CrossRef]
30. Shiqi, C.; Ronghui, Z.; Jun, Z. Regional Attention-based Single Shot Detector for SAR Ship Detection. J. Eng. 2019, 2019, 7381–7384.
[CrossRef]
31. Mao, Y.; Yang, Y.; Ma, Z.; Li, M.; Su, H.; Zhang, J. Efficient low-cost ship detection for SAR imagery based on simplified U-Net.
IEEE Access 2020, 8, 69742–69753. [CrossRef]
32. Ronneberger, O.; Fischer, P.; Brox, T. U-Net: Convolutional networks for biomedical image segmentation. arXiv 2015,
arXiv:1505.04597.
33. Liu, L.; Jia, Z.; Yang, J.; Kasabov, N.K. SAR Image Change Detection Based on Mathematical Morphology and the K-Means
Clustering Algorithm. IEEE Access 2019, 7, 43970–43978. [CrossRef]
34. Farhadiani, R.; Homayouni, S.; Safari, A. Hybrid SAR Speckle Reduction Using Complex Wavelet Shrinkage and Non-Local
PCA-Based Filtering. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2019, 12, 1489–1496. [CrossRef]
35. Zhang, J.; Li, W.; Li, Y. SAR Image Despeckling Using Multiconnection Network Incorporating Wavelet Features. IEEE Geosci.
Remote Sens. Lett. 2020, 17, 1363–1367. [CrossRef]
Sensors 2023, 23, 6084 16 of 16
36. Liu, S.; Lei, Y.; Zhang, L.; Li, B.; Hu, W.; Zhang, Y.-D. MRDDANet: A Multiscale Residual Dense Dual Attention Network for SAR
Image Denoising. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5214213. [CrossRef]
37. Sun, X.; Wang, Z.; Sun, Y.; Diao, W.; Zhang, Y.; Fu, K. AIR-SARShip-1.0: High-resolution SAR ship detection dataset. J. Radars
2019, 8, 852–862. [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.