Multi-UAV Networks: Special Issue Reprint
Multi-UAV Networks: Special Issue Reprint
Multi-UAV Networks
Edited by
Zhihong Liu, Shihao Yan, Yirui Cong and Kehao Wang
mdpi.com/journal/drones
Multi-UAV Networks
Multi-UAV Networks
Editors
Zhihong Liu
Shihao Yan
Yirui Cong
Kehao Wang
Kehao Wang
School of Information
Engineering, Wuhan
University of Technology
Wuhan, China
Editorial Office
MDPI
St. Alban-Anlage 66
4052 Basel, Switzerland
This is a reprint of articles from the Special Issue published online in the open access journal
Drones (ISSN 2504-446X) (available at: https://www.mdpi.com/journal/drones/special issues/
32K8N7DLNM).
For citation purposes, cite each article independently as indicated on the article page online and as
indicated below:
Lastname, A.A.; Lastname, B.B. Article Title. Journal Name Year, Volume Number, Page Range.
© 2024 by the authors. Articles in this book are Open Access and distributed under the Creative
Commons Attribution (CC BY) license. The book as a whole is distributed by MDPI under the terms
and conditions of the Creative Commons Attribution-NonCommercial-NoDerivs (CC BY-NC-ND)
license.
Contents
Xinkai Xu, Shuaihe Zhao, Cheng Xu, Zhuang Wang, Ying Zheng, Xu Qia, et al.
Intelligent Mining Road Object Detection Based on Multiscale Feature Fusion in Multi-UAV
Networks
Reprinted from: Drones 2023, 7, 250, doi:10.3390/drones7040250 . . . . . . . . . . . . . . . . . . . 15
Jianjun Gui, Tianyou Yu, Baosong Deng, Xiaozhou Zhu and Wen Yao
Decentralized Multi-UAV Cooperative Exploration Using Dynamic Centroid-Based Area
Partition
Reprinted from: Drones 2023, 7, 337, doi:10.3390/drones7060337 . . . . . . . . . . . . . . . . . . . 33
Yifan Li, Feng Shu, Jinsong Hu, Shihao Yan, Haiwei Song, Weiqiang Zhu, et al.
Machine Learning Methods for Inferring the Number of UAV Emitters via Massive MIMO
Receive Array
Reprinted from: Drones 2023, 7, 256, doi:10.3390/drones7040256 . . . . . . . . . . . . . . . . . . . 75
Liangbin Zhu, Cheng Ma, Jinglei Li, Yue Lu and Qinghai Yang
Connectivity-Maintenance UAV Formation Control in Complex Environment
Reprinted from: Drones 2023, 7, 229, doi:10.3390/drones7040229 . . . . . . . . . . . . . . . . . . . 135
Taiqi Wang, Shuaihe Zhao, Yuanqing Xia, Zhenhua Pan and Hanwen Tian
Consensus Control of Large-Scale UAV Swarm Based on Multi-Layer Graph
Reprinted from: Drones 2022, 6, 402, doi:10.3390/drones6120402 . . . . . . . . . . . . . . . . . . . 153
Tong Shen, Guiyang Xia, Jingjing Ye, Lichuan Gu, Xiaobo Zhou and Feng Shu
UAV Deployment Optimization for Secure Precise Wireless Transmission
Reprinted from: Drones 2023, 7, 224, doi:10.3390/drones7040224 . . . . . . . . . . . . . . . . . . . 187
v
Kehao Wang, Jiangwei Xu, Xiaobai Li, Pei Liu, Hui Cao, Kezhong Liu, et al.
Joint Trajectory Planning, Time and Power Allocation to Maximize Throughput in UAV
Network
Reprinted from: Drones 2023, 7, 68, doi:10.3390/drones7020068 . . . . . . . . . . . . . . . . . . . . 201
Kehao Wang, Xun Zhang, Xuyang Qiao, Xiaobai Li, Wei Cheng, Yirui Cong, et al.
Adjustable Fully Adaptive Cross-Entropy Algorithms for Task Assignment of Multi-UAVs
Reprinted from: Drones 2023, 7, 204, doi:10.3390/drones7030204 . . . . . . . . . . . . . . . . . . . 251
Hanqiang Deng, Jian Huang, Quan Liu, Tuo Zhao, Cong Zhou and Jialong Gao
A Distributed Collaborative Allocation Method of Reconnaissance and Strike Tasks for
Heterogeneous UAVs
Reprinted from: Drones 2023, 7, 138, doi:10.3390/drones7020138 . . . . . . . . . . . . . . . . . . . 277
Yangang Wang, Hai Wang, Xianglin Wei, Kuang Zhao, Jianhua Fan, Juan Chen, et al.
Service Function Chain Scheduling in Heterogeneous Multi-UAV Edge Computing
Reprinted from: Drones 2023, 7, 132, doi:10.3390/drones7020132 . . . . . . . . . . . . . . . . . . . 299
vi
About the Editors
Zhihong Liu
Zhihong Liu received a Ph.D. degree in computer science from the National University of
Defense Technology (NUDT), in 2016. He was a visiting student with the David R. Cheriton School
of Computer Science, University of Waterloo, Waterloo, ON, Canada, from 2013 to 2015. He is
currently an Associate Professor with the College of Intelligence Science and Technology, NUDT. He
has authored or co-authored more than 50 publications in peer-reviewed journals and international
conferences. His research interests include learning-based robotic control, UAV swarming, and
reinforcement learning. He is a regular reviewer for several prominent journals and conferences.
Shihao Yan
Shihao Yan received a Ph.D. degree in Electrical Engineering from the University of New South
Wales (UNSW), Sydney, Australia, in 2015. He received a B.S. in Communication Engineering and an
M.S. in Communication and Information Systems from Shandong University, Jinan, China, in 2009
and 2012, respectively. He was a Postdoctoral Research Fellow in the Australian National University,
a University Research Fellow in Macquarie University, and a Senior Research Associate in the School
of Electrical Engineering and Telecommunications, UNSW, Sydeny, Australia. He is currently a
Senior Lecturer in the School of Science, Edith Cowan University (ECU), Perth, Australia. He is
also the Theme Lead of Emerging Technologies for Cybersecurity in the Security Research Institute
(SRI) at ECU. He was a Technical Co-Chair and Panel Member of a number of IEEE conferences
and workshops, including the IEEE GlobeCOM 2018 Workshop on Trusted Communications
with Physical Layer Security and IEEE VTC 2017 Spring Workshop on Positioning Solutions for
Cooperative ITS. He was also awarded the Endeavour Research Fellowship by the Department of
Education, Australia. His current research interests are in the areas of signal processing for wireless
communication security and privacy, including covert communications, covert sensing, location
spoofing detection, physical layer security, IRS-aided wireless communications, and UAV-aided
communications.
Yirui Cong
Yirui Cong is an associate professor with the National University of Defense Technology,
Changsha, China. He received a Ph.D. degree from the Australian National University in 2018. His
research interests include distributed control and filtering theory, multi-UAV cooperative localization,
set-membership filtering theory and applications, and networked control under communication
constraints. He has published over 20 papers in top journals such as IEEE Transactions on Automatic
Control and IEEE Transactions on Wireless Communications.
Kehao Wang
Kehao Wang is a professor with Wuhan University of Technology, Wuhan, China. He received a
Ph.D degree from the Department of Computer Science, the University of Paris-Sud XI, Orsay, France,
in 2012. From February 2013 to August 2013, he was a postdoc with the HongKong Polytechnic
University. From December 2015 to December 2017, he was a visiting scholar in the Laboratory for
Information and Decision Systems, Massachusetts Institute of Technology, Cambridge, MA. USA. His
research interests include stochastic optimization, operation research, scheduling, wireless network
communications, and embedded operating systems. He has published over 80 papers in top journals
such as IEEE Transactions on Signal Processing and IEEE Transactions on Communications.
vii
drones
Article
Object Detection in Drone Video with Temporal Attention
Gated Recurrent Unit Based on Transformer
Zihao Zhou 1 , Xianguo Yu 2, * and Xiangcheng Chen 3
Abstract: Unmanned aerial vehicle (UAV) based object detection plays a pivotal role in civil and
military fields. Unfortunately, the problem is more challenging than general visual object detection
due to the significant appearance deterioration in images captured by drones. Considering that video
contains more abundant visual features and motion information, a better idea for UAV based image
object detection is to enhance target appearance in reference frame by aggregating the features in
neighboring frames. However, simple feature aggregation methods will frequently introduce the
interference of background into targets. To solve this problem, we proposed a more effective module,
termed Temporal Attention Gated Recurrent Unit (TA-GRU), to extract effective temporal information
based on recurrent neural networks and transformers. TA-GRU works as an add-on module to bring
existing static object detectors to high performance video object detectors, with negligible extra
computational cost. To validate the efficacy of our module, we selected YOLOv7 as baseline and
carried out comprehensive experiments on the VisDrone2019-VID dataset. Our TA-GRU empowered
YOLOv7 to not only boost the detection accuracy by 5.86% in the mean average precision (mAP) on
the challenging VisDrone dataset, but also to reach a running speed of 24 frames per second (fps).
Keywords: drone video object detection; deformable transformer; recurrent neural network;
feature aggregation
2. Related Work
Image-based Object Detection: Image-based detectors can be categorized broadly
into two groups: two-stage detectors and one-stage detectors. Two-stage detectors first
generate region proposals and then refine and classify them. Some representative methods
in this category include R-CNN [9], SSD [10], RetinaNet [11], Fast RCNN [12], and Faster
R-CNN [13]. While two-stage detectors tend to be more accurate, they are also slower. On
the other hand, one-stage detectors are usually faster but less accurate, as they directly
predict the region proposals based on the feature map. Relevant research in the field
of object detection includes various iterations of the YOLO series, such as YOLOv5 [5],
YOLOX [14], and YOLOv7 [6]. In our work, we utilized YOLOv7 as the base detector and
extended its capabilities for video object detection.
Video Object Detection: Compared to image object detection, video object detection
provides more comprehensive information about targets, including motion and richer
appearance details. In recent years, researchers have tried to utilize neighboring frame
features to enhance reference frame features. However, the presence of varying offsets in
each frame poses a significant challenge to effectively utilizing these features. Previous
studies attempted to address this issue by aligning the neighboring frame features with the
reference frame features. Alternatively, some methods choose to overlook the offsets in each
frame and instead use specialized modules to extract temporal information from videos.
Feature Aggregation: To address the issue of significant degradation in the visual
quality of drone videos, various previous methods focus on feature aggregation. This
2
Drones 2023, 7, 466
technique involves enhancing the reference features by combining the features of adja-
cent frames. For instance, FGFA [4] and THP [15] utilize the optical flow produced by
FlowNet [14] to model motion relations and align various frames. Alternatively, the optical-
flow-based framework [5] categorizes video images based on the background, acquires
the optical flow of the input sequence using FlowNet [14], and eventually aggregates the
optical flow to model motion relations. Nevertheless, flow-warping-based techniques have
some drawbacks. Firstly, drone videos frequently comprise numerous small objects, which
make it challenging to accurately extract optical flow. Secondly, it is important to note that
obtaining optical flow demands a considerable amount of computational resources, which
can make real-time detection a challenging task. In contrast, some other approaches employ
deformable convolution to compute the offsets in different frames. This method allows
for the adaptive adjustment of convolutional kernel parameters to obtain corresponding
offsets. For instance, STSN [8] utilizes stacked 6-layer deformable convolutional layers to
gradually aggregate the temporal contexts. TCE-Net [1] takes into account that the con-
tribution of neighboring frames to the reference frame may differ. To align frames, it uses
a single deformable convolutional layer and a temporal attention module, which assigns
weights to frames based on their respective contributions. However, the task of drone
video object detection presents significant challenges, and relying solely on a single de-
formable convolutional layer can make it difficult to accurately compute the offset between
neighboring frames and the reference frame. To avoid introducing excessive computation,
simply increasing the number of deformable convolutional layers is not the ideal solution.
Our approach, however, is to utilize the GRU module in our TA-GRU to transfer temporal
features and incorporate a temporal context enhanced aggregation module to obtain the
fusion features that are then fed to the detection network. This method allows us to avoid
the need for aligning every neighboring frame with a reference frame and instead adopt a
frame-by-frame alignment strategy, which not only reduces computation but also enhances
alignment accuracy.
Some recent studies have utilized recurrent neural networks, such as long short-term
memory networks (LSTM), to propagate temporal features that contain previous video
features. STMN [6] and Association LSTM [7] attempt to model object association between
different frames by applying LSTM or its variants. However, the object association modeled
by these methods is often imprecise, particularly in drone videos. On the other hand,
Conv-GRU utilizes convolution to replace linear calculation, which introduces significant
challenges to the GRU module originally used for calculating sequences. TPN [16] adopts
a unique method of object tracking which differs from general video object detections.
The proposed approach involves linking multiple frames of the same object to generate a
segment of tube, which is then fed into an ED-LSTM network to capture temporal context.
However, this method introduces significant background noise that can compromise the
accuracy of the results. To address this issue, recent research has explored the use of
transformers for video object detection. TransVOD [3] demonstrated that incorporating
self-attention and cross-attention modules can improve the model’s focus on the target
regions. Building on this work, our TA-GRU method aggregates temporal features and
applies deformable attention instead of convolution to enhance performance. We elaborate
on the details of TA-GRU in Section 3.
3. Proposed Method
To enable both high accuracy and high efficiency for UAV based image object detection,
we proposed a new, highly effective video object detection framework termed TA-GRU
YOLOv7. Particularly, we designed four effective modules including, the Temporal Atten-
tion Gated Recurrent Unit (TA-GRU) to enhance attention to target features in the current
frame and improve the accuracy of motion information extraction between frames; the Tem-
poral Deformable Transformer Layer (TDTL) to reduce additional computational overhead
and strengthen the target features; a new deformable alignment module (DeformAlign)
to extract motion information and align features from two frames; as well as a temporal
3
Drones 2023, 7, 466
attention based fusion module (TA-Fusion) to integrate useful information from temporal
features into the current frame feature.
Figure 1. Architecture of TA-GRU. In TA-GRU, input features xt interact with temporal features
ht−1 through a temporal processing module (temporal alignment and fusion) to obtain enhanced
features to feed to the detection head. Additionally, temporal features ht−1 will be updated by update
gate features zt , where sel f _attn is self-deformable transformer layer, cross_attn is cross-deformable
transformer layer.
4
Drones 2023, 7, 466
Figure 2. Architecture of Conv-GRU. It is constituted by the reset gate, update gate, and main body
of Conv-GRU.
where σ is mean Sigmoid activation function, tanh is tanh activation function, is element-
wise multiplication, ∗ is convolution, xt is input features extracted by backbone, and
Wxz , Whz , Wxr , Whr , Wx , Wh are the 2D convolutional kernels whose parameters are
optimized end-to-end.
Temporal Attention Gated Recurrent Unit (TA-GRU). Different from the original Conv-
GRU, we modified it to make it extend to drone video object detections. The overall
Temporal Attention Gated Recurrent Unit (TA-GRU) architecture is shown in Figure 1. We
used it to propagate temporal features to more effectively retain temporal information;
we chose deformable transformer layer to replace the original convolutional layer and
added the temporary processing module (temporal alignment and fusion) to aggregate
input and temporal features. The deformable transformer layer can enable the model
5
Drones 2023, 7, 466
to focus more effectively on target areas, and it is better at handling temporal inputs
than traditional convolutional layers, resulting in improved performance compared to
traditional convolutional layers. In TA-GRU module, the temporal features are propagated
frame-by-frame between inputs to improve each frame appearance features. The final
outputs will be batch inputs. The specific formula is shown in Equation (2):
where σ is mean Sigmoid activation function, tanh is tanh activation function, is element-
wise multiplication, sel f _attn is self-deformable attention, cross_attn is cross-deformable
attention, Tem_prc is the mean after temporal processing on temporal features, xt is input
features extracted by backbone, and Wxz , Whz , Wxr , Whr , Wx are the weight matrix of
deformable attention whose parameters are optimized end-to-end.
Temporal Deformable Transformer Layer (TDTL). To our knowledge, there are a lot of
tiny objects in drone videos, which will introduce much background. Previous work [17]
has addressed this issue by adding a transformer layer at the neck of the model to enhance
the features extracted from the backbone. However, the general transformer layer [18]
will introduce much computation overhead. The viewpoint in DETR [19] suggests that
the more relevant area to the target area is often its nearby area. Furthermore, a video
object detector was built using a deformable transformer within TransVOD [3] and attained
satisfactory detection outcomes. Therefore, we utilized a deformable transformer layer to
build our Temporal Deformable Transformer Layer (TDTL). This module will make the
model pay more attention on target areas to improve the features. As shown in Figure 3,
the deformable transformer layer only assigns a small, fixed number of keys for each
query. Given an input feature map x ∈ RC× H ×W , let i index a 2D reference point pi . The
deformable attention feature is calculated by Equation (3):
De f ormAttn( x, pi ) = ∑n=1 Wn ∑k=1 Anik Wn x ( pi + Δpnik )
N K
(3)
where n indexes the attention head, k indexes the sampled keys, Δpnik and Anik denote
the sampling offset and attention weight of the kth sampling point in the nth attention
head, respectively, and the scalar attention weight Anik lies in range [0, 1], normalized by
∑kK=1 Anik = 1.
Figure 3. Architecture of Temporal Deformable Transformer Layer (TDTL). To reduce huge compu-
tation overhead on a typical transformer layer, the deformable transformer layer only attends to a
small set of key sampling points around the reference.
We chose self-deformable attention to improve the attention on target areas of the input
features, then used cross-deformable attention to complete the interaction with temporal
features. By implementing this approach, our model becomes more adept at emphasizing
the features of the current frame during the update of temporal features while also giving
due consideration to the previous temporal features when determining which information
6
Drones 2023, 7, 466
should be preserved. This enhanced flexibility enables our network to focus more precisely
on the specific areas of interest.
DeformAlign. We noticed that same object features are usually not spatially aligned
across frames due to video motion. Without proper feature alignment before aggregation,
the object detector may generate numerous false recognitions and imprecise localizations.
Therefore, recent works [1,8] have utilized deformable convolution [20] to compute offset
caused by movement between different frames to align different frame features. The
architecture of the DeformAlign module is shown in Figure 4. Different from the deformable
convolution module, we needed model motion in different frames so we used an extra
convolution layer to simply fuse different frame features. Then, we used two different
convolutions to compute the offsets and corresponding weights by choosing the fused
features as inputs and utilized the offsets and weights to align neighboring frame features to
the reference frame features. Given the prevalence of small targets in drone imagery, where
the inter-frame motion of these targets may not be substantial, we found that a single layer
of deformable convolution was sufficient to effectively capture their motion information.
Figure 4. Architecture of DeformAlign. We used an extra convolution layer to simply fuse different
features connected by channel. We used bilinear interpolation in a deformable convolution module
to align the features of neighboring frames to the reference frame.
where N = kernel − size, Δpi = {(−1, −1), (−1, 0) · · · (1, 1)} , Wpi is the corresponding
weight at p0 + Δpi .
Deformable convolution introduces two additional convolutional layers to adaptively
calculate offset Δpn and weight Δwn . We can compute the aligned pixel at p0 by following
Equation (5):
N
y align ( p0 ) = ∑i=1 Wpi · x ( p0 + Δpi + Δpn )·Δwn (5)
It uses bilinear interpolation to achieve the process of p0 + Δpi + Δpn .
Temporal Attention and Temporal Fusion Module (TA-Fusion). TCE-Net [1] notices
that there are different contributions to reference frame features in different frame features.
The goal of temporal attention is to compute frame similarity in an embedding space to
focus on ‘when’ it is important given neighboring frames. Intuitively, at location p, if the
aligned features f align are close to reference features f t , they should be assigned higher
weights. Here, dot product similarity metric is used to measure the similarity. Additionally,
temporal fusion is proposed to aggregate features from neighboring frames to model
temporal context. We used a 1 × 1 × C convolutional network to fuse the aligned temporal
7
Drones 2023, 7, 466
features with the features of the current frame. During the training process, the parameters
of the fusion network were adaptively updated, enhancing the efficiency of feature fusion
in our model and improving overall performance.
The weights of temporal attention map are estimated by Equation (6):
where σ is Sigmoid activation function. The architecture of the temporal attention and
temporal fusion module is shown in Figure 5.
Figure 5. Architecture of Temporal Attention and Temporal Fusion module. We used dot product to
measure the similarity of f align and f t . Then, we used the similarity metric to assign different weights
for each frame. Finally, we chose a 1 × 1 × C convolution layer to aggregate features.
As shown in Figure 5, the temporal attention maps have the same spatial size with f t
and are then multiplied in a pixel-wise manner to the original aligned features f align .
4. Experiments
4.1. Training Dataset and DETAILS
Training Dataset. We trained and tested on the VisDrone2019-VID dataset [21], which
includes 288 video clips taken by the UAV platform at different angles and heights. All
videos are fully annotated with object bounding box, object category, and tracking IDs.
There are 10 object categories (‘pedestrian’, ‘people’, ‘bicycle’, ‘car’, ‘van’, ‘truck’, ‘tricycle’,
‘awning-tricycle’, ‘bus’, ‘motor’) consisting of 261,908 images, 24,201 for training images,
2846 for validate images, and 6635 for test images. Unlike other general video object
detection datasets, there are a lot of tiny objects and severe appearance deterioration in it.
Thus, we needed a video object detection method that could aggregate extensive tiny object
features to solve the appearance deterioration. Mean average precision (mAP) (average of
all 10 IoU thresholds, ranging from [0.5:0.95]) and AP50 were used as the evaluation metric.
Implementation Details. Our modules rely on one NVIDIA RTX3090 GPU for both
training and testing. Additionally, our experiments show that the diversity of the video clips
in VisDrone2019-VID is significantly lower when compared to ImageNet-VID. Hence, it
was necessary for us to perform additional data processing on VisDrone2019-VID. Referring
to the method in TCE-Net [1], we chose a temporal stride predictor that took the differences
between features t and features k to select which frames to aggregate. This predictor takes
the differences between features t and features k, i.e., ( f t − f k ), as input and predicts the
deviation score between frame t and frame k. The deviation score is formally defined as
the motion intersection-over-union (IoU). If IoU < 0.5, the temporal stride is set to 1. If
0.5 < IoU < 0.7, the temporal stride is set to 2. Furthermore, if IoU > 0.7, the temporal stride
is set to 4. Inspired by FGFA [4], we firstly used VisDrone2019-DET to pretrain our model
by setting batch_size = 1. We then used the pretrained model weights as the resume model
to continue training on VisDrone2019-VID. Because the VisDrone2019-VID training set is
a bit small, we only trained the model on VisDrone2019-VID trainset for 70 epochs, and
8
Drones 2023, 7, 466
the first 2 epochs were used for warm-up. We used an SGD optimizer for training and
5 × 10−4 as the initial learning rate with the cosine learning rate schedule. The learning
rate of the last epoch decays to 0.01 of the initial learning rates. Considering the small
objects in the drone image, we assigned the size of the image to 1280 pixels. The important
parameters of the training process were set, as shown in Table 1.
Parameters Setup
Epochs 70
Batch Size 4
Image Size 1280 × 1280
Initial Learning Rate 2 × 10−4
Final Learning Rate 2 × 10−6
Momentum 0.937
Weight-Decay 5 × 10−4
Image Scale 0.6
Image Flip Left-Right 0.5
Mosaic 0
Image Translation 0.2
Image Rotation 0.2
Image Perspective 2 × 10−5
Data Analysis. Based on our past experience, it is crucial to analyze the dataset
thoroughly before designing and training a model in order to construct an effective one.
Upon reviewing the VisDrone2019-VID dataset, we observed the presence of numerous
small objects, as well as some appearance deterioration such as part occlusion, motion blur,
and video defocus. Therefore, there is an urgent need to develop a simple yet effective
VOD framework that can be fully end-to-end.
In Figure 6, there are numerous objects smaller than 4 pixels. While these objects
aided in training our temporal aggregated module, they should not be included in the
computation of the model loss function. Typical methods for handling the ignore regions
in the VisDron2019-VID dataset involve replacing them with gray squares. However, our
experiments show that this approach can result in a loss of image information, particularly
in UAV images, which is not conducive to training the temporal aggregated module. To
prevent our model from detecting ignore regions and to retain useful training information,
we chose to map the predicted bounding box back to the original images and set the
intersection-over-union (IoU) to 0.7 about ground truth bounding box and ignore regions,
thus excluding the ignore regions from loss calculation. This method has proven to be more
effective than simply replacing the ignored regions with gray squares, resulting in a 0.72%
increase in mean average precision (mAP).
9
Drones 2023, 7, 466
Table 2 shows that TA-GRU YOLOv7 achieves a higher mean average precision
(mAP) than YOLOv7, with an improvement of 5.86% mAP. Moreover, the computational
overhead introduced by our method is small, which provides strong evidence for its
effectiveness. Compared with FGFA (18.33% mAP), TA-GRU YOLOv7 obtains 24.57% mAP,
outperforming it by 6.24%. Furthermore, TA-GRU YOLOv7 only aggregates a temporal
feature and reference frame feature, while FGFA is 21. Additionally, with a deformable
convolution detector and temporal post-processing, STSN obtains 18.52% mAP. However,
TA-GRU YOLOv7 obtains 24.57% mAP, which is about 6.05% higher than it. The detection
effect of some scenes is shown in Figure 7.
Table 3, presented below, illustrates the detection outcomes of our model across
various categories in the VisDrone2019-VID dataset. Our model has achieved outstanding
detection performance across the vast majority of categories.
10
Drones 2023, 7, 466
Table 4. Accuracy and runtime of different methods on VisDrone2019-VID validation. The runtime
contains data processing, which is measured on one NVIDIA RTX3090 GPU.
Method (a) is the single-frame baseline. It has a mAP of 18.71% using YOLOv7.
It outperforms the video detector, FGFA, by 0.38%. This indicates that our baseline is
competitive and serves as a valid reference for evaluation.
Method (b) is a naive feature aggregation approach and a degenerated variant of
TA-GRU YOLOv7, which uses Conv-GRU to aggregate temporal features. The variant
is also trained end-to-end in the same way as TA-GRU YOLOv7. The mAP decreases
to 16.55%, 2.16% shy of baseline (a). This indicates that using traditional feature fusion
networks to directly aggregate complex drone video features can potentially introduce
background interference.
Method (c) adds the DeformAlign module into (b) to align neighboring frame features
to the reference frame features. It obtains a mAP of 20.03%, 1.32% higher than that of
(a) and 3.48% higher than that of (b). This result suggests that when features are aligned to
the same spatial position, it enhances the fusion of effective features in the fusion network.
However, introducing noise remains inevitable.
Method (d) adds the temporal attention and temporal fusion module to (c). It increases
the mAP score from 20.03% to 23.96%. Figure 8 shows that images with distinct appearance
features are assigned varying weights depending on how similar they are to the features of
the reference frame. This also effectively eliminates the impact of noise information from
adjacent frames on the features of the reference frame.
11
Drones 2023, 7, 466
Figure 8. Images with distinct appearance features are assigned varying weights. The weight is
determined by both the distance and similarity to the reference frame.
Method (e) is the proposed TA-GRU YOLOv7 method, which uses deformable atten-
tion to replace the convolution layer in (d). It increases the mAP score from 23.96% to
24.57%. It suggests that the deformable attention make model pays more attention to target
areas to effectively promote the information from nearby frames in feature aggregation. The
proposed TA-GRU YOLOv7 method improves the overall mAP score by 5.86% compared
to the single-frame baseline (a).
Method (f) is a degenerated version of (e) without using end-to-end training. It takes
the feature and the detection sub-networks from the single-frame baseline (a). During
training, these modules are fixed and only the embedding temporal extracted module is
learnt. It is clearly worse than (e). This indicates the importance of end-to-end training in
TA-GRU YOLOv7.
5. Conclusions
This work presents an accurate, simple yet effective VOD framework in a fully end-
to-end manner. Because our approach focuses on improving feature quality, it would be
complementary to existing single frameworks for better accuracy in video frames. Our
primary contribution is the integration of recurrent neural networks, transformer layers,
and feature alignment and fusion modules. Ablation experiments show the effectiveness
of our modules. Together, the proposed model not only achieves a 24.57% mAP score
on VisDorne2019-VID, but also reaches a running speed of 24 frames per second (fps).
However, more annotation data and precise motion estimation may be beneficial for im-
provements. Indeed, our module currently lacks proficiency in handling long-term motion
information, and the degradation of appearance characteristics in various objects within
UAV images presents a challenge for our module’s ability to effectively learn temporal
information. Addressing this issue is a key objective for our next stage of development.
Author Contributions: Conceptualization, Z.Z.; methodology, Z.Z.; software, Z.Z.; validation, Z.Z.;
formal analysis, Z.Z. and X.Y.; investigation, Z.Z. and X.Y.; resources, X.Y.; data curation, Z.Z. and X.Y.;
writing—original draft preparation, Z.Z.; writing—review and editing, Z.Z. and X.Y.; visualization,
Z.Z.; supervision, X.C.; project administration, X.Y.; funding acquisition, X.Y. All authors have read
and agreed to the published version of the manuscript.
Funding: This work was supported by the National Natural Science Foundation of China under
Grant 61973309 and the Natural Science Foundation of Hunan Province under Grant 2021JJ20054.
12
Drones 2023, 7, 466
Data Availability Statement: The data presented in this study are openly available in The Vision
Meets Drone Object Detection in Video Challenge Results (VisDrone-VID2019) at https://github.
com/VisDrone/VisDrone-Dataset, accessed on 29 May 2023.
Conflicts of Interest: The authors declare no conflict of interest. The funders had no role in the design
of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or
in the decision to publish the results.
References
1. He, F.; Gao, N.; Li, Q.; Du, S.; Zhao, X.; Huang, K. TCE-Net. In Proceedings of the AAAI Conference on Artificial Intelligence,
Hilton, NY, USA, 7–12 February 2020; pp. 10941–10948. [CrossRef]
2. Shi, Y.; Wang, N.; Guo, X. YOLOV: Making Still Image Object Detectors Great at Video Object Detection. In Proceedings of the
AAAI Conference on Artificial Intelligence, Washington, DC, USA, 7–14 February 2023; Volume 37, pp. 2254–2262.
3. Zhou, Q.; Li, X.; He, L.; Yang, Y.; Cheng, G.; Tong, Y.; Ma, L.; Tao, D. TransVOD: End-to-End Video Object Detection with
Spatial-Temporal Transformers. IEEE Trans. Pattern Anal. Mach. Intell. 2022, 45, 7853–7869. [CrossRef] [PubMed]
4. Zhu, X.; Wang, Y.; Dai, J.; Yuan, L.; Wei, Y. Flow-Guided Feature Aggregation for Video Object Detection. In Proceedings of the
2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017. [CrossRef]
5. Fan, L.; Zhang, T.; Du, W. Optical-Flow-Based Framework to Boost Video Object Detection Performance with Object Enhancement.
Expert Syst. Appl. 2020, 170, 114544. [CrossRef]
6. Xiao, F.; Lee, Y.J. Video Object Detection with an Aligned Spatial-Temporal Memory. In Proceedings of the Computer Vision—
ECCV 2018, Munich, Germany, 8–14 September 2018; Lecture Notes in Computer Science. Springer: Cham, Switzerland, 2018;
pp. 494–510. [CrossRef]
7. Lu, Y.; Lu, C.; Tang, C.-K. Online Video Object Detection Using Association LSTM. In Proceedings of the 2017 IEEE International
Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017. [CrossRef]
8. Bertasius, G.; Torresani, L.; Shi, J. Object detection in video with spatiotemporal sampling networks. In Proceedings of the
European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 331–346.
9. Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation;
Cornell University: Ithaca, NY, USA, 2013.
10. Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.Y.; Berg, A.C. SSD: Single Shot MultiBox Detector. In Proceedings
of the Computer Vision—ECCV 2016, Amsterdam, The Netherlands, 11–14 October 2016; Lecture Notes in Computer Science.
Springer: Berlin/Heidelberg, Germany, 2016; pp. 21–37. [CrossRef]
11. Lin, T.; Goyal, P.; Girshick, R.; He, K.; Dollár, P. Focal Loss for Dense Object Detection. In Proceedings of the 2017 IEEE
International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017. [CrossRef]
12. Girshick, R. Fast R-CNN. arXiv 2015, arXiv:1504.08083.
13. Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks.
IEEE Trans. Pattern Anal. Mach. Intell. 2016, 39, 1137–1149. [CrossRef] [PubMed]
14. Dosovitskiy, A.; Fischer, P.; Ilg, E.; Hausser, P.; Hazirbas, C.; Golkov, V.; Van Der Smagt, P.; Cremers, D.; Brox, T. FlowNet: Learning
Optical Flow with Convolutional Networks. arXiv 2015, arXiv:1504.06852.
15. Zhu, X.; Dai, J.; Yuan, L.; Wei, Y. Towards high performance video object detection. In Proceedings of the IEEE Conference on
Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 7210–7218.
16. Kang, K.; Li, H.; Xiao, T.; Ouyang, W.; Yan, J.; Liu, X.; Wang, X. Object Detection in Videos with Tubelet Proposal Networks. In Pro-
ceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017.
[CrossRef]
17. Zhu, X.; Lyu, S.; Wang, X.; Zhao, Q. TPH-YOLOv5: Improved YOLOv5 Based on Transformer Prediction Head for Object
Detection on Drone-Captured Scenarios. In Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision
Workshops (ICCVW), Montreal, BC, Canada, 11–17 October 2021. [CrossRef]
18. Liu, Z.; Lin, Y.; Cao, Y.; Hu, H.; Wei, Y.; Zhang, Z.; Lin, S.; Guo, B. Swin Transformer: Hierarchical Vision Transformer Using
Shifted Windows. arXiv 2021, arXiv:2103.14030.
19. Zhu, X.; Su, W.; Lu, L.; Li, B.; Wang, X.; Dai, J. Deformable DETR: Deformable Transformers for End-to-End Object Detection.
arXiv 2020, arXiv:2010.04159.
20. Dai, J.; Qi, H.; Xiong, Y.; Li, Y.; Zhang, G.; Hu, H.; Wei, Y. Deformable Convolutional Networks. In Proceedings of the 2017 IEEE
International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017. [CrossRef]
21. Zhu, P.; Du, D.; Wen, L.; Bian, X.; Ling, H.; Hu, Q.; Peng, T.; Zheng, J.; Wang, X.; Zhang, Y.; et al. VisDrone-VID2019: The Vision
Meets Drone Object Detection in Video Challenge Results. In Proceedings of the 2019 IEEE/CVF International Conference on
Computer Vision Workshop (ICCVW), Seoul, Republic of Korea, 27–28 October 2019. [CrossRef]
22. Wang, C.Y.; Bochkovskiy, A.; Liao, H.Y.M. YOLOv7: Trainable Bag-of-Freebies Sets New State-of-the-Art for Real-Time Object
Detectors. arXiv 2022, arXiv:2207.02696.
23. Ge, Z.; Liu, S.; Wang, F.; Li, Z.; Sun, J. YOLOX: Exceeding YOLO Series in 2021. arXiv 2021, arXiv:2107.08430.
13
Drones 2023, 7, 466
24. Feichtenhofer, C.; Pinz, A.; Zisserman, A. Detect to Track and Track to Detect. In Proceedings of the International Conference on
Computer Vision, Venice, Italy, 22–29 October 2017; ICCV: Paris, France, 2017; pp. 3057–3065.
25. Lin, T.Y.; Dollár, P.; Girshick, R.; He, K.; Hariharan, B.; Belongie, S. Feature Pyramid Networks for Object Detection. In Proceedings
of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; CVPR:
New Orleans, LA, USA, 2017; pp. 936–944.
26. Law, H.; Deng, J. Cornernet: Detecting Objects as Paired Keypoints. In Proceedings of the 15th European Conference on Computer
Vision, Munich, German, 8–14 September 2018; ECCV: Aurora, CO, USA, 2018; pp. 765–781.
27. Zhou, X.; Wang, D.; Krähenbühl, P. Objects as points. arXiv 2019, arXiv:1904.07850.
28. Zhao, Q.; Sheng, T.; Wang, Y.; Ni, F.; Cai, L. Cfenet: An accurate and efficient single-shot object detector for autonomous driving.
arXiv 2018, arXiv:1806.09790.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.
14
drones
Article
Intelligent Mining Road Object Detection Based on Multiscale
Feature Fusion in Multi-UAV Networks
Xinkai Xu 1,2 , Shuaihe Zhao 3,4 , Cheng Xu 2, *, Zhuang Wang 2 , Ying Zheng 1 , Xu Qian 1 and Hong Bao 2
1 School of Mechanical Electronic & Information Engineering, China University of Mining &
Technology-Beijing, Beijing 100083, China
2 Beijing Key Laboratory of Information Service Engineering, Beijing Union University, Beijing 100101, China
3 The School of Automation, Beijing Institute of Technology, Beijing 100081, China
4 Aerospace Shenzhou Aerial Vehicle Ltd., Tianjin 300301, China
* Correspondence: xucheng@buu.edu.cn
Abstract: In complex mining environments, driverless mining trucks are required to cooperate
with multiple intelligent systems. They must perform obstacle avoidance based on factors such as
the site road width, obstacle type, vehicle body movement state, and ground concavity-convexity.
Targeting the open-pit mining area, this paper proposes an intelligent mining road object detection
(IMOD) model developed using a 5G-multi-UAV and a deep learning approach. The IMOD model
employs data sensors to monitor surface data in real time within a multisystem collaborative 5G
network. The model transmits data to various intelligent systems and edge devices in real time,
and the unmanned mining card constructs the driving area on the fly. The IMOD model utilizes
a convolutional neural network to identify obstacles in front of driverless mining trucks in real
time, optimizing multisystem collaborative control and driverless mining truck scheduling based on
obstacle data. Multiple systems cooperate to maneuver around obstacles, including avoiding static
obstacles, such as standing and lying dummies, empty oil drums, and vehicles; continuously avoiding
multiple obstacles; and avoiding dynamic obstacles such as walking people and moving vehicles.
For this study, we independently collected and constructed an obstacle image dataset specific to
Citation: Xu, X.; Zhao, S.; Xu, C.;
the mining area, and experimental tests and analyses reveal that the IMOD model maintains a
Wang, Z.; Zheng, Y.; Qian, X.; Bao, H.
smooth route and stable vehicle movement attitude, ensuring the safety of driverless mining trucks
Intelligent Mining Road Object
as well as of personnel and equipment in the mining area. The ablation and robustness experiments
Detection Based on Multiscale
Feature Fusion in Multi-UAV
demonstrate that the IMOD model outperforms the unmodified YOLOv5 model, with an average
Networks. Drones 2023, 7, 250. improvement of approximately 9.4% across multiple performance measures. Additionally, compared
https://doi.org/10.3390/ with other algorithms, this model shows significant performance improvements.
drones7040250
Keywords: multisystem collaboration; 5G-multi-UAV systems; multiscale feature fusion; pyramid
Academic Editors: Zhihong Liu,
model
Shihao Yan, Yirui Cong and Kehao
Wang
# !#
"
! #
The contributions of this paper include: (1) proposing an IMOD automatic driving
model based on 5G-multi-UAV for enhancing safety in mining areas; (2) constructing an
16
Drones 2023, 7, 250
obstacle image dataset through field collection and manual marking; and (3) improving
multiscale obstacle detection capabilities through cross-modal data fusion.
The second section presents related work, followed by a presentation of the IMOD
autopilot model based on 5G-multi-UAV in Section 3. In Section 4, we provide experimental
analysis results, followed by our conclusions in Section 5.
2. Related Work
2.1. Multisystem Collaboration Scenarios and Applications in Open-Pit Mines
Automated driving in open-pit mines continues to adhere to the standard production
workflow of drilling, blasting, mining, transportation, and discharging [5]. Considering
the process of mining transport operations and platooning, automatic driving scenarios
in mines can be classified into three categories: loading, transporting, and unloading.
Additionally, there are maintenance support scenarios, such as refueling and water replen-
ishment, that facilitate the aforementioned operational processes. To implement intelligent
networked automatic driving applications effectively, the realization of remote control
driving capabilities for mining trucks is required. Moreover, there is a demand for seamless
operation coordination between these trucks and other construction machinery as well as
accurate planning of their travel path to ensure safe autonomous operation.
In this scenario, an unmanned mining truck travels to a loading point, where it re-
ceives payloads from excavators, shovels, and other equipment. The entire process involves
communication between the mining trucks, equipment, and cloud platforms during entry,
loading, and transportation to the designated destination along with providing updates
such as on position/speed/direction/acceleration. Loading equipment also sends posi-
tional and directional information for efficient operations.
During abnormal conditions, triggering an emergency brake feature that initiates
remote control mode provides increased safety measures by triggering alarms alerting
excavators about any danger. Cloud-based systems provide troubleshooting assistance,
resolving issues experienced during material transport.
For autonomous driving in mining trucks, a cloud platform is utilized for planning
paths while integrating environmental information. The vehicle interacts with other ve-
hicles/equipment/cloud platforms, ensuring safe driving via functions such as forward
collision warning and over-the-horizon perception while being capable of emergency
braking followed by remote takeover, if required.
The unloading process requires communication/cooperation among various pieces
of equipment (bulldozers/loaders/cloud platforms). In using planned tasks/paths, as-
sistance is provided by the surrounding environment perception resulting in real-time
status/information exchange between the mining truck/unloading equipment, leading
to efficient cooperation. Again, it should be capable of emergency braking followed by a
remote takeover in case of any abnormal situations.
Finally, refueling/water replenishment/maintenance tasks require organizing mainte-
nance or overhaul tasks by the cloud platform when necessary, with support task execution
planning via coordination between the truck/platform detecting faults or insufficient
oil/water using planned paths periodically broadcasting real-time status/task information
with availability for remote takeover in case of abnormalities.
17
Drones 2023, 7, 250
18
Drones 2023, 7, 250
efficiency while also reducing computational complexity with high accuracy [10]. Another
study [11] proposed an engineering vehicle detection algorithm based on faster R-CNN,
which adjusts the position of ROI pooling layers and adds a convolution layer in the feature
classification part, thereby enhancing model accuracy. Furthermore, ref. [12] introduces
several differently sized RPNs into traditional faster R-CNN structures, allowing for larger
vehicle detection, whilst [13], building upon faster R-CNN, improved object feature ex-
traction by combining multilayered feedforwarding alongside the output of each context
layer, enriching robustness against smaller or occluded targets that may confound other
models. In the literature [14], there is a suggestion for improving domain adaptive fast R-
CNN algorithms by refining their respective region proposal network (RPN) configuration;
multiscale training helps mine difficult samples during secondary training, leading toward
expanded small-target capability-albeit at a considerable computational expense.
Given the slow execution speed of the R-CNN algorithm, the one-stage detection
method pioneered in [15] creatively integrated candidate frame extraction and feature
classification, developing several versions [16]. The you only look once (YOLO) target
detection algorithm enables higher accuracy and faster detection. Another example [17]
improved YOLOv4 by adjusting the size of the detection layer for smaller objects and
replacing the backbone with CSPLocknet-19, effectively achieving a good average accuracy
(mAP) and frames per second (FPS) on low-cost edge hardware. In [18], an improved
vehicle detection method using the YOLOv5 network under different traffic scenarios
was proposed, utilizing flip-mosaic to enhance the perception of small targets, thereby
increasing the accuracy of detection while reducing false positives. By adding an SSH
module after YOLOv7’s pafpn structure to merge context information, the small object
detection ability was improved. In recent years, the one-stage target detection algorithm
has gained popularity across various fields due to its strong generalization performance
and fast processing.
The open-pit mining environment is complex and dynamic, necessitating the use
of a target detection algorithm based on convolutional neural networks. The selected
algorithm must satisfy the real-time and high-precision requirements for obstacle detection
by unmanned mining trucks. After analyzing the existing algorithms, we selected the
YOLOv5 network based on its suitability for detecting obstacles within an open-pit mining
area [19]. Obstacle detection using only two-dimensional image methods often produces
inaccurate distance information; multisensor fusion that combines stereo vision with laser
radar can provide more accurate results. Currently, YOLOv5 and YOLOX algorithms
demonstrate favorable obstacle detection performance. Notably, YOLOV5-s and YOLOX-s
are lightweight models recommended for mobile deployment devices but still require
improvement in detecting occluded and small-scale targets.
Automatic driving solutions in the field of mining primarily comprise three modules:
a central control system, automatic driving trucks, and other engineering vehicle coor-
dination kits. In cloud-based remote monitoring, the scheduling platform serves as the
central control system, whereas automatic driving trucks operate with perception abilities
to make intelligent decisions automatically, thereby reducing operational costs while
increasing transportation efficiency through improved safety measures between vehicles
such as excavator coordination kits or other vehicle support tools. With the integration
of these solutions into practices such as travel path planning during onsite applications
or collaborative management activities located within various sections throughout the
mine site, significant improvements have been made toward safer operations for lower
overall cost.
19
Drones 2023, 7, 250
typically simple pavements, frequently being traversed by heavy-duty trucks increases the
likelihood of pavement damage and deformation.
The one-stage target detection algorithm YOLOv5 utilizes mosaic image enhance-
ment and adaptive anchor frame calculation at its input. Its backbone network integrates
focus and CSP structures, whereas the neck module uses FPN structures to enhance se-
mantic information across different scales. The PAN structure fosters location aware-
ness across these scales, thereby improving multiscale target detection performance. The
lightweight YOLOv5-s and YOLOX-s target detection algorithms have strong performance
in detecting targets across multiple scales and are well-suited for deployment on resource-
constrained devices.
Based on the 5G-multi-UAV architecture, Figure 2 displays the IMOD collaborative
system for a mining scene. Using the YOLOv5-s network structure as a basic framework,
this paper presents an IMOD obstacle target detection model that adapts feature fusion
to address challenges related to high-density occlusion of targets, low detection accuracy,
and miss rate in detecting small-scale targets within open-pit mining areas. The specific
improvements are summarized as follows:
(1) To cope with the negative impact of adjacent scale feature fusion on models, we
propose utilizing a feature fusion factor and improving the calculation method. By
increasing effective samples post-fusion, this approach improves learning abilities
toward small and medium-sized scale targets.
(2) To enhance the detection accuracy of smaller targets in open-pit mining areas, reinforc-
ing shallow feature layer information extraction via added shallow detection layers
is crucial.
(3) Adaptively selecting appropriate receptive field features during model training can
help tackle insufficient feature information extraction in scenes containing vehicles
and pedestrians with significant scaling changes. Therefore, an adaptive receptive
field fusion module based on the concept of an RFB [21] network structure is proposed.
(4) For efficiently detecting dense small-scale targets with high occlusion, we introduce
StrongFocalLoss as a loss function while incorporating the CA attention mechanism to
alter model focus toward relevant features, resulting in improved algorithmic accuracy.
Figure 2. The IMOD mining scene collaborative system architecture based on 5G-multi-UAV.
20
Drones 2023, 7, 250
pyramids and horizontal connection structures of PAFPN for different scale feature fusion.
However, some scales exhibit large response values across adjacent feature maps, leading
to the identification of only one layer during network learning based on rough response
value estimation, resulting in poor detection accuracy and convergence effectiveness.
The challenge in multisystem collaborative target detection stems from variations
in image scale, sparse distribution of targets, a high number of targets, and small target
size. Balancing the computational demand for processing high-resolution UAV images
with limited computing power presents additional difficulties. To address these problems,
the IMOD model uses three layers of feature maps that differ in size to detect objects at
different scales. The model employs YOLOv5 as its base feature network and leverages
inter-layer connections to extract more semantically informative features that facilitate
effective object recognition while minimizing interference information. Selective channel
expansion used by the IMOD does not excessively impact the model’s parameter size
and avoids unnecessary training operations, thereby maximizing detection accuracy while
ensuring algorithmic speediness.
The effectiveness of the same sample can vary in characteristic maps of different scales.
Deep and shallow layers contribute differently to target information at various scales,
and their impact on other layers has both advantages and disadvantages. To alleviate the
negative effects of feature fusion, it is necessary to adjust the participation rate of deep
features in shallow feature learning by filtering out invalid samples during adjacent layer
transmission. This ensures more effective samples are available for learning on deep feature
maps, which improves detection performance for targets of different sizes. An improved
adjacent scale feature fusion strategy is proposed here to address these challenges. The
expression for the FPN’s feature fusion process is as follows:
Here, Ci and Pi+1 signify the feature map of layer i prior to and after feature fusion,
respectively. The term f lateral represents the one-in-FPN horizontal connection by convolu-
tion operation, whereas f upsample denotes an operation that increases the resolution twofold.
In addition, f conv indicates a convolution operation for processing features, whereas aii+1
signifies the factor for fusing features that should be multiplied when transferring layer
i + 1 feature maps into those of layer i.
This study derived its proposed feature fusion factors through statistical analysis with
calculated corresponding target numbers for each layer using this formula:
The proposed method in [22] utilizes an attention module for calculating the fusion
factor, incorporating the BAM attention mechanism from [23] and enhancing the efficiency
of feature fusion between adjacent scales, as illustrated in Figure 3. The feature fusion
factor is computed to alleviate the negative effects during the feature fusion process, with
its formula expressed as follows:
Here, Ci represents the feature map obtained by Ci after a 1 × 1 convolution opera-
tion, whereas Pi+1 denotes a feature map that was likewise processed through a twofold
upsampling operation from Pi+1 . Both Ms and Mc denote spatial and channel attention
modules used within the adjacent scale feature high-efficiency fusion (AFHF) module, as
depicted in Figure 3.
21
Drones 2023, 7, 250
The spatial attention module plays a crucial role in analyzing the differences between
adjacent feature maps at different layers of transmission and filtering out invalid samples
passed from deep to shallow layers. The Ms module is represented by the following formula:
Ms (Ci , Pi+1 ) = σ ( f 5 (So f tmax ( f 1 f 3 f 3 f 1 (Ci )) − So f tmax ( f 1 f 3 f 3 f 1 ( Pi+1 ))) (4)
Here, σ represents the sigmoid activation function, and f 1 , f 3 , and f 5 specify convolu-
tion operations with varying kernel sizes and shared parameter information. Additionally,
So f tmax refers to an operation involving multiplication across the row and column dimen-
sions of feature maps after a softmax operation has been applied. Overall, these techniques
aid in accurately identifying invalid sample data within drone imagery datasets at various
depths of analysis.
Each feature map channel contains a significant amount of information. The channel
attention module (CAM) focuses on the meaningful content within the feature map, and,
in conjunction with the spatial attention module (SAM), it can more effectively process
features at the channel level to reduce meaningless channels for improved performance.
The formula for the MC module is:
Mc (Ci , Pi+1 ) = σ ( MLP( GAP(Ci ); GAP( Pi+1 ))) (5)
Here, “GAP” represents global average pooling operation, while MLP represents a
multilayer perceptron with a hidden layer composed of fully connected layers along with
ReLU activation that reduces channel dimensions to 1/r times their original size before
expanding them back out again. The specific experiment uses r = 16 in this instance,
whereas sigmoid is used as an activation function.
22
Drones 2023, 7, 250
Therefore, this paper proposes using an adapted RFB-s network structure, which effec-
tively increases the receptive field area for adaptive fusion whilst addressing shortcomings
experienced with previous approaches. Figure 4 illustrates the improved methodology
employed in our study.
The proposed RFB-s module employs several techniques to optimize the structural
design. First, the input feature map is subjected to a 1 × 1 convolution operation, which
reduces both channel count and computational load. Asymmetric convolutional layers
are then used to further reduce parameter size before applying 3 × 3 dilated convolutions
that expand the feature perceptual field across three rates (1, 3, and 5). Each such branch
undergoes stitching, followed by another round of 1 × 1 convolution fusion operations
that yield a final output fulfilling the fusion requirements of each stage. Critically, our
approach incorporates shortcut regularization [24], which not only accelerates training
but also reconciles issues around exploding or vanishing gradient flow via the merging of
multiscale perceptual features with their original counterparts.
To further enhance the receptive field, the improved module for object detection in
drone imagery, referred to as SRFB-s (strong receptive field block), adopts the overall struc-
ture of multi-branch null convolution. It replaces 3 × 3 convolutions with more efficient
1 × 3 and 3 × 1 asymmetric convolutions to reduce parameter count and computational
effort. The module also includes additional perceptual field branches to provide a wider
range of features, including contextual information. Additionally, it utilizes the ASFF
network [25] to adaptively fuse feature maps and selects optimal fusion methods based
on scale targets during training to prevent irrelevant background noise degradation while
enhancing detection capabilities under occlusion, large-scale changes, etc.
23
Drones 2023, 7, 250
that highlight crucial features while ignoring irrelevant data, thus improving overall model
performance. The coordinate attention (CA) module, introduced in [26], filters out invalid
details, instead emphasizing relevant ones by incorporating novel encoding methods along
two spatial directions, integrating coordinate information into generated attention maps
for lightweight networks.
The role of the loss function is essential to enhance object detection and localization
in open-pit mining scene models. The loss function comprises three critical components:
localization loss, confidence loss, and classification loss. The formula is described as follows:
Loss = Localization Loss + Con f idence Loss + Classi f ication Loss (6)
where σ represents the prediction, and |y − σ |α (α ≥ 0), with scaling factor α, is a quality
label ranging from 0 to 1 that denotes the absolute value of distance. This hyperparameter
controls the downscaling rate, which can be set for optimal performance; in the literature,
examples such as the recent study of Yuan et al. [27] suggest an experimentally specific
variety of this parameter, where α = 2.
In open-pit mining scenarios, targets are often occluded or densely distributed at small
scales, causing overlaps between candidate frames and leading to reduced classification
accuracy within model loss functions subject to non-maximal suppression post-processing
analyses. These deficiencies may be compensated for by introducing SFL into models
that uncover obscured targets or those appearing on dense small-scale settings found in
open-pit mining scenes more accurately relative to earlier work.
24
Drones 2023, 7, 250
4. Experimental Analysis
4.1. Network Model Ablation Study
To fully examine the efficacy of the three proposed improvement strategies in this
paper, their impact on the performance of YOLOv5-s was investigated by conducting
ablation and robustness experiments. Evaluation metrics including parameter count,
weight size, computation volume (GFLOPS), mean average precision (mAP), and single-
frame detection time (FPS) were evaluated. The mAP was computed at an intersection over
a union threshold of 0.5 using the following formula:
c 1
1 1
mAP =
C ∑ APi = C P( R)dR (10)
i =1 0
where P is the accuracy rate, TP is true positive cases, FP is false positive cases, and FN is
false negative cases.
Figure 6 presents the training loss curves of localization, classification, and confidence
losses, indicating that the improved model converges faster than before. Furthermore,
Figure 7 reports an enhanced mAP achieved by the improved model.
25
Drones 2023, 7, 250
26
Drones 2023, 7, 250
The BUUISE-MO dataset has a picture resolution of 1920 × 1080, a training set of 7220,
and a test set of 2500, for a total of 9720 images, as shown in Figure 8. The dataset contains
15 categories including truck, forklift, car, excavator, person, signboard, and others, with
6041 (large), 9230 (medium), and 12,043 (small) labels. This dataset is appropriate for the
task of detecting small objects.
Table 1. Experimental results of the ablation study on the IMOD model using the BUUISE-MO dataset.
In this article, the IMOD model is proposed, which employs 5G-based multi-UAV and
deep learning methods to enhance unmanned mining vehicle behavior via multisystem
coordination in an open-pit mining environment. An autonomous mine obstacle image
dataset is constructed and experimentally analyzed to address difficulties with recognizing
small-scale targets amidst complex road scenes in mining areas. The IMOD model effec-
tively enhances safety for driverless vehicles and personnel/equipment within mining
zones. Future work should focus on improving the accuracy of small target recognition
under abnormal illumination conditions and addressing error correction resulting from
data desynchronization caused by multisystem coordinated power outages.
The BUUISE-MO dataset was chosen for quantitative and qualitative analysis to demon-
strate the degree of improvement achieved by the improved algorithm on various targets.
Table 2 displays the average accuracy of different enhanced algorithms for each target.
Upon comparison of Tables 1 and 2, it was determined that the inferior performance
of the model in open-pit mining environments can be attributed to an increase in occluded
targets and smaller targets. Although the SRFB-S module and AFHF module have been
demonstrated to improve the detection accuracy for trucks, signboards, excavators, persons,
forklifts, and cars within this scene, real-time detection is sacrificed as a result. The addition
of both CA attentional modules and SFL loss functions resulted in improvements of 0.3%
and 0.2%, respectively, without compromising other features. Implementing four detection
branches presents the potential for significantly enhancing small target detection accuracy
at the cost of increased model complexity.
27
Drones 2023, 7, 250
Table 3. Experimental results showing improved model robustness on the BUUISE-MO dataset.
"+
"*
")
"(
"'
"&
"%
"$
"#
"
"'
' ' % &
28
Drones 2023, 7, 250
!
According to the results in Table 4, obtained from different algorithmic models applied
on the BDD100K dataset, the enhanced YOLOv5 algorithm outperforms other prevalent
models based on both detection accuracy and speed metrics. In contrast to the YOLOv5-s
and YOLOv5-m algorithms, with swift performance but lower detection accuracy, the en-
hanced YOLOv5 algorithm manifests superiority over real-time model operation conditions
while delivering prominent overall performance, resulting in improved object recognition
in intricate road scenarios. This has practical significance, demonstrating its beneficial
applicability scope.
5. Conclusions
We proposed an object detection algorithm for complex road scenes in open-pit mining
environments, aiming to address the problems of low detection accuracy, false detection,
and missed detection of road occlusion targets and small-scale targets. Our algorithm is
based on adaptive feature fusion using the YOLOv5-s algorithm as a starting point. We
introduce a feature fusion factor to reduce negative impacts caused by adjacent scale fusion
strategies, increase effective samples after feature fusion, and improve learning ability for
small- to medium-sized targets. Additionally, we propose an improved receptive field
module that extracts more target feature information from shallow feature layers. Finally,
we introduce a CA attention mechanism and StrongFocalLoss loss function to improve
model accuracy for dense occlusion targets and small-scale targets.
We autonomously collect and construct a mine obstacle image dataset to facilitate
experimental testing of our approach. Our results show that our approach effectively
29
Drones 2023, 7, 250
addresses issues of blockage and small-scale target recognition in complex road scenarios
found in mining areas, with use cases demonstrating the IMOD model increases the safety
of unmanned vehicles while preserving equipment fidelity required at industrial scales.
Future work will involve improving recognition accuracy under abnormal illumination
conditions as well as correcting errors due to data synchrony caused by multisystem
shutdowns during network operations. Developing lightweight architectures toward
facilitating deployment on mobile terminals while simultaneously enhancing overall model
accuracy is also essential.
Author Contributions: Conceptualization and methodology, X.X. and C.X.; software and validation,
C.X. and Z.W.; formal analysis, Y.Z.; investigation, S.Z.; writing—original draft preparation, X.X.;
writing—review and editing, C.X., X.X. and Z.W.; project administration, H.B. and X.Q. All authors
have read and agreed to the published version of the manuscript.
Funding: This work is supported in part by a key project of the National Nature Science Foundation of
China (Grant No. 61932012), in part by the National Natural Science Foundation of China (Grant No.
62102033), and in part by Support for high-level Innovative Teams of Beijing Municipal Institutions
(Grant No. BPHR20220121).
Data Availability Statement: Not applicable.
Conflicts of Interest: The authors declare no conflict of interest.
References
1. Gao, Y.; Ai, Y.; Tian, B.; Chen, L.; Wang, J.; Cao, D.; Wang, F.Y. Parallel end-to-end autonomous mining: An IoT-oriented approach.
IEEE Internet Things J. 2019, 7, 1011–1023. [CrossRef]
2. Ko, Y.; Kim, J.; Duguma, D.G.; Astillo, P.V.; You, I.; Pau, G. Drone Secure Communication Protocol for Future Sensitive
Applications in Military Zone Number: 6. Sensors 2021, 21, 2057. [CrossRef] [PubMed]
3. Xu, C.; Wu, H.; Liu, H.; Gu, W.; Li, Y.; Cao, D. Blockchain-oriented privacy protection of sensitive data in the internet of vehicles.
IEEE Trans. Intell. Veh. 2022, 8, 1057–1067. [CrossRef]
4. Chen, S.; Hu, J.; Shi, Y.; Zhao, L.; Li, W. A vision of C-V2X: Technologies, field testing, and challenges with chinese development.
IEEE Internet Things J. 2020, 7, 3872–3881. [CrossRef]
5. Zhang, X.; Guo, A.; Ai, Y.; Tian, B.; Chen, L. Real-time scheduling of autonomous mining trucks via flow allocation-accelerated
tabu search. IEEE Trans. Intell. Veh. 2022, 7, 466–479. [CrossRef]
6. Ma, N.; Li, D.; He, W.; Deng, Y.; Li, J.; Gao, Y.; Bao, H.; Zhang, H.; Xu, X.; Liu, Y.; et al. Future vehicles: Interactive wheeled robots.
Sci. China Inf. Sci. 2021, 64, 1–3. [CrossRef]
7. Pan, Z.; Zhang, C.; Xia, Y.; Xiong, H.; Shao, X. An Improved Artificial Potential Field Method for Path Planning and Formation
Control of the Multi-UAV Systems. IEEE Trans. Circuits Syst. II Express Briefs 2022, 69, 1129–1133. [CrossRef]
8. Krichen, M.; Adoni, W.Y.H.; Mihoub, A.; Alzahrani, M.Y.; Nahhal, T. Security Challenges for Drone Communications: Possible
Threats, Attacks and Countermeasures. In Proceedings of the 2022 2nd International Conference of Smart Systems and Emerging
Technologies (SMARTTECH), Riyadh, Saudi Arabia, 9–11 May 2022; pp. 184–189.
9. Girshick, R.; Donahue, J. Trevor DARRELL a Jitendra MALIK. Rich feature hierarchies for accurate object detection and semantic
segmentation. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Columbus,
OH, USA, 23–28 June 2014; pp. 580–587.
10. Ren, S.; He, K.; Girshick, R.; Sun, J. Faster r-cnn: Towards real-time object detection with region proposal networks. Adv. Neural
Inf. Process. Syst. 2015, 28, 1497. [CrossRef] [PubMed]
11. Xiang, X.; Lv, N.; Guo, X.; Wang, S.; El Saddik, A. Engineering vehicles detection based on modified faster R-CNN for power grid
surveillance. Sensors 2018, 18, 2258. [CrossRef] [PubMed]
12. Ghosh, R. On-road vehicle detection in varying weather conditions using faster R-CNN with several region proposal networks.
Multimed. Tools Appl. 2021, 80, 25985–25999. [CrossRef]
13. Luo, J.Q.; Fang, H.S.; Shao, F.M.; Zhong, Y.; Hua, X. Multi-scale traffic vehicle detection based on faster R-CNN with NAS
optimization and feature enrichment. Def. Technol. 2021, 17, 1542–1554. [CrossRef]
14. Yin, G.; Yu, M.; Wang, M.; Hu, Y.; Zhang, Y. Research on highway vehicle detection based on faster R-CNN and domain
adaptation. Appl. Intell. 2022, 52, 3483–3498. [CrossRef]
15. Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You only look once: Unified, real-time object detection. In Proceedings of the
IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788.
16. Qiu, Z.; Bai, H.; Chen, T. Special Vehicle Detection from UAV Perspective via YOLO-GNS Based Deep Learning Network. Drones
2023, 7, 117. [CrossRef]
17. Koay, H.V.; Chuah, J.H.; Chow, C.O.; Chang, Y.L.; Yong, K.K. YOLO-RTUAV: Towards real-time vehicle detection through aerial
images with low-cost edge devices. Remote Sens. 2021, 13, 4196. [CrossRef]
30
Drones 2023, 7, 250
18. Zhang, Y.; Guo, Z.; Wu, J.; Tian, Y.; Tang, H.; Guo, X. Real-Time Vehicle Detection Based on Improved YOLO v5. Sustainability
2022, 14, 12274. [CrossRef]
19. Ultralytics. YOLOv5. Available online: https://github.com/ultralytics/yolov5 (accessed on 3 February 2023).
20. Lu, X.; Ai, Y.; Tian, B. Real-time mine road boundary detection and tracking for autonomous truck. Sensors 2020, 20, 1121.
[CrossRef]
21. Liu, S.; Huang, D. Receptive field block net for accurate and fast object detection. In Proceedings of the European Conference on
Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 404–419.
22. Wu, H.; Xu, C.; Liu, H. S-MAT: Semantic-Driven Masked Attention Transformer for Multi-Label Aerial Image Classification.
Sensors 2022, 22, 5433. [CrossRef]
23. Park, J.; Woo, S.; Lee, J.Y.; Kweon, I.S. Bam: Bottleneck attention module. arXiv 2018, arXiv:1807.06514.
24. Geirhos, R.; Jacobsen, J.; Michaelis, C.; Zemel, R.; Brendel, W.; Bethge, M.; Wichmann, F. Shortcut Learning in Deep Neural
Networks. Nat. Mach. Intell. 2020, 2, 665–673. [CrossRef]
25. Cheng, X.; Yu, J. RetinaNet with difference channel attention and adaptively spatial feature fusion for steel surface defect
detection. IEEE Trans. Instrum. Meas. 2020, 70, 1–11. [CrossRef]
26. Hou, Q.; Zhou, D.; Feng, J. Coordinate attention for efficient mobile network design. In Proceedings of the IEEE/CVF Conference
on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 13708–13717.
27. Yuan, J.; Wang, Z.; Xu, C.; Li, H.; Dai, S.; Liu, H. Multi-vehicle group-aware data protection model based on differential privacy
for autonomous sensor networks. IET Circuits Devices Syst. 2023, 17, 1–13. [CrossRef]
28. Li, M.; Zhang, H.; Xu, C.; Yan, C.; Liu, H.; Li, X. MFVC: Urban Traffic Scene Video Caption Based on Multimodal Fusion.
Electronics 2022, 11, 2999. [CrossRef]
29. Yu, F.; Chen, H.; Wang, X.; Xian, W.; Chen, Y.; Liu, F.; Madhavan, V.; Darrell, T. BDD100K: A Diverse Driving Dataset for
Heterogeneous Multitask Learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition,
Seattle, WA, USA, 13–19 June 2020; pp. 2636–2645.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.
31
drones
Article
Decentralized Multi-UAV Cooperative Exploration Using
Dynamic Centroid-Based Area Partition
Jianjun Gui 1,† , Tianyou Yu 1,† , Baosong Deng 1, *, Xiaozhou Zhu 1 and Wen Yao 1,2
1 Defense Innovation Institute, Chinese Academy of Military Science, Beijing 100071, China
2 Intelligent Game and Decision Laboratory, Beijing 100071, China
* Correspondence: dbs@nudt.edu.cn
† These authors contributed equally to this work.
Abstract: Efficient exploration is a critical issue in swarm UAVs with substantial research interest due
to its applications in search and rescue missions. In this study, we propose a cooperative exploration
approach that uses multiple unmanned aerial vehicles (UAVs). Our approach allows UAVs to
explore separate areas dynamically, resulting in increased efficiency and decreased redundancy. We
use a novel dynamic centroid-based method to partition the 3D working area for each UAV, with
each UAV generating new targets in its partitioned area only using the onboard computational
resource. To ensure the cooperation and exploration of the unknown, we use a next-best-view (NBV)
method based on rapidly-exploring random tree (RRT), which generates a tree in the partitioned area
until a threshold is reached. We compare this approach with three classical methods using Gazebo
simulation, including a Voronoi-based area partition method, a coordination method for reducing
scanning repetition between UAVs, and a greedy method that works according to its exploration
planner without any interaction. We also conduct practical experiments to verify the effectiveness of
our proposed method.
Figure 1. System framework. The modules of localization, mapping, partition, and planning are run
independently in each UAV. All of the working UAVs connect to a 5G WiFi for information exchange.
Poses and platform weights are shared to dynamically adjust the partition areas. The details of
partition and weight calculation are discussed in Section 4.
2. Related Work
In recent years, significant progress has been made within the academic community
regarding collaborative exploration by multiple UAVs in unknown environments [15].
In light of our research focus, this paper will examine three areas related to collaborative
34
Drones 2023, 7, 337
35
Drones 2023, 7, 337
The two mentioned categories were widely used in the exploration planning of a
single UAV. However, for multi-UAV exploration, a coordination module is required to
prevent collisions and redundancies. The NBV method [12] is commonly utilized in such
scenarios. This method iteratively selects viewpoints in free space to refresh candidates’
paths, ensuring a consistent update rate. Our proposed method follows this approach by
integrating the strengths of the sampling-based method. This enables frequent recollection
of viewpoints to avoid collisions and facilitate flexible collaboration between UAVs.
3. Problem Statement
The task of multi-UAV exploration in an unknown environment performs the pro-
cess of exploring and mapping iteratively. A 3D workspace W S of known size is given
before the task for establishing the concerned area; all UAVs will explore the workspace.
Exploration processing by a UAV team contains N identical UAV with four degrees of
freedom, as the 3D position [ x, y, z] T ∈ R3 and the yaw angle ψ ∈ S 1 . The UAV state can
be described as x = [ x, y, z, ψ] T . In each platform, a depth camera is equipped to collect the
environment information with a certain field of view.
The environment is reconstructed by an Octomap M, and the occupancy probability
of each gird m ∈ M is initialized as P(m) = 0.5. The posterior occupancy probability
P(m | x1:t , z1:t ) is updated by the depth measurement z1:t and the UAV state x1:t from initial
time to current time t. The grids in the map will be gradually scanned by the sensor and
identified as either free grids M f = {m | P(m | x1:t , z1:t ) < Pf ree , m ∈ M} or occupied
grids Mo = {m | P(m | x1:t , z1:t ) > Pocc , m ∈ M}. Pf and Po are given thresholds. Given a
map M at time t, the receding horizon exploration planner decides an optimal path T ∗
in every period. To seek the T ∗ for the UAV so that it gathers measurements that reduce
unknown space and maintain coordination, a cost function is formulated to measure the
value of the candidate path, considering the uncertainty of the map M, the UAV team
information RT, the location of waypoints in path T , and the time cost of the path c(T ).
UAVs visit unknown spaces independently according to the outputs of the exploration
planner. We assume that the UAVs are equipped with an accurate localization system.
From the initial state, the UAVs are deployed and set with a connected network and a
known relative position as in a practical task application.
4. Method
This paper describes the implementation of a decentralized structure, as illustrated
in Figure 1. Each platform performs RRT-based planning, area partitioning, and mapping
independently. The core parts of our proposed modules are as follows.
36
Drones 2023, 7, 337
E , of the executable
which can lead to environmental observation and the space volume VPA
area are known, the weight w can be calculated as:
w = Ne /VPA
E
. (2)
E in the PA can be approximated as Equation (3). The V E
The executable space volume VPA PA
represents the volume of a space, which is free, and the RRT-based planner in Section 4.3
can build a tree there.
Cloop
E
VPA ≈ VW S / = VW S × NT /Cloop . (3)
NT
The number of node sampling loops Cloop is counted to calculate the size of the
executable exploration area. With the executable space bigger, the Cloop is smaller because it
is more likely to generate a candidate in PA successfully. An average number of samples
required to generate a node can be formulated by NT /Cloop .
The proposed dynamic centroid-based area partition calculates a virtual centroid
Xc = ( xc , yc ) adjusted from the information of platform weights wi for UAVi and its
positions Xi = ( x, y) for the 2D projection from the poses, expressed in Equation (4). UAVs
explore 3D space and map a 3D Octomap, while the two-dimensional coordinates of UAVs
for partitioning the 3D area are considered to be the simplest form of calculation.
∑iN=1 wi · Xi
Xc = . (4)
∑iN=1 wi
As illustrated in Figure 2 and depicted in Equations (5) and (6), the partition ray Prs
for the area partition are generated according to the included angle between the j th and
( j + 1) th UAV when starting from the virtual centroid. θ Pr j is the angle from x axis to Pr j
in a counterclockwise direction.
θ Pr j1 /θ Pr j2 ∝ w j+1 /w j . (6)
Figure 2. The partitioning process of UAV j . Each UAV computes a virtual centroid, and a 3D self-
responsible area is partitioned by the nearest two planes P , defined by partition rays Pr and the axis
of the centroid.
For UAV j , its exploring space PA j is bounded by two vertical planes P j and P j−1 ,
which are defined by partition rays and the axis of the centroid, see Equation (7). As il-
lustrated in Figure 2, the pink area in the sector space marks the two-dimensional plane
37
Drones 2023, 7, 337
projection of the exploration space allocated ahead. The partition area is formulated by
Equation (8). After the coordinates of the three-dimensional point under the x and y axes
are converted to the polar coordinate system, the angle should be between θ Pr j−1 and θ Pr j .
−
−−−−−−−−−−−−−−−−−−−−−−−
π π →
Pj = p | cos θ Pr j + , sin θ Pr j + , 0 · p = 0 , (7)
2 2
(a) (b)
Figure 3. Distributed ray-cast mapping. As (a) shows, three small axes represent the UAV, and the
big axis represents the origin of a unified coordinate system. The depth measurement is visualized by
point clouds of various colours, where blue denotes objects that are far away and red indicates those that
are in close proximity. As the UAV wander, (b) shows the map expressed by Octomap simultaneously.
38
Drones 2023, 7, 337
Throughout the process, the planner generates a tree T consisting of nodes in the
free space of an OctoMap M. Each node n corresponds to a potential viewpoint, with its
state denoted by ξ = ( x, y, z, ψ) T , reflecting position and yaw. The tree is constrained to
j
remain in the collision-free space to guarantee safe planning. For UAVj, the best node nbest
is selected based on Equation (9), which considers the feasibility of the path. The function
G (n) reflects the gain of the parent node, and a novel information value (with related
parameters) can be expressed as g(M, n, P j , P j−1 ) in Equation (10). This optimization
aims to minimize Equation (1) considering M for information measurements and path
planning, nodes corresponding to the path, and P j , P j−1 aggregating team information to
enable coordination.
j
nbest = arg maxn∈Tj G (n), (9)
39
Drones 2023, 7, 337
and
g(M, n, P j , P j−1 ) = ∑ V (v) × e−λc(σ) ∏ f . (11)
v∈ FOV (ξ )∩M
In this context, the function V (v) takes a value of 1 if the voxel v is unexplored, and
0 otherwise. Our objective is to explore the unknown space, and the visible voxels within
the field of view of the sensor are accumulated to form a basic visibility score. The cost of a
path from the initial node n0 to n is determined by the RRT algorithm and denoted by c(σ ).
To enable collaboration among the UAVs, the product function ∏ f is utilized, as shown
in Figure 4. Each plane P exerts a repulsive factor of f = 1 + 1/d, where d is the distance
between the candidate node and the plane P . The term e−λc(σ) causes the visibility score to
decrease smoothly as the path cost increases, while e− ∏ 1+1/d drives the UAVs away from
the P j and P j−1 .
Figure 4. The planning process of UAV j . The d j and d j−1 are calculated to formulate the factor f j
and f j−1 .
The algorithm initializes the value of n0 in tree T as the current state of the UAV
during the first initialization. Otherwise, the initial tree would be set as the previous best
branch. The BestBranch is defined as the branch from n0 to nbest . To ensure that sufficient
environmental information is obtained while generating nodes, the tree T must have a
minimum of Ninit nodes, and loops continue until a valuable solution is obtained with
G (nbest ) = 0. To prevent the UAV from idling due to unreasonable area partitioning,
the partition constraint is disabled when the number of tree nodes NT exceeds a certain
threshold Nthreshold . Once NT > NMAX , the exploration is deemed to be completed. In each
iteration, the first segment of the BestBranch is considered the planned path. The weight w
can be updated using Equation (12) according to Equation (2). Since W S is assumed to be
given and VW S is constant and known, it is omitted.
w = Ne /VPA
E
≈ Ne × Sampleloop/(VW S × NT ). (12)
The algorithm presented in this paper is intended to be executed on each UAV inde-
pendently. The planner and the area partition calculation are interdependent, and a smaller
partition area can make it more difficult for the planner to generate an effective trajectory,
resulting in a smaller weight. In turn, a smaller weight can lead to a larger partition being
provided to the UAV during the partition area calculation.
40
Drones 2023, 7, 337
5. Evaluation in Simulation
To demonstrate the superiority of the proposed method, we compared it with three
representative algorithms. The first was the greedy method for group application, which
did not employ cooperative settings (referred to as “greedy” in this context). The second [11]
was the classic method that discounted the information gain based on the repeating area
(referred to as “coordination” in this context). In this method, the gain was reduced based
on the area within the sensor range of the current best nodes of other UAVs. The third
method [7] used dynamic Voronoi partitions to assign different target locations to individual
UAVs, minimizing duplicated exploration areas (referred to as “Voronoi” in this context).
All of the three algorithms were decentralized and used RRT to generate candidate points.
Simulation experiments were conducted using Gazebo. All of the methods were tested
with the same virtual UAVs and environmental settings. Each UAV was equipped with
a depth camera that had a field of view [60, 90]◦ in the vertical and horizontal directions.
For the indoor scenario, the camera was mounted with a downward pitch angle of 15◦ .
For the outdoor scenario, it was mounted with a pitch angle of 35◦ . For all of the simulation
experiments, the maximum velocity was set as ψ̇max = 0.5 rad/s and vmax = 0.25 m/s,
while the size of the collision detection box was assumed to be 0.5 × 0.5 × 0.3 m3 .
41
Drones 2023, 7, 337
(a) (b)
(c) (d)
Figure 5. Simulation environments and their exploration results. (a) A 20 × 12 × 3 m3 indoor scenario,
a regular single-story space; (b) a 40 × 40 × 9 m3 outdoor scenario, a typical urban community.
The colored points on the floor represent the initial positions of UAVs, with red indicating 2 UAVs,
orange for 3 UAVs, yellow for 4 UAVs, and green for 5 UAVs. (c,d) are the exploration results
for both scenarios. (The following figures in this paper have the same meaning, where the spatial
structure and depth information are depicted using grids of different colours.)
450
Greedy
400 Coordination 100
Voronoi
Exploration Completion (%)
90
Proposed
Exploration Time (s)
350
80
300 70
60
250
50
200
40
150 30 Proposed
20 Voronoi
100 Classic
10
Baseline
50 0
2 3 4 5 0 15 30 45 60 75 90 105 120 135 150
Number of UAVs Time (s)
(a) (b)
Figure 6. Numerical analysis of indoor simulation. (a) shows the exploration time of a team with 2, 3,
4, and 5 UAVs. (b) shows the team of 5 UAVs in one trial. Three other algorithms are compared.
42
Drones 2023, 7, 337
Greedy
1400 100
Coordination
Voronoi
1200
80
70
1000
60
50
800
40
30 Proposed
600
20 Voronoi
Classic
400 10
Baseline
0
2 3 4 5 0 100 200 300 400 500 600 700
Number of UAVs Time (s)
(a) (b)
Figure 7. Numerical analysis of outdoor simulation. (a) shows the exploration time of the team with
2, 3, 4, and 5 UAVs. (b) shows the team of 5 UAVs in one trial. Three other algorithms are compared.
For each planner, 20 trials were conducted for teams consisting of 2, 3, 4, and 5 UAVs,
starting from the same initial position, where the relative distances between UAVs were
less than 100 cm. Similar to the indoor scenario, the results of the algorithms (indicated in
Figure 7) show that the proposed method outperforms the voronoi, coordination, and greedy
methods, with a mean completion time of 376.7 s for a team of 5 UAVs. In contrast,
the voronoi, coordination, and greedy methods gave mean completion times of 428.8 s,
657.8 s, and 771.5 s, respectively. The experiment also shows that as the scenario size
increases, the greedy method, without cooperation, becomes increasingly random and
unintelligent. The variance of the results is significantly larger in the outdoor scenario.
In both scenarios, the exploration completion rate exhibits a decreasing trend. This
trend can be attributed to the fact that at the start of the exploration, there were many
unknown areas, and the UAVs can find enough task points with fewer sampling times.
Regardless of the efficiency of the planning algorithm, the UAVs could detect unknown
spaces that were not pre-included in the planning, which could lead to efficient exploration
in the beginning. As the environment has been continuously explored, the unknown area
has decreased, and the time required to calculate the targets has become longer. Especially
with the use of the receding horizon method, the UAVs often re-visited those optimal targets
due to being confined to local optima, which would slow down the rate of exploration at
the later stage.
43
Drones 2023, 7, 337
(a) (b)
Figure 8. The practical experiments. (a) shows the initial status of three UAVs; they are placed on the
same side of a room. (b) shows three UAVs are performing exploration in one trial; a 10 × 8 × 3 m3
virtual boundary is set to bound the exploring space.
44
Drones 2023, 7, 337
(a) (b)
Figure 9. Robust coordination case. (a) shows UAV represented by the red arrow has stopped
exploring due to insufficient power at an early stage; (b) shows other UAVs continue to finish the
task. The yellow one helps the red to explore the bottom right corner of this environment.
Figure 10. The mapping process of one practical experiment. The sampling times display the
complete process of three UAVs collaborating on exploration, with each UAV’s designated area being
continuously updated. The UAV represented by the yellow icon on the left gradually moves towards
the lower region, collaborating with the other UAVs to adjust the exploration area. In the final map,
gaps on the ground were influenced by the depth camera’s perception range. It is assumed that using
more powerful sensors such as 3D LiDAR [29] may mitigate this phenomenon, but this approach
necessitates further consideration of the experimental system’s applicability.
45
Drones 2023, 7, 337
Author Contributions: Conceptualization, J.G. and B.D.; methodology, J.G. and T.Y.; software, T.Y.;
validation, T.Y., J.G. and X.Z.; formal analysis, T.Y. and J.G.; investigation, J.G.; resources, X.Z. and
W.Y.; data curation, T.Y. and J.G.; writing—original draft preparation, T.Y.; writing—review and
editing, J.G.; visualization, J.G.; supervision, B.D and W.Y.; project administration, B.D.; funding
acquisition, B.D. All authors have read and agreed to the published version of the manuscript.
Funding: This work was supported by the National Natural Science Foundation of China (No. 61902423).
Data Availability Statement: Data available on request from the authors.
Conflicts of Interest: The authors declare no conflict of interest.
References
1. Cieslewski, T.; Kaufmann, E.; Scaramuzza, D. Rapid exploration with multi-rotors: A frontier selection method for high speed
flight. In Proceedings of the 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Vancouver, BC,
Canada, 24–28 September 2017; pp. 2135–2142. [CrossRef]
2. Wang, C.; Chi, W.; Sun, Y.; Meng, M.Q.H. Autonomous Robotic Exploration by Incremental Road Map Construction. IEEE Trans.
Autom. Sci. Eng. 2019, 16, 1720–1731. [CrossRef]
3. Xu, Z.; Deng, D.; Shimada, K. Autonomous UAV Exploration of Dynamic Environments Via Incremental Sampling and
Probabilistic Roadmap. IEEE Robot. Autom. Lett. 2021, 6, 2729–2736. [CrossRef]
4. Jung, S. Bridge Inspection Using Unmanned Aerial Vehicle Based on HG-SLAM: Hierarchical Graph-Based SLAM. Remote Sens.
2020, 12, 3022. [CrossRef]
5. Wang, J.; Wu, Y.X.; Chen, Y.Q.; Ju, S. Multi-UAVs collaborative tracking of moving target with maximized visibility in urban
environment. J. Frankl. Inst. 2022, 359, 5512–5532. [CrossRef]
6. Pan, T.; Gui, J.; Dong, H.; Deng, B.; Zhao, B. Vision-Based Moving-Target Geolocation Using Dual Unmanned Aerial Vehicles.
Remote Sens. 2023, 15, 389. [CrossRef]
7. Hu, J.; Niu, H.; Carrasco, J.; Lennox, B.; Arvin, F. Voronoi-Based Multi-Robot Autonomous Exploration in Unknown Environments
via Deep Reinforcement Learning. IEEE Trans. Veh. Technol. 2020, 69, 14413–14423. [CrossRef]
8. Simmons, R.; Apfelbaum, D.; Burgard, W.; Fox, D.; Moors, M.; Thrun, S.; Younes, H. Coordination for Multi-Robot Exploration and
Mapping; AAAI Press: Palo Alto, CA, USA 2020; pp. 852–858.
9. Yu, J.; Tong, J.; Xu, Y.; Xu, Z.; Dong, H.; Yang, T.; Wang, Y. SMMR-Explore: SubMap-based Multi-Robot Exploration System
with Multi-robot Multi-target Potential Field Exploration Method. In Proceedings of the 2021 IEEE International Conference on
Robotics and Automation (ICRA), Xi’an, China, 30 May–5 June 2021; pp. 8779–8785. [CrossRef]
10. Masaba, K.; Li, A.Q. GVGExp: Communication-Constrained Multi-Robot Exploration System based on Generalized Voronoi
Graphs. In Proceedings of the 2021 International Symposium on Multi-Robot and Multi-Agent Systems (MRS), Cambridge, UK,
4–5 November 2021; pp. 146–154. [CrossRef]
11. Mannucci, A.; Nardi, S.; Pallottino, L. Autonomous 3D Exploration of Large Areas: A Cooperative Frontier-Based Approach. In
Proceedings of the Modelling and Simulation for Autonomous Systems, Rome, Italy, 24–26 October 2017; Mazal, J., Ed.; Springer
International Publishing: Cham, Switzerland, 2018; pp. 18–39. [CrossRef]
46
Drones 2023, 7, 337
12. Bircher, A.; Kamel, M.; Alexis, K.; Oleynikova, H.; Siegwart, R. Receding Horizon “Next-Best-View” Planner for 3D Exploration.
In Proceedings of the 2016 IEEE International Conference on Robotics and Automation (ICRA), Stockholm, Sweden, 16–21 May
2016; pp. 1462–1468. [CrossRef]
13. Lindqvist, B.; Agha-Mohammadi, A.A.; Nikolakopoulos, G. Exploration-RRT: A multi-objective Path Planning and Exploration
Framework for Unknown and Unstructured Environments. In Proceedings of the 2021 IEEE/RSJ International Conference on
Intelligent Robots and Systems (IROS), Prague, Czech Republic, 27 September–1 October 2021; pp. 3429–3435. [CrossRef]
14. Hornung, A.; Wurm, K.M.; Bennewitz, M.; Stachniss, C.; Burgard, W. OctoMap: An efficient probabilistic 3D mapping framework
based on octrees. Auton. Robot. 2013, 34, 189–206. [CrossRef]
15. Yu, T.; Deng, B.; Gui, J.; Zhu, X.; Yao, W. Efficient Informative Path Planning via Normalized Utility in Unknown Environments
Exploration. Sensors 2022, 22, 8429. [CrossRef] [PubMed]
16. Ju, S.; Wang, J.; Dou, L. MPC-Based Cooperative Enclosing for Nonholonomic Mobile Agents Under Input Constraint and
Unknown Disturbance. IEEE Trans. Cybern. 2023, 53, 845–858. [CrossRef]
17. Fox, D.; Jonathan, K.O.; Konolige, K.; Limketkai, B.; Schulz, D.; Stewart, B. Distributed Multirobot Exploration and Mapping.
Proc. IEEE 2006, 94, 1325–1339. [CrossRef]
18. Tang, Y.; Chen, Y.; Zhou, D. Measuring uncertainty in the negation evidence for multi-source information fusion. Entropy 2022,
24, 1596. [CrossRef] [PubMed]
19. Hardouin, G.; Moras, J.; Morbidi, F.; Marzat, J.; Mouaddib, E.M. Next-Best-View planning for surface reconstruction of large-scale
3D environments with multiple UAVs. In Proceedings of the 2020 IEEE/RSJ International Conference on Intelligent Robots and
Systems (IROS), Las Vegas, NV, USA, 24 October 2020–24 January 2021; pp. 1567–1574. [CrossRef]
20. Oleynikova, H.; Taylor, Z.; Fehr, M.; Siegwart, R.; Nieto, J. Voxblox: Incremental 3D Euclidean Signed Distance Fields for
on-board MAV planning. In Proceedings of the 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems
(IROS), Vancouver, BC, Canada, 24–28 September 2017; pp. 1366–1373. [CrossRef]
21. Li, H.; Tsukada, M.; Nashashibi, F.; Parent, M. Multivehicle Cooperative Local Mapping: A Methodology Based on Occupancy
Grid Map Merging. IEEE Trans. Intell. Transp. Syst. 2014, 15, 2089–2100. [CrossRef]
22. Corah, M.; O’Meadhra, C.; Goel, K.; Michael, N. Communication-Efficient Planning and Mapping for Multi-Robot Exploration in
Large Environments. IEEE Robot. Autom. Lett. 2019, 4, 1715–1721. [CrossRef]
23. Schmid, L.; Reijgwart, V.; Ott, L.; Nieto, J.; Siegwart, R.; Cadena, C. A Unified Approach for Autonomous Volumetric Exploration
of Large Scale Environments Under Severe Odometry Drift. IEEE Robot. Autom. Lett. 2021, 6, 4504–4511. [CrossRef]
24. Zhou, B.; Zhang, Y.; Chen, X.; Shen, S. FUEL: Fast UAV Exploration Using Incremental Frontier Structure and Hierarchical
Planning. IEEE Robot. Autom. Lett. 2021, 6, 779–786. [CrossRef]
25. Lee, E.M.; Choi, J.; Lim, H.; Myung, H. REAL: Rapid Exploration with Active Loop-Closing toward Large-Scale 3D Mapping
using UAVs. In Proceedings of the 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Prague,
Czech Republic, 27 September–1 October 2021; pp. 4194–4198. [CrossRef]
26. Zhu, H.; Cao, C.; Xia, Y.; Scherer, S.; Zhang, J.; Wang, W. DSVP: Dual-Stage Viewpoint Planner for Rapid Exploration by Dynamic
Expansion. In Proceedings of the 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Prague,
Czech Republic, 27 September–1 October 2021; pp. 7623–7630. [CrossRef]
27. Schmid, L.; Pantic, M.; Khanna, R.; Ott, L.; Siegwart, R.; Nieto, J. An Efficient Sampling-Based Method for Online Informative
Path Planning in Unknown Environments. IEEE Robot. Autom. Lett. 2020, 5, 1500–1507. [CrossRef]
28. Charrow, B.; Kahn, G.; Patil, S.; Liu, S.; Goldberg, K.; Abbeel, P.; Michael, N.; Kumar, V. Information-Theoretic Planning with
Trajectory Optimization for Dense 3D Mapping. In Proceedings of the Robotics: Science and Systems, Rome, Italy, 13–17 July
2015; Volume 11, pp. 3–12. [CrossRef]
29. Li, S.; Tian, B.; Zhu, X.; Gui, J.; Yao, W.; Li, G. InTEn-LOAM: Intensity and Temporal Enhanced LiDAR Odometry and Mapping.
Remote Sens. 2022, 15, 242. [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.
47
drones
Article
BCDAIoD: An Efficient Blockchain-Based Cross-Domain
Authentication Scheme for Internet of Drones
Gongzhe Qiao 1 , Yi Zhuang 1, *, Tong Ye 1 and Yuan Qiao 2
1 College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics,
Nanjing 211100, China; qgz@nuaa.edu.cn (G.Q.); yetong@nuaa.edu.cn (T.Y.)
2 Harbin Electric Power Bureau, STATE GRID Corporation of China, Harbin 150050, China;
qiaoyuan_qy@163.com
* Correspondence: zy16@nuaa.edu.cn
Abstract: During long-distance flight, unmanned aerial vehicles (UAVs) need to perform cross-
domain authentication to prove their identity and receive information from the ground control station
(GCS). However, the GCS needs to verify all drones arriving at the area it is responsible for, which
leads to the GCS being unable to complete authentication in time when facing cross-domain requests
from a large number of drones. Additionally, due to potential threats from attackers, drones and GCSs
are likely to be deceived. To improve the efficiency and security of cross-domain authentication, we
propose an efficient blockchain-based cross-domain authentication scheme for the Internet of Drones
(BCDAIoD). By using a consortium chain with a multi-chain architecture, the proposed method
can query and update different types of data efficiently. By mutual authentication before cross-
domain authentication, drones can compose drone groups to lighten the authentication workload
of domain management nodes. BCDAIoD uses the notification mechanism between domains to
enable path planning for drones in advance, which can further improve the efficiency of cross-domain
authentication. The performance of BCDAIoD was evaluated through experiments. The results
show that the cross-domain authentication time cost and computational overhead of BCDAIoD are
significantly lower those of than existing methods when the number of drones is large.
Citation: Qiao, G.; Zhuang, Y.; Ye, T.;
Keywords: blockchain; Internet of Drone; cross-domain authentication; security protocol;
Qiao, Y. BCDAIoD: An Efficient
Blockchain-Based Cross-Domain
multi-UAVs; group policy
Authentication Scheme for Internet of
Drones. Drones 2023, 7, 302.
https://doi.org/10.3390/
drones7050302 1. Introduction
Academic Editors: Zhihong Liu,
The increasing demand for unmanned aerial vehicles (UAVs) and Internet of Drones
Shihao Yan, Yirui Cong and (IoD) in civil and military applications has been noticed in the last few decades [1]. IoD
Kehao Wang usually consists of a number of drones, ground control stations (GCS), and a communication
network for data exchange. Drones can obtain information through sensors and exchange
Received: 3 April 2023
information through the network. They can also communicate with GCS and other drones
Revised: 28 April 2023
for proper navigation [2]. The available space of the air traffic network (ATN) is much
Accepted: 3 May 2023
larger than that of the ground traffic network (GTN). Reasonable use of the ATN can reduce
Published: 4 May 2023
the burden of the GTN. Furthermore, using ATN can also avoid the congestion of the GTN.
Therefore, many companies try to use drones as air transport tools for cargo transportation,
so as to promote logistics efficiency [3].
Copyright: © 2023 by the authors. During long-distance flights, drones are likely to enter other IoD domains where
Licensee MDPI, Basel, Switzerland. they need to pass identity authentication and obtain necessary information such as flight
This article is an open access article routes to continue their tasks. In the IoD, wireless communication between drones is
distributed under the terms and easy to eavesdrop on, resulting in information leakage [4–7]. Furthermore, attackers can
conditions of the Creative Commons conduct replay attacks or identity forgery to interrupt the IoD [8–10]. In addition, once
Attribution (CC BY) license (https:// drone transportation forms a complete industry, the number of drones may be quite large.
creativecommons.org/licenses/by/ The resource overhead of complex authentication mechanisms, such as computation and
4.0/).
storage, is a huge challenge for identity authentication servers [11–13]. At the same time, it
also increases the cross-domain waiting time for drones to obtain necessary information.
Centralized authentication techniques [14–16] are widely adopted in traditional in-
formation systems. However, in an IoD environment, if conventional centralized au-
thentication approaches are applied, the workload of the authentication service center
increases exponentially as the system scales up [17]. Therefore, traditional authentica-
tion technology is not suitable for the IoD environment. At present, there are also some
studies using public key infrastructure (PKI), trusted third party (TTP), or blockchain
technology to verify the identity of the drone and the authenticity of the task [18,19]. The
existing methods are shown in Figure 1a. However, the existing studies rarely consider the
efficiency of cross-domain authentication in the case of a large number of drones. Addition-
ally, when the number of drones is large, the time cost of cross-domain authentication is
also very high.
Figure 1. Comparison of existing methods and this paper. (a) Ideas of existing methods. (b) The idea
of the method proposed in this paper.
50
Drones 2023, 7, 302
authentication delay time and increase the waiting time of drones. (3) The authentication
server needs to authenticate the UAV identity information, verify the transaction in the
blockchain, and upload the task information. A large number of drones to be authenticated
may cause server interruptions.
To address the above challenges, we propose an efficient cross-domain authentication
scheme based on blockchain. As shown in Figure 1b, the main idea is that drones should
authenticate each other before cross-domain authentication to form a mutually endorsed
group. To ensure that the identities and tasks of drones in each group are real and trusted,
we designed an authentication mechanism based on domain signature, encryption, and
domain private chain. In this way, drones with the same out-of-domain range (Rout)
compose a drone group. Considering the continuity of drone tasks in the process of cross-
domain tasks, we designed a notification mechanism between domains combined with
the concepts of token, permission, and authority. The main contributions of this paper
are as follows.
(1) We propose an efficient blockchain-based cross-domain authentication scheme
for the Internet of Drones (BCDAIoD). By using a consortium chain with a multi-chain
architecture, the proposed method can query and update different types of data efficiently,
which can also facilitate the domain management node to manage and control the drones.
Additionally, we describe the mission model of the drones.
(2) In order to improve the efficiency of the cross-domain authentication of drones, we
designed an establishment method of drone groups and a group cross-domain authentica-
tion method based on blockchain, encryption, and challenge–response game. By mutual
authentication before cross-domain authentication, drones can compose drone groups
to lighten the authentication workload of domain management nodes and improve the
efficiency of cross-domain authentication.
(3) We propose a notification mechanism between domains that can enable the man-
agement node of the next domain to know the task information of the drones in ad-
vance. The management node of the next domain can plan space resources reasonably
and plan the flight path for drones in advance, which can also ensure the continuity of the
tasks of drones.
The remainder of this article is organized as follows. The literature is reviewed in
Section 2. The framework of BCDAIoD, the consortium blockchain architecture, and the
mission model of the drones are presented in Section 3. In Section 4, we first propose the
single-drone cross-domain authentication method. Then, we propose the establishment
mechanism of drone groups, the drone group cross-domain authentication method, and the
notification mechanism between domains. Section 5 describes the simulation experiments
and shows the experimental evaluation results of the BCDAIoD method. In Section 6, we
analyze the security of BCDAIoD. Finally, Section 7 concludes this paper.
2. Related Works
(1) IoD management scheme
IoD has received widespread attention due to its potential application prospects.
Therefore, there are studies targeting the management scheme of IoD. In order to facilitate
the information acquisition of drones and users in IoD, Al-Hilo et al. [22] proposed a
collaborative and management framework between UAVs and roadside units. Arafeh
et al. [23] proposed a blockchain-based UAV management method that can verify the
authenticity of information in IoD networks. By using blockchain and trust policies, García-
Magariño et al. [24] proposed a UAV management approach which can also maintain
security in IoD by corroborating information about events from different sources. However,
the above UAV management framework mainly focuses on data security and neglects the
management efficiency when facing a large number of drones. Additionally, the blockchain-
based methods have not been able to reasonably partition the data storage on the chain,
leading to low query efficiency and data isolation.
51
Drones 2023, 7, 302
3. Overview of BCDAIoD
3.1. The BCDAIoD Framework
To improve the efficiency of data query, BCDAIoD uses a multi-chain architecture to
reasonably partition the data storage on the consortium blockchain. Additionally, domain
private chains are used to ensure the anonymity of drones. The framework of the proposed
BCDAIoD scheme is shown in Figure 2. The BCDAIoD framework includes four layers: an
application layer, service support layer, data storage layer, and network layer.
In the network layer, a P2P network is used for communication between consortium
nodes (CNs) and UAVs. Additionally, CNs can transmit information through the P2P
network, such as mission information. Each CN manages a certain domain and maintains a
private chain. The UAVs can also communicate with each other through the P2P network.
In the data storage layer, the UAV device information, task information, address
information of CNs, smart contracts, and user registration information are stored in the
consortium blockchain (CBC) in a specified format. At the same time, the CNs store the
details of the above information in local databases. The architecture of the consortium
blockchain is described in Section 3.2. For identity authentication and UAV grouping, the
CNs also maintain a private chain to store the device ID on PBC (Pid) and the out-of-domain
range (Rout) information of UAVs.
52
Drones 2023, 7, 302
The service support layer provides support for users, CNs, and UAVs to interact with
the data storage layer and the network layer. It mainly includes the consensus mechanism,
smart contract, identity authentication mechanism, access control strategy, path planning
algorithm, and communication protocol. The consensus mechanism can ensure that the
information of each block is consistent. The smart contract can conduct trusted transactions
in the form of commitment without a trusted third party. The identity authentication
mechanism and access control strategy can ensure that UAVs enter the correct domains and
obtain their own path information. By using the path planning algorithm, the CNs obtain
the next domain IDs and calculate the paths in their domains for UAVs. The communication
protocol supports the communication among users, UAVs, and CNs.
By using the functions of the application layer, users can submit registration appli-
cations to the CNs, release tasks, and query the status of tasks. The CNs can manage the
information of the UAV devices, tasks, and paths, as well as perform identity authentication
and path planning. The UAVs can query task information, view path information, and
submit cross-domain authentication requests to perform tasks. To facilitate the introduction
of subsequent methods, Table 1 lists the key symbols and their definitions.
53
Drones 2023, 7, 302
Table 1. Cont.
The Mission chain mainly stores task information, and the data structure can be
expressed as Equation (1), where Mid is the current task ID; Did is the drone’s ID on the
CBC that can be provided to or queried by the CNs; CNidn is the ID of the CN that is
responsible for the next domain; CNidd is the ID of the CN that manages the destination
domain; CNidc is the CN ID of the current domain; Rout is the expected range out of the
current domain; and Tout represents the current expected time out of the domain.
The User chain mainly stores the necessary information of the registered users. The
stored information can be expressed as Equation (2), where Uid is the user ID for the
consortium authentication, CNidr is the CN ID of the registration place, and Fu is the
current user account balance. The other registration details of the user are directly stored in
the local database of the registration place’s CN.
The Address chain mainly stores the domain range information of the CN, which can
be expressed as Equation (3), where CNid represents the ID of a CN; PKcn represents the
public key of the CN; and RGcn represents the domain range of the CN.
54
Drones 2023, 7, 302
The Devices chain mainly stores the necessary information of the UAV registered in the
CBC, which can be expressed as Equation (4), where PKd is the public key of a UAV; Mod
is the module of the UAV; and Pol is the execution strategy of the UAV. The registration
details of other UAVs can be directly stored in the local databases of the registration
place’s CN.
Devices = ( Did, PKd , CNidr , Mod, Pol ) (4)
The Contract chain mainly stores the contract information, which can be expressed as
Equation (5), where Cid is the current contract ID, V is the current contract version, and
Cont is the current contract content.
The PBC chain mainly stores the necessary information for the intradomain authen-
tication of the UAVs, which can be expressed as Equation (6), where Pid represents the
temporary ID of a UAV in a domain.
55
Drones 2023, 7, 302
denotes the detailed information of d j , such as drone model, maximum sailing distance,
and payload.
Step 2. After receiving the registration request, cni determines the acceptable task type
of d j (Pold j ), such as selecting tasks with reasonable distance according to
power consumption.
Step 3. Then, cni sends the registration request REG’ = Rrequest, macd j , CNidi , Pold j
to the other nodes in the CBC.
Step 4. After the members of the CBC reach a consensus, the CNs assign a device ID
in the CBC (Didd j ) to the d j and update the Devices chain. Furthermore, cni calculates the
public key (PKd j ) and private key (SKd j ) for d j . Then, cni sends the SKd j to d j .
Step 5. cni sends the Didd j and its public key PKcni to d j . Additionally, cni calculates
EncSKcn Didd j and sends it to d j . Then, cni stores the registration time, model, and other
i
detailed information of d j in the local database.
Symbol Definition
Flight path The flight path in the current domain.
Piddk Device ID of dk on the PBC.
SKdk Private key of dk .
PKcni Public key of cni for intradomain authentication between drones.
PKcnn Public key of cnn (the CNof the next domain) for cross-domain authentication.
EncSKcn Diddk The identification for cross-domain authentication.
i
Step 6. Finally, the cni stores the detailed information in the local database and removes
the drone dk from the available drones of the local database.
Furthermore, we make the following assumptions:
(1) We consider that each CN has the public keys of the other CNs, and the private
key of each CN is not leaked; (2) The private key of the drone carrying out the task is not
56
Drones 2023, 7, 302
leaked; (3) Attackers cannot deduce the private key from the public key, or it takes too
much time.
57
Drones 2023, 7, 302
In this way, cnn can determine the identity and task information of d j . Then, cnn
sends the Token to d j . The Token of d j can be expressed as Equation (8), where, Pid’d
j
represents the device ID of d j in the Domainn , Tstamp represents the time stamp of the
Token, Pd j represents the permission that d j has for obtaining the necessary information,
and hash Pid’d Tstamp Pd j represents the hash value of the combination of Pid’d , Tstamp ,
j j
and Pd j .
Token = hash Pid’d j Tstamp Pd j , Pid’d j , Tstamp , Pd j (8)
Finally, d j obtains its Pid’d and uses the Token to obtain the flight path and other
j
necessary information from the CN of the next domain. At the same time, cnn updates the
Mission chain by using the method proposed in Section 4.4.
In the UAV cargo transportation scenario, there are many UAVs flying to the same
next domain. Therefore, we propose a method of drone group cross-domain authentication
to improve the efficiency of UAV cross-domain authentication. The idea of this method
is that UAVs compose a group through mutual authentication before the cross-domain
authentication. In this way, the proposed method can lighten the authentication workload
of the CN and improve the speed of the UAV cross-domain authentication. The method is
mainly divided into two stages: (1) the formation of a UAV group (in Section 4.2), and (2)
the cross-domain authentication of a UAV group (in Section 4.3).
58
Drones 2023, 7, 302
When drone d j and drone dn start the verification, d j firstly sends its verification
information Md j to dn . Md j can be expressed as Equation (9), where, Taut represents the
two-way authentication request, EncSKcn Pidd j represents the Pid of d j encrypted with
i
the private key of the current domain CN, and PBHd j represents the PBC block height of
the block that includes the task information of d j .
Md j = Taut, EncSKcn Pidd j , PBHd j (9)
i
After receiving the verification information from d j , dn uses the PKcni to decrypt the
EncSKcn Pidd j to obtain the Pidd j . Then, dn searches the local PBC according to PBHd j
i
and queries whether the corresponding information is there. If PBHd j is bigger than the
block height of the local PBC, it updates the PBC from the CN. After that, dn obtains the
public key (PKd j ) of d j from the PBC. Then, a random number x is encrypted by PKd j as a
challenge ack, and dn sends its Mdn and ack to d j .
After receiving the Mdn and ack, d j decrypts the EncSKcni ( Piddn ) to obtain the Piddn .
Additionally, d j searches the local PBC according to PBHd j and queries whether the corre-
sponding information is there. If PBHd j is bigger than the block height of the local PBC, it
updates the PBC from the CN. Then, d j obtains the public key (PKdn ) of dn from the PBC.
Additionally, d j decrypts the ack with SKd j to obtain x, and sends back x + 1 and a random
59
Drones 2023, 7, 302
number y with PKdn encryption as a response, rsp j = Enc PKdn ( x + 1||y) . According to
the notification mechanism between domains described in Section 4.4, local PBCs saved
by drones store task information for a period of time in the future, and the PBCs can be
updated by drones after mutual authentication. Therefore, in theory, drones do not need or
rarely need to update blocks through the CN, and they only need to update blocks through
the CN at most once during a two-way authentication period.
Next, dn decrypts rsp j with SKdn and checks whether x + 1 is received within a certain
period of time to determine whether d j has the declared identity. Then, dn obtains y and
sends back a response, rspn = Enc PKd (y + 1), to d j . After receiving the rspn , d j decrypts
j
the rspn with SKd j and checks the response. After successful identity authentication, the
session key between d j and dn can be generated by ks = H ( x || y) . Then, d j and dn can
communicate with each other and update their PBCs.
By using the proposed verification strategy, drones can confirm each other’s identity,
generate session keys, and update the PBCs. In the process of moving, drones try to join a
group or build a new one, as shown in Figure 7.
60
Drones 2023, 7, 302
After receiving the GCMdl from dl , cnn verifies the group leader through the single-
drone cross-domain authentication method proposed in Section 4.1. Then, cnn obtains the
GListdl . If dl passes validation, cnn sends dl a Token. After receiving the Token, dl sends a
group cross-domain signal to the drone group. Then, the other drones in the group send
group cross-domain requests to cnn . The group cross-domain request sent by d j (GCMd j )
can be expressed as Equation (12), where Enc PKcnn Pidd j , x j represents the device ID on
the PBC and a random number x j encrypted with the public key of cnn . Additionally,
Enc PKcnn Pidd j , x j is generated by d j after d j joins or builds a group.
GCMd j = GCrequest, Enc PKcnn Pidd j , x j , EncSKcn Didd j (12)
i
After receiving the GCMd j , cnn decrypts the EncSKcn Didd j to obtain the Did of d j .
i
Additionally, cnn checks whether the incomplete Mission chain transaction trans∗ of the
corresponding Didd j is there. Then, cnn searches the Devices chain to obtain the public key
of d j . Additionally, cnn decrypts the Enc PKcnn Pidd j , x j to obtain the Pid and x j of d j . If
Pidd j is in the GListdl , cnn generates the Token for d j according to the equivalence between
j j
Pid and Did. Next, cnn sends a response, rspcn = Enc PKd y j , to d j . After decrypting rspcn
j
and obtaining y j , d j can generate a session key by ks j = H ( x j y j ) .
In this way, cnn verifies the drones and distributes the Tokens to the drones in the
group. Finally, the drones in the group obtain their Pids and use their Tokens to obtain the
flight path and other necessary information from cnn . At the same time, cnn updates the
Mission chain by using the method proposed in Section 4.4.
61
Drones 2023, 7, 302
62
Drones 2023, 7, 302
When it has obtained the Listtrans and the relevant task information, cni calls on
Algorithm 2 to preprocess the tasks. For each transaction in the Listtrans , the algorithm
receives Mid, Did, CNidd , Rout, and Tout from the transaction. Then, it calls on the path
planning algorithm to plan a flight path for the drone and obtain the CN ID of the next
domain (CNid’n ), the range out of the current domain (Rout’ ), and the current expected
time cost out of the domain (TC ∗ ). For drone cross-domain authentication, the algorithm
reads the Address chain and obtains the public key (PKcnn ) of CNid’n . Then, it reads the
Devices chain and obtain the PKd of the drone. Additionally, it generates the device ID
in this domain (Pid ) and the permission (P) for the drone. Then, the algorithm submits
the transaction (Pid , Rout’ , PKd ) to the PBC. In this way, the PBC of the drone currently
flying to the next domain has the information about the drones performing tasks in that
domain for a period of time in the future. Additionally, it generates an incomplete Mission
chain transaction, trans∗ = Mid, Did, CNid’n , CNidd , CNid’c , Rout’ , TC ∗ , and the TC ∗
in trans∗ is updated when the drone arrives. At the same time, cni packages the PBC
transactions and generates a new block on the PBC at certain intervals, or the number of
transactions meets the requirement.
7. Read Address chain and get the PKcnn of CNidn .
8. Read Devices chain and get the PKd of the drone.
9. Create Pid and P // Generate device ID in this domain and the permission for the drone.
10. submit(Pid , Rout , PKd ) ->PBC // Submit the PBC transaction to the PBC.
11. trans∗ = ( Mid, Did, CNidn , CNidd , CNidc , Rout , TC ∗ ) // Generate the Mission
chain transaction.
12. end for
13. cni packages PBC transactions and generates a new block on the PBC at certain intervals or
the number of transactions meets the requirement.
14. return trans∗ , Pid , P, PKcnn .
When a drone d j has passed the cross-domain authentication and entered the domain
Domaini , cni uses Algorithm 3 to publish a transaction for updating the task information of
63
Drones 2023, 7, 302
d j . Above all, cni obtains the task start time (TSTdj ) of d j in the domain. Then, cni calculates
the expected time out of the domain by Tout’dj = TSTdj + TC ∗ . After that, the Mission
chain transaction including the task information of d j is published, which can be expressed
as trans = Mid, Diddj , CNid’n , CNidd , CNid’c , Rout’dj , Tout’dj . We consider that all the
CNs in the CBC are trusted. Therefore, this paper uses the Raft consensus mechanism to
package the transactions and generate new blocks. In addition, the Pid’dj , Pdj , and PKcnn
generated in Algorithm 2 are sent to the d j during the cross-domain authentication process.
j
4. trans = Mid, Didd j , CNidn , CNidd , CNidc , Routd , Toutd // Generate transaction
j j
including task information of d j .
5. Publish the trans to the Mission chain.
6. isUploaded = TRUE // Upload successfully.
7. return isUploaded.
5. Performance Evaluation
5.1. Experimental Settings
We analyzed the performance of the proposed scheme by conducting simulation
experiments. The performance of the method proposed in this paper was measured in terms
of computational overhead, communication overhead, and cross-domain authentication
time cost. The configuration of the PC for the experiments is: CPU: Intel Core i7-8550,
RAM: 8 GB, OS: Ubuntu 18.04, 64-bit. Hyperledger Fabric is an open source project from
the Linux Foundation. We used Hyperledger Fabric v1.4 to build the blockchain, and
the consensus on the consortium blockchain was reached through the Raft algorithm.
Additionally, we used the JPBC v2.0 bilinear pair cryptography library from Italy GAS Lab
to generate the public and private keys, and to encrypt and decrypt messages and ciphertext,
respectively. The applied elliptic curve is a Type A elliptic curve with an order length of
160 bits (y2 = x3 + x). Raspberry Pi, as an embedded single-board computer (SBC) from Uk
Raspberry Pi Foundation that is easy to use for coding and other implementations, is widely
used in the existing studies. To further evaluate the feasibility of the proposed scheme, we
used Raspberry Pi 4B SBCs to simulate the drones. The configuration of the Raspberry
Pi 4B is: CPU: Quad-core Cortex-A72, RAM: 8 GB, OS: Ubuntu 18.04, 64-bit. We also
compared the proposed method with existing methods [25–27] that use a ground station
as a trusted third party for identity authentication, as well as existing methods [31–33] for
identity authentication through ground stations and blockchain architectures.
64
Drones 2023, 7, 302
RG TR SC TA NG JG DGC NMD
4CAS + BC +
DAT + GKP
PP + 3BC + 4CAS + BC + HO + (N−1)
CN + BC + CAS + - - - 4BC + PP
CAS HO × (3CAS +
LD
BC + HO)
4CAS + BC +
dj - - 3CAS + HO (N − 1) × TA 2TA + GL CAS + HO
HO
4CAS + BC + (N − 1) × TA +
dl - - - TA + GL 3CAS + HO
HO GL
Note: “-” means no relevant operation.
65
Drones 2023, 7, 302
The computational time consumption of the CN side and the UAV side in different
situations is shown in Figure 9. In the SC case, the time consumption of the CN side is
{2 × 3.61 + 2 × 2.43 + 0.17 + 0.03} = 12.28 ms, and the time consumption of the UAV side
is {2 × 9.31 + 7.24 + 0.32} = 26.18 ms. In the case of DGC, the computational time cost
needs to consider the scale of the drones. The time consumption of the UAV group leader is
{2 × 9.31 + 7.24 + 0.32} = 26.18 ms, and the time cost of an ordinary UAV in the group is
{9.31 + 0.32} = 9.63 ms. The minimum average time consumption of the CN side is the time
consumption when dealing with ordinary UAVs, i.e., {2 × 3.61 + 2.43 + 0.17 + 0.03} = 9.85 ms.
The maximum average time consumption on the CN side is when there are only two drones,
that is, {(12.28 + 9.85)/2} = 11.07 ms. Therefore, the average computational time consump-
tion interval of the CN side is (9.85 ms, 11.07 ms).
The computational time cost of the proposed method and existing methods are shown
in Figure 10. In the SC case, the computational time cost of our method is
{12.28 + 26.18} = 38.46 ms. The figure also shows the maximum average computational
time cost in the case of DGC, which is the average computational time cost required for
each drone to cross domains when two drones perform DGC. The time cost is calculated
by {(26.18 + 9.63 + 9.85 + 12.28)/2} = 28.97 ms. The existing methods for authenticating
identity through a ground station as a trusted third party, as reported by Wazid et al. [25],
Srinivas et al. [26], and Tanveer et al. [27], require 42.36 ms, 39.32 ms, and 38.12 ms, re-
spectively. The existing methods for identity authentication through ground stations and
blockchain architectures, as reported by Feng et al. [31], Shen et al. [32], and
Gauhar et al. [33], require 32.93 ms, 36.87 ms, and 34.52 ms, respectively. Although
the computational time cost of SC is not significantly different from that of existing
methods [25–27,31–33], that of DGC is lower than that of other methods. Therefore, it
can be considered that the DGC method can reduce the computational time cost of UAV
cross-domain authentication. Figure 11 shows the computational time cost of cross-domain
authentication when the number of drones increases. The computational time cost of DGC
can be expressed as {26.18 + 12.28 + (N − 1)(9.63 + 9.85)} = 19.48N + 18.98 ms. As shown in
66
Drones 2023, 7, 302
Figure 11, the time cost of each method increases linearly as the number of drones increases.
Compared with the existing methods [25,31], the DGC method proposed in this paper has
significant advantages when the number of drones is large.
Figure 10. The comparison of computational time cost of different methods [25–27,31–33].
Figure 11. The computational time cost with increasing number of drones [25,31].
rsp j : Enc PKcnn ( x + 1|| y) , and rspcn : Enc PKd (y + 1) . The length of the CRM, ack,
j
rsp j , and rspcn is {64 + 4026} = 4090 bits, 1094 bits, 1094 bits, and 1094 bits, respectively.
Thus, the total communication cost of the SC is 7372 bits. In the two-way
authentication
(TA) case, the communicated messages are: Md j : Taut, EncSKcn Pidd j , PBHd j , Mdn :
i
Taut, EncSKcn ( Piddn ), PBHdn , ack : Enc PKd ( x ) , rsp j : Enc PKdn ( x + 1 y ) , and
i
j
rspn : Enc PKd (y + 1) . The length of the Md j , Mdn , ack, rsp j , and rspn is
j
{32 + 2350 + 128} = 2510 bits, 2510 bits, 1094 bits, 1094 bits, and 1094 bits, respectively. Thus,
the total communication cost of the SC is 8302 bits. The communicated messages in the DGC
case can be divided into two parts: (a) the communicated messages in the group leader
authentication process, and (b) the communicated messages in an ordinary group member
authentication process.The communicated messages in thegroup leader authentication
process are: GCMdl : GCrequest, EncSKcn Diddl , GListdl , ack : Enc PKd ( x ) , rsp j :
i l
67
Drones 2023, 7, 302
Enc PKcnn ( x + 1|| y) , and rspcn : Enc PKd (y + 1) . The length of the GCMdl , ack, rsp j ,
l
and rspcn is {64 + 4026 + N × 64} = 4090 + 64N bits, 1094 bits, 1094 bits, and 1094 bits, respec-
tively. Thus, the total communication cost of (a) is 7372 + 64N bits, where N is the number
of members in the drone group. The communicated
messagesin an ordinary
group member
authentication process are: GCMd j : GCrequest, Enc PKcnn Pidd j , x j , EncSKcn Didd j ,
i
j j
and rspcn = Enc PKd y j . The length of the GCMd j and rspcn is {32 + 2570 + 4026} = 6628 bits
j
and 1094 bits, respectively. Thus, the total communication cost of (b) is 7722 bits. For a drone
group with N members, the total communication cost is
{a) + (N − 1)b)} = {7372 + 64N + 7722N−7722} = 7786N-350 bits. The average cost per drone
is 7786 − 350/N bits.
68
Drones 2023, 7, 302
The total communication time cost mainly includes transmission delay and prop-
agation delay. The transmission delay can be expressed as Sizedata/Tr, where Sizedata
represents the size of the transmission data, and Tr represents the transmission rate of the
channel. According to different transmission frequencies and communication bandwidths,
Tr varies from tens of Kb to tens of Mb per second. Figure 14 shows how the transmission
delay in the communicated messages changes with the transmission rate from 200 Kbps to
10 Mbps. We selected the minimum communication cost (Tanveer et al. [27]) and the maxi-
mum communication cost (Shen et al. [32]) among the comparison methods to compare
them with our method. When the transmission rate is 5 Mps, the transmission time taken
by SC, DGC_max, Tanveer et al. [27], and Shen et al. [32] is 1.43 ms, 1.51 ms, 1.07 ms, and
1.8 ms, respectively. The propagation delay can be expressed as Dis/Velwave, where Dis is
the distance between the two sides of the communication and Velwave is the propagation
speed of the wave in the vacuum (about 3 × 105 km/s). Generally, the range of the domain
is around 1 km. Therefore, propagation delay can be ignored.
Figure 14. The transmission delay with increasing transmission rate [27,32].
69
Drones 2023, 7, 302
taken by the DGC proposed in this paper can be expressed as {19.48N + 18.98 + 1.55N − 0.07}
= 21.03N + 18.91 ms, where N is the number of members in the drone group. In the case
of cross-domain authentication for one drone, the SC method, Wazid et al. [25], Srinivas
et al. [26], Tanveer et al. [27], Feng et al. [31], Shen et al. [32], and Gauhar et al. [33] require
39.89 ms, 43.69 ms, 40.51 ms, 39.19 ms, 34.37 ms, 38.67 ms, and 36.06 ms, respectively. The
communication time cost of DGC in the figure (30.49 ms) is the average value for two drones.
It can be seen that DGC has a better cross-domain authentication performance compared
with the other methods [25–27,31–33]. Figure 16 shows the cross-domain authentication
time cost when the number of drones increases. It can be seen that the DGC method
proposed in this article has significant advantages when the number of drones is large.
Figure 15. The comparison of cross-domain authentication time cost of different methods [25–27,31–33].
Figure 16. The cross-domain authentication time cost with increasing number of drones [25–27,31–33].
6. Security Analysis
We used the widely used Dolev and Yao (DY) threat model [35] to evaluate the security
of the proposed method. In the DY threat model, a malicious attacker (MA) can inject,
delete, eavesdrop, forge, or modify the exchanged messages over a public channel [36].
In this way, an MA can perform various security attacks on drones or CNs. The possible
attacks and descriptions are as follows:
(1) Replay attack: An MA replays authentication messages to deceive the CN.
(2) Forgery attack: An MA generates an illegal or false ID to deceive the CN.
(3) Impersonation attack: An MA obtains authentication messages by impersonating
terminals or eavesdropping on a channel, and impersonates a legitimate device to
deceive the CN.
70
Drones 2023, 7, 302
All the methods can well support mutual authentication and cross-domain authentica-
tion functions. At the same time, due to the use of blockchain technology and temporary
intradomain ID methods, our method also has good decentralization and anonymity.
Drones have different temporary IDs in different domains, and their device IDs on the CBC
and all mission information can only be queried by the consortium nodes. Therefore, an
MA cannot obtain the complete flight path of the drone, namely, task path untraceability.
The notification mechanism between domains designed in this paper allows CNs to plan
their paths in advance, which can improve their perception of the overall network situation.
For possible attacks, we make the following analysis:
(1) Resilience to replay attacks: During the process of cross-domain authentication, the
CNs and drones use PKI and a challenge–response mechanism to perform identity
authentication and generate a session key. An MA cannot obtain useful information
through this attack.
(2) Resilience to forgery attacks: The CNs need to query the Devices chain and the
Mission chain transaction to confirm identity, and an MA cannot forge identity on the
consortium chain.
(3) Resilience to impersonation attacks: Unregistered drones cannot obtain a legal Did,
public key, and private key. In the process of the challenge–response game, an MA
cannot decrypt the ciphertext to complete the verification. Therefore, it is difficult to
implement an impersonation attack.
71
Drones 2023, 7, 302
7. Conclusions
During long-distance flights for cargo transportation, drones need to apply cross-
domain authentication mechanisms to enter the next domain. However, due to public
wireless communication channels, drones are vulnerable to various security attacks in
the process of cross-domain authentication. When facing a large number of cross-domain
requests from drones, a CN requires significant computational and time overhead, which
may lead to long waiting times for the cross-domain authentication of drones. To address
this problem, we proposed BCDAIoD, an efficient blockchain-based cross-domain authenti-
cation scheme for the Internet of Drones. The BCDAIoD method includes a single-drone
cross-domain authentication method, an establishment mechanism of drone groups, a
drone group cross-domain authentication method, and a notification mechanism between
domains. By taking advantage of blockchain, PKI, and the challenge–response game,
BCDAIoD can ensure the authenticity and integrity of data, and can effectively prevent
various attacks on drones and CNs. Furthermore, BCDAIoD uses the CBC and notification
mechanism between domains to enable CNs to plan paths for drones in advance, which
can further improve the efficiency of drone cross-domain authentication and task execu-
tion. The main contribution of this article is that BCDAIoD can improve the efficiency
and security of the cross-domain authentication of drones. Experiment results show that
the cross-domain authentication time cost and computational overhead of BCDAIoD are
significantly lower than those of the existing state-of-the-art methods when facing a large
number of drones.
Nevertheless, there are still limitations when applying BCDAIoD. First, blockchain
brings additional communication and storage costs to the drone network. For example,
drones in the IoD communicate with each other and update their local blockchains. Second,
a small number of drones flying to the same destination or drones being far apart from
each other may lead to drone group establishment failure. Hence, to address the above
limitations, we seek to further simplify storage data in the block and design block pruning
algorithms for the PBC to reduce communication and storage costs in future extensions of
this work. At the same time, we will also attempt to design an optimization algorithm that
dynamically adjusts between single-drone and drone group cross-domain methods based
on the current state of IoD.
Author Contributions: Conceptualization, G.Q. and Y.Z.; methodology, G.Q. and T.Y.; software,
T.Y. and G.Q.; investigation, G.Q. and Y.Q.; validation, G.Q., Y.Z. and T.Y.; result analysis, T.Y.,
Y.Q.; writing—original draft preparation, G.Q.; writing—review and editing, Y.Z. and G.Q.; super-
vision, Y.Z.; funding acquisition, Y.Z. All authors have read and agreed to the published version of
the manuscript.
Funding: This research was funded by the National Natural Science Foundation of China (General
Program) under Grant No. 61572253.
Data Availability Statement: The data presented in this study are available on request from the
corresponding author.
Conflicts of Interest: The authors declare no conflict of interest.
72
Drones 2023, 7, 302
References
1. Hassan, M.A.; Javed, A.R.; Hassan, T.; Band, S.S.; Sitharthan, R.; Rizwan, M. Reinforcing communication on the internet of aerial
vehicles. IEEE Trans. Green Commun. Netw. 2022, 6, 1288–1297. [CrossRef]
2. Salah, K.; Rehman, M.H.U.; Nizamuddin, N.; Al-Fuqaha, A. Blockchain for AI: Review and open research challenges. IEEE Access
2019, 7, 10127–10149. [CrossRef]
3. Farah, M.F.; Mrad, M.; Ramadan, Z.; Hamdane, H. Handle with Care: Adoption of Drone Delivery Services. In Proceedings of the
Advances in National Brand and Private Label Marketing: Seventh International Conference, Barcelona, Spain, 17–20 June 2020;
pp. 22–29.
4. Makhdoom, I.; Zhou, I.; Abolhasan, M.; Lipman, J.; Ni, W. PrivySharing: A blockchain-based framework for privacy-preserving
and secure data sharing in smart cities. Comput. Secur. 2020, 88, 101653. [CrossRef]
5. Li, X.; Wang, Y.; Vijayakumar, P.; He, D.; Kumar, N.; Ma, J. Blockchain-based mutual-healing group key distribution scheme in
unmanned aerial vehicles ad-hoc network. IEEE Trans. Veh. Technol. 2019, 68, 11309–11322. [CrossRef]
6. Qiu, J.; Grace, D.; Ding, G.; Yao, J.; Wu, Q. Blockchain-Based Secure Spectrum Trading for Unmanned-Aerial-Vehicle-Assisted
Cellular Networks: An Operator’s Perspective. IEEE Internet Things J. 2020, 7, 451–466. [CrossRef]
7. Bera, B.; Chattaraj, D.; Das, A.K. Designing secure blockchain-based access control scheme in IoT-enabled Internet of Drones
deployment. Comput. Commun. 2020, 153, 229–249. [CrossRef]
8. Yapıcı, Y.; Rupasinghe, N.; Güvenç, I.; Dai, H.; Bhuyan, A. Physical layer security for NOMA transmission in mmWave drone
networks. IEEE Trans. Veh. Technol. 2021, 70, 3568–3582. [CrossRef]
9. Asheralieva, A.; Niyato, D. Distributed dynamic resource management and pricing in the IoT systems with blockchain-as-a-service
and UAV-enabled mobile edge computing. IEEE Internet Things J. 2020, 7, 1974–1993. [CrossRef]
10. Li, T.; Liu, W.; Wang, T.; Ming, Z.; Li, X.; Ma, M. Trust data collections via vehicles joint with unmanned aerial vehicles in the
smart Internet of Things. Trans. Emerg. Telecommun. Technol. 2022, 33, e3956. [CrossRef]
11. Nakamura, S.; Enokido, T.; Takizawa, M. Information flow control based on the CapBAC (capability-based access control) model
in the IoT. Int. J. Mob. Comput. Multimed. Commun. 2019, 10, 13–25. [CrossRef]
12. Ali, G.; Ahmad, N.; Cao, Y.; Ali, Q.E.; Azim, F.; Cruickshank, H. BCON: Blockchain based access CONtrol across multiple conflict
of interest domains. J. Netw. Comput. Appl. 2019, 147, 102440. [CrossRef]
13. Wang, Y.; Wang, H.; Wei, X.; Zhao, K.; Fan, J.; Chen, J.; Jia, R. Service Function Chain Scheduling in Heterogeneous Multi-UAV
Edge Computing. Drones 2023, 7, 132. [CrossRef]
14. Jha, S.; Sural, S.; Atluri, V.; Vaidya, J. Specification and verification of separation of duty constraints in attribute-based access
control. IEEE Trans. Inf. Forensics Secur. 2017, 13, 897–911. [CrossRef]
15. Sandhu, R.S.; Coyne, E.J.; Feinstein, H.L.; Youman, C.E. Role-based access control models. Computer 1996, 29, 38–47. [CrossRef]
16. Xu, S.; Ning, J.; Li, Y.; Zhang, Y.; Xu, G.; Huang, X.; Deng, R.H. Match in my way: Fine-grained bilateral access control for secure
cloud-fog computing. IEEE Trans. Dependable Secur. Comput. 2022, 19, 1064–1077. [CrossRef]
17. Wang, K.; Zhang, X.; Qiao, X.; Li, X.; Cheng, W.; Cong, Y.; Liu, K. Adjustable Fully Adaptive Cross-Entropy Algorithms for Task
Assignment of Multi-UAVs. Drones 2023, 7, 204. [CrossRef]
18. Abdel-Malek, M.A.; Akkaya, K.; Bhuyan, A.; Ibrahim, A.S. A proxy Signature-Based swarm drone authentication with leader
selection in 5G networks. IEEE Access 2022, 10, 57485–57498. [CrossRef]
19. Fysarakis, K.; Soultatos, O.; Manifavas, C.; Papaefstathiou, I.; Askoxylakis, I. XSACd-Cross-domain resource sharing & access
control for smart environment. Future Gener. Comput. Syst. 2018, 80, 572–582.
20. Nakamoto, S. Bitcoin: A peer-to-peer electronic cash system. In Decentralized Business Review; Scholastica: Seoul, Korea, 2008;
p. 21260.
21. Mehta, P.; Gupta, R.; Tanwar, S. Blockchain envisioned drone networks: Challenges, solutions, and comparisons. Comput.
Commun. 2020, 151, 518–538. [CrossRef]
22. Al-Hilo, A.; Samir, M.; Assi, C.; Sharafeddine, S.; Ebrahimi, D. Cooperative content delivery in UAV-RSU assisted vehicular
networks. In Proceedings of the 2nd ACM MobiCom Workshop on Drone Assisted Wireless Communications for 5G and Beyond,
London, UK, 21–25 September 2020; pp. 73–78.
23. Arafeh, M.; El Barachi, M.; Mourad, A.; Belqasmi, F. A blockchain based architecture for the detection of fake sensing in mobile
crowdsensing. In Proceedings of the 2019 4th International Conference on Smart and Sustainable Technologies (SpliTech), Split,
Croatia, 18–21 June 2019; pp. 1–6.
24. García-Magariño, I.; Lacuesta, R.; Rajarajan, M.; Lloret, J. Security in networks of unmanned aerial vehicles for surveillance with
an agent-based approach inspired by the principles of blockchain. Ad Hoc Netw. 2019, 86, 72–82. [CrossRef]
25. Wazid, M.; Das, A.K.; Kumar, N.; Alazab, M. Designing authenticated key management scheme in 6G-enabled network in a box
deployed for industrial applications. IEEE Trans. Ind. Inform. 2020, 17, 7174–7184. [CrossRef]
26. Srinivas, J.; Das, A.K.; Wazid, M.; Vasilakos, A.V. Designing secure user authentication protocol for big data collection in IoT-based
intelligent transportation system. IEEE Internet Things J. 2020, 8, 7727–7744. [CrossRef]
27. Tanveer, M.; Alkhayyat, A.; Naushad, A.; Kumar, N.; Alharbi, A.G. RUAM-IoD: A robust user authentication mechanism for the
Internet of Drones. IEEE Access 2022, 10, 19836–19851. [CrossRef]
28. Jan, S.U.; Abbasi, I.A.; Algarni, F.; Khan, A.S. A verifiably secure ECC based authentication scheme for securing IoD using FANET.
IEEE Access 2022, 10, 95321–95343. [CrossRef]
73
Drones 2023, 7, 302
29. Rajamanickam, S.; Vollala, S.; Ramasubramanian, N. EAPIOD: ECC based authentication protocol for insider attack protection in
IoD scenario. Secur. Priv. 2022, 5, e248. [CrossRef]
30. Ever, Y.K. A secure authentication scheme framework for mobile-sinks used in the internet of drones applications. Comput.
Commun. 2020, 155, 143–149. [CrossRef]
31. Feng, C.; Liu, B.; Guo, Z.; Yu, K.; Qin, Z.; Choo, K.K.R. Blockchain-based cross-domain authentication for intelligent 5G-enabled
internet of drones. IEEE Internet Things J. 2021, 9, 6224–6238. [CrossRef]
32. Shen, M.; Liu, H.; Zhu, L.; Xu, K.; Yu, H.; Du, X.; Guizani, M. Blockchain-assisted secure device authentication for cross-domain
industrial IoT. IEEE J. Sel. Areas Commun. 2020, 38, 942–954. [CrossRef]
33. Ali, G.; Ahmad, N.; Cao, Y.; Khan, S.; Cruickshank, H.; Qazi, E.A.; Ali, A. xDBAuth: Blockchain based cross domain authentication
and authorization framework for Internet of Things. IEEE Access 2020, 8, 58800–58816. [CrossRef]
34. Zhang, H.; Chen, X.; Lan, X.; Jin, H.; Cao, Q. BTCAS: A blockchain-based thoroughly cross-domain authentication scheme. J. Inf.
Secur. Appl. 2020, 55, 102538. [CrossRef]
35. Dolev, D.; Yao, A. On the security of public key protocols. IEEE Trans. Inf. Theory 1983, 29, 198–208. [CrossRef]
36. Yu, S.; Das, A.K.; Park, Y.; Lorenz, P. SLAP-IoD: Secure and lightweight authentication protocol using physical unclonable
functions for internet of drones in smart city environments. IEEE Trans. Veh. Technol. 2022, 71, 10374–10388. [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.
74
drones
Article
Machine Learning Methods for Inferring the Number of UAV
Emitters via Massive MIMO Receive Array
Yifan Li 1 , Feng Shu 1,2, *, Jinsong Hu 3 , Shihao Yan 4 , Haiwei Song 5 , Weiqiang Zhu 5 , Da Tian 5 , Yaoliang Song 1
and Jiangzhou Wang 6
1 School of Electronic and Optical Engineering, Nanjing University of Science and Technology,
Nanjing 210094, China
2 School of Information and Communication Engineering, Hainan University, Haikou 570228, China
3 College of Physics and Information Engineering, Fuzhou University, Fuzhou 350116, China
4 School of Science and Security Research Institute, Edith Cowan University, Perth, WA 6027, Australia
5 8511 Research Institute, China Aerospace Science and Industry Corporation, Nanjing 210007, China
6 School of Engineering, University of Kent, Canterbury CT2 7NT, UK
* Correspondence: shufeng0101@163.com
Abstract: To provide important prior knowledge for the direction of arrival (DOA) estimation of UAV
emitters in future wireless networks, we present a complete DOA preprocessing system for inferring
the number of emitters via a massive multiple-input multiple-output (MIMO) receive array. Firstly,
in order to eliminate the noise signals, two high-precision signal detectors, the square root of the
maximum eigenvalue times the minimum eigenvalue (SR-MME) and the geometric mean (GM), are
proposed. Compared to other detectors, SR-MME and GM can achieve a high detection probability
while maintaining extremely low false alarm probability. Secondly, if the existence of emitters is
determined by detectors, we need to further confirm their number. Therefore, we perform feature
extraction on the the eigenvalue sequence of a sample covariance matrix to construct a feature vector
and innovatively propose a multi-layer neural network (ML-NN). Additionally, the support vector
machine (SVM) and naive Bayesian classifier (NBC) are also designed. The simulation results show
that the machine learning-based methods can achieve good results in signal classification, especially
neural networks, which can always maintain the classification accuracy above 70% with the massive
Citation: Li, Y.; Shu, F.; Hu, J.; Yan, S.;
MIMO receive array. Finally, we analyze the classical signal classification methods, Akaike (AIC)
Song, H.; Zhu, W.; Tian, D.; Song, Y.;
and minimum description length (MDL). It is concluded that the two methods are not suitable for
Wang, J. Machine Learning Methods
scenarios with massive MIMO arrays, and they also have much worse performance than machine
for Inferring the Number of UAV
learning-based classifiers.
Emitters via Massive MIMO Receive
Array. Drones 2023, 7, 256. https://
doi.org/10.3390/drones7040256
Keywords: unmanned aerial vehicle (UAV); massive MIMO; threshold detection; emitter number
detection; machine learning; information criterion
Academic Editor: Emmanouel T.
Michailidis
MIMO arrays can make a great extension of signal coverage [6], and experimental results
in [7] showed massive MIMO works well with LoS mobile channels. So in view of the
problems that UAV communications face, it is natural to consider the combination of UAVs
and massive MIMO technology [8]. In [9], a nonstationary 3D geometry-based model was
proposed for UAV-to-ground massive MIMO channels; this model considered the realistic
scenarios and discussed the impact of some important UAV parameters such as altitude
and flight velocity, so it can give some inspiration for future research on 6G standard UAV
channel models. As UAVs often appear as clusters, the potential of massive MIMO ground
station communication with UAV swarms was explored in [10], and a realistic geometric
model was also developed.
Because of the high mobility of UAVs, it is necessary for ground base stations to
obtain direction-of-arrival (DOA) information of UAVs in a timely manner for channel
estimation and communication security. For most DOA estimation algorithms, such as
MUSIC and ESPRIT, the number of emitters is required prior knowledge, but the number
is usually unknown [11]. So inferring the number of emitters has been an active topic in
array processing for a few decades [12]. In recent years, the potential of massive MIMO
technology in array processing has also been gradually discovered [13], as the larger number
of antennas can decrease the beamwidth and then increase the angular resolution of the
arrays [14]. Therefore, considering the realistic needs of UAV communications and the
advantages of massive MIMO technology in array processing, we will study the methods
for inferring the number of UAV emitters via a massive MIMO receive array in this work.
In general, the solutions for inferring the number of emitters can be divided into two
main categories. The first is based on the information-theoretic criteria and another is
based on the analysis of the covariance matrices. Since detecting the number of signal
sources can be viewed as a typical model order selection problem, Akaike firstly proposed
a method focusing on finding the minimum Kullback–Leibler (KL) discrepancy between
the probability density function (PDF) of obtained data and that of models for selection [15],
and this method is now called AIC. Schwarz introduced Bayesian information criterion
(BIC) based on Akaike’s work [16], and Rissanen also derived a similar criterion called
MDL [17]. Ref. [18] provided a good summary of these classical information criteria. In
the last decade, Lu and Zoubir proposed the generalized Bayesian information criterion
(GBIC) [19] and flexible detection criterion (FDC) [20], which effectively improved the
performance on source enumeration. The other basic method for enumerating the number
of sources is performing analysis on the covariance matrices of signals received by arrays.
Williams and Johnson proposed a sphericity test for source enumeration in [21], which
was based on a hypothesis test for the covariance matrix. Ref. [22] gave a bootstrap-based
method to estimate the null distributions of the test statistics. Wax and Adler solved this
problem by performing signal subspace matching [23].
Signal detection is another technique adopted in this work. In order to reduce the
interference of the noise to the detection of signal number, some good methods were
proposed, such as classic signal detection algorithms containing energy detection [24],
matched-filter detection [25], cyclostationarity-based detection [26], etc. On the basis of
these methods, Zeng and Liang proposed two eigenvalue-based algorithms in [27], Zhang
et al. used the generalized likelihood ratio test (GLRT) approach to improve detection
performance [28], and an eigenvalue-based LRT algorithm was also given in [29].
In recent years, machine learning (ML) has played an important role in the fields of
array signal processing [30] and UAV communications [31], and now the ML-based methods
used in 5G mainly include supervised learning, unsupervised learning, and reinforcement
learning [32]. Thilina et al. compared the performance of unsupervised learning approaches
and supervised learning approaches for cooperative spectrum sensing [33]. A machine
learning-based DOA measurement method was also proposed in [34], and ref. [35] used a
neural network for power allocation in a wireless communication network.
In this paper, we will combine the techniques mentioned above for inferring the
number of UAV emitters via massive MIMO receive array. First, the pure noise signals
76
Drones 2023, 7, 256
are separated by threshold detectors, and then the feature vectors are extracted from
the sample covariance matrices of the remaining signals. Finally, the ML-NN and other
machine learning methods are used to classify the signals for determining the number of
emitters. Therefore, our main contributions are summarized as follows:
1. A DOA preprocessing system is proposed for obtaining the number of UAV emitters
via a massive MIMO array. The main steps of this system include signal detection and
inferring the number of emitters. The received signals are first inputted into signal
detectors. If the detection result shows the presence of emitters, this signal is further
transmitted to signal classifiers to determine the number of emitters.
2. Two high-precision signal detectors, the square root of the maximum eigenvalue times
the minimum eigenvalue (SR-MME) and the geometric mean (GM), are proposed in
Section 3. Their thresholds and probability of detection are also derived with the aid
of random matrix theories. The simulation results show that SR-MME and GM have
significant improvement in detection performance compared with the MME detector
proposed in [27] and the M-MME detector proposed in [36], even though the SNR is
very low and the number of samples is small. The simulation results also show that
SR-MME and GM can maintain a low false alarm probability while achieving a high
detection probability.
3. Since the existence of emitters is known, we innovatively introduce machine learning-
based classifiers to infer their number, including multi-layer neural networks (ML-
NNs), support vector machine (SVM), and naive Bayesian classifier (NBC). Important
features which make up feature vectors are also extracted from eigenvalue sequences
of signals’ sample covariance matrices. The results show that machine learning meth-
ods are very suitable for performing signal classification, especially neural networks,
because they can achieve a classification accuracy of 70%, even under extreme con-
ditions. Finally, we validate the classification performance of AIC and MDL under
different SNR and number of receive antennas. We show that they are unapplicable
to scenarios with low SNR and massive MIMO receive arrays compared to machine
learning-based methods.
The rest of the paper is organized as follows. In Section 2, we present a specific
system model and assumptions. Two high-precision signal detectors are given in Section 3.
Section 4 shows how to perform feature extraction on received signals and classify them by
machine learning methods. Then, the advantages of the proposed detectors and classifiers
are presented through simulation results in Section 5. Finally, Section 6 draws conclusions.
Notation: Matrices, vectors, and scalars are denoted by letters of bold upper case,
bold lower case, and lower case, respectively. Signs (·) T , (·)∗ , and (·) H represent transpose,
conjugate, and conjugate transpose. I M denotes the M × M identity matrix. diag{·} stands
for diagonal matrix.
2. System Model
As the system shown in Figure 1, we consider a scenario with K far-field UAV emitters
and one massive MIMO receiver equipped with an M-element uniform linear array (ULA).
The signals transmitted by the kth UAV are denoted by sk (t)e j2π f c t , where sk (t) is the
baseband signal and f c is the carrier frequency. Referring to [37], the received signals at the
mth antenna are given by
K
ym (t) = ∑ sk (t)e j2π fc t e− j2π fc τk,m + vm (t), (1)
k =1
where vm (t) ∼ CN (0, σv2 ) represents the additive white Gaussian noise (AWGN) term, and
τk,m denotes the propagation delay from the kth UAV to the mth antenna, expressed by
(m − 1)d sin θk
τk,m = τ0 − , (2)
c
77
Drones 2023, 7, 256
where τ0 is the propagation delay from the UAV to the reference point on the receive array,
θk is the angle of signal incidence from the kth UAV, d = λ/2 represents the space between
array elements, and c denotes the speed of light. Then received signals go through ADC
and down converter, and we obtain
K
ym (n) = ∑ e− j2π(m−1)d sin θk /λ sk (n) + vm (n), (3)
k =1
K
y( n ) = ∑ a( θ k ) s k ( n ) + v( n ), (4)
k =1
where v(n) = [v1 (n), . . . , v M (n)] T denotes the noise vector and
K
Qy = AQs A H + σv2 I M = ∑ σs,k
2
a(θk )a H (θk ) + σv2 I M . (7)
k =1
and
λm = ρm + σv2 , (9)
where ρ1 ≥ . . . ≥ ρK > ρK +1 = . . . = ρ M = 0 are the eigenvalues of AQs AH .
In practice, the covariance matrix of received signal y cannot be obtained accurately,
so the sample covariance matrix of the received signal is usually used to approximate it:
N
1 1
Q̂y =
N ∑ y( n )y H ( n ) = N
YY H , (10)
n =1
where
H0 : Y = V H1 : Y = AS + V, (11)
and S = [s(1), s(2), . . . , s( N )], V = [v(1), v(2), . . . , v( N )].
78
Drones 2023, 7, 256
8$9 8$9.
5) 5) 5) 5)
$'&
6DPSOH&RYDULDQFH0DWUL[
(LJHQYDOXH'HFRPSRVLWLRQ
1R
6LJQDO'HWHFWLRQ
'R8$9VH[LVW"
)HDWXUH([WUDFWLRQ
1XPEHU,QIHUULQJ
5HVXOWV
Figure 1. The procedure of proposed system for inferring the number of UAV emitters by massive
MIMO receive array.
3. Signal Detectors
As shown in Figure 1, after the sample covariance matrix of the received signal is
obtained, we take eigenvalue decomposition (EVD) on it. For the two situations in (11),
eigenvalues are represented by λ1 (Q̂y,H0 ) ≥ . . . ≥ λ M (Q̂y,H0 ) and λ1 (Q̂y,H1 ) ≥ . . . ≥
79
Drones 2023, 7, 256
λ M (Q̂y,H1 ), respectively. For convenience, we consider moving the constant 1/N to the
left-hand side of (10). Assuming σv2 = 1, we obtain
R H0 = VV H , (12a)
R H1 = NAQ̂S A + R H0 ,
H
(12b)
where R H0 is a Wishart matrix and Q̂S is the sample covariance matrix of S. The eigenvalues
of R H0 and R H1 can also be expressed as λ1 (R H0 ) ≥ . . . ≥ λ M (R H0 ) and λ1 (R H1 ) ≥ . . . ≥
λ M (R H1 ), where λm (R H0 ) = Nλm (Q̂y,H0 ) and λm (R H1 ) = Nλm (Q̂y,H1 ). Since R H0 is a
complex Gaussian Wishart matrix, its largest eigenvalue should follow Tracy–Widom
distribution of order 2 [39]:
λmax (R H0 ) − μ d
−
→ T W 2, (13)
ν
where
√ √
μ = ( M + N )2 , (14a)
1/3
√ 1 1
ν= μ √ +√ , (14b)
M N
are center and scaling parameters. Then the cumulative distribution function (CDF) of the
largest eigenvalue, i.e., F ( x ), can be approximated as
x−μ
F ( x ) ≈ F2 , (15)
ν
where λmax (Q̂y ), λmin (Q̂y ) are maximum and minimum eigenvalues, respectively, of
sample covariance matrix Q̂y , and γ1 denotes the judgment threshold.
At the end of judgment, there will be four possible results: true positive (TP), false
positive (FP), true negative (TN), false negative (FN). From a probabilistic perspective, we
know PTP + PTN = 1 and PFP + PFN = 1, where the probability of FP is also called false
80
Drones 2023, 7, 256
alarm (FA) probability, so only TP and FP situations need to be addressed. Therefore, PFA
of the SR-MME detector is defined as
PFA = P λmax (Q̂y,H0 )λmin (Q̂y,H0 ) > γ1
( Nγ1 )2
= P λmax (R H0 ) >
λmin (R H0 )
⎛ 2 ⎞
√ Nγ√ 1
− μ⎟
⎜ λmax (R H0 ) − μ N− M (19)
= P⎝ > ⎠
ν ν
⎛ 2 ⎞
√ Nγ√
1
− μ⎟
⎜ N− M
= 1 − F2 ⎝ ⎠.
ν
81
Drones 2023, 7, 256
where λm (Q̂y ) is the eigenvalue of the sample covariance matrix and γ2 represents the
judgment threshold of this detector. Similar to SR-MME detector, the false alarm probability
of the GM detector is given by
⎛ M ⎞
= P⎝
PFA M
∏ λm (Q̂y,H0 ) > γ2 ⎠
m =1
!
λmax (R H0 )
= P λmax (R H0 ) > γ2M
det(Q̂y,H0 )
⎛ √ √ ⎞
( N + M )2
γ2M det(Q̂ ) −μ (24)
⎜ λmax (R H0 ) − μ y,H0 ⎟
= P⎝ > ⎠
ν ν
⎛ √ √ ⎞
( N + M )2
γ2M det(Q̂y,H0 )
−μ
⎜ ⎟
= 1 − F2 ⎝ ⎠,
ν
82
Drones 2023, 7, 256
As the number of emitters grows, the features also increase. In order to enlarge the
discrimination between the different signals, we perform log operations on them. Then,
the feature vector of any received signal is given by
Since the signal received by the base station is derived from different emitters, and it is
a typical multiclass problem, machine learning-based methods are very suitable. Assuming
there are most K emitters in the coverage area of the base station, we can obtain a K-elements
classifier based on the existing training data and then substitute the signal to be detected
into this classifier for classification. Then we will introduce several high-performance
classification algorithms.
5
α1,j1 = ∑ vh,j1 x(h), (29)
h =1
where vh,j1 is the connection coefficient between the hth neuron of the input layer and the
j1 th neuron of hidden layer 1. Then, the output of this neuron is given by
where δ1,j1 denotes the threshold of the jth neuron of hidden layer 1. f (·) is the activation
function, and usually a sigmoid function is adopted, which can be defined as
1
sigmoid( x ) = . (31)
1 + e− x
We can deduce the input and output of the rest of the hidden layers from hidden layer
1, and the output from the js th neuron of hidden layer s is given as
83
Drones 2023, 7, 256
where u js−1 ,js represents the connection coefficient between the js−1 th neuron of hidden
layer s − 1 and the js th neuron of hidden layer s. Since the output of the last hidden layer is
transmitted to the output layer, the final output of this network is
!
qs
ĝk = f ( β k − ε k ) = f ∑ w js ,k zsjs − ε k , (33)
js =1
where w js ,k is the connection coefficient between hidden layer s and the output layer, and
ε k is the threshold of the kth neuron of the output layer.
When the input signal is x1 , the ideal output is gi . However, the actual output of this
neural network is ĝi = [ ĝi,1 , . . . , ĝi,k , . . . , ĝi,K ], then the mean squared error (MSE) between
ideal output and actual output is derived as
K
1
Ei =
K ∑ ( ĝi,k − gi,k )2 , (34)
k =1
−1
Based on the classification error, we can update all the (5q1 + ∑ts= 1 qt qt+1 + qs K ) connection
coefficients and (∑st=1 qt + K ) thresholds of this neural network. Taking the js th neuron of
hidden layer s as an example, we obtain
where l represents the number of iterations. According to the gradient descent method, the
update terms are defined as
∂Ei
Δwljs ,k = −η
∂wljs ,k
∂Ei ∂ ĝi,k ∂β k
= −η · · (36)
∂ ĝi,k ∂β k ∂wlj ,k
s
2η s
=− z · Gi,k ,
K js
and
∂Ei
l
Δδs,j = −η
s
∂δs,j
l
s
∂Ei ∂ ĝi,k ∂β k ∂z js
K s
= −η ∑ · · s · l
∂ ĝi,k ∂β k ∂z js ∂δs,j
(37)
k =1 s
K
2η s
=− z js (1 − zsjs ) · ∑ wljs ,k Gi,k ,
K k =1
All the parameters in the neural network are updated in each iteration until the
parameters change less than a certain threshold or a certain number of iterations is reached.
Therefore, the final classification result for signal i is given by
where Ci ∈ {1, 2, . . . , K }.
84
Drones 2023, 7, 256
+LGGHQ +LGGHQ
,QSXW
/D\HU « /D\HU
2XWSXW
/D\HU
/D\HU
V
wT x + b = 0, (40)
where w is the normal vector which determines the direction of this hyperplane, and b
denotes the bias which is defined as the distance from the hyperplane to the original point.
Therefore, the separable hyperplane can be denoted as (w, b).
Assuming the samples can be classified by hyperplane (w, b) accurately, if gi = −1,
we can obtain wT xi + b < 0, and if gi = +1, we obtain wT xi + b > 0. Then the following
conditions should be satisfied:
' T
w xi + b ≥ +1, gi = +1
(41)
wT xi + b ≤ −1, gi = −1.
The samples closest to the separable hyperplane make the equalities in (41) hold, and they
are support vectors. The sum of the distance from the two heterologous support vectors
to the hyperplane is called the margin, and it is defined as δ = w
2
. For maximizing the
margin of the separable hyperplane, the optimization problem can be designed as
1
min w2 (42a)
w,b 2
s.t. gi (wT xi + b) ≥ 1. (42b)
Actually, the training samples can hardly be linearly separated in the current sample
space. Firstly, we map the samples to a higher-dimensional feature space. Then the model
of the separable hyperplane is modified as
Secondly, to avoid overfitting, we introduce the concept of soft margin. This concept
allows SVM to make errors in the classification of some samples, i.e., these samples can
85
Drones 2023, 7, 256
not satisfy constraint gi (wT φ(xi ) + b) ≥ 1. Consequently, the optimization problem (42) is
transformed to maximize the margin while minimizing the classification error:
s
1
min w2 + C ∑ ξ i , (44a)
w,b,ξ i 2 i =1
s.t. gi (wT φ(xi ) + b) ≥ 1 − ξ i , (44b)
ξ i ≥ 0. (44c)
Taking them into Equation (45), the dual problem of (44) is derived as
s s s
1
max
αi
∑ α i − 2 ∑ ∑ α i α j g i g j κ ( xi , x j ) , (47a)
i =1 i =1 j =1
P ( c k ) P ( xi | c k ) P ( c ) P ( xi | c k )
P ( c k | xi ) = = K k , (49)
P ( xi ) ∑ k = 1 P ( xi | c k ) P ( c k )
86
Drones 2023, 7, 256
where ck , k ∈ D = {1, 2, . . . , K } is the label for classification. Therefore, the NBC for our
problem can be verified as
The training process is based on the training set to estimate the class prior probability
P(ck ) and conditional probability P(xi |ck ). Since the features in (28) are continuous, we
can suppose P(xi |ck ) ∼ N (μk , Σk ), where μk and Σk are the mean and covariance matrix of
feature vectors for all training samples that belong to class k. Therefore, the conditional
probability can be represented by its PDF as
1 T −1
e − 2 ( xi − μ k ) Σ k ( xi − μ k ) .
1
P ( xi | c k ) = √ (51)
( 2π )5 |Σ|1/2
Then, we can compute the logarithm of (50). Finally, the NBC can be transformed as
5. Simulation Results
In this section, representative simulation results are given to show the high perfor-
mance of signal detectors and classifiers proposed in this paper. Next, we will compare the
two proposed signal detectors with existing detectors.
λmax (Q̂y ) H1
MME : ≷ γ3 . (53b)
λmin (Q̂y ) H0
As can be seen in Figure 3, the relationship between SNR and probability of detec-
tion is plotted, where the probability of false alarm PFA = 10−4 , the number of receive
antennas M = 64, the number of snapshots N = 100, and the final results are obtained
from 5000 Monte-Carlo simulations. Among these four detectors, SR-MME has the best
performance across all SNR values, especially in the low-SNR region. In extremely poor
communication conditions, i.e., SNR in the range from −30 dB to −20 dB, M-MME and
MME can hardly detect the presence of the signal sources, while SR-MME can keep the
detection probability above 85%, so we can say that SR-MME is the best signal detector for
the low-SNR situations. For the GM detector, its detection probability is slightly less than
87
Drones 2023, 7, 256
SR-MME in the low-SNR situation, but it still has a great improvement compared to the
other two detectors.
0.9
0.8
Probability of detection
0.7
0.6
0.5
0.4
0.3
0.2
SR-MME
GM
0.1 M-MME
MME
0
-30 -28 -26 -24 -22 -20 -18 -16 -14 -12 -10
SNR(dB)
Figure 4 presents the detection probability of these four signal detectors with the
number of samples, where M = 64, PFA = 10−4 and SNR = −20 dB. The overall trend
of the curves in this figure is similar to Figure 3, with SR-MME still the best performing
of these four signal detectors and achieving a detection probability of at least 93%. The
detection performance of the GM detector also improves as the number of samples increases,
especially when N ranges between 100 and 200. GM has a significant improvement
compared with M-MME and MME. Therefore, the robust performance of SR-MME and
GM at a lower number of samples can help us save lots of time and spatial resources, and
not at the cost of a loss of detection performance.
0.9
0.8 SR-MME
GM
Probability of detection
0.7 M-MME
MME
0.6
0.5
0.4
0.3
0.2
0.1
0
100 200 300 400 500 600 700 800 900 1000
Number of samples
Figure 4. Probability of detection versus number of samples, SNR = −20 dB, PFA = 10−4 .
88
Drones 2023, 7, 256
Figure 5 shows the most commonly used indicator in the field of threshold detection,
the Receiver Operating Characteristic (ROC) curve. It evaluates a detector comprehensively
in terms of both detection probability and false alarm probability. The parameters involved
in this simulation are M = 64, N = 200, and SNR = −20 dB. The ROC curve of SR-
MME is above the other three curves, so it is the best detector for the overall performance.
Correspondingly, the MME has the worst performance. For GM and M-MME, due to
a cross-over of their ROC curves, the area under ROC curve (AUC) is introduced for
comparing their performance. Since the axes in this figure employ scientific counting, after
converting it to ordinary coordinates, the AUC value of M-MME is larger than GM. From
this perspective, M-MME performs better than GM. However, in practice, we would prefer
a relatively low false alarm probability, so GM will be more useful, since it can guarantee a
low false alarm probability while maintaining a high detection probability.
100
SR-MME
GM
Probability of detection
M-MME
MME
10-1
10-2
10-5 10-4 10-3 10-2 10-1
Probability of false alarm
Figure 5. ROC curve, SNR = −20 dB, N = 200.
89
Drones 2023, 7, 256
Figure 6 plots the relationship between the classification accuracy of the four classifiers
and SNR, where M = 64, N = 200, and K = 3 in the test. This figure shows that ML-NNs
have much stronger performance than NBC in all the SNR regions, and the accuracy of
SVM is obviously lower than ML-NNs when SNR ≥ −18 dB. Since neural networks have
strong learning ability, the deeper networks will cause overfitting and result in the decrease
in classification accuracy; we only consider 3L-NN and 4L-NN in this work.
By observing the curves of the signal detectors and the signal classifiers about SNR
in Figures 3 and 6, we can find when SNR = −20 dB and PFA = 10−4 ; the PD of SR-MME
can achieve 95%. Since PFA + PAN = 1, SR-MME almost separates all the noise while
ensuring a high signal detection rate. However, for the optimal neural network-based
signal classifier, its classification accuracy at SNR = −20 dB is also only about 70%, that is,
if the noise is directly added to the classification process, nearly 30% of the noise will be
misclassified as signals. Therefore, we believe that adding the step of signal detection is
necessary. Moreover, the time required to perform one signal detection was approximately
0.04 s, and the training duration required for the four-layer neural network after adding
noise is also increased to about 1.02 s when the number of training sample is 10. Therefore,
using the signal detectors can also save time.
0.9
0.8
Classification Accuracy
0.7
0.6
0.5
0.4
0.3
90
Drones 2023, 7, 256
0.95
0.9
Classification Accuracy
0.85
0.8 4-layer NN
3-layer NN
0.75 SVM
NBC
0.7
0.65
0.6
0.55
0.5
0 20 40 60 80 100 120 140
Number of Receive Antennas
Figure 7. Classification accuracy versus number of receive antennas, SNR = −15 dB.
( M−m) N 1
MDL(m) = −2 log Lm + m(2M − m) log N. (59)
2
MDL modified the bias term based on AIC, leading to improved classification performance.
The classification result of MDL is
The former papers only verified the work performance of AIC and MDL with a small-
sized receiving array, such as arrays with around eight antennas. To find out whether
these two methods can maintain good performance with a massive receive array, we
present a curve between their classification accuracy and the number of receive antennas.
Unfortunately, as shown in Figure 8, AIC and MDL can only achieve good performance
when the number of receive antennas is between 8 and 36. Once the number of receive
antennas exceeds 36, their classification accuracy drops sharply until the number of emitters
is completely inaccessible at 44 antennas. By analyzing the definitions of AIC and MDL,
since the number of receive antennas is equal to the number of possible classifications,
the corresponding model complexity increases when the number of antennas increases. If
the model is too complex, the values of AIC and MDL will increase, and this will result
91
Drones 2023, 7, 256
in overfitting. Thus, we can conclude that AIC and MDL are not applicable for scenarios
using massive receive arrays.
1
AIC
0.9
MDL
0.8
Classification Accuracy
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
0 20 40 60 80 100 120 140
Number of Receive Antennas
Figure 8. Classification accuracy versus number of receive antennas for AIC and MDL, SNR = 0 dB.
0.9
0.8
4-layer NN
Classification Accuracy
0.7
3-layer NN
0.6 SVM
NBC
0.5 AIC
MDL
0.4
0.3
0.2
0.1
0
-20 -15 -10 -5 0 5 10
SNR(dB)
6. Conclusions
In order to provide the vital prior knowledge for DOA estimation, a DOA preprocess-
ing system containing signal detectors and ML-based signal classifiers has been proposed
for inferring the number of UAV emitters in a massive MIMO system. Two high-precision
signal detectors, i.e., SR-MME and GM, can quickly and accurately judge the presence
of the signal emitters based on the statistical characteristics of the received signals and
the threshold detection theory. Simulation results showed that the proposed SR-MME
and GM have much better detection performance than existing detectors like MME and
M-MME, especially in the low-SNR region and situations with a small number of samples.
92
Drones 2023, 7, 256
After determining the presence of signals, the specific number of emitters can be further
determined by ML-based classifiers including ML-NN, SVM, and NBC. Compared to tradi-
tional methods, like AIC and MDL, the proposed methods can work well with a massive
MIMO array and have higher accuracy when SNR is low. In conclusion, we believe that
the proposed system and method will be helpful for the future implementation of UAV
massive MIMO communications.
Author Contributions: Conceptualization, Y.L.; Methodology, Y.L.; Software, Y.L.; Validation, Y.L.;
Investigation, J.H., S.Y., W.Z., D.T. and Y.S.; Resources, H.S.; Writing—review & editing, J.W.; Project
administration, F.S. All authors have read and agreed to the published version of the manuscript.
Funding: This work was supported in part by the National Natural Science Foundation of China
(Nos. U22A2002 and 62071234), the Major Science and Technology plan of Hainan Province under
Grant ZDKJ2021022, and the Scientific Research Fund Project of Hainan University under Grant
KYQD(ZR)-21008. This work was also supported in part by the National Natural Science Foundation
of China under Grant 62001116, the Natural Science Foundation of Fujian Province under Grant
2020J05106.
Conflicts of Interest: The authors declare no conflict of interest.
References
1. Zeng, Y.; Zhang, R.; Lim, T.J. Wireless communications with unmanned aerial vehicles: Opportunities and challenges. IEEE
Commun. Mag. 2016, 54, 36–42. [CrossRef]
2. Huang, Y.; Wu, Q.; Lu, R.; Peng, X.; Zhang, R. Massive MIMO for cellular-connected UAV: Challenges and promising solutions.
IEEE Commun. Mag. 2021, 59, 84–90. [CrossRef]
3. Wang, C.X.; Haider, F.; Gao, X.; You, X.H.; Yang, Y.; Yuan, D.; Aggoune, H.M.; Haas, H.; Fletcher, S.; Hepsaydir, E. Cellular
architecture and key technologies for 5G wireless communication networks. IEEE Commun. Mag. 2014, 52, 122–130. [CrossRef]
4. Saad, W.; Bennis, M.; Chen, M. A vision of 6G wireless systems: Applications, trends, technologies, and open research problems.
IEEE Netw. 2019, 34, 134–142. [CrossRef]
5. Zhang, Z.; Xiao, Y.; Ma, Z.; Xiao, M.; Ding, Z.; Lei, X.; Karagiannidis, G.K.; Fan, P. 6G wireless networks: Vision, requirements,
architecture, and key technologies. IEEE Veh. Technol. Mag. 2019, 14, 28–41. [CrossRef]
6. Chandhar, P.; Larsson, E.G. Massive MIMO for connectivity with drones: Case studies and future directions. IEEE Access 2019,
7, 94676–94691. [CrossRef]
7. Harris, P.; Malkowsky, S.; Vieira, J.; Bengtsson, E.; Tufvesson, F.; Hasan, W.B.; Liu, L.; Beach, M.; Armour, S.; Edfors, O.
Performance characterization of a real-time massive MIMO system with LOS mobile channels. IEEE J. Sel. Areas Commun. 2017,
35, 1244–1253. [CrossRef]
8. Geraci, G.; Garcia-Rodriguez, A.; Azari, M.M.; Lozano, A.; Mezzavilla, M.; Chatzinotas, S.; Chen, Y.; Rangan, S.; Di Renzo, M. What
will the future of UAV cellular communications be? A flight from 5G to 6G. IEEE Commun. Surv. Tuts. 2022, 24, 1304–1335. [CrossRef]
9. Bai, L.; Huang, Z.; Cheng, X. A Non-Stationary Model with Time-Space Consistency for 6G Massive MIMO mmWave UAV
Channels. IEEE Trans. Wireless Commun. 2022, 22, 2048–2064. [CrossRef]
10. Chandhar, P.; Danev, D.; Larsson, E.G. Massive MIMO for communications with drone swarms. IEEE Trans. Wireless Commun.
2017, 17, 1604–1629. [CrossRef]
11. Huang, L.; Qian, C.; So, H.C.; Fang, J. Source enumeration for large array using shrinkage-based detectors with small samples.
IEEE Trans. Aerosp. Electron. Syst. 2015, 51, 344–357. [CrossRef]
12. Krim, H.; Viberg, M. Two decades of array signal processing research: the parametric approach. IEEE Signal Process. Mag. 1996,
13, 67–94. [CrossRef]
13. Aquino, S.; Vairavel, G. A Review of Direction of Arrival Estimation Techniques in Massive MIMO 5G Wireless Communication
Systems. In Proceedings of the Fourth International Conference on Communication, Computing and Electronics Systems: ICCCES 2022;
Springer: Berlin/Heidelberg, Germany, 2023; pp. 15–34.
14. Björnson, E.; Sanguinetti, L.; Wymeersch, H.; Hoydis, J.; Marzetta, T.L. Massive MIMO is a reality—What is next? Five promising
research directions for antenna arrays. Digit. Signal Process. 2019, 94, 3–20. [CrossRef]
15. Akaike, H. A new look at the statistical model identification. IEEE Trans. Autom. Control 1974, 19, 716–723. [CrossRef]
16. Schwarz, G. Estimating the dimension of a model. Ann. Stat. 1978, 6, 461–464. [CrossRef]
17. Rissanen, J. Modeling by shortest data description. Automatica 1978, 14, 465–471. [CrossRef]
18. Stoica, P.; Selen, Y. Model-order selection: a review of information criterion rules. IEEE Signal Process. Mag. 2004, 21, 36–47. [CrossRef]
19. Lu, Z.; Zoubir, A.M. Generalized Bayesian information criterion for source enumeration in array processing. IEEE Trans. Signal
Process. 2012, 61, 1470–1480. [CrossRef]
20. Lu, Z.; Zoubir, A.M. Flexible detection criterion for source enumeration in array processing. IEEE Trans. Signal Process. 2012,
61, 1303–1314. [CrossRef]
93
Drones 2023, 7, 256
21. Williams, D.B.; Johnson, D.H. Using the sphericity test for source detection with narrow-band passive arrays. IEEE Trans. Acoust.
Speech Signal Process. 1990, 38, 2008–2014. [CrossRef]
22. Brcich, R.F.; Zoubir, A.M.; Pelin, P. Detection of sources using bootstrap techniques. IEEE Trans. Signal Process. 2002, 50, 206–215.
[CrossRef]
23. Wax, M.; Adler, A. Detection of the Number of Signals by Signal Subspace Matching. IEEE Trans. Signal Process. 2021, 69, 973–985.
[CrossRef]
24. Cabric, D.; Mishra, S.M.; Brodersen, R.W. Implementation issues in spectrum sensing for cognitive radios. In Proceedings of the
Conference Record of the Thirty-Eighth Asilomar Conference on Signals, Systems and Computers, Pacific Grove, CA, USA, 7–10
November 2004; Volume 1, pp. 772–776.
25. Cabric, D.; Tkachenko, A.; Brodersen, R.W. Spectrum sensing measurements of pilot, energy, and collaborative detection. In
Proceedings of the Milcom 2006-2006 IEEE Military Communications Conference, Washington, DC, USA, 23–25 October 2006;
pp. 1–7.
26. Gardner, W.A. Exploitation of spectral redundancy in cyclostationary signals. IEEE Signal Process. Mag. 1991, 8, 14–36. [CrossRef]
27. Zeng, Y.; Liang, Y.C. Eigenvalue-based spectrum sensing algorithms for cognitive radio. IEEE Trans. Commun. 2009, 57, 1784–1793.
[CrossRef]
28. Zhang, R.; Lim, T.J.; Liang, Y.C.; Zeng, Y. Multi-antenna based spectrum sensing for cognitive radios: A GLRT approach. IEEE
Trans. Commun. 2010, 58, 84–88. [CrossRef]
29. Liu, C.; Li, H.; Wang, J.; Jin, M. Optimal eigenvalue weighting detection for multi-antenna cognitive radio networks. IEEE Trans.
Wirel. Commun. 2016, 16, 2083–2096. [CrossRef]
30. Yang, M.; Ai, B.; He, R.; Huang, C.; Ma, Z.; Zhong, Z.; Wang, J.; Pei, L.; Li, Y.; Li, J. Machine-learning-based fast angle-of-arrival
recognition for vehicular communications. IEEE Trans. Veh. Technol. 2021, 70, 1592–1605. [CrossRef]
31. Bithas, P.S.; Michailidis, E.T.; Nomikos, N.; Vouyioukas, D.; Kanatas, A.G. A survey on machine-learning techniques for
UAV-based communications. Sensors 2019, 19, 5170. [CrossRef]
32. Jiang, C.; Zhang, H.; Ren, Y.; Han, Z.; Chen, K.C.; Hanzo, L. Machine learning paradigms for next-generation wireless networks.
IEEE Wirel. Commun. 2016, 24, 98–105. [CrossRef]
33. Thilina, K.M.; Choi, K.W.; Saquib, N.; Hossain, E. Machine learning techniques for cooperative spectrum sensing in cognitive
radio networks. IEEE J. Sel. Areas Commun. 2013, 31, 2209–2221. [CrossRef]
34. Zhuang, Z.; Xu, L.; Li, J.; Hu, J.; Sun, L.; Shu, F.; Wang, J. Machine-learning-based high-resolution DOA measurement and robust
directional modulation for hybrid analog-digital massive MIMO transceiver. Sci. China Inf. Sci. 2020, 63, 1–18. [CrossRef]
35. Shu, F.; Liu, L.; Yang, L.; Jiang, X.; Xia, G.; Wu, Y.; Wang, X.; Jin, S.; Wang, J.; You, X. Spatial Modulation: an Attractive Secure
Solution to Future Wireless Network. arXiv 2021, arXiv:2103.04051.
36. Jie, Q.; Zhan, X.; Shu, F.; Ding, Y.; Shi, B.; Li, Y.; Wang, J. High-performance Passive Eigen-model-based Detectors of Single
Emitter Using Massive MIMO Receivers. arXiv 2021, arXiv:2108.02011.
37. Zhang, R.; Shim, B.; Wu, W. Direction-of-Arrival Estimation for Large Antenna Arrays With Hybrid Analog and Digital
Architectures. IEEE Trans. Signal Process. 2021, 70, 72–88. [CrossRef]
38. Chen, C.E.; Lorenzelli, F.; Hudson, R.E.; Yao, K. Stochastic maximum-likelihood DOA estimation in the presence of unknown
nonuniform noise. IEEE Trans. Signal Process. 2008, 56, 3038–3044. [CrossRef]
39. Chiani, M. Distribution of the largest eigenvalue for real Wishart and Gaussian random matrices and a simple approximation for
the Tracy–Widom distribution. J. Multivar. Anal. 2014, 129, 69–81. [CrossRef]
40. Wei, L.; Tirkkonen, O. Analysis of scaled largest eigenvalue based detection for spectrum sensing. In Proceedings of the 2011
IEEE International Conference on Communications (ICC), Kyoto, Japan, 5–9 June 2011; pp. 1–5.
41. Tracy, C.A.; Widom, H. The distributions of random matrix theory and their applications. In Proceedings of the New Trends in
Mathematical Physics: Selected Contributions of the XVth International Congress on Mathematical Physics; Springer: Berlin/Heidelberg,
Germany, 2009; pp. 753–765.
42. Fokas, A.S.; Its, A.R.; Novokshenov, V.Y.; Kapaev, A.A.; Kapaev, A.I.; Novokshenov, V.I. Painlevé Transcendents: The Riemann-Hilbert
Approach; Number 128; American Mathematical Society: Providence, RI, USA, 2006.
43. Perry, P.; Johnstone, I.; Ma, Z.; Shahram, M. Rmtstat: Distributions and Statistics from Random Matrix Theory. R Software
Package Version. 2009. Available online: https://cran.rstudio.com/web/packages/RMTstat/index.html (accessed on 6 April
2023).
44. Hagan, M.T.; Demuth, H.B.; Beale, M. Neural Network Design; PWS Publishing Co.: Boston, MA, USA, 1997.
45. Prähofer, M.; Spohn, H. Exact scaling functions for one-dimensional stationary KPZ growth. J. Stat. Phys. 2004, 115, 255–279.
[CrossRef]
46. Akaike, H. Information theory and an extension of the maximum likelihood principle. In Selected Papers of Hirotugu Akaike;
Springer: Berlin/Heidelberg, Germany, 1998; pp. 199–213.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.
94
drones
Article
Joint Communication and Action Learning in Multi-Target
Tracking of UAV Swarms with Deep Reinforcement Learning
Wenhong Zhou 1 , Jie Li 2 and Qingjie Zhang 1, *
Abstract: Communication is the cornerstone of UAV swarms to transmit information and achieve
cooperation. However, artificially designed communication protocols usually rely on prior expert
knowledge and lack flexibility and adaptability, which may limit the communication ability between
UAVs and is not conducive to swarm cooperation. This paper adopts a new data-driven approach
to study how reinforcement learning can be utilized to jointly learn the cooperative communication
and action policies for UAV swarms. Firstly, the communication policy of a UAV is defined, so that
the UAV can autonomously decide the content of the message sent out according to its real-time
status. Secondly, neural networks are designed to approximate the communication and action
policies of the UAV, and their policy gradient optimization procedures are deduced, respectively.
Then, a reinforcement learning algorithm is proposed to jointly learn the communication and
action policies of UAV swarms. Numerical simulation results verify that the policies learned by
the proposed algorithm are superior to the existing benchmark algorithms in terms of multi-target
tracking performance, scalability in different scenarios, and robustness under communication failures.
2. Related Works
As an emerging research hotspot, the MADRL-based communication learning research
in recent years can be classified into several categories, including communication protocol,
communication structure, communication object, and communication timing, etc.
96
Drones 2022, 6, 339
agent can generate a belief state about its local observation and store it in a shared memory,
and all agents can access and update the memory to achieve message passing between
agents. However, in complex and drastically dynamic scenarios, the belief states generated
by different agents may be all kinds of strange, which is not conducive to establishing
a stable cooperative relationship between agents.
3. Preliminary
3.1. Decentralized Partially Observable Markov Decision Process (Dec-POMDP)
Dec-POMDP [24] is a model of a Markov decision process (MDP) for multi-agents in
which each one can only partially observe the environment and make its action decision
accordingly. For n agents, each one is indexed by i ∈ [1, n]; the Dec-POMDP at every step
(the subscript t is omitted for convenience) can be described as:
97
Drones 2022, 6, 339
where N is the collective set of all agents, S is the global state space denoting all agents’
and the environment’s configurations, and s ∈ S denotes the current and specific state. The
joint action space of all agents is denoted as A : A1 × · · · × An in which ai ∈ Ai is agent i’s
specific action; O : (O1 , · · · , On ) denotes all agents’ joint observation space; Z : oi = Z (s, i )
denotes the individual observation model of agent i given the global state s, and oi ∈ Oi is
agent i’s local observation. T : P(s | s, a) → [0, 1] denotes the probability of s transiting
to new state s executing joint action a : ( a1 , · · · , an ); R is the reward function; γ ∈ [0, 1] is
the constant discount factor.
In Dec-POMDP, each agent makes its action decision following individual policy
π i : Oi → Ai , and the joint policy is denoted as π : (π (1) , · · · , π (n) ). Then, all agents
execute the joint action to refresh the environment. Given a specific joint observation o
and all agents’ joint policy π, if each agent can access its private reward rti at every time
∞ N
step t, Vπ (o) = Eπ [ ∑ ∑ γt rti | ot=0 = o] denotes the state-value function of all agents.
t =0 i =1
Furthermoremore, executing the joint action a, their action-value function is denoted
∞ N
as Qπ (o, a) = Eπ [ ∑ ∑ γt rti | (o, a)t=0 = (o, a)].
t =0 i =1
The value function in DDPG is updated with the frozen network trick, and in addition
to the two networks appearing in AC, the target-policy function and the target-value
function are used to improve training stability [26].
4. Problem Formulation
4.1. Problem Description
The research focus of this paper is to explore a communication and action policies
joint learning method to achieve swarm cooperation. To reduce the learning difficulty,
we make reasonable assumptions and simplifications of the models of both the UAV
and the target. As shown in Figure 1, a large number of homogeneous small fixed-
wing UAVs track an unknown number of moving targets on the ground. Each UAV
can only perceive the targets below it but cannot distinguish the specific identities or in-
dices of the tracked targets. It is assumed that the UAVs move at a uniform constant speed
in a two-dimensional plane and rotate their headings according to the local communication
messages and observation information. However, since the targets are non-cooperative
and there is no explicit target assignment, a single UAV may track multiple aggregated
targets, or multiple UAVs may cooperatively track one or multiple targets simultaneously.
Therefore, the UAVs should cooperate in a decentralized manner to keep targets within
98
Drones 2022, 6, 339
their field of view and track as many targets as possible. In addition, the UAVs should also
satisfy the safety constraints, such as avoiding collisions, crossing boundaries, etc.
where the subscription t is denoted as the current time, Δt is the discrete time step, θ̇max is
the UAV’s maximum heading angular rate, and xmax and ymax are the maximum boundaries.
Similarly, for any target k, k ∈ [1, m], its kinematic model can also be described with
the position [ xTk , yTk ] and heading angular θTk , and the difference is that the target’s heading
angular rate θ̇Tk is assumed to be a bounded random variable.
99
Drones 2022, 6, 339
2n a − Na − 1
θ̇U,t = θ̇max , n a ∈ [1, Na ], (5)
Na − 1
i,k
When UAV i tracks multiple targets, its target tracking reward is rtari = ∑mk =1 rtar .
Specifically, the constant bias 1 in Equation (6) can encourage the UAV to track more
targets rather than just obsessing over a single target. For example, when tracking
i 2, but when tracking a single target, r i < 2.
two targets, rtar tar
(2) Repeated Observation Penalty: Repeated observation of a target by multiple UAVs
may not increase the number of tracked targets but may increase the risk of collision
due to the proximity of the UAVs. Therefore, to improve the observation efficiency
and track more targets, a penalty item is defined to guide the UAV i and j, j = i
to avoid repeated observations, that is:
'
i,j −0.5 × exp 2 × ro − di,j /(2 × ro ) di,j ≤ 2 × ro ,
rrt = (7)
0 else ,
i,j j,i
and rrt = rrt . In Equation (7), if di,j > 2 × ro , there is no observational overlap
i = n i,j
between UAV i and j, and UAV i’s repeated observation penalty is rrt ∑ j=1,j=i rrt .
(3) Boundary Penalty: To effectively capture and track targets, UAV i’s observation area
should always be within the boundaries. When the observation range is outside
100
Drones 2022, 6, 339
the boundaries, the outside part is invalid. To this end, the minimum distance
i
from UAV i to all boundaries is dbound , and the boundary penalty item is defined as:
'
−0.5 × ro − dbound
i /ro i
dbound < do
i
rbound = (8)
0 else .
ri = rtar
i
+ rrt
i
+ rbound
i
. (9)
5. Methods
5.1. Communication and Action Policies Modeling
Based on the above settings, the set of UAV i’s neighbors that can communicate locally
with it at time t is denoted as Nti . Its communication and action decision-making processes
is shown in Figure 3. Specifically, j ∈ Nti , ait is its heading angular rate θ̇U,t
i ; mi indicates
t
the continuous and deterministic message that is about to be published to the neighbors.
Here, UAV i can receive the messages from i
itself and all neighbors in the last moment; ct
j
is denoted as cit = mit−1 , mt−1 | ∀ j ∈ Nti . UAV i makes its action and communication
decisions based on its local observation and the messages received. Then, the action policy
is defined as:
ait ∼ πa a | oti , cit , (10)
101
Drones 2022, 6, 339
communication communication
action policy action policy
policy policy
W
other other
UAVs UAVs
W
communication communication
action policy action policy
policy policy
UAV i UAV j
102
Drones 2022, 6, 339
Thus, the communication and action policies of UAV i can be approximated with
neural networks. Take communication policy as an example, the overview of its neural
network is shown in Figure 4, and the aggregation process of its communication messages
in the dashed box on the right is as follows:
(1) At time t, transform the communication messages with function F whose parameters
can be learned to obtain the high-level feature [28], and denote F (mit−1 ) as query ,
j
which represents the prior knowledge of UAV i, while { F (mt−1 ) | ∀ j ∈ Nti } are the set
of sources, and each one indicates the received message to be aggregated;
(2) The correlation coefficient from any adjacent UAV j, j ∈ Nti to the central UAV i is
defined as:
j
eij = F (mit−1 ), F (mt−1 ), j ∈ Nti (12)
the inner product represents a parameter-free calculation, which outputs a scalar that
measures the correlation;
(3) Use the softmax function to normalize the similarity set {eij | ∀ j ∈ Nti } to obtain
the weight set {wij | ∀ j ∈ Nti } in which
exp(eij )
wij = (13)
∑ j∈N i exp(eij )
t
(4) Weighted summation over the source set yields the aggregated message ĉit :
∑
j
ĉit = wij F mt−1 (14)
j∈Nti
Then, oti and ĉit are concatenated and input into the following hidden layers to calculate
the output message mit , and Equation (11) is redefined as:
j
mit = πc oti , GAT mit−1 , mt−1 | ∀ j ∈ Nti ; θc (15)
where θc is the parameter of the communication neural network, and the GAT component
is a part of the network.
Similarly, the action policy could also be approximated by a neural network, only
the output layer should be modified accordingly. Then, the discrete actions of each UAV i
obey the distribution:
j
ait ∼ πa a | oti , GAT mit−1 , mt−1 | ∀ j ∈ Nti ; θa (16)
103
Drones 2022, 6, 339
DJJUHJDWHG
RXWSXWPHVVDJH FÖ WL
PHVVDJH
P WL
matmul
hidden layers
softmax
dot product
GAT
query source
1
LiQ (φQ ) = Eπa [ (yi − Q(oi , ci , ai ; φQ ))2 ] (17)
2
− −
where yi = ri + γQ(oi , ci , ai ; φQ ) | ai ∼ π i ,ci ;θ , φQ is the parameter of the correspond-
a ( a|o a)
ing target network, ai is the next action, oi and ci are the local observations and the set
of received messages at the next moment, respectively. The time-difference(TD) error
−
is denoted as δi = ri + γQ(oi , ci , ai ; φQ ) − Q(oi , ci , ai ; φQ ), and the gradient of this loss
function with respect to φQ performing gradient descent is:
Then, the action policy is updated via maximizing the action-value function:
104
Drones 2022, 6, 339
policy πa , the parameter of the action-value function φQ , and UAV i’s current input variables
(oti , cit ), the communication objective is denoted as:
1
∑
j j/i j
Jci (θc ) = Eπc [ Q(ot+1 , (ct+1 , mit ), at+1 ; φQ )
| Ni | j∈N i (21)
| mi = π j j j/i ],
c ( ot ,ct ;θc ),at+1 ∼ π ( a | ot+1 ,( ct+1 ,mt );θa )
i i i
t
j/i
where ct+1 is the set of UAV j’s received messages except mit .
Then, the communication policy gradient is derived according to the policy gradient
theorem and the chain derivation rule as:
1
∇θc Jci (θc ) =
| Ni | ∑ Eπc [∇θc πc (oti , cit ; θc )
j∈N i
j j/i j j/i j (22)
· ∇mi log πa ( a | ot+1 , (ct+1 , mit ); θa ) Q(ot+1 , (ct+1 , mit ), at+1 ; φQ )
t
j j/i j
+ ∇θc πc (oti , cit ; θc )∇mi Q(ot+1 , (ct+1 , mit ), at+1 ; φQ )].
t
For simplicity, the conditional term in Equation (21) is omitted. Thus, given the input
variables of UAV i at the current moment t and that of all adjacent UAVs N i at the next
moment t + 1, the communication policy gradient can be calculated via Equation (22).
Then, the policy can be updated accordingly.
Note that the objective functions, Equations (17), (19) and (21), are non-convex when
using neural networks to approximate them, respectively. The common optimizers, such
as MBSGD (mini-batch stochastic gradient descent) or Adam (adaptive moment estimation)
in PyTorch, are usually adopted to solve these optimization problems.
105
Drones 2022, 6, 339
B1
1 2
L Q ( φQ ) =
2B1 ∑ [(yk − Q(ok , ck , ak ; φQ )) ];
k =1
16: end if
17: Update parameter θa with gradient:
B1
1
∇θa Ja (θa ) ≈
B1 ∑ ∇θa log π (ak | ok , ck ; θa )δk
k =1
106
Drones 2022, 6, 339
6. Experiments
6.1. Benchmark Algorithms
In this paper, the proposed Algorithm 1 is named Att-Message for simplification, and
we hardly see from the existing literature that techniques other than MADRL can achieve
the similar goal of solving communication and action policies for large-scale UAV swarms
to cooperate. Thus, we select and adopt several benchmark algorithms that are commonly
used by researchers from [14,19], and the non-communication one , including:
(1) No-Comm. Literally, in No-Comm, each UAV can only receive local observation
and selfishly maximize its individual rewards. There is no clear communication
channel between UAVs and naturally no explicit cooperation or competition. Thus,
the communication policy is πc = Null, and the action policy is:
a i ∼ πa ( a | o i ; θa ). (23)
(2) Local-CommNet. In CommNet [14], it is assumed that each agent can receive all
agents’ communication messages. It should be adapted to the local communication
configuration of UAV swarms in this paper, named Local-CommNet.
Specifically, each UAV publishes the hidden layer information h of its action policy
j j
network to its adjacent UAVs, i.e., mt−1 = ht−1 , Then, the messages received by UAV
i are denoted as:
. j
cit = { ht−1 | ∀ j ∈ Nti }. (24)
1
∑
j
ĉit = h t −1 . (25)
| Nti | j∈Nti
(3) Att-Hidden. In addition to the average pooling method, the GAT can also be used
to aggregate cit [12,19]. Then:
j
ĉit = GAT({ hit−1 , ht−1 | ∀ j ∈ Nti }). (26)
The message of each UAV is its hidden layer information of the action policy network,
and there is no separate communication policy network. So GAT, as an encoder, could
be a component of the action policy network. The network can be updated according
j
to the input variable (oti , { hit−1 , ht−1 | ∀ j ∈ Nti }) following Equation (20).
6.2. Settings
In this section, the effectiveness of the proposed algorithm is verified by numerical
simulation experiments. According to the problem description (Section 4.1), the training
environmental parameters are set in Table 1. These parameters have been used in our
previous work [8,27], and the rationality has been verified. During testing, the environment
size and the numbers of UAVs and targets may change. The hyper-parameters of those
algorithms are configured in Table 2.
107
Drones 2022, 6, 339
Hyper-Parameter Value
2 × 103
Iteration Episode
Replay Buffer 5 × 105
Max Step 200
Batch Size 64
Target Network Update Interval 100
Action Policy Learning Rate 1 × 10−4
Communication Policy Learning Rate 5 × 10−5
Communication Policy Output Dimension 100
Discount Factor 0.95
To evaluate the tracking performance of UAV swarms, the following metrics are defined:
(1) Average Reward:
T n
1
Tn ∑ ∑ rti , (27)
t =1 i =1
where rti has been defined in Equation (9), which comprehensively evaluates the per-
formance of UAV swarms from the aspects of target tracking, repeated observation,
safe flight, etc.
(2) Average Target:
'
1 T n m 1, d(i,k) do ;
Tn ∑∑∑ 1(i, k), 1(i, k) =
0, else.
(28)
t =1 i =1 k =1
which evaluates the number of targets tracked from the perspective of each UAV.
(3) Collective Target:
'
1 T m 1, ∃i ∈ [1, n], s.t. d(i,k) do ;
T ∑∑ 1( k ), 1( k ) =
0, else.
(29)
t =1 k =1
108
Drones 2022, 6, 339
0.5
NoComm
LocalCommNet
0 Att+LGGHQ
$WW0HVVDJH
average reward
-0.5
-1
-1.5
0 500 1000 1500 2000
episode
(a) Average reward curves.
1
NoComm
LocalCommNet
0.8
Att+LGGHQ
$WW0HVVDJH
average target
0.6
0.4
0.2
0
0 500 1000 1500 2000
episode
(b) Average target curves.
9
NoComm
8 Local&ommNet
7 Att+LGGHQ
$WW0HVVDJH
collective target
1
0 500 1000 1500 2000
episode
(c) Collective target curves.
Figure 5. The metric curves during the training process of the four algorithms.
109
Drones 2022, 6, 339
Looking at specifics, (1) the three algorithms using explicit communication outperform
No-Comm without communication, which indicates that communication can effectively
promote the cooperation between UAVs, thereby improving the tracking performance
of UAV swarms; (2) the comprehensive performance of Att-Hidden using GAT is better
than that of Local-CommNet, but the UAV in both algorithms transmits the hidden layer
of the action policy network. The reason may be that GAT can better aggregate the received
messages, then effectively improve the action policy of UAVs and the cooperation between
them; (3) furthermore, the comprehensive performance of the Att-Message is superior to
that of Att-Hidden, indicating that compared with the hidden layer of the action policy
neural network, the communication message can better capture the information that is
helpful for cooperation. It is also proved that the communication policy in Att-Message can
be optimized based on feedback from other UAVs to facilitate cooperation between UAVs.
Furthermore, Figure 6 intuitively visualizes the tracking process of the UAVs using
the four algorithms, respectively, and the snapshots verify the previous conclusions again.
In addition, it can be seen that executing the policies learned by Att-Message, the UAVs
emerge with obvious cooperative behaviors. For example, when a target escapes the obser-
vation range of a UAV, the adjacent UAVs can quickly track and recapture the target again.
Alternatively, there is a tendency to avoid getting too close between the UAVs to avoid
repeated tracking as much as possible and to improve the observation coverage to capture
more targets.
Y(m)
Y(m)
X(m) X(m)
X(m) X(m)
110
Drones 2022, 6, 339
Algorithm
n
Map Size m Metrics Local- Att-
No-Comm Att-Hidden
CommNet Message
Average
–1.3108 –0.8653 –0.2912 –0.3554
Reward
1000 5
5 Average
1.0626 1.1393 1.0513 1.2109
Target
Coverage 0.6555 0.7370 0.7626 0.7915
Average
–4.1760 –2.5640 –1.6756 –1.4816
Reward
1000 10
10 Average
1.9919 1.9792 1.4915 1.6382
Target
Coverage 0.7692 0.8278 0.8508 0.8722
Average
–0.8157 –0.0425 0.0645 0.0776
Reward
2000 10
10 Average
0.7190 0.8166 0.7589 0.8777
Target
Coverage 0.5357 0.6207 0.6275 0.6714
Average
–2.5680 –1.1612 –0.5765 –0.5166
Reward
2000 20
20 Average
1.2339 1.3396 1.0992 1.2432
Target
Coverage 0.6581 0.7523 0.7586 0.8026
Average
–6.9769 –6.1045 –3.3002 −3.0900
Reward
2000 50
50 Average
2.5686 2.5803 1.9871 2.1014
Target
Coverage 0.8562 0.8475 0.9100 0.9183
Average
–2.2617 –1.1111 –0.9921 −0.5119
Reward
5000 100
100 Average
1.1118 1.1958 1.1712 1.0925
Target
Coverage 0.6170 0.7174 0.7297 0.7542
Average
–4.5219 –3.7763 –2.2895 –2.0386
Reward
5000 200
200 Average
1.8413 1.9586 1.4805 1.6496
Target
Coverage 0.7743 0.8177 0.8392 0.8705
Average
–6.0707 –5.3054 –3.4510 –3.1223
Reward
10,000 1000
1000 Average
2.2993 2.3369 1.7754 1.9648
Target
Coverage 0.8242 0.8406 0.8814 0.9043
The statistical results generally indicate that the average reward and coverage of the three
algorithms that introduce explicit communication in different scenarios are significantly better
than No-Comm without communication, which once again verify the effectiveness of the com-
munication. Specifically, Att-Message performs better than other algorithms in terms of av-
erage reward and coverage, which directly reflects that the UAVs adopting the action
and communication policies learned with Att-Message can better cooperate to track more
targets in different scenarios. However, in the scenario with dense UAVs and targets, the
average targets of No-Comm and Local-CommNet are higher, indicating that the indi-
vidual performance of a single UAV is excellent, while the cooperation between UAVs
111
Drones 2022, 6, 339
is much lower. This also reveals the importance of cooperation for the emergence of
swarm intelligence.
Combined with the visualization in Figure 6, the numerical results verify that UAV
swarms can learn more efficient communication and action policies by using Att-Message
and can scale these policies to different scenarios and achieve better performance.
0.2
NoComm
0 LocalCommNet
AttHidden
-0.2 AttMessage
average reward
-0.4
-0.6
-0.8
-1
-1.2
0 0.2 0.4 0.6 0.8 1
HUURU probability
(a) The trend of average reward.
Figure 7. Cont.
112
Drones 2022, 6, 339
1.05
NoComm
1 LocalCommNet
0.95
AttHidden
AttMessage
average target
0.9
0.85
0.8
0.75
0.7
0.65
0 0.2 0.4 0.6 0.8 1
HUURU probability
(b) The trend of average target.
7.5
NoComm
LocalCommNet
7 AttHidden
AttMessage
collective target
6.5
5.5
4.5
0 0.2 0.4 0.6 0.8 1
HUURU probability
(c) The trend of collective target.
Figure 7. The variation trends of the metrics with communication error probability.
0.5
NoComm
LocalCommNet
AttHidden
0 AttMessage
average reward
-0.5
-1
-1.5
0 0.2 0.4 0.6 0.8 1
ORVV probability
(a) The trend of average reward.
Figure 8. Cont.
113
Drones 2022, 6, 339
1.05
1R&RPP
1 /RFDO&RPP1HW
$WW+LGGHQ
0.95 $WW0HVVDJH
DYHUDJHWDUJHW
0.9
0.85
0.8
0.75
0.7
0.65
0 0.2 0.4 0.6 0.8 1
ORVVSUREDELOLW\
(b) The trend of average target.
7.5 NoComm
LocalCommNet
AttHidden
7 AttMessage
collective target
6.5
5.5
4.5
0 0.2 0.4 0.6 0.8 1
ORVV probability
(c) The trend of collective target.
Figure 8. The variation trends of the metrics with communication loss probability.
It is conceivable that when the probability increases, the available messages gradually
dwindle, and the comprehensive performance of the former three algorithms with commu-
nication gradually deteriorates. The reason is that the reduction of useful messages leads
to increased conflicts between UAVs and a decrease in cooperation. Moreover, as the prob-
ability increases, the average targets of the former three algorithms gradually increase,
indicating that the UAVs shortsightedly maximize the number of targets tracked by indi-
viduals. In addition, when the communication is paralyzed, each UAV makes a completely
independent decision. It can be seen that the comprehensive performance of Att-Message
is the best, which reveals that while learning the communication policy, the UAVs can
also learn a better action policy for tracking targets. Therefore, even the communication
fails, and the improvement of the individual MTT capability can also feed back the overall
capability of the swarm to a certain extent.
In summary, when there is a communication failure, such as message loss or error,
the comprehensive performance of the communication and action policies learned by
the proposed algorithm would be affected to a certain extent, but it is also better than
the other three benchmark algorithms. The numerical results also demonstrate the robust-
ness of the learned policies.
114
Drones 2022, 6, 339
7. Discussion
As mentioned earlier, the research object of this paper is large-scale UAV swarms
in which each UAV can only communicate and interact locally with the adjacent ones
when making decisions. In local topology and ignoring other factors, the computational
complexity of the action and communication policies for processing (aggregating) a message
is assumed to be a unit, denoted as O(1), and the average cardinal number of the message
set is denoted as |c|.
Then, in the decentralized execution, the computational complexity of the action
and communication policies of each UAV is O(|c|) according to Equations (16) and (15),
respectively. In centralized training, the computational complexity of updating action
policy is also O(|c|) according to Equation (20), and that of the communication policy is
O(2|c|2 ) since a message can influence the communication and action decisions at the next
step of all adjacent UAVs according to Equation (22).
Therefore, similar to most MADRL algorithms adopting the CTDE framework, the
training of the proposed algorithm requires more computational resources than execu-
tion, which is suitable for offline implementation. The offline training in this paper was
deployed on the computer equipped with Intel (R) Xeon E5 CPU (Manufacturer: Intel
Corporation, Santa Clara, CA, USA) and GTX Titan X GPU (Manufacturer: ASUS,Taiwan)
, the operating system was Ubuntu 16.04 LTS, and the algorithm was implemented by
Pytorch. Then the learned policies can be performed online without retraining. The specific
requirement of computational resources should comprehensively take the constraints, such
as computing platform, neural network design and optimization, decision frequency and
so on, into consideration.
Moreover, in the observation of a target, we only consider the simple numerical
information, such as its location and speed, but not the real-time image, and the communi-
cation policy can also compress and encode the high-dimensional information to realize
lightweight embedding interaction. These can further improve the feasibility of the algo-
rithm in real-world scenarios.
115
Drones 2022, 6, 339
establish more accurate models to investigate how the physical aspects of both the UAVs
and targets would affect the MTT performance.
Author Contributions: Conceptualization, W.Z.; data curation, J.L.; formal analysis, W.Z.; funding
acquisition, J.L.; investigation, W.Z.; methodology, W.Z.; project administration, Q.Z.; software,
W.Z.; supervision, J.L.; validation, W.Z.; visualization, W.Z.; writing—original draft, W.Z.; writing—
review and editing, W.Z., J.L. and Q.Z. All authors have read and agreed to the published version of
the manuscript.
Funding: This research was funded by the Science and Technology Innovation 2030-Key Project
of “New Generation Artificial Intelligence” under Grant 2020AAA0108200.
Data Availability Statement: Not applicable.
Conflicts of Interest: The authors declare no conflict of interest.
References
1. Goldhoorn, A.; Garrell, A.; Alquezar, R.; Sanfeliu, A. Searching and tracking people in urban environments with static and
dynamic obstacles. Robot. Auton. Syst. 2017, 98, 147–157. [CrossRef]
2. Senanayake, M.; Senthooran, I.; Barca, J.C.; Chung, H.; Kamruzzaman, J.; Murshed, M. Search and tracking algorithms for swarms
of robots: A survey. Robot. Auton. Syst. 2016, 75, 422–434. [CrossRef]
3. Abdelkader, M.; Güler, S.; Jaleel, H.; Shamma, J.S. Aerial Swarms: Recent Applications and Challenges. Curr. Robot. Rep. 2021, 2,
309–320. [CrossRef] [PubMed]
4. Emami, Y.; Wei, B.; Li, K.; Ni, W.; Tovard, E. Joint Communication Scheduling and Velocity Control in Multi-UAV-Assisted Sensor
Networks: A Deep Reinforcement Learning Approach. IEEE Trans. Veh. Technol. 2021, 9545, 1–13. [CrossRef]
5. Maravall, D.; de Lope, J.; Domínguez, R. Coordination of Communication in Robot Teams by Reinforcement Learning.
Robot. Auton. Syst. 2013, 61, 661–666. [CrossRef]
6. Kriz, V.; Gabrlik, P. UranusLink—Communication Protocol for UAV with Small Overhead and Encryption Ability.
IFAC-PapersOnLine 2015, 48, 474–479. [CrossRef]
7. Khuwaja, A.A.; Chen, Y.; Zhao, N.; Alouini, M.S.; Dobbins, P. A Survey of Channel Modeling for UAV Communications.
IEEE Commun. Surv. Tutor. 2018, 20, 2804–2821. [CrossRef]
8. ZHOU, W.; LI, J.; LIU, Z.; SHEN, L. Improving multi-target cooperative tracking guidance for UAV swarms using multi-agent
reinforcement learning. Chin. J. Aeronaut. 2022, 35, 100–112. [CrossRef]
9. Bochmann, G.; Sunshine, C. Formal Methods in Communication Protocol Design. IEEE Trans. Commun. 1980, 28, 624–631.
[CrossRef]
10. Rashid, T.; Samvelyan, M.; Schroeder, C.; Farquhar, G.; Foerster, J.; Whiteson, S. QMIX: Monotonic value function factorisation
for deep multi-agent reinforcement learning. In Proceedings of the 35th International Conference on Machine Learning,
Stockholm, Sweden, 10–15 July 2018; Volume 80, pp. 4295–4304.
11. Son, K.; Kim, D.; Kang, W.J.; Hostallero, D.; Yi, Y. QTRAN: Learning to factorize with transformation for cooperative multi-agent
reinforcement learning. In Proceedings of the 36th International Conference on Machine Learning, Long Beach, CA, USA,
9–15 June 2019; pp. 10329–10346. Available online: https://arxiv.org/abs/1905.05408 (accessed on 20 September 2022).
12. Wu, S.; Pu, Z.; Qiu, T.; Yi, J.; Zhang, T. Deep Reinforcement Learning based Multi-target Coverage with Connectivity Guaranteed.
IEEE Trans. Ind. Inf. 2022, 3203, 1–12. [CrossRef]
13. Xia, Z.; Du, J.; Wang, J.; Jiang, C.; Ren, Y.; Li, G.; Han, Z. Multi-Agent Reinforcement Learning Aided Intelligent UAV Swarm for
Target Tracking. IEEE Trans. Veh. Technol. 2022, 71, 931–945. [CrossRef]
14. Sukhbaatar, S.; Szlam, A.; Fergus, R. Learning multiagent communication with backpropagation. In Proceedings
of the 30th Conference on Neural Information Processing Systems (NIPS 2016), Barcelona, Spain, 5–10 December 2016.
[CrossRef]
15. Hausknecht, M.; Stone, P. Grounded semantic networks for learning shared communication protocols. In Proceedings of the
International Conference on Machine Learning (Workshop), New York City, NY, USA, 19–24 June 2016.
16. Foerster, J.; Assael, Y.M.; de Freitas, N.; Whiteson, S. Learning to Communicate with Deep Multi-Agent Reinforcement Learning.
Adv. Neural Inf. Process. Syst. 2016, 29, 2137–2145.
17. Pesce, E.; Montana, G. Improving coordination in small-scale multi-agent deep reinforcement learning through memory-driven
communication. Mach. Learn. 2020, 109, 1–21. [CrossRef]
18. Peng, P.; Wen, Y.; Yang, Y.; Yuan, Q.; Tang, Z.; Long, H.; Wang, J. Multiagent Bidirectionally-Coordinated Nets: Emergence
of Human-Level Coordination in Learning to Play StarCraft Combat Games. 2017. Available online: https://arxiv.org/abs/1703
.10069 (accessed on 20 September 2022).
19. Jiang, J.; Lu, Z. Learning attentional communication for multi-agent cooperation. In Proceedings of the 32nd International
Conference on Neural Information Processing Systems, Montréal, QC, Canada, 3–8 December 2018; Volume 18, pp. 7265–7275.
116
Drones 2022, 6, 339
20. Liu, Y.; Wang, W.; Hu, Y.; Hao, J.; Chen, X.; Gao, Y. Multi-agent game abstraction via graph attention neural network.
In Proceedings of the AAAI 2020—34th AAAI Conference on Artificial Intelligence, New York, NY, USA, 7–12 December 2020;
pp. 7211–7218. [CrossRef]
21. Ding, G.; Huang, T.; Lu, Z. Learning Individually Inferred Communication for Multi-Agent Cooperation. In Proceedings of
the 34th Conference on Neural Information Processing Systems (NeurIPS2020), Vancouver, BC, Canada, 6–12 December 2020;
Volume 33, pp. 22069–22079.
22. Das, A.; Gervet, T.; Romoff, J.; Batra, D.; Parikh, D.; Rabbat, M.; Pineau, J. TarMAC: Targeted multi-agent communication.
In Proceedings of the 36th International Conference on Machine Learning, Long Beach, CA, USA, 9–15 June 2019; Volume 97, pp.
1538–1546.
23. Singh, A.; Jain, T.; Sukhbaatar, S. Learning when to Communicate at Scale in Multiagent Cooperative and Competitive Tasks.
arXiv 2018, arXiv:1812.09755.
24. Dibangoye, J.S.; Amato, C.; Buffet, O.; Charpillet, F. Optimally Solving Dec-POMDPs as Continuous-State MDPs. J. Artif. Intell.
Res. 2016, 55, pp.443–497. [CrossRef]
25. Sutton, R.S.; Barto, A.G. Temporal-difference learning. In Reinforcement Learning: An Introduction; MIT Press: Cambridge, MA,
USA, 1998; pp. 133–160.
26. Lillicrap, T.P.; Hunt, J.J.; Pritzel, A.; Heess, N.; Erez, T.; Tassa, Y.; Silver, D.; Wierstra, D. Continuous control with deep
reinforcement learning. arXiv 2015, arXiv:1509.02971.
27. Zhou, W.; Liu, Z.; Li, J.; Xu, X.; Shen, L. Multi-target tracking for unmanned aerial vehicle swarms using deep reinforcement
learning. Neurocomputing 2021, 466, 285–297. [CrossRef]
28. Veličković, P.; Cucurull, G.; Casanova, A.; Romero, A.; Lio, P.; Bengio, Y. Graph attention networks. arXiv 2017, arXiv:1710.10903.
29. Lee, J.B.; Rossi, R.A.; Kim, S.; Ahmed, N.K.; Koh, E. Attention Models in Graphs: A Survey. ACM Trans. Knowl. Discov. Data 2019,
13, 1–25. [CrossRef]
30. Zhou, J.; Cui, G.; Hu, S.; Zhang, Z.; Yang, C.; Liu, Z.; Wang, L.; Li, C.; Sun, M. Graph neural networks: A review of methods and
applications. AI Open 2020, 1, 57–81. [CrossRef]
117
drones
Article
A Lightweight Authentication Protocol for UAVs Based on
ECC Scheme
Shuo Zhang, Yaping Liu *, Zhiyu Han and Zhikai Yang
Abstract: With the rapid development of unmanned aerial vehicles (UAVs), often referred to as drones,
their security issues are attracting more and more attention. Due to open-access communication
environments, UAVs may raise security concerns, including authentication threats as well as the
leakage of location and other sensitive data to unauthorized entities. Elliptic curve cryptography
(ECC) is widely favored in authentication protocol design due to its security and performance.
However, we found it still has the following two problems: inflexibility and a lack of backward
security. This paper proposes an ECC-based identity authentication protocol LAPEC for UAVs.
LAPEC can guarantee the backward secrecy of session keys and is more flexible to use. The time cost
of LAPEC was analyzed, and its overhead did not increase too much when compared with other
authentication methods.
1. Introduction
Unmanned aerial vehicles (UAVs) have experienced rapid developments in recent
years and have attracted the interest of researchers [1]. They have been deployed for
many applications and missions such as data transmission, surveillance, cellular service
provisioning, package delivery, firefighting, traffic monitoring, military operations, agricul-
Citation: Zhang, S.; Liu, Y.; Han, Z.;
ture, etc. [2,3]. Here, a common UAV scenario (target surveillance as an example) is shown
Yang, Z. A Lightweight
Authentication Protocol for UAVs
in Figure 1.
Based on ECC Scheme. Drones 2023,
7, 315. https://doi.org/10.3390/
drones7050315
to the closest ground control station (GCS). Since the GCS connects to the data processing
center (DPC) through the network, it can send the data of the targets to the DPC. Finally,
the DPC utilizes the data from the GCS to analyze the behavior of the targets.
UAV communication relies on wireless channels, which makes UAVs vulnerable to
many attacks such as replay attacks, man-in-the-middle attacks, and masquerading attacks.
These attacks can have serious consequences, which can lead to commercial and non-
commercial losses. Attackers may also aim to exploit these UAVs to eavesdrop on sensitive
information, tamper with data, or cause malicious interference [4,5].
With the rapid development of the internet of drones (IoD), the security of the IoD is
becoming more and more important. Among many, authentication is one of the research
hotspots in the field of IoD security. Because most drones have shortcomings (such as
low computing power, small storage space, etc.), it is difficult to directly apply traditional
identity authentication and key agreement protocols within the IoD [2]. Thus, it is necessary
to design identity authentication and key agreement protocols suitable for the IoD [3]. The
traditional security provisioning applicable to distributed networks fails to give similar
results for UAVs [6]. The large-scale deployment of UAVs is hindered due to these many
security challenges [7,8].
Aiming at the lack of a pre-registration process and the backward security of session
keys in the EDHOC (ephemeral Diffie–Hellman over COSE) protocol, an ECC (elliptic curve
cryptography)-based authentication protocol for the IoD (called LAPEC) is proposed in this
paper, which can achieve high security and an acceptable time overhead. A formal security
proof is given for the proposed LAPEC protocol to demonstrate its security properties. At
the end of the paper, a time cost analysis and a comparison of LAPEC with other protocols
are carried out.
120
Drones 2023, 7, 315
Ever [19] proposed an authentication framework for the IoD using elliptic curve
cryptosystems, but it still has some of the inherent issues of ECC. Tao [20] has proposed
a two-way identity authentication scheme based on the SM2 algorithm and adopted the
pre-shared secret information to improve the efficiency of authentication, but how to
securely pre-share the secret is also an issue. Lin [21] proposed a certificate signing based
on elliptic curve multiple authentication schemes, but it still has inherent issues with
certificate mechanisms.
Some related work can be found in [22–32], and we will compare ours with them
when analyzing the time cost in Section 5.
121
Drones 2023, 7, 315
implementation, which will cause the UAV devices to face extreme inflexibility during
large-scale deployment and use. It may bring a lot of inconvenience.
• Session key backward security.
Backward security, also known as future security or post-compromise security (PCS),
was formally defined by Katriel et al. [35]. Backward security means that after the long-term
key or session key is leaked or compromised, the security of messages after the session can
still be guaranteed.
The scheme of EDHOC relies on the automatic update of the symmetric session key
after completing the authentication and key negotiation process. Therefore, EDHOC needs
to use the symmetric session key to secure the subsequent messages. Once the session key
is leaked or compromised, the subsequent messages will face significant security risks, that
is to say, backward secrecy cannot be guaranteed.
3. Proposed Scheme
3.1. Design Principles
Aiming at the problems of inflexibility and security of EDHOC, this section proposes
an enhanced elliptic-curve-based lightweight authentication protocol for IoD, which is
named LAPEC (lightweight authentication protocol over elliptic curve). The main design
ideas are as follows:
(1) In view of the inflexibility of EDHOC use, the corresponding pre-registration steps
are designed to reduce the use of public key certificates of both parties, and users do not
need to configure the public key in advance, which is flexible in large-scale deployment
and use.
(2) For backward security, based on the non-interactive zero-knowledge proof protocol,
a corresponding session key update mechanism is designed to ensure the security of
message communication. Even if the session key is leaked, the attacker cannot complete
the zero-knowledge proof, so the key cannot be modified and session-backward security
is guaranteed. In the session key update phase, the Schnorr zero-knowledge proof is
introduced to design the session key update process.
In the LAPEC protocol, (1) a pre-registration process is added, which is before the
authentication process, and (2) a new session key update process is designed using the
zero-knowledge proof to increase the backward security of the session key.
Therefore, the LAPEC protocol consists of three phases: the pre-registration phase,
the authentication phase, and the session key update phase. Figure 3 shows the general
process of interaction flow of a LAPEC message.
122
Drones 2023, 7, 315
Symbols Meaning
DEV, GWN the UAV DEV, its ground control station (gateway) GWN
P_A, P_B Ephemeral public key for device A and B
P_D, P_G Authentication public key of the device and the gateway
ID_CRED_D, ID_CRED_G The public key identifier of the device and the gateway
Additional data are encrypted with authentication using a key K
AEAD(K;(Plaintext))
derived from the shared secret
Extract Pseudorandom key generation function
Expand Symmetric key generation function
MAC Message authentication code
tD The current timestamp of the device
tG The current timestamp of the gateway
Δt Maximum time interval allowed
H_m Hash of message data
H(*) Collision resistant hash function
|| Connect operation
⊕ XOR operation
The proposed LAPEC scheme mainly includes the pre-registration phase, the authenti-
cation and key negotiation phase, and the key update phase. Figure 4 shows the interaction
messages during the scheme process.
'HYLFH *:1
3B'__,'B&5('B'__+ ,'B&5('B'__3B' !
3UHUHJLVWUDWLRQ
3B*__,'B&5('B*__+ ,'B&5('B*__3B* !
,'B&5('B'__3B$__VXLWHBW'! 0HDVXULQJ
$XWKB'__W'! 9HULI\
(QF 6.*B< !
6HVVLRQNH\XSGDWH
123
Drones 2023, 7, 315
'HYLFH *:1
3UHUHJLVWUDWLRQ
6WHS,'B&5('B'__3B$__VXLWHBW'! 6WHS0HDVXULQJ
6WHS$XWKB'__W'! 6WHS9HULI\
6HVVLRQNH\XSGDWH
(1) Step 1
Firstly, the UAV device generates the current timestamp tD1 to determine the freshness
of the message, selects a random number A, and calculates the ephemeral public key:
P_A = A × P.
Secondly, the device needs to determine the cipher suite suite_D. The function of the
suite parameter is to ensure that both parties use the same cipher algorithm in the next
protocol process, especially to determine the AEAD algorithm that both parties need to use
and the parameters required by the Extract and Expand functions to generate a key.
Finally, the device connects the above parameters and sends Message_1 (Step 1 shown
in Figure 5) to the ground station GWN via the open channel:
Message_1 = ID_CRED_D||P_A||suite_D||tD1
(2) Step 2
After the ground station, GWN receives the first message, it first needs to extract and
verify the parameters (Step 2: Measuring shown in Figure 5). It mainly checks whether
the time when GWN receives the message meets the timeliness and whether it supports
124
Drones 2023, 7, 315
the cipher suite suite_D contained in Message_1. For timeliness, it records the current
timestamp tG1 , and judges: |tG1 − tD1 | < Δt?
If the above decoding or verification process fails, GWN must send back an authenti-
cation error message and abort the process. If GWN does not support the selected cipher
suite, it will return the parameter suite_G containing its own supported cipher suites.
(3) Step 3
After successfully decoding Message_1, the ground station GWN selects a random
number B, calculates the ephemeral public key, and saves it as its own temporary public–
private key pair: P_B = B × P.
In the process of identity authentication and key generation, corresponding crypto-
graphic algorithms are required to encrypt plaintext or decrypt ciphertext. The Extract and
Expand functions are used with a hash algorithm in the selected cipher suite to derive the
key. Extract is used to derive a uniform pseudorandom key (PRK) of fixed length from
the shared secret. Expand is used to derive other key material from PRK. The process of
generating the intermediate key PRK is as follows: PRK = Extract(salt, IKM), where salt is
the added salt value, and IKM represents the input key material. The Extract function is
specifically determined by the suite parameter in Step 1.
The keys used in LAPEC are derived from PRK using Expand function, and the
process of generating the symmetric key K is as follows: K = Expand(PRK, H). Among
them, PRK is the pseudo-random intermediate key generated by the above Extract function,
and H represents the text hash value of a certain message.
The ground station GWN first needs to calculate the shared secret P_AB according
to P_A and B: P_AB = B × P_A. GWN uses P_AB to calculate the first and the second
PRK: PRK_1 = Extract(tD1 , P_AB), PRK_2 = Extract(PRK_1, P_GA). Among them, P_GA is
the shared secret calculated from P_A and G: P_GA = G × P_A.
After the generation of the PRK is completed at the ground station GWN, the generation
of the symmetric key K used for authentication needs to be performed. GWN first needs to
generate K_1 using the Expand function described in Step 2, the generated PRK_1, and text
hash H_1. The calculations of H_1 and K_1 are as follows:
external_aad_G = AEAD(H_1||P_G||tG1 )
125
Drones 2023, 7, 315
(4) Step 4
After receiving message_2, the UVA device should handle message_2 (Step 4: Verify
shown in Figure 5) as follows
1. Decode message_2 and record the timestamp to determine the freshness of the message.
2. XOR the Auth_G with the key K_1 to decrypt the Auth_G field.
3. Verify MAC_2 using the algorithm in the selected cipher suite.
If the timestamp or AEAD algorithm fails to verify the authentication packet of MAC_2,
an error message is returned and the protocol process is aborted.
The UAV device also needs to calculate the shared secret P_AB according to P_B and
A. The calculation process is P_AB = A × P_B. Next, similar to Step 3, the UAV device also
uses P_AB to calculate PRK: PRK_1 = Extract(tD1 , P_AB), PRK_2 = Extract(PRK_1, P_GA),
PRK_3 = Extract(PRK_2, P_DB), where P_DB = D × P_B.
The UAV device also needs to generate K_1 :
After the key is generated, the verification process can be performed. The UAV device
first performs the following XOR decryption for the Auth_G part:
Then, it uses the generated K_2 as the key to decrypt the MAC_2 :
where AEAD_dec(K, M) is a decryption function that uses the key K to decrypt and verify
the encrypted message M. AEAD determines whether the key K_2 is correct or not by
comparing the decrypted auxiliary authentication data:
external_aad_G = external_aad_G?
(5) Step 5
After the UAV device completes the processing of the authentication data packet to the
ground station, if the authentication is passed, it constructs Message_3. During the verification
process in Step 3, the UAV device has completed the calculation of the pseudo-random keys
PRK_1, PRK_2, and PRK_3, as well as the keys K_1 and K_2 used for verification. In order to
construct the authentication data packet MAC_3, the UAV device first calculates the text hash
value H_2 as follows: H_2 = H(H_1||Auth_G||P_B||tG1 ). K_3 is constructed using H_2 and
pseudo-random key PRK_3 as follows: K_3 = Expand(PRK_3, H_2).
Similar to Step 3, the additional authentication data of MAC_3 are constructed as follows:
external_aad_D = H_1||P_D||tG1
At this point, the UAV device can construct MAC_3 as:
Finally, the UAV device calculates the encryption key K_4 of Auth_D:
126
Drones 2023, 7, 315
The UAV device connects the generated Auth_D with the timestamp to get the final
Message_3 (Step 5 shown in Figure 5) and sends it to GWN.
Message_3 = Auth_D||tD2
(6) Step 6
After the ground station, GWN receives the corresponding Message_3. It first needs
to authenticate the device (Step 6: Verify shown in Figure 5). The intermediate pseudo-
random key has been calculated in Step 3. At this time, the gateway needs to calculate H_2,
K_3 , and K_4 :
H_2 = H(H_1||Auth_G||P_B||tG1 )
SK = Expand(PRK_3, H_3)
Both parties encrypt subsequent messages and communicate via SK.
'HYLFH *:1
6HVVLRQZLWKQHZ6.
127
Drones 2023, 7, 315
After successfully completing the mutual authentication and key negotiation process,
both parties should communicate by sharing the secret session key SK. If the session key
needs to be updated (i.e., the session key has a valid time), either party will initiate a key
update request.
The entity that needs to initiate the update of the session key (presumably the D
party) first selects a random number X and calculates the temporary public–private
key pair G_X = X × P. Party D calculates the following results and sends them to
GWN: c = H(P_D||G_X ).
The UAV device constructs the following response based on challenge c: z = X + c × D,
where D is the authentication public key of the UAV device, and c is the challenge result
calculated by the above formula. The device constructs and sends a session key change
request message (step Change SK in Figure 6): Message_ChangeSK = Enc(SK, G_X ||z||tD ).
After GWN receives the session key change request, it checks the following steps (step
Verify in Figure 6):
1. Decode the message and obtain and check the freshness of the message.
2. Calculate random challenges.
3. Calculate and check:
z × P = G_X + c × P_D?
If not, the receiver aborts the session key update procedure and returns an update
failure error message. If so, GWN considers that the identity of the requester for updating
the session key is legitimate, and the receiver generates the updated session key according
to the following steps:
P_GX = X × P_G
Next, both parties calculate:
H_4 = H(Message_ChangeSK)
4. Security Analysis
4.1. Security Properties Analysis
In this section, the security properties of LAPEC are discussed. The LAPEC protocol
has five security attributes: backward security, anti-replay attack, forward security, anti-
masquerade attack, and session key confidentiality. However, the EDHOC protocol has
four security attributes, which are shown in Table 2:
128
Drones 2023, 7, 315
Theorem 1. The LAPEC protocol can inherit the anti-replay attack, forward security, anti-masquerading
attack, and session key confidentiality of EDHOC in the authentication negotiation process.
Proof. Since the pre-registration process added to the LAPEC protocol does not change
the key calculation in the authentication protocol phase, the LAPEC protocol can inherit
the security properties of EDHOC during the authentication phase. According to the
formal analysis of EDHOC using Tamarin tools, LAPEC can at least inherit the following
security properties: forward security, session key independence, anti-replay attack, and
anti-masquerading attack.
Lemma 1. The LAPEC protocol has the security properties of anti-replay attack, anti-masquerading
attack, and session key confidentiality.
Proof. According to Theorem 1, the LAPEC protocol can inherit the relevant security
properties of EDHOC in the authentication negotiation process, so the LAPEC protocol
has the security properties of anti-replay attack, anti-masquerading attack, and session key
confidentiality.
Suppose that attacker A can launch different attacks by interrogating the oracles as
shown in Table 3.
Oracle Description
Creat (D, r, G) Create a new session oracle with peer G as D’s identity r
Send (D, i, M) Execute and return the result at the ith session oracle of D
Corrupt (C) Leak C’s long-term key
If b = 1, C outputs the current session key SK. If b = 0, C returns a
Test-session (s)
random number. If no session key is generated, returns null.
Randomness (C, i) Leak the random number in the ith session of C
Session-key (s) Leaked session key SK
Hsm (C) Hardware security module for C
Guess (b) End game
Definition 1. After receiving the last expected message M3, C will generate a session key and enter
the accept state. All communication messages M1, M2, and M3 are concatenated in sequence to
form a session identifier.
Definition 2. If D and G meet the following conditions, they are defined as a partnership: (1) D
and G are both in the accepted state; (2) D and G authenticate each other and share the same
session ID.
Definition 4. Attacker A has the following equation for the ECDLP problem within time tA :
ε is the advantage of A for the semantic safety of the ECDLP problem within time tA .
129
Drones 2023, 7, 315
Send queries and Execute queries qH , qS , qE times, and session-key queries, respectively.
Then, for A, we have:
AdvC PCS ≤ 2qH /2lH + 10qS /2lr + 4qS AdvA ECDLP (tA )
Proof. Game 0, Game 1, Game 2, Game 3, Game 4, Game 5 are a defined set of games, and
Succi is the probability of correctly guessing coin b in Game i.
Game 0: Assume that Game 0 is the same as the actual scheme in the random oracle, with:
Game 1: Query the oracle in Game 1. Since Game 0 and Game 1 are indistinguishable,
there are:
Pr[Succ0 ] = Pr[Succ1 ]
Game 2: Game 2 considers that the Hash function collides with the key update
message. According to the birthday paradox, the probability of Hash query collision is at
most qH /2lH , so there are:
Game 3: The adversary tries to query the oracle machine to guess the random number
directly from the message. The probability of guessing the random number will not exceed
2qS /2lr . Therefore, there are:
Thus:
AdvC PCS ≤ 2qH /2lH + 10qS /2lr + 4qS AdvA ECDLP (tA )
The theorem is proved.
130
Drones 2023, 7, 315
and communication overhead are mainly considered [36–39]. The primitive operation and
time overhead of the authentication protocol based on ECC are shown in Table 4:
In the table, User represents the user of the UAV, while GWN and UAV represent the
ground control station (gateway) and the UAV, respectively. TSM represents the overhead
of the ECC scalar multiply operation, TA represents the overhead of the point-add oper-
ation, TH represents the overhead of the hash operation, TS represents the overhead of
symmetric encryption/decryption, and TEX represents the exponential function to execute
the computational complexity.
In terms of communication overhead, the LAPEC protocol only needs to perform
the interaction in the pre-registration phase when the LAPEC protocol is connected for
the first time and it is quite small. The pre-registration phase only performs 2TH which
costs almost 10% of the authentication phase computation cost (6TSM + 12TH + 4TS ). After
the second connection, only the overhead of the authentication phase and the session key
update phase is considered.
• For the authentication phase:
In order to facilitate the time cost comparison without the hardware platform, re-
fer to the experimental results of Roy et al. [32]. The overhead of hash operations and
symmetric encryption and decryption operations is about 8% and 14% of elliptic curve
scalar multiplication operations. As it is shown in Table 4, LAPEC has a computational
overhead similar to most schemes in the authentication phase (for example, schemes such
as Lu [28], Bander [30], Deebak [31], etc.). However, the computation cost of LAPEC is
a little higher than the scheme of Saeed [29]. What is more, LAPEC is better than some
ECC-based schemes.
• For the session key update phase:
Since some schemes do not design corresponding key update steps, this paper uses
the default key Diffie–Hellman exchange for comparison.
As we can see, LAPEC needs to complete the zero-knowledge proof in the key update
phase, so one more scalar multiplication operation TM is required. We perform a zero-
knowledge proof session key update phase after five traditional update processes. In this
update method, the phase only increases the computational overhead by about 8% but still
maintains backward security.
131
Drones 2023, 7, 315
From Table 5, LAPEC adds message overhead in the pre-authentication phase, but it
only needs to be considered when connecting for the first time, and it is only a small part
of the overall connection process (in the experiment, less than 10%).
For the key update phase, it increases the message size by about 50%. However, it is
only about 14% compared to the authentication phase messages. Considering that most of
the actual overhead is the channel delay of message exchange, these increases are acceptable
as long as the number of message exchanges during the update phase is guaranteed to
be equal.
At the same time, it can be seen that in the process of protocol implementation, the
number of public key operations such as elliptic curve scalar multiplication between the two
parties should be minimized, and the number of message exchanges should be controlled.
6. Conclusions
This paper proposed an ECC-based identity authentication protocol LAPEC for UAVs.
We introduced the interaction process of the LAPEC protocol in detail, and we proved that
it has session key backward security. In the end, we compared the LAPEC protocol with
other authentication protocols and found that the time overhead of the LAPEC protocol
is small. However, due to the need to increase the backward security in the key update
phase, the time overhead in the session key update phase only increased by about 8%.
Since the pre-authentication phase is only required when connecting for the first time, the
extra overhead added to the pre-authentication phase was only about 10% of the entire
authentication process.
132
Drones 2023, 7, 315
In the future, we will continue to optimize the LAPEC protocol and apply it in multiple
scenarios such as the authentication between UAV–UAV communications.
Author Contributions: Conceptualization, S.Z. and Z.H.; methodology, S.Z.; validation, S.Z., Z.H.,
and Z.Y.; formal analysis, S.Z.; writing—original draft preparation, S.Z. and Z.H.; writing—review
and editing, S.Z.; supervision, Y.L. All authors have read and agreed to the published version of
the manuscript.
Funding: This research was supported by the Major Key Project of PCL (Grant No. PCL2022A03,
PCL2021A02, PCL2021A09) and the Key-Area Research and Development Program of Guangdong
Province (Grant No. 2019B010137005).
Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.
Data Availability Statement: The data presented in this paper will be made available on request via
the author’s email with appropriate justification.
Conflicts of Interest: The authors declare no conflict of interest.
References
1. Mozaffari, M.; Saad, W.; Bennis, M.; Nam, Y.-H.; Debbah, M. A tutorial on UAVs for wireless networks: Applications, challenges,
and open problems. IEEE Commun. Surv. Tutor. 2018, 21, 2334–2360. [CrossRef]
2. Hayat, S.; Yanmaz, E.; Muzaffar, R. Survey on Unmanned Aerial Vehicle Networks for Civil Applications: A Communications
Viewpoint. IEEE Commun. Surv. Tutor. 2016, 18, 2624–2661. [CrossRef]
3. Motlagh, N.H.; Taleb, T.; Arouk, O. Low-Altitude Unmanned Aerial Vehicles-Based Internet of Things Services: Comprehensive
Survey and Future Perspectives. IEEE Internet Things J. 2016, 3, 899–922. [CrossRef]
4. Jangirala, S.; Das, A.K.; Kumar, N.; Rodrigues, J. Tcalas: Temporal credential-based anonymous lightweight authentication
scheme for internet of drones environment. IEEE Trans. Veh. Technol. 2019, 68, 6903–6916.
5. Li, B.; Fei, Z.; Zhang, Y.; Guizani, M. Secure UAV Communication Networks over 5G. IEEE Wirel Commun. 2019, 26, 114–120.
[CrossRef]
6. Gaurang, B.; Naren, N.; Vinay, C.; Biplab, S. SHOTS: Scalable Secure Authentication-Attestation Protocol Using Optimal Trajectory
in UAV Swarms. IEEE Trans. Veh. Technol. 2022, 71, 5827–5836.
7. Kaufman, C.; Hoffman, P.; Nir, Y.; Eronen, P.; Kivinen, T. RFC 7296: Internet Key Exchange Protocol Version 2 (IKEv2); RFC Editor;
IETF: Fremont, CA, USA, 2014.
8. Rescorla, E. RFC 8446: The Transport Layer Security (TLS) Protocol Version 1.3; RFC Editor; IETF: Fremont, CA, USA, 2018.
9. Zhong, C.; Yao, J.; Xu, J. Secure uav communication with cooperative jamming and trajectory control. IEEE Commun. Lett. 2018,
23, 286–289. [CrossRef]
10. Zeng, Y.; Zhang, R. Energy-efficient uav communication with trajectory optimization. IEEE Trans. Wirel. Commun. 2017, 16, 3747–3760.
[CrossRef]
11. Grover, A.; Berghel, H. A survey of RFID deployment and security issues. Inf. Process. Syst. 2011, 7, 561–580. [CrossRef]
12. Gope, P.; Sikdar, B. An efficient privacy-preserving authenticated key agreement scheme for edge-assisted internet of drones.
IEEE Trans. Veh. Technol. 2020, 69, 13621–13630. [CrossRef]
13. Gope, P.; Millwood, O.; Saxena, N. A provably secure authentication scheme for RFID-enabled UAV applications. Comput.
Commun. 2021, 166, 19–25. [CrossRef]
14. Khattab, A.; Jeddi, Z.; Amini, E.; Bayoumi, M. RFID Security Threats and Basic Solutions; Springer International Publishing: Cham,
Switzerland, 2017; pp. 27–41.
15. Lopez, P.P.; Hernandez-Castro, J.C.; Estevez-Tapiador, J.M.; Ribagorda, A. RFID Systems: A Survey on Security Threats and Proposed
Solutions; Springer: Berlin/Heidelberg, Germany, 2006; pp. 159–170.
16. Suh, G.; Devadas, S. Physical unclonable functions for device authentication and secret key generation. In Proceedings of the
Design Automation Conference (DAC ’07), San Diego, CA, USA, 4–6 June 2007.
17. Sung, J.Y.; Ashok, K.D.; Youngho, P.; Pascal, L. SLAP-IoD: Secure and lightweight authentication protocol using physical
unclonable functions for internet of drones in smart city environments. IEEE Trans. Veh. Technol. 2022, 71, 10374–10388.
18. Bansal, G.; Sikdar, B. S-MAPS: Scalable Mutual Authentication Protocol for Dynamic UAV Swarms. IEEE Trans. Veh. Technol.
2021, 70, 12088–12100. [CrossRef]
19. Wazid, M.; Das, A.K.; Kumar, N.; Vasilakos, A.V.; Rodrigues, J.J. Design and analysis of secure lightweight remote user
authentication and key agreement scheme in internet of drones deployment. IEEE Internet Things J. 2018, 6, 3572–3584. [CrossRef]
20. Ever, Y.K. A secure authentication scheme framework for mobile-sinks used in the Internet of Drones applications. Comput.
Commun. 2020, 155, 143–149. [CrossRef]
133
Drones 2023, 7, 315
21. Tao, X.; Jun, H. An Identity Authentication Scheme Based on SM2 Algorithm in UAV Communication Network. Wirel. Commun.
Mob. Comput. 2022, 4, 1–10.
22. Lin, L.; Xiao, F.L.; Yu, L.W.; Tan, L. CSECMAS: An Efficient and Secure Certificate Signing Based Elliptic Curve Multiple
Authentication Scheme for Drone Communication Networks. Appl. Sci. 2022, 12, 9203. [CrossRef]
23. Hankerson, D.; Vanstone, S.; Menezes, A.J. Guide to Elliptic Curve Cryptography; Springer Science & Business Media: Berlin/Heidelberg,
Germany, 2006.
24. Cohn-Gordon, K.; Cremers, C.; Garratt, L. On post-compromise security. In Proceedings of the 2016 IEEE 29th Computer Security
Foundations Symposium (CSF), Lisboa, Portugal, 27 June–1 July 2016.
25. He, Y.X.; Sun, F.J.; Li, Q.A.; He, J.; Wang, L.M. A survey on public key mechanism in wireless sensor networks. Jisuanji Xuebao/Chin.
J. Comput. 2020, 43, 381–408.
26. Huang, Z.; Wang, Q. A PUF-based unified identity verification framework for secure IoT hardware via device authentication.
World Wide Web 2020, 23, 1057–1088. [CrossRef]
27. Li, X.; Liu, J.; Ding, B.; Li, Z.; Wu, H.; Wang, T. A SDR-based verification platform for 802.11 PHY layer security authentication.
World Wide Web 2020, 23, 1011–1034. [CrossRef]
28. Shao, S.; Chen, F.; Xiao, X.; Gu, W.; Lu, Y.; Wang, S.; Tang, W.; Liu, S.; Wu, F.; He, J.; et al. IBE-BCIOT: An IBE based cross-chain
communication mechanism of blockchain in IoT. World Wide Web 2021, 24, 1665–1690. [CrossRef]
29. Xu, X.; Zhu, P.; Wen, Q.; Jin, Z.; Zhang, H.; He, L. A secure and efficient authentication and key agreement scheme based on ECC
for telecare medicine information systems. J. Med. Syst. 2014, 38, 1–7. [CrossRef]
30. Wu, F.; Li, X.; Sangaiah, A.K.; Xu, L.; Kumari, S.; Wu, L.; Shen, J. A lightweight and robust two-factor authentication scheme
for personalized healthcare systems using wireless medical sensor networks. Future Gener. Comput. Syst. 2018, 82, 727–737.
[CrossRef]
31. Jiang, Q.; Ma, J.; Wei, F.; Tian, Y.; Shen, J.; Yang, Y. An untraceable temporal-credential-based two-factor authentication scheme
using ECC for wireless sensor networks. J. Netw. Comput. Appl. 2016, 76, 37–48. [CrossRef]
32. Li, X.; Niu, J.; Bhuiyan, M.Z.A.; Wu, F.; Karuppiah, M.; Kumari, S. A robust ECC-based provable secure authentication protocol
with privacy preserving for industrial Internet of Things. IEEE Trans. Industr. Inform. 2018, 14, 3599–3609. [CrossRef]
33. Li, X.; Niu, J.; Kumari, S.; Wu, F.; Sangaiah, A.K.; Choo, K.-K.R. A three-factor anonymous authentication scheme for wireless
sensor networks in IoT environments. J. Netw. Comput. Appl. 2018, 103, 194–204. [CrossRef]
34. Chang, I.P.; Lee, T.F.; Lin, T.H.; Liu, C.M. Enhanced two-factor authentication and key agreement using dynamic identities in
wireless sensor networks. Sensors 2015, 15, 29841–29854. [CrossRef] [PubMed]
35. Lu, Y.R.; Xu, G.Q.; Li, L.X.; Yang, Y. Anonymous three-factor authenticated key agreement for wireless sensor networks. Wirel.
Netw. 2019, 25, 1461–1475. [CrossRef]
36. Chatterjee, S.; Roy, S.; Das, A.K.; Chattopadhyay, S.; Kumar, N.; Vasilakos, A.V. Secure biometric-based authentication scheme
using chebyshev chaotic map for multi-server environment. IEEE Trans. Dependable Secur. Comput. 2018, 15, 824–839. [CrossRef]
37. Saeed, U.J.; Irshad, A.A.; Fahad, A.; Adnan, S.K. A Verifiably Secure ECC Based Authentication Scheme for Securing IoD Using
FANET. IEEE Access 2022, 10, 95321–95343.
38. Bander, A.A.; Ahmed, B.; Shehzad, A.C. A Resource-Friendly Authentication Protocol for UAV-Based Massive Crowd Manage-
ment Systems. Secur. Commun. Netw. 2021, 2021, 3437373. [CrossRef]
39. Deebak, B.D.; Al-Turjman, F. A smart lightweight privacy preservation scheme for IoT-based UAV communication systems.
Comput. Commun. 2020, 162, 102–117. [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.
134
drones
Article
Connectivity-Maintenance UAV Formation Control in
Complex Environment
Liangbin Zhu 1 , Cheng Ma 2, *, Jinglei Li 2 , Yue Lu 2 and Qinghai Yang 2
1 School of Information and Electronics, Beijing Institute of Technology, Beijing 100081, China;
3420205018@bit.edu.cn
2 School of Telecommunications Engineering, Xidian University, Xi’an 710071, China
* Correspondence: chengma@stu.xidian.edu.cn
Abstract: Cooperative formation control is the research basis for various tasks in the multi-UAV
network. However, in a complex environment with different interference sources and obstacles,
it is difficult for multiple UAVs to maintain their connectivity while avoiding obstacles. In this
paper, a Connectivity-Maintenance UAV Formation Control (CMUFC) algorithm is proposed to
help multi-UAV networks maintain their communication connectivity by changing the formation
topology adaptively under interference and reconstructing the broken communication topology of
a multi-UAV network. Furthermore, through the speed-based artificial potential field (SAPF), this
algorithm helps the multi-UAV formation to avoid various obstacles. Simulation results verify that
the CMUFC algorithm is capable of forming, maintaining, and reconstructing multi-UAV formation
in complex environments.
1. Introduction
With the rapid development of UAV technology, the multi-UAV network is widely
used in civil and military fields such as disaster rescue, air reconnaissance, etc. [1–5]. The
Citation: Zhu, L.; Ma, C.; Li, J.; Lu, Y.;
communication topology of the multi-UAV network affects its work efficiency, and its
Yang, Q. Connectivity-Maintenance
UAV Formation Control in Complex
connectivity can be maintained by controlling the topology of the multi-UAV formation.
Environment. Drones 2023, 7, 229.
Therefore, it is fundamental to carry out research on the formation topology control
https://doi.org/10.3390/ of multi-UAVs. During the actual flight, the limited communication range of UAVs
drones7040229 and different environmental factors, such as obstacles and interference sources, will
affect the connectivity of the multi-UAV network. Thus, it is necessary to maintain
Academic Editors: Zhihong Liu,
connectivity of the entire multi-UAV network by controlling multi-UAV formations in
Shihao Yan, Kehao Wang
complex environments [6].
and Yirui Cong
Scholars at home and abroad have conducted many studies on connectivity main-
Received: 28 February 2023 tenance between UAVs. An UAV formation control law was proposed to generate a
Revised: 20 March 2023 leader–follower structure based on consistency under the balance of control constraints
Accepted: 23 March 2023 and communication constraints, so as to avoid collisions and maintain connectivity
Published: 26 March 2023 between UAVs [7]. The authors in [8] proposed using the graph coalition formation
game to model the cooperation between UAVs, which can quickly restore the required
connectivity between UAV networks. In [9], the connectivity methods were compared in
four application scenarios, mainly by increasing or decreasing the communication links
Copyright: © 2023 by the authors.
between UAVs to increase or decrease the connectivity of UAV clusters. A connectivity
Licensee MDPI, Basel, Switzerland.
This article is an open access article
tracking algorithm was proposed to track the connectivity distribution over time, and
distributed under the terms and
the results are analyzed. The authors in [10] used the second-order integral characteristic
conditions of the Creative Commons to solve the time-varying formation tracking control problem of multiple UAVs. We
Attribution (CC BY) license (https:// consider the correspondence between multi-UAV connectivity and formation control
creativecommons.org/licenses/by/ and maintain the connectivity of multi-UAV networks through formation control in
4.0/). complex environments. These papers also consider the problem of UAV formation flight
in the case of limited communication. The authors in [11] studied the formation control
problem of multiple agents in the noise environment and transformed the formation
control problem into the convergence problem of the infinite product of general random
matrix sequences. A flight strategy was proposed to improve the multi-UAV cooperative
search ability under the condition of limited resources. A multi-UAV cooperative search
model was established. The optimization function of the model considers communica-
tion cost and formation benefit to ensure multi-UAV Effectiveness of Human–Machine
Search [12]. A new adaptive formation control method was proposed for UAVs with
limited leader information and communication. The method was extended to replace
the leader with adjacent UAVs, where the leader can convey location and direction
information [13].
In addition, the formation obstacle avoidance problem of UAVs needs to be con-
sidered in the process of formation flight. The aim is the formation and maintenance
of a specific configuration to adapt to mission requirements and friendly aircrafts.
Currently widely used strategies include the leader–follower method [14], virtual
structure method [15], behavior-based control method [16], and the consensus algo-
rithm [17]. Among them, the algorithm based on consensus theory emphasizes the
synchronization, cooperation, and substitutability among individuals. This algorithm
meets the characteristics of decentralization, autonomy, and autonomy of UAVs; it thus
gradually became the main method and research direction to solve in the formation
control of UAVs. In addition, the obstacle avoidance problem of UAVs needs to be
considered in the process of formation flight. The artificial potential field (APF) al-
gorithm proposed by Khatib [18] in 1986 stands out among many obstacle avoidance
algorithms because of its simple structure, easy real-time control, and rapid response to
environmental changes. An observer-based memory consensus protocol was proposed
in [19] for achieving the consensus of nonlinear multi-agent systems with Markov
switching topologies. This approach was applicable for an observer-based nonlinear
multi-agent system which was described by switched undirected topologies. In [20],
the authors solved the consensus problem in multi-agent systems with Markov jump,
time-varying delay, and uncertainties. In [21], the authors developed a consistent
algorithm to decompose the motion of UAV into three directions, but the constraint
processing of instructions in the algorithm convergence process is too cumbersome,
which is not conducive to engineering implementation. The authors in [22] introduced
a particle swarm optimization algorithm to deal with static and dynamic obstacles.
They added UAV formation configuration requirements to the consensus algorithm.
An adaptive distributed control algorithm was proposed to realize the problem of
cooperative formation of heterogeneous vertical take-off and landing UAVs under the
condition of parameter uncertainty in [23]. In [24], the authors developed a novel de-
centralized adaptive consensus formation control method. Each UAV sets a coordinate
and controls its relative position with adjacent UAVs to obtain the desired formation.
A multi-UAV formation system based on the leader–follower model was proposed
in [25]. The follower predicts the state of the leader, maintains a relative position in
the formation, and finally reaches a consensus with the leader. A topology control
algorithm was proposed in [26] to complete the distributed communication mainte-
nance and formation configuration of four quadrotor UAVs. However, the security
requirements for the long-running machine in the cluster are very high.
In this paper, aiming at connectivity maintenance of a multi-UAV network and
obstacle avoidance of multi-UAV formation, we design a formation control algorithm
to overcome the connectivity maintenance and obstacle avoidance problem. The
main challenge is to design an excellent formation control algorithm to ensure the
connectivity and security of the multi-UAV network during the actual flight due to the
limited communication range of UAVs and the existence of different environmental
factors, such as obstacles and interference sources. Specifically, the formation switching
of the multi-UAV network or the failure of some communication networks will cause
136
Drones 2023, 7, 229
2. System Model
As shown in Figure 1, this paper considers the formation control problem of multi-
UAV connectivity maintenance in complex environments where there are K interference
sources of different interference powers and O obstacles of different sizes. In the considered
multi-UAV scenario, we construct the model in a 3D Cartesian coordinate system. Among
them, M UAVs are modeled as discs with a radius lmin . Let piuav (t) = [ pix uav ( t ), puav ( t ), H ],
iy
i ∈ [1, 2, . . . , M], t ∈ [1, 2, . . . , Ti ] denote the 3D position of the UAV, where H is the altitude
of the UAV,which is assumed to be fixed; Ti denotes the time for UAV i to complete its
mission. The o-th obstacle is modeled as a disk with radius ro , o ∈ [1, 2, . . . , O], and its
position is pobs o ( t ) = [ pox ( t ), poy ( t )]. The position of the interference source k is pk ( t ) =
obs obs int
[ pkx (t), pky (t)], and its transmission power is Pk , k ∈ [1, 2, . . . , K ]. The target location of
int int int
137
Drones 2023, 7, 229
M
ui = − ∑ aij (t)[( piuav − puav
j ) + γ ( t )( vi − v j )], i = 1, 2, . . . , M (2)
j =1
where γ(t) is a positive number and aij is the (i, j)-th term in the Laplacian matrix of an
undirected graph G M . The consensus formation control algorithm of a double integral
dynamic system makes the relative position between UAVs tend to the set value by control-
ling the input ui , so as to form the formation of multiple UAVs. In addition, the speed and
acceleration of the UAV must be less than its maximum limit
where vi , amax are the maximum speed and maximum acceleration of the UAV, respectively.
138
Drones 2023, 7, 229
strategies. In order to reduce the impact of communication delay and other factors on
the multi-UAV network topology control, this paper considers two-way communication
between UAVs to transmit information such as position and speed. Thus, the topology
of the multi-UAV network is represented by an undirected graph G M ≡ ( Q M , E M , WM ),
where Q M = {1, 2, . . . , M } denotes a non-empty finite set of UAVs. E M ⊆ Q M × Q M is
the edge set of the communication links connecting two UAVs. If there is a reachable
communication link between UAV i and UAV j, it means that there is an edge Eij in the
undirected graph G M , and Qi can obtain the information consisting of position and speed
from Q j . WM ⊆ Q M × Q M represents the weight matrix of communication links between
UAVs in the network topology, and we think that the communication between UAVs is
symmetric, i.e., Wij = Wji , ∀ Eij . Specifically, WM is described as the communication quality
matrix, where Wij represents the communication weight between UAV i and UAV j, which
is related to the communication distance between two UAVs. An undirected graph is
connected if there is an undirected path between any two different UAVs in the undirected
graph G M .
Figure 2 shows the correspondence between the formation structure and communica-
tion topology of the multi-UAV network. By controlling the relative position between two
UAVs, the distance between them satisfies the communication requirements, p1 − p2 ≤ Rc ,
E12 = 1. That is, the multi-unmanned systems maintain connectivity.
As shown in Figure 3, the network topologies considered in this paper include string
type, ring type, tree type, and star type. There is at least one undirected path between every
two UAVs in the multi-UAV network to ensure the connectivity of the system.
139
Drones 2023, 7, 229
where Gi is the transmitting antenna gain of UAV i, Gj is the receiving antenna gain
of UAV j, λ is the wavelength, dij indicates the distance between UAV i and UAV j,
α denotes the average path loss constant, Lm is the loss factor, and Pi is the signal
transmission power of UAV i.
where di,k is the distance between UAV i and interference source k, β 0 = (λ/4π )2 is
the path loss at a reference distance of 1m under LoS conditions, λ is the carrier, κ < 1
is the additional attenuation factor due to NLoS propagation, and α is the path loss
140
Drones 2023, 7, 229
1
PLoS (θ ) = (6)
1 + a · exp(−b(θ − a))
Among them, a and b are modeling parameters, and θ is the elevation angle from
interference source k to UAV i, namely
where H is the height of the UAV. The probability of an NLoS environment is given by
⎛ ⎞1
α
λ ⎝ Pi ⎠
Rc = SI NRth
(10)
4π
σ2 + Pi,K · 10 10
where σ2 is the average power of the noise in the wireless channel and SI NRth is
the signal to interference plus noise ratio (SINR) threshold. In order to ensure the
connectivity of the multi-UAV network, there is at least one undirected link between
every two UAVs; the communication between adjacent UAVs in the undirected path
needs to meet its maximum transmission distance.
141
Drones 2023, 7, 229
The situation of the interference of a UAV is shown in Figure 5. When the multi-UAV
network is interfered by an interference source, the maximum transmission distance of
the UAV signal is reduced. The closer the UAV is to the interference source, the smaller
the communication range. This situation reflects the actual UAV formation. That is, the
distance between UAVs is scaled adaptively to maintain the connectivity of the system.
142
Drones 2023, 7, 229
The SAPF algorithm establishes an attractive potential field for the target and a
repulsive potential field for the obstacle. The two potential fields are combined to avoid the
collision between the UAV and the obstacle in the process of flying to the target position.
The attractive and repulsive potential fields are expressed as
1
Uatt ( p) = k att · l 2 ( puav , ptar ) (12)
2
⎧
⎪
⎨ 1 krep ( 1 1 2
− ) , l ( puav , pobs ) ≤ lo
Urep ( p) = 2 l ( p , p ) lo
uav obs
(13)
⎪
⎩
0 , l ( puav , pobs ) > lo
where k att is the attraction gain factor, krep is the repulsive force gain coefficient, l ( puav , ptar )
denotes the vector distance between UAV and target position, l ( puav , pobs ) is the vector
distance between the UAV and the obstacle, i.e., the Euclidean distance between two points.
lo is a constant that represents the maximum range over which the obstacle can affect the
UAV. The attractive and repulsive forces are the negative gradients of the attractive and
repulsive potential fields, respectively, and the attractive and repulsive force functions are
expressed as
Fatt ( p) = −∇(Uatt ( p)) = −k att · l ( puav , ptar ) (14)
⎧
⎪
⎨krep ( 1 1 1 ∂(l ( puav , pobs ))
− ) · 2 uav obs · , l ( puav , pobs ) ≤ lo
Frep ( p) = l ( puav , pobs ) lo l ( p , p ) ∂( p) (15)
⎪
⎩
0 , l ( puav , pobs ) > lo
Then, adding the speed steering force to solve the local minimum problem, the speed
steering force is expressed as
⎧
⎪
⎨kvrep ( 1 1
− ) · v, l ( puav , pobs ) ≤ lo
v
Frep = l ( puav , pobs ) lo (16)
⎪
⎩0 ,l ( puav , pobs ) > l
o
where kvrep is the speed repulsion force gain coefficient, v is the speed of the UAV, and the
v is perpendicular to v. Therefore, the resultant repulsive force is expressed as
direction of Frep
sum
Frep = Frep ( p) + Frep
v
(17)
In addition, this paper adopts the formation control mode of the virtual pilot. Then,
the consensus algorithm, according to the double integral dynamic system shown in
Equation (2), is further expressed as
n
ui = − ∑ aij (t)(c1 ( piuav − puav
j − Δhij ) + c2 (vi − v j )) − f r , i = 1, 2, . . . , n (18)
j =1
143
Drones 2023, 7, 229
144
Drones 2023, 7, 229
Before the departure of the multi-UAV network, UAVs are divided into layers ac-
cording to the number of communication links of them. If there is a communication link
between UAV i and UAV j, then Eij = 1. Select UAV i, satisfying arg max ∑ Eij , ∀i, as the
j
first layer of the multi-UAV network. If there are UAVs with the same number of links,
select the UAV close to the target position. Then, the UAVs that have a communication link
with the UAVs on the first layer are used as the second layer, and the division method of
the third layer and subsequent layers is the same as above. Then, number each UAV in
order from top to bottom and from left to right and assign weights to UAVs according to
the position difference of each UAV in the expected formation. Generally, the multi-UAV
expected formation is divided into three layers from top to bottom according to the princi-
ple of hierarchical division, and the basic formation configuration is obtained. The position
of the first UAV in the initial formation is generally at the center UAV of the first layer. The
numbering method of the second layer specifies the relative position of each UAV in order
from left to right. The naming method of the third layer and subsequent layers is the same
as that of the second layer. After layering, two control mechanisms, hierarchical weight β q
and intra-layer position weight β p , were established by setting the corresponding weight
coefficients to ensure the stability of the UAV reconstruction formation. The UAVs in the
first layer have the largest β q , which decrease according to the increase of the number of
layers; the position weights β q within the layer decrease in order from left to right. For
V-shaped formations, each UAV β q within the same layer is equal,β p is not equal, and
β q >> β p .
When a UAV is damaged or forced to leave the system, the child UAV of the problem
UAV is used as the repair UAV. The multi-UAV formation is traversed down along the
communication link until the entire UAV formation is traversed. Then, the repair subnet is
established. If there are multiple child UAVs, the child UAV that can reach the expected
position of the problem UAV the fastest is judged as the repair UAV according to the
position, speed, and acceleration of each child UAV at the current moment. If there are
multiple problematic UAVs, select the child UAV of the problematic UAV with a larger
weight to repair the missing position. The repair UAV first flies to the desired position of the
problem UAV, so as to establish connectivity with other child UAVs of the original problem
145
Drones 2023, 7, 229
UAV. The repair UAV is within the maximum communication link range with the root UAV
of the problem UAV and approaches the movement direction of the problem UAV when it
leaves the team. It then restores the connection with the problem UAV as much as possible.
If the connection with the problem UAV cannot be restored, the sub-UAV of the repair also
approaches the problem UAV to form a serial link to expand the communication range.
After the repair subnetwork is established, the weights of the sub-UAVs of the problem
UAVs are updated. First, each UAV recalculates the current weights according to the
formation in the repair subnetwork. It then sends the new weights to the UAVs through
the link. Human–machine and the repair UAV sums the new weight and its own weight to
realize the weight update.
Algorithm 1 CMUFC
1: Initialize the physical parameters of M UAVs
2: Initialize the physical parameters of O obstacles and K interference sources
3: for t = 1, . . . , T do
4: for i = 1, . . . , M do
5: Calculate the communication distance of UAV i in Equation (10)
6: Calculate the distance between UAV i and neighboring UAVs in the undirected
path
7: Calculate the resultant force of UAV i in Equation (20)
8: Calculate the position of UAV i under the constraints at time t + 1
9: if there is an undirected path in the multi-UAV network then
10: Continue the cycle
11: else
12: Repair system connectivity
13: end if
14: end for
15: end for
4. Simulation Results
In this section, we simulate a V-formation multi-UAV network and analyze the simu-
lation results. The relevant parameters of the simulation are shown in Table 1.
146
Drones 2023, 7, 229
Parameter Value
Number of UAVs M=5
Transmitting power of UAVs Puav = 36 dBm
Maximum speed of UAV vmax = 30 m/s
Maximum acceleration of UAV amax = 30 m/s2
Number of interference sources K = 3. 1
Power of interference sources Pint = 10–36 dBm
Number of obstacles O = 10
Obstacle size ro = 30–50 m
Position attractive force coefficient k att = 0.1
Position repulsive force coefficient krep = 1500
Speed repulsive force coefficient kvrep = 100
Radius of influence of obstacles lo = 100 m
Safe radius of the UAV lmin = 10 m
147
Drones 2023, 7, 229
148
Drones 2023, 7, 229
149
Drones 2023, 7, 229
5. Conclusions
We investigated the problem of maintaining the connectivity of multi-UAV networks
in complex environments. For complex environments with obstacles and interference
sources, the CMUFC algorithm helps multi-UAV networks safely avoid obstacles and
maintain connectivity during flight. In order to solve the problem that UAVs may collide
with obstacles during fast flight, the traditional APF is improved, and SAPF is proposed
to help UAVs avoid obstacles more safely. In addition, in order to solve the situation that
UAVs leave the team and are forced to change the communication topology during the
obstacle avoidance process, the proposed method helps the multi-UAV network to perform
formation reconstruction. The simulation results show that the CMUFC algorithm is helpful
for multiple UAVs to form, maintain, and reconstruct the formation during their flight.
Author Contributions: Conceptualization, L.Z.; Methodology, C.M.; Software, J.L.; Validation, Y.L.;
Writing–review, Q.Y. All authors have read and agreed to the published version of the manuscript.
Funding: This research was supported by the Natural Science Basis Research Plan in Shaanxi Province
of China (2023JCYB555).
Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.
Data Availability Statement: Not applicable.
Conflicts of Interest: The authors declare no conflict of interest.
References
1. Wu, E.; Sun, Y.; Huang, J.; Zhang, C.; Li, Z. Multi UAV Cluster Control Method Based on Virtual Core in Improved Artificial
Potential Field. IEEE Access 2020, 8, 131647–131661. [CrossRef]
2. Yue, X.; Zhang, W. UAV Path Planning Based on K-Means Algorithm and Simulated Annealing Algorithm. In Proceedings of the
37th Chinese Control Conference (CCC), Wuhan, China, 25–27 July 2018; pp. 2290–2295.
3. Li, Z.; Han, R. Unmanned Aerial Vehicle Three-dimensional Trajectory Planning Based on Ant Colony Algorithm. In Proceedings
of the 37th Chinese Control Conference (CCC), Wuhan, China, 25–27 July 2018; pp. 9992–9995.
4. Lin, Y.; Saripalli, S. Sampling-Based Path Planning for UAV Collision Avoidance. IEEE Trans. Intell. Transp. Syst. 2017, 18,
3179–3192. [CrossRef]
150
Drones 2023, 7, 229
5. Wei, Z.; Meng, Z.; Lai, M.; Wu, H.; Han, J.A.; Feng, Z. Anti-collision Technologies for Unmanned Aerial Vehicles: Recent Advances
and Future Trends. IEEE Internet Things J. 2022, 9, 7619–7638. [CrossRef]
6. Tegicho, B.E.; Bogale, T.E.; Eroglu, A.; Edmonson, W. Connectivity and Safety Analysis of Large Scale UAV Swarms: Based on
Flight Scheduling. In Proceedings of the 2021 IEEE 26th International Workshop on Computer Aided Modeling and Design of
Communication Links and Networks (CAMAD), Porto, Portugal, 25–2 October 2021; pp. 1–6.
7. Mukherjee, S.; Namuduri, K. Formation Control of UAVs for Connectivity Maintenance and Collision Avoidance. In Proceedings
of the IEEE National Aerospace and Electronics Conference (NAECON), Dayton, OH, USA, 15–19 July 2019; pp. 126–130.
8. Huang, Y.T.; Qi, N.; Huang, Z.Q.; Jia, L.L.; Wu, Q.H.; Yao, R.G.; Wang, W.J. Connectivity Guarantee Within UAV Cluster: A Graph
Coalition Formation Game Approach. IEEE Open J. Commun. Soc. 2023, 4, 79–90. [CrossRef]
9. Trimble, J.; Pack, D.; Ruble, Z. Connectivity Tracking Methods for a Network of Unmanned Aerial Vehicles. In Proceedings of the
IEEE 9th Annual Computing and Communication Workshop and Conference (CCWC), Las Vegas, NV, USA, 7–9 January 2019;
pp. 0440–0447.
10. Ma, Z.; Qi, J.; Wang, M.; Wu, C.; Guo, J.; Yuan, S. Time-Varying Formation Tracking Control for Multi-UAV Systems with Directed
Graph and Communication Delays. In Proceedings of the 40th Chinese Control Conference (CCC), Shanghai, China, 26–28 July
2021; pp. 5436–5441.
11. Li, Z.; Huang, T.; Tang, Y.; Zhang, W. Formation Control of Multiagent Systems With Communication Noise: A Convex Analysis
Approach. IEEE Trans. Cybern. 2021, 51, 2253–2264. [CrossRef] [PubMed]
12. Fei, B.; Bao, W.; Zhu, X.; Liu, D.; Men, T.; Xiao, Z. Autonomous Cooperative Search Model for Multi-UAV With Limited
Communication Network. IEEE Internet Things J. 2022, 9, 19346–19361. [CrossRef]
13. Baldi, S.; Sun, D.; Zhou, G.; Liu, D. Adaptation to Unknown Leader Velocity in Vector-Field UAV Formation. IEEE Trans. Aerosp.
Electron. Syst. 2022, 58, 473–484. [CrossRef]
14. Roldao, V.; Cunha, R.; Cabecinhas, D.; Silvestre, C.; Oliveira, P. A leader-following trajectory generator with application to
quadrotor formation flight. Robot. Auton. Syst. 2014, 62, 1597–1609. [CrossRef]
15. Davidi, A.; Berman, N.; Arogeti, S. Formation flight using multiple Integral Backstepping controllers. In Proceedings of the IEEE
5th International Conference on Cybernetics and Intelligent Systems (CIS), Qingdao, China, 17–19 September 2011; pp. 317–322.
16. Song, M.; Wei, R.X.; Hu, M.L. Unmanned aerial vehicle formation contorl for reconnaissance task based on virtual leader. Syst.
Eng. Electron. 2010, 32, 2412–2415.
17. Li, C.X.; Liu, Z.; Yin, H. Cooperative motions control method guided by virtual formations for multi-UAVs. Syst. Eng. Electron.
2012, 34, 1220–1224.
18. Khatib, O. Real-time obstacle avoidance for manipulators and mobile robots. In Proceedings of the 1985 IEEE International
Conference on Robotics and Automation, St. Louis, MO, USA, 25–28 March 1985.
19. Parivallal, A.; Sakthivel, R.; Amsaveni, R.; Alzahrani, F.; Saleh Alshomrani, A. Observer-based memory consensus for nonlinear
multi-agent systems with output quantization and Markov switching topologies. Phys. Stat. Mech. Appl. 2020, 551, 123949.
[CrossRef]
20. Parivallal, A.; Sakthivel, R.; Wang, C. Guaranteed cost leaderless consensus for uncertain Markov jumping multi-agent systems.
J. Exp. Theor. Artif. Intell. 2023, 35, 257–273. [CrossRef]
21. Wu, Y.; Liang, T.J. Improved consensus-based algorithm for unmanned aerial vehicle formation control. Acta Aeronaut. Astronaut.
Sin. 2020, 41, 172–190.
22. Wu, Y.; Guo, J.Z.; Hu, X.T.; Huang, Y.T. A new consensus theory-based method for formation control and obstacle avoidance of
UAVs. Aerosp. Sci. Technol. 2020, 107, 106332. [CrossRef]
23. Zou, Y.; Zhang, H.; He, W. Adaptive Coordinated Formation Control of Heterogeneous Vertical Takeoff and Landing UAVs
Subject to Parametric Uncertainties. IEEE Trans. Cybern. 2022, 52, 3184–3195. [CrossRef] [PubMed]
24. Tran, V.P.; Santoso, F.; Garratt, M.A.; Petersen, I.R. Distributed Formation Control Using Fuzzy Self-Tuning of Strictly Negative
Imaginary Consensus Controllers in Aerial Robotics. IEEE/ASME Trans. Mechatron. 2021, 26, 2306–2315. [CrossRef]
25. Wang, Y.; Cheng, Z.; Xiao, M. UAVs’ Formation Keeping Control Based on Multi-Agent System Consensus. IEEE Access 2020, 8,
49000–49012. [CrossRef]
26. Qin, T.; Yu, H.; Lv, Y.; Guo, Y. Artificial Potential Field Based Distributed Cooperative Collision Avoidance for UAV Formation.
In Proceedings of the 3rd International Conference on Unmanned Systems (ICUS), Harbin, China, 27–28 November 2020;
pp. 897–902.
27. Zeng, Y.; Wu, Q.; Zhang, R. Accessing from the sky: A tutorial on UAV communications for 5G and beyond. Proc. IEEE 2019, 107,
2327–2375. [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.
151
drones
Article
Consensus Control of Large-Scale UAV Swarm Based on
Multi-Layer Graph
Taiqi Wang 1 , Shuaihe Zhao 1,2 , Yuanqing Xia 1, *, Zhenhua Pan 1 and Hanwen Tian 1
Abstract: An efficient control of large-scale unmanned aerial vehicle (UAV) swarm to establish a
complex formation is one of the most challenging tasks. This paper investigates a novel multi-layer
topology network and consensus control approach for a large-scale UAV swarm moving under a
stable configuration. The proposed topology can make the swarm remain robust in spite of the
large number of UAVs. Then a potential function-based controller is developed to control the UAVs
in realizing autonomous configuration swarming under the consideration of mutual collision, and
the stability of the controller from the individual UAV to the entire swarm system is analyzed by a
Lyapunov approach. Afterwards, a yaw angle adjustment approach for the UAVs to reach consensus
is developed for the multi-layer swarm, then the direction state of each UAV converges with a fast
rate. Finally, simulations are performed on the large-scale UAV swarm system to demonstrate the
effectiveness of the proposed scheme.
model predictive control method is applied for tracking trajectories; In [15], a multi-layer
formation control scheme and a layered distributed finite-time estimator is designed for
agents, which impels them to reach the desired positions and velocities according to the the
information of agents in their prior layers. In practice, many issues need to be considered
in order to implement formation control approaches successfully, such as the avoidance of
the obstacles and collisions.
The artificial potential field (APF) models provides an effective solution for practical
applications, which attracts the agent to the target and repulses it for avoidance, and
can be executed quickly and provides a viable solution [16]. In [17], a rotating potential
field is introduced, which makes the UAVs can escape from the oscillations and ensures
that the follower-leader maintains the desired angles and distances. Based on the APF
approach in [18], a novel automatic vehicles motion planning and tracking framework
is presented, and the effectiveness is validated in real experiment. In [19], an adaptive
synchronized tracking control based on the neural network is applied to boat by combining
with APF and robust H∞ methods, and the artificial potential method is used to guarantee
the boat maintaining desired distance with obstacles. In [20], different forms of potential
field functions are used for repulsion, velocity alignment and interaction with walls and
obstacles, and the proposed model is validated on a self-organized swarm of 30 drones.
Consensus control of multiagent systems is also a hotspot now, which means all the
agents in the system converge to the same state by the specific control law. In [21], a dis-
tributed active anti-disturbance cooperative control method with a finite-time disturbance
observer is proposed to achieve the consensus in finite time for the agents. In [22], the
consensus control problem is investigated under an event-triggered mean-square consensus
control law for a class of discrete time-varying stochastic multi-agent system. There are
three approaches proposed by [23] for consensus control of the multi-agent systems on
directed graphs, and some correlative examples are presented to validate the effectiveness.
In [24], the synergistic trajectory tracking problem of UAVs formation is investigated, both
the position tracking to the desired position and the attitude tracking to the command
attitude signal are achieved with the stability analysis and simulations validation.
The main challenges that impede the solving of the configuration and consensus
problem for the swarm are the large-sclae of the community and the chronological order
of configuration and consensus. Therefore, we have carried out the following research to
solve these problems. In order to improve the scalability of the network topology under
the large size of the swarm situation, based on the concept of [14], a multi-layer network
graph model is proposed for the large-scale UAV swarm, which allows the configuration to
be more adjustable and robust. After the configuration of the swarm is completed, to make
each UAV in the swarm reach an agreement, a multi-layer recursive consensus control
concept is designed for the UAV swarm, so that the yaw angles of UAVs in each layer tend
to be consistent.
The remainder of the paper is organized as follows. Section 2 describes some pre-
liminaries and formulates the problem to be investigated in this paper. In Section 3,
the multi-layer UAVs swarm configuration control strategy and the consensus concept are
proposed. The effectiveness of the proposed methodologies is illustrated by numerical
analysis in Section 4. Finally, the results of our work are briefly summarized in Section 5.
154
Drones 2022, 6, 402
( )
formation. For the undirected graph G , the adjacency matrix is given by A = aij ∈ R N × N
with aij = 0 ⇔ (i, j) ∈ ε , aij = a ji . The neighboring set of agent is denoted in [25]:
Assumption 1. We assume the Large-scale UAVs swarm consisting n UAVs with the same
dynamic characteristics flying in a same altitude space. Therefore, the working environment of each
UAV can be consider a two-dimensional space.
Assumption 2. In the case of controlling large-scale swarm, we assume each UAV as a point mass,
which means the influence of the size and shape of each UAV can be ignored.
155
Drones 2022, 6, 402
From single agent to multi-agent system, the dynamic protocol of the UAV swarm in
each layer is described as follows:
⎧ 1
⎪
⎪ ẋi = v1i
⎪
⎪ f irst layer i ∈ ν1
⎪
⎪ v̇ 1 = u1 = f 1 − k ẋ1
⎪
⎪ i i sum 1 i
⎪
⎪
⎪
⎪ 2
⎪
⎪ ẋi = v2i
⎨ second layer i ∈ ν2
v̇2i = u2i = f sum
2 − k ẋ2
1 i (3)
⎪
⎪
⎪
⎪
⎪
⎪ ..
⎪
⎪
⎪
⎪ .
⎪
⎪ ẋim = vim
⎪
⎪ i ∈ νm
⎩ mth layer
v̇im = uim = f sum m − k ẋ m
1 i
where xi1 , v1i ∈ Rn are respectively the position and velocity of each UAV in the subgroup
of first layer, u1i ∈ Rn is the control input acting on it, f sum
1 is the resultant force contains
obstacle avoidance force and collision avoidance force between UAVs. For the second layer,
xi2 , v2i ∈ Rk2 and u2i ∈ Rk2 are respectively the position, velocity and the control input of the
UAV in the subgroup of second layer, where k2 = n/( No + 1) is the element number of the
ν2 ; and f sum
2 is the resultant force contains not only mutual forces from each UAV but also
has potential field force from other subgroups in the second layer. Silimarly, xim , vim ∈ Rkm
and uim ∈ Rkm are respectively the position, velocity and the control input of the UAV in
the subgroup of mth layer, where k m = n/( No + 1)(m−1) is the element number of the νm ;
m contains the mutual forces from each UAV in the whole global
and the resultant force f sum
and the potential field forces from the second layer to the mth layer. Furthermore, k1 is a
positive constant for damping action.
By analyzing the dynamic model of the UAV (2), we design the corresponding control
law to make UAVs reach their desired configuration. Two forces will be engendered based
on the designed potential functions to drive all the UAVs move into the desired position
and avoid mutual collisions.
The mathematical expression of potential function is as follows
'
d d d
− ξ r0ij ln( r0ij ) + r0ij xi ∈ Ni1
Vij (dij ) = (4)
0 otherwise
* *
where ξ is the positive control coefficient, dij = * xi − x j * is the distance between agent i
and agent j, r0 is the desired radius between each UAV.
Differentiating (3) with respect to dij yields a potential force as
'
d
ξ ln( r0ij ) xi ∈ Ni1
f ij = − ∇ Vij (dij ) = (5)
0 otherwise
In another case, when UAV i and UAV j are not well-defined neighbors, both can be
regarded as obstacles to each other. Therefore, another potential function to avoid obstacles
is necessary to proposed as follows
' x −x
η (r0 − dio ) id j dio < r0
Vo (dio ) = io (6)
0 dio ≥ r0
where η is the positive control gain, xo is the position of the obstacle o. dio = xi − xo is
the distance between UAV i and the obstacles.
Then we define the set of the obstacles as
Oi = { j ∈
/ Ni |dio < r0 } (7)
156
Drones 2022, 6, 402
1
Based on the above two forces, the resultant force f sum for the first layer is expressed
as follows
1
f sum = ∑ f ij + ∑ f io (9)
j∈ Ni1 o ∈O i
u1i = ∑ f ij + ∑ f io − k1 ẋi1
j∈ Ni1 o ∈O i
(10)
= − ∑ ∇Vij (dij ) − ∑ ∇Vo (dio ) − k1 ẋi1
j∈ Ni1 o ∈Oi
For the second layer, in addition to the mutual force between the individual UAV, the
swarm are also affected by the potential field force between the subgroups. We define the
potential function of the second layer as follows
⎧
⎨ d2 d2 d2
−ξ r2ij ln( r2ij ) + r2ij xi2 ∈ Ni2
Vij (dij ) =
2 2
(11)
⎩ 0 0 0 0
otherwise
* *
* *
where d2ij = *xi2 − x2j *, r02 is the desired distance of the second layer. Then the correspond-
ing potentional force is expressed as follows
At the same time, each UAV in the swarm has gathered within a fixed area, then the
2
force to avoid obstacles disappears. Therefore, resultant force f sum are combined as follows
2
f sum = ∑ f ij + ∑ f ij2 (13)
j∈ Ni1 j∈ Ni2
The control law u2i of the second layer can be describe as follows
For the mth layer, we assume it as the last layer of the whole swarm, then each UAV in
mth layer is subject to global forces. The potential function is described as follows
' dm dm dijm
−ξ rmij ln( rmij ) + xim ∈ Nim
Vijm (dijm ) = 0 0 r0m (15)
0 otherwise
* *
* * m
where dijm = *xim − x m
j *, r0 is the desired distance of the second layer. Then the correspond-
ing potentional force is expressed as follows
157
Drones 2022, 6, 402
The control law of the entire UAV swarm are completed. Furthermore, the stability of
the configuration needs to be analyzed.
Theorem 1. Consider a subgroup of n UAVs with dynamics (2), under the control law (10), each
UAV can stay at a desired position and the forces and velocity converge to zero finally.
1 T
V1 = ∑ Vij (dij ) + ∑ Vo (dio ) + ẋ1i ẋi1
2
(19)
j∈ Ni1 o ∈O i
From the above conclusion we can get V1 is non-negative. Differentiating (19) with respect
to time and combining with (2), (3) and (10), we have
T
V̇1 = ẋ1i ( ∑ ∇Vij (dij ) + ∑ ∇Vo (dio ) + ẍi1 )
j∈ Ni1 o ∈O i
T
= ẋ1i (− f sum
1 + u1 )
i (20)
T
= −k1 ẋ1i ẋi1
≤0
Thus the energy of each UAV i (i = 1, 2, ..., n) monotonically decreasing. From the
analysis we can conclude that the velocity of UAVs eventually converge as the same.
Theorem 2. For the entire swarm with n agents, under the global control law (18), all the UAVs
can arrive at the desired positions, the potential forces from the first layer to the mth layer and
velocity converge to zero finally.
158
Drones 2022, 6, 402
Differentiating (21) with respect to time and combining with (3) and (18), we have
n
V̇m = ∑ ẋim T ( ∑ ∇Vij (dij ) + ∑ ∇Vij2 (d2ij )
i =1 j∈ Ni1 j∈ Ni2
+...+ ∑ ∇Vijm−1 (dijm−1 ) + ∑ ∇Vijm (dijm ) + ẍim )
j∈ Nim−1 j∈ Nim
n (22)
= ∑ ẋim T (− f sum
m + um )
i
i =1
n
T m
= −k1 ∑ ẋi ẋi
m
i =1
≤0
Therefore, the total potential energy can approach the minimumwe and ẋim → 0 as t → ∞
for all the UAVs in the swarm, and so is ẍim . As a result, the multi-layer configuration of
the swarm is constructed.
The attitude of the UAV i1 can be updated according to the attitude of all the UAVs in the
same subgroup. Therefore, the consensus of the first layer is achieved.
For the second layer, the UAVs yaw angle are adjusted by the following approach
No +1
∑ sin ψj2 (t)
j =1
ψi2 (t + 1) = arctan N2o +1
∑ cos ψj2 (t)
j2 =1
No +1 No +1 (24)
∑ sin( No1+1 ∑ ψj1 (t))
j =1 j1 =1
= arctan N2o +1 No +1
∑ cos( No1+1 ∑ ψj1 (t))
j2 =1 j1 =1
Therefore, ψi2 is obtained from the average of the yaw angles of the individual UAVs
in all the subgroups for the first layer.
Based on the above strategy, the UAVs yaw angle adjustment strategy for the mth
layer is as follows
159
Drones 2022, 6, 402
No +1
∑ sin ψj2 (t)
j =1
ψim (t + 1) = arctan Nmo +1
∑ cos ψj2 (t)
jm =1
No +1 No +1 (25)
∑ sin( No1+1 ∑ ψjm−1 (t))
jm =1 jm − 1 = 1
= arctan No +1 No +1
∑ cos( No1+1 ∑ ψjm−1 (t))
jm =1 jm−1 =1
In the above, we describe the consensus strategy between different layers, then all
the UAVs in the swarm achieve consensus eventually. For the specific example of the
UAVs swarm, as shown in Figure 1, assume n = 9, No = 2, there are nine UAVs labeled as
A1 , . . . , A9 , then the whole swarm can be combined as three first layer subgroups named as
G11 , G12 , G13 ,which constitute a second layer G2 . Futhermore, G11 is composed of A1 , A2 , A3 , G12
and G13 are consisted of A4 , A5 , A6 , A7 , A8 , A9 , respectively. Then we set the communication
between G11 and G12 are connected by A2 and A4 , G12 and G13 are connected by A6 and A8 ,
G11 and G13 are connected by A3 and A7 . Firstly, the UAV swarm achieves the desired
configuration through the forces between the UAV individuals and between the same
layers. Taking A2 as an example, A2 is subjected to the forces of A1 and A3 , namely f A2 A1
and f A2 A3 , A2 is also subject to f G2 1 G 2 and f G2 1 G 3 , which are the components between G11
1 1 1 1
and G12 and between G11 and G13 , respectively. After the whole swarm reaches the desired
configuration, the resultant force of A2 is zero. Furthermore, let G11 , G12 and G13 achieve
intra-group consensus through the yaw angle adjustment stragety (23), and reach the same
yaw angle ψ A1 , ψ A4 and ψ A7 respectively. For the second layer, G2 achieve the intra-group
consistent yaw angle from the average of the ψ A1 , ψ A4 and ψ A7 . Based on this rule, the
entire swarm achieves consensus eventually.
$
I $ $
I $ $
$ $
I
I
$ $
$ $ $ $
Figure 1. Communication topology with n = 9, No = 2.
4. Simulation Study
To illustrate the effectiveness of the proposed multi-layer topology and the consensus
algorithm, corresponding simulation results under different conditions are presented in
this section. For the multi-layer UAVs swarm, we consider a group of networked UAVs
with n = 27, No = 2, which contains two layer subgroups. The control parameters are
chosen as r0 = 2 m, r02 = 4 m, ξ = 20, η = 5.
160
Drones 2022, 6, 402
'LVWDQFH P
6WHS
161
Drones 2022, 6, 402
<DZDQJOH GHJ
6WHS
5. Conclusions
The current paper proposed a multilayer framework based on the multi-layer con-
cept to deal with the multiagent problem with arbitrary number of UAVs. The primary
contribution is that the designed multi-layer structure can be used to form the desired
configuration and keep consensus under the context of large-scale UAVs swarm with
Assumption 1 and Assumption 2, rather than moving into random positions. A potential
function-based multi-layer controller is developed to drive all the UAVs to achieve the
desired configuration precisely without collisions. Then all the UAVs reach an agreement
through the consensus algorithm. The stability of the system is proved by the Lyapunov
approach. The simulation studies demonstrated the effectiveness of the proposed methods
for the UAVs swarm. In our future work, the trajectory tracking and the obstacle avoidance
of the large-scale UAVs swarm will be investigated under the Active Disturbance Rejection
Control approach.
Abbreviations
The following abbreviations are used in this manuscript:
162
Drones 2022, 6, 402
References
1. Nasir, M.H.; Khan, S.A.; Khan, M.M.; Fatima, M. Swarm Intelligence inspired Intrusion Detection Systems—A systematic
literature review. Comput. Netw. 2022, 205, 108708. [CrossRef]
2. Rosenberg, L.; Willcox, G.; Askay, D.; Metcalf, L.; Harris, E. Amplifying the social intelligence of teams through human swarming.
In Proceedings of the 2018 First International Conference on Artificial Intelligence for Industries (AI4I), San Francisco, CA, USA,
26–28 September 2018; pp. 23–26.
3. Kiebert, L.; Joordens, M. Autonomous robotic fish for a swarm environment. In Proceedings of the 2016 11th System of Systems
Engineering Conference (SoSE), Waurn Ponds, Australia, 12–16 June 2016; pp. 1–6.
4. Reynolds, C.W. Flocks, herds and schools: A distributed behavioral model. In Proceedings of the 14th Annual Conference on
Computer Graphics and Interactive Techniques, Los Angeles, CA, USA, 27–31 July 1987; pp. 25–34.
5. Cucker, F.; Smale, S. Emergent behavior in flocks. IEEE Trans. Autom. Control. 2007, 52, 852–862. [CrossRef]
6. Saska, M.; Vakula, J.; Přeućil, L. Swarms of micro aerial vehicles stabilized under a visual relative localization. In Proceedings of
the 2014 IEEE International Conference on Robotics and Automation (ICRA), Hong Kong Convention and Exhibition Center,
Hong Kong, China, 31 May–7 June 2014; pp. 3570–3575.
7. Olfati-Saber, R. Flocking for multi-agent dynamic systems: Algorithms and theory. IEEE Trans. Autom. Control. 2006, 51,
401–420. [CrossRef]
8. Su, H.; Wang, X.; Lin, Z. Flocking of multi-agents with a virtual leader. IEEE Trans. Autom. Control. 2009, 54, 293–307. [CrossRef]
9. Fu, X.; Pan, J.; Wang, H.; Gao, X. A formation maintenance and reconstruction method of UAV swarm based on distributed
control. Aerosp. Sci. Technol. 2020, 104, 1270–9638. [CrossRef]
10. Wu, Y.; Gou, J.; Hu, X.; Huang, Y. A new consensus theory-based method for formation control and obstacle avoidance of UAVs.
Aerosp. Sci. Technol. 2020, 107, 1270–9638. [CrossRef]
11. Soria, E.; Schiano, F.; Floreano, D. Predictive control of aerial swarms in cluttered environments. Nat. Mach. Intell. 2021, 3,
545–554. [CrossRef]
12. Haghighi, R.; Cheah, C.C. Multi-group coordination control for robot swarms. Automatica 2012, 48, 2526–2534. [CrossRef]
13. Yan, X.; Chen, J.; Sun, D. Multilevel-based topology design and shape control of robot swarms. Automatica 2012, 48,
3122–3127. [CrossRef]
14. Pan, Z.; Sun, Z.; Deng, H.; Li, D. A Multilayer Graph for Multiagent Formation and Trajectory Tracking Control Based on MPC
Algorithm. IEEE Trans. Cybern. 2021, 52, 13586–13597. [CrossRef]
15. Li, D.; Ge, S.S.; He, W.; Ma, G.; Xie, L. Multilayer formation control of multi-agent systems. IEEE Trans. Cybern. 2019,
109, 108558. [CrossRef]
16. Liu, X.; Ge, S.S.; Goh, C.H. Formation potential field for trajectory tracking control of multi-agents in constrained space. Int. J.
Control 2017, 90, 2137–2151. [CrossRef]
17. Pan, Z.; Zhang, C.; Xia, Y.; Xiong, H.; Shao, X. An Improved Artificial Potential Field Method for Path Planning and Formation
Control of the Multi-UAV Systems. IEEE Trans. Circuits Syst. II Express Briefs 2021, 69, 1129–1133. [CrossRef]
18. Huang, Y.; Ding, H.; Zhang, Y.; Wang, H.; Cao, D.; Xu, N.; Hu, C. A motion planning and tracking framework for au-
tonomous vehicles based on artificial potential field elaborated resistance network approach. IEEE Trans. Ind. Electron. 2019, 67,
1376–1386. [CrossRef]
19. Wen, G.; Ge, S. S.; Tu, F.; Choo, Y.S. Artificial Potential-Based Adaptive H∞ Synchronized Tracking Control for Accommodation
Vessel. IEEE Trans. Ind. Electron. 2017, 64, 5640–5647. [CrossRef]
20. Vásárhelyi, G.; Virágh, C.; Somorjai, G.; Nepusz, T.; Eiben, A.E.; Vicsek, T. Optimized flocking of autonomous drones in confined
environments. Sci. Robot. 2018, 3, eaat3536. [CrossRef]
21. Wang, X.; Li, S.; Yu, X.; Yang, J. Distributed active anti-disturbance consensus for leader-follower higher-order multi-agent
systems with mismatched disturbances. IEEE Trans. Autom. Control 2016, 62, 5795–5801. [CrossRef]
22. Ma, L.; Wang, Z.; Lam, H.K. Event-triggered mean-square consensus control for time-varying stochastic multi-agent system with
sensor saturations. IEEE Trans. Autom. Control 2016, 62, 3524–3531. [CrossRef]
23. Zhang, H.; Lewis, F.L.; Qu, Z. Lyapunov, adaptive, and optimal design techniques for cooperative systems on directed communi-
cation graphs. IEEE Trans. Ind. Electron. 2011, 59, 3026–3041. [CrossRef]
24. Zou, Y.; Meng, Z. Coordinated trajectory tracking of multiple vertical take-off and landing UAVs. IEEE Trans. Ind. Electron. 2019,
99, 33–40. [CrossRef]
25. Olfati-Saber, R.; Murray, R.M. Consensus problems in networks of agents with switching topology and time-delays. IEEE Trans.
Autom. Control. 2004, 49, 1520–1533. [CrossRef]
26. Vicsek, T.; Czirók, A.; Ben-Jacob, E.; Cohen, I.; Shochet, O. Novel type of phase transition in a system of self-driven particles.
Phys. Rev. Lett. 1995, 75, 1226. [CrossRef] [PubMed]
163
drones
Article
Distributed Offloading for Multi-UAV Swarms in
MEC-Assisted 5G Heterogeneous Networks
Mingfang Ma and Zhengming Wang *
Abstract: Mobile edge computing (MEC) is a novel paradigm that offers numerous possibilities for
Internet of Things (IoT) applications. In typical use cases, unmanned aerial vehicles (UAVs) that can
be applied to monitoring and logistics have received wide attention. However, subject to their own
flexible maneuverability, limited computational capability, and battery energy, UAVs need to offload
computation-intensive tasks to ensure the quality of service. In this paper, we solve this problem for
UAV systems in a 5G heterogeneous network environment by proposing an innovative distributed
framework that jointly considers transmission assessment and task offloading. Specifically, we
devised a fuzzy logic-based offloading assessment mechanism at the UAV side, which can adaptively
avoid risky wireless links based on the motion state of an UAV and performance transmission metrics.
We introduce a multi-agent advantage actor–critic deep reinforcement learning (DRL) framework to
enable the UAVs to optimize the system utility by learning the best policies from the environment.
This requires decisions on computing modes as well as the choices of radio access technologies (RATs)
and MEC servers in the case of offloading. The results validate the convergence and applicability of
our scheme. Compared with the benchmarks, the proposed scheme is superior in many aspects, such
as reducing task completion delay and energy consumption.
Keywords: unmanned aerial vehicle; heterogeneous networks; computation offloading; fuzzy logic;
deep reinforcement learning
thus reducing the pressure on a single cellular network and enhancing the exploits on
available network resources [10]. As a result, different UAV tasks will face many choices
when selecting the target network nodes to request services, and facilities in close proximity
are not always the best choice. Notably, although the UAV link selection in a network-sparse
environment is small and relatively fixed [11], a critical concern of this paper is how flying
UAVs adaptively evaluate and select transmission links in a distributed manner to achieve
flexible and stable offloading in hotspots with overlapped coverage of heterogeneous
networks.
For task offloading, previous works have mainly focused on developing strategies
under system certainty or used centralized approaches when faced with environmental
dynamics. In most cases, they fall short of settling the multi-UAV offloading problem in
unknown environments, particularly when multiple heterogeneous network nodes are de-
ployed and UAVs fly arbitrarily. In addition, the heuristics or dynamic programming meth-
ods commonly used to achieve optimal task-offloading solutions may be time-consuming
due to the large number of iterations required. As a result, these approaches may not be
suitable for real-time offloading decision-making in dynamic environments. Accordingly,
reinforcement learning (RL) has the potential to alleviate excessive computational demands,
so as to enable learning for the agents. Previous online schemes based on RL have coped
with system uncertainty to a certain extent, while offloading strategies are made centrally
by the system or independently by each agent.
In this paper, we propose a distributed task-offloading scheme for multi-UAVs in MEC-
assisted heterogeneous networks with the objective of maximizing the utilities of all UAVs
for processing tasks through multi-UAV collaboration. To prevent UAVs from offloading
via easily disconnected communication links and poorly performing service nodes, we
propose an offloading assessment mechanism for UAV swarms based on fuzzy logic. In the
framework, UAV velocity and transmission quality are jointly considered, and UAVs can
make assessments locally and efficiently based on the perceived information. Subsequently,
we designed an offloading algorithm by applying deep reinforcement learning (DRL),
which adopts multi-agent advantage actor–critic (A2C) policy optimization to automatically
and effectively work out the optimal solution, so as to reduce the task completion time and
energy consumption of an UAV swarm in a MEC environment.
The contributions of this paper are summarized as follows:
• We introduce a multi-agent task-offloading model in a heterogeneous network en-
vironment (which is different from the existing works that consider single-network
scenarios or independent devices). Moreover, the optimization problem is formulated
as a Markov decision process (MDP), which is beneficial for solving the sequential
offloading decision-making for UAV swarm in dynamic environments.
• To facilitate stable offloading of UAVs in any motion state, we devised a fuzzy logic-
based offloading assessment mechanism. The mechanism is executed in a decentral-
ized manner on the UAV with low complexity and can adaptively identify available
offloading nodes that are prone to disconnection or have undesirable transmission
quality.
• Based on the multi-agent DRL framework, we propose a distributed offloading scheme
named DOMUS. DOMUS effectively enables each UAV to learn the joint optimal policy,
such as determining the computing mode and selecting the RATs and MEC servers in
the offloading case.
• We performed different numerical simulations to verify the rationality and efficiency
of the DOMUS scheme. The evaluation results show that the DOMUS proposed
is capable of rapidly converging to a stable reward, achieving the optimal offload-
ing performance in energy consumption and delay by comparing with four other
benchmarks.
The rest of the paper is structured as follows. Section 2 presents the related works
on task offloading. Section 3 illustrates the system model, presents the mathematical
presentation of the task computing model, formulates a utility model for the performance
166
Drones 2023, 7, 226
metrics of task computing, and defines the optimization problem. An offloading assessment
mechanism based on fuzzy logic is devised in Section 4. In Section 5, we propose a
distributed task-offloading algorithm by applying the multi-agent DRL framework. Finally,
Section 6 demonstrates and compares the efficiency of the proposed scheme and Section 7
summarizes this paper. For ease of reference, the definitions of the key symbols are listed
in Table 1.
Symbols Definition
U = {1, ..., U } Set of UAVs
M = {1, ..., M} Set of servers
κu Task of UAV u ∈ U
du,κ Data size of κu
cu,κ Computation resources required by task κu
α1u,κ , α2u,κ Offloading decisions
λu Computational capability of UAV u ∈ U
λm Computational capability of server m ∈ M
loc
lu,κ Execution time in local computing
loc
eu,κ Energy consumption in local computing
ρeu Energy consumption coefficient per CPU cycle
c , ξw
ξ u,κ u,κ Transmission rate via cellular and Wi-Fi networks, respectively
Buc , Buw Allocated bandwidth to the UAV u from cellular and Wi-Fi networks, respectively
c
Pu,κ , Pu,κw Transmission power of the UAV u via cellular and Wi-Fi connectivities, respectively
c
Gu,κ , Gu,κ w Channel gain over cellular and Wi-Fi networks, respectively
(σu,κc )2 , ( σ w )2
u,κ Noise power of the channel over cellular and Wi-Fi networks, respectively
du,m Distance between the UAV u and the server m
tr
lu,κ,m Task transmission time in the MEC offloading
mec
eu,κ Task execution time on the server
mec
eu,κ Transmission energy consumption in the MEC offloading
mec
lu,κ Total time in the MEC offloading
êu Maximum energy constraint of the UAV u
lˆu,k , b̂u,k , p̂u,k Tolerable upper bound values for delay, BER, and PLR, respectively
wd,κ , we,κ Balance factors for delay and energy consumption, respectively
Ĉm Computation capacity of the server m
Fu Utility of the UAV u
f uzzy(·) Fuzzy logic processor
pmu,κ Packet loss rate generated in the data transmission
m
bu,κ Bit error rate generated in the data transmission
χm Offloading probability
2. Related Work
Effective offloading of computer-intensive application tasks for smart devices, espe-
cially UAVs, is becoming more critical. Accordingly, many studies related to task offloading
are being proposed. In this section, we briefly outline the related work.
Some studies consider centralized controllers to realize offloading decisions. Li et al. [12]
considered that the tasks performed in Maritime environments have strict delay requirements;
they designed a genetic-based offloading algorithm for energy-starved UAVs, which optimizes
energy consumption under the task delay constraint. Guo et al. [13] studied task offloading in
a MEC system and attempted to minimize the system overhead by expressing the offloading
as a mixed-integer non-linear programming problem, proposing a heuristic algorithm based
on a greedy policy. Zhang et al. [14] integrated latency and energy consumption to obtain
the offloading utility, which was combined with simulated annealing to make offloading
strategies in MEC, so as to enhance the utility. These efforts [12–14] have global coordination
but require UAVs to upload private information related to the tasks executed and real-time
167
Drones 2023, 7, 226
status to enable centralized offloading decision-making. This will significantly increase the
burden on the centralized controller when the scale of the UAV swarm increases.
As network environments become larger and more complex, distributed frameworks
are becoming more popular in some computation-offloading efforts [15,16]. Dai et al. [1]
developed a vehicle-assisted offloading architecture for UAVs in smart cities, where vehicles
and UAVs are matched according to preferences and the offloading process of data are
modeled as part of a bargaining game to enhance the offloading efficiency and optimize the
system utility. Zhou et al. [17] modeled the interaction in offloading as part of a Stackelberg
game and maximized the utility of the system. To select a suitable service provider for
the offloaded task of UAV, Gu et al. [18] devised an evolutionary game-based offloading
approach to make a trade-off between latency, energy, and cost. However, these methods
require multiple interactions and iterations of all participants to reach a satisfactory optimal
solution, and they are not always suitable for making real-time decisions due to the fact that
UAVs have good maneuverability, which can lead to rapid changes in environmental states.
Swarm intelligence is a popular approach used in multi-UAV systems and can enable
global behavior to emerge from UAV clusters through operations such as interactions. As a
result, swarm intelligence algorithms have received more attention in the implementation
of UAV offloading. You et al. [19] introduced a computation-offloading scheme based
on particle swarm optimization, which can offload tasks to low-latency MEC servers and
balance the load on the servers. Li et al. [20] constructed an offloading model, which
aims to minimize the delay of whole UAVs under the constraint of consumed energy;
they applied the bat algorithm to solve the model. In [21], Asaamoning et al. researched
computing offloading in a networked control system consisting of UAVs and discussed
the application of swarm intelligence approaches, such as ant colony optimization and bee
colony optimization. In addition, these swarm intelligence approaches can help determine
the optimal positions of drone base stations, which can provide support for drones to act as
base stations in the next generation of the Internet of Things [22,23].
Some studies on computation offloading in MEC tend to leverage reinforcement
learning because of its strength in adapting to dynamic environments. Chen et al. [24]
constructed task-offloading architecture based on deep deterministic policy gradients to
optimize the offloading performance. Different from these schemes, refs. [24–26], our work
devises the distributed decision-making mechanism by leveraging the multi-agent DRL
framework, which can collaboratively deal with the optimization of offloading policies
for multi-UAVs with heterogeneous tasks. Although there are distributed approaches
for computation offloading decision-making that applies reinforcement learning, such
as Q-TOMEC [27], TORA [28], and a distributed offloading technique based on deep Q-
learning [29], these approaches utilize parallel deep neural networks instead of considering
collaboration among agents.
This paper considers the popular 5G heterogeneous network architecture rather than
a single network, as considered in many papers [8,9,30,31]. Correspondingly, it is impor-
tant to evaluate and choose the appropriate offloading link among many heterogeneous
networks for UAVs with good maneuverability. This is because an improper offloading
selection may lead to frequent service interruptions, network hand-offs, and transmission
link failures. However, existing offloading schemes are only concerned with task deadlines,
energy consumed, or a balance between the two. In order to make the offloading scheme
effective, we propose an offloading assessment mechanism that jointly considers the effects
of transmission quality and UAV mobility to ensure efficient data transmission. Further-
more, we designed the mechanism to be fully decentralized on the UAV side, so that the
mechanism has great scalability. To our knowledge, this is the first attempt to research the
link evaluation during the task offloading of an UAV swarm.
168
Drones 2023, 7, 226
function to evaluate the critical attributes that can affect the decision-making of UAVs
(Section 3.3). Finally, according to the system model and task computing models, we define
the optimization problem to be solved in this paper (Section 3.4).
ZLIL$3
3
8$9
8$9
8$9
8$9
ZLIL$3
8$9
FHOOXODU%6
U %6
6 FHOOXODU%6
8$9 8$9 8$9
Furthermore, each UAV u ∈ U has a task to process at a certain time; we use a tuple
κu = {du,κ , cu,κ } to express the UAV task u; the data size of the task κu is indicated by du,κ ,
and the total computation resources required to complete κu are denoted by cu,κ . Moreover,
the application tasks place tight requirements on the quality of service (QoS) attributes,
such as delay, BER, and PLR when executing tasks.
In view of the above, each UAV in the system can perform its task κu by computing
locally, offloading to a MEC server through cellular BS or a Wi-Fi AP. Correspondingly,
when each UAV u performs its task κu , two binary variables (α1u,κ and α2u,κ ) are used to
characterize the decisions made by the UAV u; we provide the following explanations for
them. '
0 local computing
α1u,κ = (1)
1 κu is offloaded
'
0 κu is offloaded to a BS server
α2u,κ = (2)
1 κu is offloaded to a Wi-Fi server
in which α1u,κ expresses the task κu computed locally or offloaded. In the second decision,
α2u,κ = 0 or 1 means task κu is offloaded to a MEC server equipped with a cellular BS or
Wi-Fi AP, which occurs only when α1u,κ = 1. These divergent task computing modes will
enable the UAVs to efficiently implement tasks and obtain great service performance.
169
Drones 2023, 7, 226
where λu indicates the computational capability of the UAV u and cu,κ denotes the needed
CPU amount to complete the task κ.
loc denote the energy consumed on local computing, which is represented by
Let eu,κ
loc
eu,κ = ρeu cu,κ (4)
where ρeu means the local energy consumption coefficient per CPU cycle.
(2) MEC offloading model
In this model, we consider that there is more than one UAV that will offload tasks to
the same MEC server in the same time period. In this case, if the UAV u performs the task
κ by MEC, the achieved data transmission rates via the cellular and Wi-Fi networks are
denoted by ξ u,κ
c and ξ w , which are, respectively, presented in Equations (5) and (6) [32].
u,κ
c Gc
Pu,κ u,κ
c
ξ u,κ = Buc · log2 (1 + ) (5)
(σu,κ
c )2 + ∑ c c
u =u Pu ,κ Gu ,κ
w Gw
Pu,κ u,κ
w
ξ u,κ = Buw · log2 (1 + ) (6)
(σu,κ
w )2 + ∑u =u Puw ,κ Guw ,κ
c means the transmission power of the UAV u for offloading the task
In Equation (5), Pu,κ
c = d−ι is the channel gain because of the
to the MEC server m via cellular connectivity; Gu,κ u,m
path loss effect and shadowing, where the path loss coefficient
is denoted by ι, the distance
between the UAV u and the server m is du,m , du,m = dv2u,m + dh2u,m , where dvu,m and
dhu,m , respectively, indicate the vertical and horizontal distances between the UAV u and
c )2 denotes the noise power of the channel, u defines the other UAVs that
the server m; (σu,κ
access the server m to process its task κ , and Buc expresses the allocated bandwidth from
the cellular network. Additionally, the variables in Equation (6) have the same meanings as
those in Equation (5).
tr
Then the transmission time lu,κ,m of the task data for the UAV u can be represented as
⎧
⎨l c = du,κ
α1u,κ = 1, α2u,κ = 0
u,κ ξ u,κ
c
tr
lu,κ,m = (7)
⎩lu,κ
w = du,κ
α1u,κ = 1, α2u,κ = 1
ξ u,κ
w
tr
Here, lu,κ,m c and l w occurs in
is a general variable; it defines the transmission time lu,κ u,κ
the data transmission through the cellular or Wi-Fi network, respectively.
Accordingly, if the transmission power of the UAV u is indicated by Pu , and Pu ∈
{ Pu,κ
c , Pw }, the energy emec consumed by the UAV u during data transmission can be
u,κ u,κ
written as
mec
eu,κ = Pu lu,κ,m
tr
(8)
170
Drones 2023, 7, 226
After the task data are transmitted to the server m, similar to the local computing
exe on the server m can be represented by
model, the data processing time lu,κ,m
cu,κ
exe
lu,κ,m = (9)
λm
in which λm denotes the computational capability of the server m. Thereby, the total time
consumed by the UAV u during offloading is expressed as
mec
lu,κ = lu,κ,m
tr
+ lu,κ,m
exe
(10)
in which wd,κ and we,κ characterize the balance factors between the consumed time and
energy; hence, wd,κ + we,κ = 1.
171
Drones 2023, 7, 226
of the tasks performed. In Equation (13), Au denotes the decision set of each UAV. C1
(t) =
indicates the offloading decision constraint, and Cm ∑ cu,κ expresses the computation
resources of the server m occupied by UAVs; therefore, C2 denotes whether server m ∈ M
is selected to provide service. The computation resources used by UAVs cannot exceed
the computation capacity of the server m at a certain time slot t. C3 indicates the battery
energy constraint of an UAV u ∈ U ; C4 denotes that the consumed time when executing
task κu should be controlled within the allowable delay threshold lˆu,κ . C5 and C6 express
whether the PLR and BER occurring in the data transmission should satisfy the tolerable
upper bound values for a certain task κu if processed by the MEC.
The optimization problem is an integer-programming problem; the feasible decision
number for task computation is ( M + 1)U , which is commonly non-convex and NP-hard.
Conventional mathematical-based optimization approaches can work out the optimal
solution for the proposed problem theoretically but are unable to realize it in a short time.
The DRL approach is applicable for settling the decision-making problems with high-
dimensional solution spaces effectively, especially for the increased number of offloaded
tasks in future wireless networks. In view of the above, we will develop a multi-agent
A2C-based DRL scheme, which can find feasible offloading actions in polynomial time.
172
Drones 2023, 7, 226
In step 3 of Algorithm 1, each UAV senses nearby offloading nodes, observes the flying
velocity, and perceives the data of the PLR and BER with respect to candidate-offloading
targets by assuming that the UAV task u will be offloaded to the node.
The PLR occurring in data transmission can be commonly evaluated by
Prss
u,κ = ς · du,κ · exp(− ϑ ·
pm ) (14)
σ2 · d2u,m
In particular, in step 4 of the algorithm, the obtained PLR and BER are normalized
pmu,κ
m
bu,κ
by p̂u,κ and so as to eliminate the unit difference before inputting it into the fuzzy
b̂u,κ
logic system. In step 5, the designed fuzzy logic system for offloading the assessment
maps the sensed data, including velocity, PLR, and BER into fuzzy sets according to the
membership functions (MFs) for each one; this process is called fuzzification. Afterward,
the fuzzy inference procedure infers the fuzzified inputs and produces fuzzy output based
on multiple IF-AND-THEN rules, which are designed by following empirically fuzzy rule
sets. Furthermore, based on the triggered fuzzy rule, the fuzzy logic system proceeds to the
defuzzification stage, which calculates and outputs a scalar value χm for the node m ∈ M
by applying the centroid defuzzifier method [33]. Moreover, χm ∈ [0, 1] can characterize
the fitness of the offloading node for the task κu of the UAV u; the higher the χm , the better
the fitness. Finally, in steps 6 and 7, the obtained χm is compared with its permitted upper
bound χ̂m ; if the condition is satisfied, the node m ∈ M will be selected by the UAV u as
the available offloading target, and be included in M̃.
173
Drones 2023, 7, 226
task; (4) SR = { Iu,1 , . . . , Iu,M }: the signal-to-noise ratio vector between the UAV u and its
available offloading nodes in M̃; (5) Dist = {du,1 , . . . , du,M }: the distance vector between
the UAV u and its available offloading nodes in M̃.
(2) Action space Au . The action taken in the time slot t for each UAV u is to decide
whether the task should be performed locally or offloaded to a MEC server, and if offloaded,
which server will be selected. Thus, according to the definitions of the task computing
decisions in Equations (1) and (2) in the system model, the action set for each UAV can be
represented as Au = { a1u , a2u , a3u }, in which a1u indicates α1u,κ = 0, a2u denotes α1u,κ = 1 and
α2u,κ = 0, and a3u expresses α1u,κ = 1 and α2u,κ = 1.
(3) Reward function Ru . At state st , each UAV chooses an action and receives an
instant reward rut from the environment. It is known that the purpose of each agent is to
maximize its utility through improving the policy of task computing. For this reason, we
define the reward rut as the performance improvement between two utility values obtained
by the UAV within two consecutive time slots; the rut is written as
⎧
⎨ε 1
⎪ Fut − Fut−1 > β
rut = ε 2 Fut − Fut−1 < − β (17)
⎪
⎩
0 otherwise,
where Fut refers to the utility of the UAV u for processing tasks at time slot t, ε 1 > 0,
and ε 2 < 0; both ε 1 and ε 2 denote the obtained instant rewards under two different
situations, i.e., Fut − Fut−1 > β and Fut − Fut−1 < − β, respectively. Moreover, β > 0 means the
sensitivity to utility changes of the UAV in MDP. Therefore, the reward rut can effectively
characterize the change directions of two utilities corresponding to two consecutive time
slots. Furthermore, the reward function Ru can be presented as Ru (s, a) = E[rut+1 |st , at ],
which is the expected value of the instant reward.
Therefore, in the multi-agent MDP model, at the current time slot t, if the state is st ∈ S
and the joint actions of agents in the system can be denoted as at = { a1 , . . . , aU } ∈ A,
each agent u ∈ U can obtain a reward rut+1 . Then the state will be transformed into a new
state st+1 ∈ S according to the transition probability P(st+1 |st , at ). Additionally, the policy
of agent u is denoted as the probability that the agent selects the action at a given state,
which can be expressed as πu (s, au ). Then the joint policy of all agents can be formulated
as π (s, a) = ΠU u=1 πu ( s, au ), and the π ( s, a ) is written as π for simplicity.
174
Drones 2023, 7, 226
T
1 1
rut+1 ]
T t∑ ∑
J (π ) = lim E[
T =1 U u∈U
(19)
1
= ∑ ηπ ( s ) ∑ π (s, a)
U ∑ Ru (s, a)
s∈S a∈A
where ηπ (s) = limt→∞ Pr(st = s|π ) denotes the stationary probability distribution in the
Markov chain when the policy π is given.
Furthermore, our optimization problem aims to work out
1
max J (θ ) = ∑ ηθ ( s ) ∑ π ( a|s; θ )
U ∑ Ru (s, a) (20)
s∈S a∈A
where θ will be learned by the policy gradient method [34]. Moreover, based on the
objective function, the gradients for θ can be calculated as
θ
∇θu J (θ ) = E[∇θu log πuθu ζ uπ (s, a)] (21)
θ
in which ζ uπ (s, a) is the advantage function, represented by
θ θ θ
ζ uπ (s, a) = Qπ (s, a) − Vuπ (s, a−u ) (22)
θ
where we use a−u to present the actions adopted by other agents, except for agent u; Qπ
indicates the action value function under the policy π θ for a given state–action pair (s, a),
θ
while Vuπ is the state value function. They are given as follows:
θ 1
Qπ (s, a) = ∑ E[ U ∑ rut+1 − J (θ )|s0 = s, a0 = a, π θ )] (23)
t u∈U
θ θ
Vuπ (s, a−u ) = ∑ πu ( au |s; θu ) Qπ (s, au , a−u )] (24)
au ∈Au
A2C takes the temporal difference (TD) error as an unbiased estimation to evaluate the
advantage function, which reduces the complexity of the parameter update and improves
the stability of the algorithm. In this case, the advantage is approximated as
1 t +1
ζ (st , at ) ≈ r + γV (st+1 |st , at ) − V (st ) = δ(st ) (25)
U
in which γ indicates the discounted factor.
θ
The critic network estimates Q(s, a) with Qπ (s, a) and generates a TD error to express
whether the action taken by the agent is good or not, as well as updates the DNN parameter
θ c with the gradient descent method. Additionally, each UAV u can share estimations from
the critic network with other UAVs nearby to effectively evaluate the actions. Then the
output of the critic network is further used to update the parameter θ a for the actor network
of the UAV agent u, which aims at improving the probabilities of actions that perform
relatively well. In particular, the update to θ a and θ c can be presented as
∂ log π ( at |st ; θ a ) t t c
θa ← θa + δ (s ; θ ) (26)
∂θ a
∂V (st ; θ c )
θ c ← θ c + δt (st ; θ c ) (27)
∂θ c
175
Drones 2023, 7, 226
At the initial stage, we give the related parameters including the set of UAVs and
MEC servers, i.e., U , M, the learning rate lr a , lr c of the actor and critic network, maximum
number of training episodes Epmax and the step size Epi of one episode, update interval
Δt, as well as the discount factor γ. For each UAV u ∈ U , we initialize the actor parameter
θua and critic parameter θuc . Afterward, at the start of each episode in the training stage,
the system state is randomly initialized, including the locations of UAVs and the relevant
information of tasks and situations of the MEC environment; each UAV will execute
Algorithm 1 to obtain the available offloading node set M̃, then the initial state s0 is
obtained (from steps 5 to 8).
Without loss of generality, one training episode is divided into Epi time slots. At time
slot t, each UAV adopts an action according to the policy πu ( atu |st ; θua ) in the actor, then
performs computation offloading according to the adopted action au ∈ at , and obtains the
instant reward r u ∈ r t ; next, the state is updated to st+1 (from steps 11 to 15). Finally, once
every Δt, the algorithm updates the parameters of the actor and critic network by only
sampling the (st+1 , at , st ) (from steps 16 to 19). In order to enable the average reward to
converge to a stable value and learn the optimal policy, the iterative training will last for
Epmax episodes. After convergence, the algorithm only needs to save the actor network to
make offloading decisions for UAVs.
The computational complexity of Algorithm 1 is to explore the available offloading
nodes by each UAV, and the complexity of the designed fuzzy logic module is a constant;
thus, the complexity for Algorithm 1 is O( M) in the worst case. In Algorithm 2, at the
176
Drones 2023, 7, 226
training stage, each UAV agent evaluates the Q-value with the critic network by inputting
the joint actions of UAVs and the environment state; thus, the input and output sizes of the
critic network in UAV are U |S| and 1, respectively. Moreover, each UAV makes an action
by mapping the current state to the actor network; thereby, the input and output sizes of
the actor network in UAV are |S| and 1, respectively. After training is finished, the action
for each UAV can be obtained from its actor network only with the |S| input size and 1
output size. The computational complexity is proportional to the input and output sizes;
thus, the overall complexity of our DOMUS proposed is O( M + U |S|) .
6. Performance Evaluation
In this section, we perform a series of numerical simulations and evaluate the proposed
task-offloading scheme for the UAV swarm in MEC-assisted heterogeneous networks.
177
Drones 2023, 7, 226
(a)
(b)
8$9V
8$9V
(SLVRGH
178
Drones 2023, 7, 226
2YHUDOOHQHUJ\FRQVXPSWLRQ -
ZH ZG
ZH ZG
ZH ZG
$YHUDJHGDWDVL]H 0%
(a)
ZH ZG
ZH ZG
ZH ZG
2YHUDOOGHOD\ V
$YHUDJHGDWDVL]H 0%
(b)
Figure 4. Energy consumption and delay comparison in DOMUS under different weighting factors (a,b).
From Figure 4b, we can see that the delay incurred by performing tasks shows a
linearly increasing trend as the delay weighting factor increases. However, as the delay
weighting factor grows larger, less delay is required to complete the UAV tasks under the
same average data size. Therefore, the comparisons depicted in Figure 4a,b are consis-
tent with the theoretical data that both energy consumption and delay show noticeable
differences under different weighting factors when processing UAV tasks.
179
Drones 2023, 7, 226
Distance-dependent offloading scheme (DDO); (4) Smart ant colony optimization task-
offloading algorithm (SACO) [31].
(1) Impact of the number of UAVs. First, we set the weighting factors as wd,κ = we,κ =
0.5 and evaluated the performance of the proposed DOMUS under different numbers of
UAVs. Figure 5a shows a comparison of the overall energy consumption of all algorithms as
the number of UAVs increases. As shown, the DOMUS achieves lower energy consumption
compared to the STCO and SACO schemes. By relying on the multi-agent DRL model,
the DOMUS can learn the distribution of computation tasks and enable multi-UAVs to
almost always select proper offloading targets as the number of UAVs increases. The
STCO ignores future offloading decisions, resulting in decisions that are suboptimal from a
long-term perspective. The SACO algorithm may become trapped in local optimization
due to the feedback of pheromones in suboptimal solutions obtained in early iterations. In
Figure 5b, we plot the overall delay under different UAV numbers, which shows the same
trend with Figure 5a; the delay curves of other offloading schemes increase significantly
compared with DOMUS. Because the DOMUS can select actions without a non-minimum
delay for the current task of the UAV, it optimizes long-term performance. To prove this,
as shown in Figure 5c, we also recorded the average utility of UAVs as the number of
UAVs increased. Combined with Figure 5a,b, our proposed DOMUS can achieve lower
energy consumption and lower delay compared to the STCO offloading approaches, with
minimum improvements of 8.29% and 7.75%, respectively. Accordingly, the average utility
achieved by DOMUS is the highest among the different algorithms, with an improvement
of up to 12.82%.
'2086 '2086
67&2 67&2
,:362 ,:362
2YHUDOOHQHUJ\FRQVXPSWLRQ -
''2 ''2
6$&2 6$&2
2YHUDOOGHOD\ V
1XPEHURI8$9V 1XPEHURI8$9V
(a) (b)
'2086
67&2
,:362
''2
$YHUDJHXWLOLW\RI8$9V
6$&2
1XPEHURI8$9V
(c)
Figure 5. Energy consumption, delay, and utility comparison under different UAV numbers (a–c).
(2) Impact of data size. We set the number of UAVs to six and then investigated the
energy consumption, delay, and average utility required to complete tasks for UAVs with
different average task data sizes.
Figure 6 shows the impact of data transmission on energy consumption. Transmitting
larger amounts of data requires more communication resources, leading to increased com-
munication delays. In this case, as the data size increases, UAVs will consume more energy
for data transmission. Moreover, SACO’s energy consumption performance deteriorates as
the data size increases, primarily due to the gradually increasing tabu lists in the SACO al-
180
Drones 2023, 7, 226
gorithm that restrict the UAV selection. Our proposed DOMUS can explore policy learning
with great effectiveness, resulting in reduced energy consumption. In general, DOMUS
reduces energy consumption by up to 13.13% compared to other schemes.
'2086
67&2
,:362
2YHUDOOHQHUJ\FRQVXPSWLRQ -
''2
6$&2
$YHUDJHGDWDVL]H 0%
Figure 7 compares the impact of average data size on task completion delay for UAVs.
As the data size increases, the delay to complete tasks also increases; when the data size of
the task is big, computing the task locally is necessary, which can reduce the transmission
delay but correspondingly increase the execution delay. Since the DNN can offer accurate
regression, in the proposed DOMUS, DNN is used both in the actor and the critic to
approximate the offloading policy and value function interactively, which enables each
UAV agent to select appropriate task-processing strategies. Nonetheless, when adopting
the STCO scheme, optimal task-processing strategies cannot be extensively derived. This is
because the STCO may not effectively take into account the optimization of the subsequent
execution of tasks. In summary, the DOMUS optimization results in up to a 6.77% delay
compared to other schemes.
'2086
67&2
,:362
''2
6$&2
2YHUDOOGHOD\ V
$YHUDJHGDWDVL]H 0%
Figure 8 shows the average utility of UAVs for task processing with varying data sizes.
The figure shows that our DOMUS mechanism outperforms other schemes in terms of
utility, especially for large data sizes. This is because our proposed DOMUS maximizes the
utility of task processing for each UAV agent by learning the offloading policy over long
training episodes. STCO and SACO obtain similar utility, while the IWPSO method only
optimizes the utility from a single UAV perspective, resulting in poor performance, and the
DDO method presents the worst utility. Finally, the average utility of UAVs is improved by
at least 11.39% compared to the comparative schemes.
181
Drones 2023, 7, 226
'2086
67&2
,:362
''2
$YHUDJHXWLOLW\RI8$9V
6$&2
$YHUDJHGDWDVL]H 0%
'2086
67&2
,:362
''2
2YHUDOOHQHUJ\FRQVXPSWLRQ -
6$&2
%
%
%
%
%
% %
$YHUDJHFRPSXWDWLRQDOFDSDELOLW\RIVHUYHUV *F\FOHVV
Figure 9. Energy consumption comparison under different computational capabilities and network
bandwidths.
Figure 10 shows the task completion delay for UAVs with the variation in compu-
tational capability and bandwidth. It is observed that the delay performance changes
similarly to the consumed energy consumption. When more computational resources
and bandwidths are allocated to UAVs, they are attracted to offload tasks, resulting in
degraded delay not only in transmission links but also on servers. However, there still
182
Drones 2023, 7, 226
exists a performance gap between our proposed scheme and the other four benchmarks.
This is because our proposed DOMUS evaluates the quality of the communication link
and offloading nodes for each UAV using the fuzzy logic-based offloading assessment
mechanism, and further uses the A2C model to make offloading decisions by continuously
updating the parameters of DNNs to enhance the prediction ability of the critic network for
actions. This efficiently facilitates the optimization of offloading decisions. In general, the
overall delay in task processing was reduced by at least 4.89% compared to other offloading
approaches.
'2086
67&2
,:362
''2
6$&2
%
2YHUDOOGHOD\ V
%
%
%
%
% %
$YHUDJHFRPSXWDWLRQDOFDSDELOLW\RIVHUYHUV *F\FOHVV
Figure 10. Delay comparison under different computational capabilities and network bandwidths.
Additionally, Figure 11 shows the variation of the average utility of UAVs as computa-
tional capability and network bandwidth vary. We can see from the figure that the utilities
of different offloading schemes increase as the computational capability bandwidth in-
creases. In particular, our proposed approach outperforms other approaches and improves
the utility of UAVs by up to 8.14%. This phenomenon indicates that the DOMUS proposed
can effectively enable multi-UAVs to explore the joint optimal task computing policy under
the guidance of the devised offloading A2C-based DRL framework in a dynamic network
environment.
%
%
%
%
$YHUDJHXWLOLW\RI8$9V
%
%
%
'2086
67&2
,:362
''2
6$&2
$YHUDJHFRPSXWDWLRQDOFDSDELOLW\RIVHUYHUV *F\FOHVV
Figure 11. Average utility comparison under different computational capabilities and network band-
widths.
(4) Impact of transmission power. To further demonstrate the scalability of the pro-
posed DOMUS scheme, we investigated the impact of varying transmission power of UAVs
on the overall delay and energy consumption to complete tasks, as shown in Figure 12. The
number of UAVs was set to 6. Generally, increasing the transmission power can result in a
higher data transmission rate, which helps to reduce data transmission delay. Accordingly,
183
Drones 2023, 7, 226
we observe from Figure 12a that all delay curves become smaller as the power gradually
increases. Nonetheless, as depicted in Figure 12b, a higher transmission power can lead to
a higher energy consumption when UAVs transmit task data due to the linear relationship
between them. Moreover, the proposed DOMUS achieves the lowest delay and energy
consumption among the five offloading approaches, which reduces the two metrics by at
least 4.66% and 9.26%, respectively.
2YHUDOOGHOD\ V
'2086
67&2
,:362
''2
6$&2
7UDQVPLVVLRQSRZHU :
(a)
'2086
67&2
,:362
''2
2YHUDOOHQHUJ\FRQVXPSWLRQ -
6$&2
7UDQVPLVVLRQSRZHU :
(b)
Figure 12. Delay and energy comparison under different transmission power (a,b).
7. Conclusions
This paper addresses the task offloading of an UAV swarm in MEC-assisted 5G
heterogeneous networks. The objective is to optimize the utility of the multi-UAV system
for task processing and prevent UAVs from offloading via easily disconnected wireless links
and poorly-performing service nodes. We first devise an assessment mechanism to evaluate
the candidate-offloading nodes by utilizing fuzzy logic theory. Afterward, considering the
unknown environmental dynamics in heterogeneous networks, we model the optimization
problem as a multi-agent MDP and propose a decentralized task-offloading scheme called
DOMUS using the model-free DRL framework based on multi-agent A2C. In particular,
the simulation results reveal that the proposed DOMUS can achieve effective convergence
184
Drones 2023, 7, 226
as well as reduce the delay and energy consumption under various settings for completing
UAV tasks.
In future work, we will integrate the swarm intelligence approach into the proposed
learning framework to enhance the services of drone base stations with multiple UAVs.
By providing drone base station-enabled MEC architecture and realizing reasonable re-
source utilization with more advanced approaches, the much stricter requirements of
next-generation Internet of Things applications on reliable and efficient service perfor-
mances will be further satisfied.
Author Contributions: Conceptualization, M.M. and Z.W.; methodology and writing—original draft,
M.M.; writing—review and editing, Z.W. All authors have read and agreed to the published version
of the manuscript.
Funding: This research was funded by the National Key R&D Program of China No. 2020YFA0713504.
Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.
Data Availability Statement: Not applicable.
Conflicts of Interest: The authors declare no conflict of interest.
References
1. Dai, M.; Su, Z.; Xu, Q.; Zhang, N. Vehicle Assisted Computing Offloading for Unmanned Aerial Vehicles in Smart City. IEEE
Trans. Intell. Transp. Syst. 2021, 22, 1932–1944. [CrossRef]
2. Liu, Z.; Wang, X.; Shen, L.; Zhao, S.; Cong, Y.; Li, J.; Yin, D.; Jia, S.; Xiang, X. Mission-Oriented Miniature Fixed-Wing UAV
Swarms: A Multilayered and Distributed Architecture. IEEE Trans. Syst. Man Cybern. Syst. 2022, 52, 1588–1602. [CrossRef]
3. Sigala, A.; Langhals, B. Applications of Unmanned Aerial Systems (UAS): A Delphi Study projecting future UAS missions and
relevant challenges. Drones 2020, 4, 8. [CrossRef]
4. Yan, S.; Hanly, S.V.; Collings, I.B. Optimal Transmit Power and Flying Location for UAV Covert Wireless Communications. IEEE
J. Sel. Areas Commun. 2021, 39, 3321–3333. [CrossRef]
5. Hu, P.; Zhang, R.; Yang, J.; Chen, L. Development Status and Key Technologies of Plant Protection UAVs in China: A Review.
Drones 2022, 6, 354. [CrossRef]
6. Yazid, Y.; Ez-Zazi, I.; Guerrero-González, A.; El Oualkadi, A.; Arioua, M. UAV-enabled mobile edge-computing for IoT based on
AI: A comprehensive review. Drones 2021, 5, 148. [CrossRef]
7. Ma, M.; Zhu, A.; Guo, S.; Yang, Y. Intelligent Network Selection Algorithm for Multiservice Users in 5G Heterogeneous Network
System: Nash Q-Learning Method. IEEE Internet Things J. 2021, 8, 11877–11890. [CrossRef]
8. Zhou, H.; Jiang, K.; Liu, X.; Li, X.; Leung, V.C.M. Deep Reinforcement Learning for Energy-Efficient Computation Offloading in
Mobile-Edge Computing. IEEE Internet Things J. 2022, 9, 1517–1530. [CrossRef]
9. Chinchali, S.; Sharma, A.; Harrison, J.; Elhafsi, A.; Kang, D.; Pergament, E.; Cidon, E.; Katti, S.; Pavone, M. Network offloading
policies for cloud robotics: A learning-based approach. Auton. Robot. 2021, 45, 997–1012. [CrossRef]
10. Zhu, A.; Ma, M.; Guo, S.; Yu, S.; Yi, L. Adaptive Multi-Access Algorithm for Multi-Service Edge Users in 5G Ultra-Dense
Heterogeneous Networks. IEEE Trans. Veh. Technol. 2021, 70, 2807–2821. [CrossRef]
11. Zhang, X.; Cao, Y. Mobile data offloading efficiency: A stochastic analytical view. In Proceedings of the 2018 IEEE International
Conference on Communications Workshops (ICC Workshops), Kansas City, MO, USA, 20–24 May 2018; pp. 1–6.
12. Li, H.; Wu, S.; Jiao, J.; Lin, X.H.; Zhang, N.; Zhang, Q. Energy-Efficient Task Offloading of Edge-Aided Maritime UAV Systems.
IEEE Trans. Veh. Technol. 2023, 72, 1116–1126. [CrossRef]
13. Guo, M.; Huang, X.; Wang, W.; Liang, B.; Yang, Y.; Zhang, L.; Chen, L. Hagp: A heuristic algorithm based on greedy policy for
task offloading with reliability of mds in mec of the industrial internet. Sensors 2021, 21, 3513. [CrossRef]
14. Zhang, D.; Li, X.; Zhang, J.; Zhang, T.; Gong, C. New Method of Task Offloading in Mobile Edge Computing for Vehicles Based
on Simulated Annealing Mechanism. J. Electron. Inf. Technol. 2022, 44, 3220–3230.
15. Huang, J.; Wang, M.; Wu, Y.; Chen, Y.; Shen, X. Distributed Offloading in Overlapping Areas of Mobile-Edge Computing for
Internet of Things. IEEE Internet Things J. 2022, 9, 13837–13847. [CrossRef]
16. Xia, S.; Yao, Z.; Li, Y.; Mao, S. Online Distributed Offloading and Computing Resource Management With Energy Harvesting for
Heterogeneous MEC-Enabled IoT. IEEE Trans. Wirel. Commun. 2021, 20, 6743–6757. [CrossRef]
17. Zhou, H.; Wang, Z.; Cheng, N.; Zeng, D.; Fan, P. Stackelberg-Game-Based Computation Offloading Method in Cloud-Edge
Computing Networks. IEEE Internet Things J. 2022, 9, 16510–16520. [CrossRef]
18. Gu, Q.; Shen, B. An Evolutionary Game Based Computation Offloading for an UAV Network in MEC. In Wireless Algorithms,
Systems, and Applications: Proceedings of the 17th International Conference, WASA 2022, Dalian, China, 24–26 November 2022; Springer:
Cham, Switzerland, 2022; pp. 586–597.
185
Drones 2023, 7, 226
19. You, Q.; Tang, B. Efficient task offloading using particle swarm optimization algorithm in edge computing for industrial internet
of things. J. Cloud Comput. 2021, 10, 41. [CrossRef]
20. Li, F.; He, S.; Liu, M.; Li, N.; Fang, C. Intelligent Computation Offloading Mechanism of UAV in Edge Computing. In Proceedings
of the 2022 2nd International Conference on Frontiers of Electronics, Information and Computation Technologies (ICFEICT),
Wuhan, China, 19–21 August 2022; pp. 451–456.
21. Asaamoning, G.; Mendes, P.; Rosário, D.; Cerqueira, E. Drone swarms as networked control systems by integration of networking
and computing. Sensors 2021, 21, 2642. [CrossRef]
22. Pliatsios, D.; Goudos, S.K.; Lagkas, T.; Argyriou, V.; Boulogeorgos, A.A.A.; Sarigiannidis, P. Drone-base-station for next-generation
internet-of-things: A comparison of swarm intelligence approaches. IEEE Open J. Antennas Propag. 2021, 3, 32–47. [CrossRef]
23. Amponis, G.; Lagkas, T.; Zevgara, M.; Katsikas, G.; Xirofotos, T.; Moscholios, I.; Sarigiannidis, P. Drones in B5G/6G networks as
flying base stations. Drones 2022, 6, 39. [CrossRef]
24. Chen, M.; Wang, T.; Zhang, S.; Liu, A. Deep reinforcement learning for computation offloading in mobile edge computing
environment. Comput. Commun. 2021, 175, 1–12. [CrossRef]
25. Zhang, D.; Cao, L.; Zhu, H.; Zhang, T.; Du, J.; Jiang, K. Task offloading method of edge computing in internet of vehicles based on
deep reinforcement learning. Clust. Comput. 2022, 25, 1175–1187. [CrossRef]
26. Xu, J.; Li, D.; Gu, W.; Chen, Y. Uav-assisted task offloading for iot in smart buildings and environment via deep reinforcement
learning. Build. Environ. 2022, 222, 109218. [CrossRef]
27. Vhora, F.; Gandhi, J.; Gandhi, A. Q-TOMEC: Q-Learning-Based Task Offloading in Mobile Edge Computing. In Proceedings of
the Futuristic Trends in Networks and Computing Technologies: Select Proceedings of Fourth International Conference on FTNCT 2021;
Springer: Singapore, 2022; pp. 39–53.
28. Zhu, D.; Li, T.; Tian, H.; Yang, Y.; Liu, Y.; Liu, H.; Geng, L.; Sun, J. Speed-aware and customized task offloading and resource
allocation in mobile edge computing. IEEE Commun. Lett. 2021, 25, 2683–2687. [CrossRef]
29. Ma, L.; Wang, P.; Du, C.; Li, Y. Energy-Efficient Edge Caching and Task Deployment Algorithm Enabled by Deep Q-Learning for
MEC. Electronics 2022, 11, 4121. [CrossRef]
30. Naouri, A.; Wu, H.; Nouri, N.A.; Dhelim, S.; Ning, H. A novel framework for mobile-edge computing by optimizing task
offloading. IEEE Internet Things J. 2021, 8, 13065–13076. [CrossRef]
31. Kishor, A.; Chakarbarty, C. Task offloading in fog computing for using smart ant colony optimization. Wirel. Pers. Commun. 2022,
127, 1683–1704. [CrossRef]
32. Guo, H.; Liu, J. Collaborative computation offloading for multiaccess edge computing over fiber–wireless networks. IEEE Trans.
Veh. Technol. 2018, 67, 4514–4526. [CrossRef]
33. Pekaslan, D.; Wagner, C.; Garibaldi, J.M. ADONiS-Adaptive Online Nonsingleton Fuzzy Logic Systems. IEEE Trans. Fuzzy Syst.
2020, 28, 2302–2312. [CrossRef]
34. Zhou, W.; Jiang, X.; Luo, Q.; Guo, B.; Sun, X.; Sun, F.; Meng, L. AQROM: A quality of service aware routing optimization
mechanism based on asynchronous advantage actor-critic in software-defined networks. Digit. Commun. Netw. 2022. [CrossRef]
35. Athanasiadou, G.E.; Fytampanis, P.; Zarbouti, D.A.; Tsoulos, G.V.; Gkonis, P.K.; Kaklamani, D.I. Radio network planning towards
5G mmWave standalone small-cell architectures. Electronics 2020, 9, 339. [CrossRef]
36. Garroppo, R.G.; Volpi, M.; Nencioni, G.; Wadatkar, P.V. Experimental Evaluation of Handover Strategies in 5G-MEC Scenario
by using AdvantEDGE. In Proceedings of the 2022 IEEE International Mediterranean Conference on Communications and
Networking (MeditCom), Athens, Greece, 5–8 September 2022; pp. 286–291.
37. Liu, Y.; Dai, H.N.; Wang, Q.; Imran, M.; Guizani, N. Wireless powering Internet of Things with UAVs: Challenges and
opportunities. IEEE Netw. 2022, 36, 146–152. [CrossRef]
38. Feng, W.; Liu, H.; Yao, Y.; Cao, D.; Zhao, M. Latency-aware offloading for mobile edge computing networks. IEEE Commun. Lett.
2021, 25, 2673–2677. [CrossRef]
39. Zhou, H.; Wu, T.; Chen, X.; He, S.; Guo, D.; Wu, J. Reverse auction-based computation offloading and resource allocation in
mobile cloud-edge computing. IEEE Trans. Mob. Comput. 2022, 1–5. [CrossRef]
40. Huang, S.; Zhang, J.; Wu, Y. Altitude Optimization and Task Allocation of UAV-Assisted MEC Communication System. Sensors
2022, 22, 8061. [CrossRef]
41. Zhang, K.; Gui, X.; Ren, D.; Li, D. Energy-Latency Tradeoff for Computation Offloading in UAV-Assisted Multiaccess Edge
Computing System. IEEE Internet Things J. 2021, 8, 6709–6719. [CrossRef]
42. Deng, X.; Sun, Z.; Li, D.; Luo, J.; Wan, S. User-centric computation offloading for edge computing. IEEE Internet Things J. 2021, 8,
12559–12568. [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.
186
drones
Article
UAV Deployment Optimization for Secure Precise
Wireless Transmission
Tong Shen 1, *, Guiyang Xia 1 , Jingjing Ye 2 , Lichuan Gu 1 , Xiaobo Zhou 1 and Feng Shu 3
1 School of Information and Computer, Anhui Agricultural University, Hefei 230036, China
2 Mingguang Meteorological Mureau, Chuzhou 239400, China
3 School of Information and Communication Engineering, Hainan University, Haikou 570228, China
* Correspondence: shentong@ahau.edu.cn
Abstract: This paper develops an unmanned aerial vehicle (UAV) deployment scheme in the context
of the directional modulation-based secure precise wireless transmissions (SPWTs) to achieve more
secure and more energy efficiency transmission, where the optimal UAV position for the SPWT is
derived to maximize the secrecy rate (SR) without frequency diverse array (FDA) and injecting any
artificial noise (AN) signaling. To be specific, the proposed scheme reveals that the optimal position
of UAV for maximizing the SR performance has to be placed at the null space of Eves channel, which
impels the received energy of the confidential message at the unintended receiver deteriorating to
zero, whilst benefits the one at the intended receiver by achieving its maximum value. Moreover, the
highly cost FDA structure is eliminated and transmit power is all allocated for transmitting a useful
message which shows its energy efficiency. Finally, simulation results verify the optimality of our
proposed scheme in terms of the achievable SR performance.
Keywords: three-dimensional UAV deployment; precise wireless transmission; physical layer security;
secrecy rate
1. Introduction
1.1. Background and Related Works
circuit budget with low computational complexity and comparable secrecy performance, it
significantly increases the practicability of SPWT system. Since the wireless propagation
channel results in the security problem becoming more formidable due to the accessibility
of diverse devices. For [14] as an example, the authors clarify that the energy of the main
lobe is always formed around the desired receiver, but a number of the non-negligible side
lobes remain having comparatively appreciable power. It is indicated that the position of
the transmitter affects the distribution of those lobes, and then leading to a security risk. To
elaborate a little, when the eavesdropper is located on the side-lobe peak, the achievable se-
crecy rate (SR) performance will be gravely degraded. Therefore, deploying the transmitter
at an appropriate position has an important meaning for the secrecy performance.
As a matter of fact, most existing works regarding the SPWT focus on static scenarios,
which severely limits its application. Considering that unmanned aerial vehicle (UAV)
has been widely utilized in wireless communications due to bringing extensive benefits
(e.g., high probability air-to-ground channel [15] and mobility controllable [16]). Moreover,
UAV transmission technology has been researched wildly. In [17], the authors presented an
UAV autonomous landing scheme with model predictive controlled moving platform. The
authors in [18] proposed a novel approach for the Drone-BS in 5G communication systems,
using the meta-heuristic algorithm. Ref. [19] considered the UAV-enabled networks under
the probabilistic line-of-sight channel model in complex city environments and jointly
optimize the communication connection, the three-dimensional (3D) UAV trajectory, and
the transmit power of the UAV to increase the average secrecy rate. In [20], the authors
consider UAV networks for collecting data securely and covertly from ground users, a
full-duplex UAV intends to gather confidential information from a desired user through
wireless communication and generate artificial noise (AN) with random transmit power in
order to ensure a negligible probability of the desired user’s transmission being detected by
the undesired users. The authors in [21] proposed a detection strategy based on multiple
antennas with beam sweeping to detect the potential transmission of UAV in wireless
networks. In [22], a novel framework is established by jointly utilizing multiple measure-
ments of received signal strength from multiple base stations and multiple points on the
trajectory to improve the localization precision of UAV. In [23], the authors considered the
region constraint and proposed a received-signal-strength-based optimal scheme for drones
swarm passive location measurement. Thus, in this work we consider SPWT in the context
of UAV networks, which not only extends the static scenarios to the dynamic situations but
also matches the stringent requirement of SPWT for line-of-sight (LoS) communication link
from the transmitter to receiver.
1.2. Motivation
Note that in the previous works (e.g., [9–11]), the authors consider the SPWT system
model only in two-dimensional scenarios, which limits the practical applications scenarios.
At the same time, the FDA technique is generally employed to determine a specific position,
while this work casts off the high-cost FDA scheme but ingeniously achieves this goal
by fully taking advantage of the angle information in the three-dimensional (3D) space.
Against this background, this paper considers an SPWT scheme with the aid of a UAV
to improve the system’s security level. For a removable UAV transmitter, an analytical
solution to the optimal UAV position is derived for reducing the computational complexity.
Finally, simulation results show the efficiency of the proposed UAV deployment scheme in
terms of the achievable SR performance.
1.3. Contributions
In this paper, the main contributions are as follows.
1. We propose a novel SPTW framework with UAV secure communications which
improves transmission security by change UAV’s position.
188
Drones 2023, 7, 224
2. Proposed UAV SPWT scheme is based on DM but not FDA, which can reduce the
radio frequency chains’ cost. Meanwhile, the computational complexity will be
significantly reduced.
3. Conventional SPWT improves security performance with aided AN, while our pro-
posed scheme deploys the transmitter, e.g., UAV on the zero SINR space of Eve, and
the power originally allocated to artificial noise can be used to transmit useful infor-
mation, which greatly improves Bob’s signal-to-interference-and-noise ratio (SINR)
without affecting Eve’s SINR, thus enhancing the security performance.
The remainder of this paper is organized as follows: in Section 2, our system model of
proposed UAV SPWT is described, then the secrecy capacity performance based on UAV
SPWT structure and proposed UAV deployment scheme is analyzed. Section 3 presents the
simulation and the analysis. Finally, the conclusion is drawn in Section 4.
Notations: In this paper, scalar variables are denoted by italic symbols, vectors, and
matrices are denoted by letters of bold upper case and bold lower case, respectively.
Sign (·) T , (·)∗ tr (·) and (·) H denote transpose,conjugation, trace, and conjugate trans-
pose, respectively. · and |·| denote the norm and modulus, respectively. E[·] denotes
expectation operation.
2. Method
2.1. System Model
As shown in Figure 1, our considered SPWT system is composed of an UAV, a desired
user (Bob) and an eavesdropper (Eve). Herein, both Bob and Eve are equipped with a
single antenna while the UAV is equipped with a M × N rectangular antenna array, namely
the distance between any two adjacent antenna elements is identical. It should be noted
that, the antennas array should be any planar array, such as rectangular array or circular
array which can form both angle and distance depended 3D beams, in this paper, without
loss generality we assume the transmit antenna forms a rectangular antenna array. For
expression convenience, we set Bob as the origin and the ray formed from Bob to Eve
is defined as the positive direction of the X-axis. Moreover, we considered that Bob and
Eve are ground users, i.e., the Z-axis coordinates of both Bob and Eve are 0. We assume
that the UAV flies at a predetermined height g and parallel to the ground. The channels
between the users (Bob or Eve) are assumed as LoS channels which have been widely used
in UAV communication scenarios [24]. What is more important, it is difficult for SPWT
of applying in None-LoS (NLoS) channels, the reasons are as follows: Firstly, as NLOS
channel is independent on θ and R, the designed beamforming vectors have the possibility
of transmitting the confidential signal to any location. As a result, the security performance
might be seriously degraded. Secondly, as for NLOS channels, it changes with time, the
designed beamforming vectors can be only applied for a specific time. Therefore, with the
designed beamforming vectors in NLOS channel, confidential message may be transmitted
to any location as time goes. Lastly, in our proposed scheme, the invoked artificial noise
(AN) has an ability of disturbing the signal received at Eve, but having a negligible effect
on Bob. However, in NLOS channels, there exists an effect of gathering AN for Bob, thus
resulting in a serious secrecy rate performance degradation at Bob. Thus, the assumption
of the LoS channel in our proposed UAV and SPWT system is completely reasonable.
Due to the UAV serving the ground users, the relative positions of Bob and Eve is
assumed to be known by the UAV. Rationality analysis of this assumption is as follows. For
Bob, it is reasonable that the UAV knows his target user’s location. For Eve, we consider a
large number of users in our system, and all the users’ location (including desired users,
undesired users, and eavesdroppers) are known for Alice. The part of undesired users
are legitimate users in other time periods and they will not eavesdrop the the confidential
message. The other part of eavesdroppers are illegitimate users in all the time, however, the
eavesdroppers hidden in all the users, who is an undesired user or who is an eavesdropper
can not be determined. Thus, we can not determine which user is an eavesdropper, but
can consider a user who is most likely to be a eavesdropper. This scheme is obviously
189
Drones 2023, 7, 224
more reasonable and practical, and thus the Eve in our proposed system model is the most
likely eavesdropper, and we try to prevent his eavesdropping by our proposed scheme. In
conclusion, the assumption that the relative positions of Bob and Eve is known to the UAV.
$OLFH
$O
<
G
G
T$
M%
M(
T(
T%
;
'HVLUHGXVHU (DYHVGURSSHU
1
h(θ, ϕ) = √ [e j2πψ1,1 . . . e j2πψm,n . . . e j2πψM,N ], (1)
MN
where θ and ϕ respectively denote the receiver’s azimuth angle and the pitch angle relative
to the UAV, where ψm,n is
190
Drones 2023, 7, 224
fc
ψm,n = − [(m − 1)d cos θ + (n − 1)d sin θ ] cos ϕ. (2)
c
Herein, f c is the central carrier frequency, d = c/(2 f c ) is the element spacing in the
transmit antenna array and c is the speed of light. Substituting (θ B , ϕ B ) and (θ E , ϕ E ) into
(1) and (2), respectively, we obtain the steering vector h(θ B , ϕ B ) and h(θ E , ϕ E ).
At the baseband, the transmit signal can be expressed as
√
s= αPs vx + (1 − α) Ps w, (3)
where α, Ps and x refer to the power allocation factor, total transmit power and a complex
symbol following E[| x |2 ] = 1. In addition, v ∈ C MN ×1 and w ∈ C MN ×1 denote the
beamforming and AN vectors, respectively.
To achieve precise transmission, v = h(θ B , ϕ B ) is set to maximize the received power
of the confidential signal at Bob, while w = [I MN − h(θ B , ϕ B )h H (θ B , ϕ B )]z is to project the
AN into the null space of Bob, where z is an AN vector consisting of MN complex Gaussian
variables with normalized power, i.e., z ∼ CN (0, I MN ). Notably, UAV is usually an aerial
platform serving for terrestrial nodes, hence all the communication channels follow the
light of sight (LOS) model. Then, the received signal at Bob can be expressed as
√ √
yB = αPs h BH vx + (1 − α) Ps h BH w + n B = αPs x + n B , (4)
In (4) and (5), h B and h E are the abbreviations of h(θ B , ϕ B ) and h(θ E , ϕ E ), respectively.
Moreover, n B and n E are the additive white Gaussian noises (AWGNs) at Bob and Eve
satisfying n B ∼ CN (0, σB2 ) and n E ∼ CN (0, σE2 ).
In this subsection, we propose the system model and give the received signal expres-
sion, it is clear to find that the received signal at Bob is independent of the UAV position
while the the received signal at Eve is related to h E which is a function of UAV position
information, e.g., (θ E , ϕ E ). This shows that it is feasible to reduce the useful signal energy
at Eve by moving the position of UAV without reducing the useful signal energy at Bob, so
as to increase the security of UAV-based SPWT system.
191
Drones 2023, 7, 224
αPs |h EH h B |2
min SINRE =
θ E ,ϕ E (1 − α) Ps h EH w + σE2
π
s.t. 0 ≤ ϕ E ≤ , 0 ≤ θ E ≤ 2π. (6)
2
Since ϕ E is the angle between the line between the from UAV to Eve and the horizontal
plane, thus in the constraint of Equation (6) it can not be larger than π2 . Considering the fact
that SINRE is certainly non-negative, we associate that SINRE arrives its minimum when
the numerator of the objective function of (6) is equal to 0. Pertinently, we expand the term
h EH h B as
M,N
1
h EH h B = ∑ e jπ [(m−1) cos θE +(n−1) sin θE ] cos ϕE ×
MN m,n
e− jπ [(m−1) cos θB +(n−1) sin θB ] cos ϕB . (7)
M N
1
h EH h B =
MN ∑ e jπ(m−1)(cos θE −cos θB ) cos ϕE × ∑ e jπ(n−1)(sin θE −sin θB ) cos ϕE ,
m n
1 e jMπ (cos θE −cos θB ) cos ϕE
−1 e jNπ (sin θE −sin θB ) cos ϕE
−1
= · × . (8)
MN e jπ (cos θE −cos θB ) cos ϕE − 1
e jπ (sin θE −sin θB ) cos ϕE − 1
or
Nπ (sin θ E − sin θ B ) cos ϕ E = ±2kπ, (10)
where k has to ensure that k = k M and k = k N, herein k is an integer (i.e., k ∈ Z). Then,
the relationship between the optimal θ E and the optimal θ B can be obtained as
±2k
cos θ E − cos θ B = , (11)
M cos ϕ E
or
±2k
sin θ E − sin θ B = . (12)
N cos ϕ E
Taking ϕ B = ϕ E and UAV flies at a constant height into account, we are aware that the
X-coordinate of UAV satisfies X A = XE /2 and θ B = π − θ E , which can be readily verified
192
Drones 2023, 7, 224
according to our system model. Substituting θ B = π − θ E , θ B = θ B − θ A and θ E = θ E − θ A
into (22), we obtain a correspondingly modified expression regarding θ B , shown as
±k
cos θ B = . (13)
M cos ϕ E cos θ A
±k
cos θ B = . (14)
N cos ϕ E sin θ A
Remark 1. Considering that (cos θ E − cos θ B ) ∈ [−2, 2] and cos ϕ E ∈ [0, 1], thus (cos θ E −
cos θ B ) cos ϕ E ∈ [−2, 2]. When (cos θ E − cos θ B ) = 0, the denominator of the first term in (8)
regarding θ B is unequal to 0. Naturally, (sin θ E − sin θ B ) = 0 holds. In a nutshell, our analysis
for the two denominators of (8) derives that sin θ E − sin θ B = 0 and cos θ E − cos θ B = 0.
Taking into account the relationship among θ A , θ B , θ B , θ E , and θ E , the flight angle of UAV follows
pπ
cos θ B cos θ A = 0 and cos θ B sin θ A = 0, which further arrives θ A = 2 , ( p ∈ Z), thus the
denominators of (8) are unequal to 0. As a result, our analysis and derivation for the solution θ B to
expression (8) is of physical significance.
Since ϕ E ∈ [0, π/2] increases as YA increases in the domain of (−∞, 0), hence cos ϕ E
decreases as YA ∈ (−∞, 0) increases. Similarly, cos ϕ E increases as YA ∈ (0, +∞) increases.
∗ ∗
ϕ E ∈ (1/N, 1/ ( N cos ϕ E )], where ϕ E refers to the
1
With those in mind, we then have N cos
optimal pitch angle for maximizing 1
N cos ϕ E . As a matter of fact, ϕ∗E arrives in the case of
YA = 0, which further derives that
XE /2
cos ϕ∗E = , . (15)
( XE /2)2 + g2
Furthermore, we note that the term sin±θk of (14) follows | sin±θk | ∈ [1, +∞), herein the
A A
left extremum arrives as k = 1 and sin θ A = 1. Hence, we conclude that θ B has at least one
∗
solution when |1/( N cos ϕ E )| ≤ 1.
With the above conclusion, we further derive the optimal solution θ B∗ to characterize
YA , where YA can be determined once θ B∗ is optimized. Upon substituting the original
√
( X A − XE )2 +(YA )2
definition cos θ B = √ X2 A 2 and cos ϕ E = √ into the derivation of (13),
X A +YA 2 ( X A − XE ) +(YA ) + g
2 2
we have
,
XA ( X A − XE )2 + (YA )2 ±k
, = . (16)
X 2A + YA2 ( X A − XE )2 + (YA )2 + g2 M cos θ A
193
Drones 2023, 7, 224
the parameters θ A (the yawing angle of UAV) and g (the height of UAV) can be strategically
regulated by UAV’s attitude, hence at least one feasible solution to (17) or (18) is able to be
gained. Because of the analytical solution, the computational complexity is significantly
reduced when compared to directly addressing problem (6). For the given X A , once YA
is optimized by (17) or (18), we finish the UAV deployment problem. At such an optimal
coordinate ( X ∗A , YA∗ , g), SINRE degrades to zero while SINRB achieves its maximum value
αPS /σB2 , thus ensuring that the maximum SR performance can be achieved.
Remark 2. Upon optimizing UAV deployment, we promulgate that the maximum SR performance
remains achievable, even though we have not split any precious power to the AN signaling. The
benefits are threefold: (1) from UAV’s perspective, the hardware structure becomes more simple
owing to removing the module of AN generator, which cuts down the expenditure and favors
lightening the weight of UAV; (2) for the perspective of energy efficiency, all the transmit power
is used to convey the confidential message, which contributes to Bob receiving a high-quality
signal; and (3) while for the computational complexity, our derived analytical solution to YA on the
basis of the 3-D spatial relation conspicuously mitigates the computational burden in terms of the
UAV deployment.
M N
1
h EH h B =
MN ∑ e jπ(m−1) cos θE (cos ϕE −cos ϕB ) · ∑ e jπ(n−1) sin θE (cos ϕE −cos ϕB ) ,
m n
1 e jMπ cos θE (cos ϕE −cos ϕB ) − 1 e jNπ sin θE (cos ϕE −cos ϕB ) − 1
= · jπ cos θ (cos ϕ −cos ϕ ) · jπ sin θ (cos ϕ −cos ϕ ) . (19)
MN e E E B −1 e E E B −1
or
±2l
cos ϕ E − cos ϕ B = ,
M cos θ E
l = l M(l = 1, 2, 3, . . .). (22)
or
±2l
cos ϕ E − cos ϕ B = ,
N sin θ E
l = l N (l = 1, 2, 3, . . .). (23)
Note that, the azimuth angles satisfy the constraint of θ B = θ E . According to the
system model, the Y-coordinate of Alice must satisfy YA = 0. Since the azimuth angle
can be adjusted with the flight angle of Alice, i.e., the flight direction which we defined as
-
θ A , its value range is (0, 2π ). The X-coordinate range of Alice is (−∞, 0) ( XE , +∞) while
cos ϕ E − cos ϕ B is a monotone decreasing function about X A . Let l = 1, it is clear that when
194
Drones 2023, 7, 224
M cos θ E < 0 or N sin θ E < 0, the X-coordinate X A ∈ (− ∞, 0). Conversely, when M cos θ E > 0
2 2 2
or N sin θ > 0, the X-coordinate X A ∈ ( XE , +∞). Thus, the value of X A is existed and can
2
E
be easily obtained similar to the pitch angles scheme or by dichotomy method.
In this subsection, we propose the UAV deployment strategies from azimuth angel
dimension and pitch angel dimension, respectively. According to the derived equations, it
is clear that, based on our proposed two schemes, the received SINR at Eve achieves zero.
At the same time, the received SINR at Bob reaches its maximum value. Thus, the secrecy
rate is improved.
Parameter Value
The number of transmitter antennas (M × N) 4×4
Total signal bandwidth (B) 5 MHz
Total transmit power (P) 1W
The height of UAV (g) 200 m
The eavesdropper’s position (XE , YE ) (500 m, 0)
The flight angle of UAV (θ A ) π/4
Central carrier frequency ( f c ) 3 GHz
Figure 2 shows the attainable SR performance of our proposed azimuth angles scheme
versus the signal-to-noise ratio (SNR), where SNR = 10 log Ps /σ2 . For comparison, we
consider the beamforming scheme proposed in [14], then three conventionally random
deployments are invoked to validate the efficiency of our proposed UAV deployment.
Firstly, it can be obviously noted that our proposed UAV deployment scheme is superior
to the other three random schemes in terms of the SR performance, albeit a negligible
computational complexity is increased. Moreover, the SR performance gap between the
proposed scheme and any other scheme becomes distinct as the SNR increases. Therefore,
the main benefits of our considered UAV deployment scheme stem from not only assuring
the precise transmission but also improving the security performance. On the other hand,
we note from Figure 2 that the theoretical SR performance, i.e., the maximum performance,
is coincident with that of our proposed scheme at any SNR. The result further verifies the
validness of our derived solution to the UAV deployment in the context of SPWT.
Figure 3 shows the attainable SR performance of our proposed pitch angles scheme
versus the SNR. Similarly, the three conventionally random deployments are invoked for
comparison. The results show that this proposed scheme is also superior to the conven-
tionally random deployments. This also proves that the basic ideas of the two methods
we proposed are completely correct. From Figures 2 and 3, it is clear to find, no matter
the azimuth angles scheme or pitch angles scheme we adopt, if only the deployed UAV
satisfies the constraint of h EH h B = 0, the optimal secrecy rates can achieve by the optimal
beamforming of v = h(θ B , ϕ B ).
To illustrate the efficiency of our proposed UAV-assisted SPWT scheme in gaining
the security, Figure 4 shows that the achievable SR performance varies as α increases. It
can be noted that the SR performances of the three conventional comparison schemes
are constantly unable to achieve the optimal SR performance even with the aid of power
allocation. In fact, the SR performance of the conventional scheme seriously degrades when
195
Drones 2023, 7, 224
the UAV is randomly deployed. Hence, properly arranging UAV has a momentous impact
on the SPWT. While for our proposed UAV deployment eliminating the AN signaling, it
overleaps the power split operation but remains gaining the maximum SR performance,
which further corroborates the potential value for the SPWT. Another interesting conclusion
is that the SR of our proposed scheme remains the maximum value and unchanged, this
means our proposed scheme do not need the artificial noise, this is because h EH h B = 0,
regardless of the power of artificial noise, the received SNR at Eve is always 0. Thus,
artificial noise is unnecessary, and this brings the benefit of a budget reduction.
7
Proposed azimuth angle scheme at the position of (250,76)
Theoretical SR of proposed scheme
Conventional scheme at the position of (300,200)
6
Conventional scheme at the position of (150,200)
Conventional scheme at the position of (300,-100)
5
Secrecy rate(bits/s/Hz)
0
0 5 10 15 20 25 30
SNR(10log10(PS/σ 2),dB)
Figure 2. The achievable SR performance versus SNR for our proposed azimuth angles scheme,
where three typically random deployments are invoked as the baselines.
7
Proposed pitch angle scheme at the position of (527,0)
Conventional scheme at the position of (150,200)
Conventional scheme at the position of (300,200)
6 Conventional scheme at the position of (300,-100)
Theoretical SR of proposed scheme
5
Secrecy rate(bits/s/Hz)
0
0 5 10 15 20 25 30
SNR(10log10(Ps/σ 2),dB)
Figure 3. The achievable SR performance versus SNR for our proposed pitch angle scheme, where
three typically random deployments are invoked as the baselines.
196
Drones 2023, 7, 224
3.5
3
Secrecy rate (bits/s/Hz)
2.5
1.5
Figure 4. The achievable SR performance versus parameter α for the different scheme when SNR is
15 dB.
In summary, simulation results show that our proposed UAV deployment schemes
achieve the SR enhanced SPWT compared with an UAV randomly distributed scheme.
Moreover, the UAV transmit power is concentrated on transmitting confidential message
but not part-allocated to AN. Thus, our proposed UAV deployment SPWT schemes are
more secure and more energy efficiency.
4. Conclusions
In this letter, we proposed an UAV deployment scheme in the context of SPWT from the
perspective of azimuth angle and pitch angle. In this scheme, we abandoned FDA for the
first time and adopted the method of combining DM with 3D scenario, which reduces the
system budget significantly. Compared to the conventional method, the proposed scheme
is more superior in terms of the attainable SR performance. Moreover, our proposed UAV
deployment algorithm gives the analytical solutions which has almost no complexity, this
is also an important benefit of our proposed scheme. Interestingly, although we introduced
AN in this hybrid SPWT system, the mathematic analysis shows that when the allocated
power in AN is zero, the performance achieves the optimal, this means our proposed
scheme do not need AN assistance to achieve the optimal SR performance, which has a
powerful ability in economizing the precious energy resource.
Author Contributions: Conceptualization, T.S.; methodology, T.S.; software, L.G.; validation, X.Z.,
G.X. and F.S.; formal analysis, T.S.; investigation, T.S.; resources, G.X.; data curation, G.X.;
writing—original draft preparation, T.S.; writing—review and editing, G.X. and J.Y.; visualization,
F.S.; supervision, X.Z.; project administration, F.S.; funding acquisition, F.S. All authors have read and
agreed to the published version of the manuscript
Funding: This work was supported in part by the Scientific Research Fund Project of Anhui Agri-
cultural University under Grant rc482106, Grant rc482103 and Grant rc482108, in part by National
Natural Science Foundation of China under Grant 61901121, Grant 62001225, Grant 62071234, Grant
62071289, and Grant 61972093, in part by the Natural Science Research Project of Education De-
partment of Anhui Province of China under Grant KJ2021A0183 and K2248004, in part by the Key
Research and Development Project of Anhui Province in 2022(202204c06020022), in part by the Natu-
197
Drones 2023, 7, 224
ral Science Foundation of Jiangsu Province under Grant BK20190454, in party by the open research
fund of National Mobile Communications Research Laboratory, Southeast University under Grant
2022D07, in part by the Scientific Research Fund Project of Hainan University under Grant KYQD(ZR)-
21007 and Grant KYQD(ZR)-21008, in part by the Hainan Major Projects under Grant ZDKJ20211022,
and in part by the National Key R and D Program of China under Grant 2018YFB1801102.
Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.
Data Availability Statement: Not applicable.
Conflicts of Interest: The authors declare no conflict of interest.
Abbreviations
The following abbreviations are used in this manuscript:
References
1. Wu, Y.; Khisti, A.; Xiao, C.; Caire, G.; Wong, K.-K.; Gao, X. A survey of physical layer security techniques for 5G wireless networks
and challenges ahead. IEEE J. Sel. Areas Commun. 2018, 36, 679–695. [CrossRef]
2. Chen, X.; Ng, D.W.K.; Gerstacker, W.H.; Chen, H.-H. A survey on multiple-antenna techniques for physical layer security. IEEE
Commun. Surv. Tutor. 2017, 19, 1027–1053. [CrossRef]
3. Bai, J.; Wang, H.-M.; Liu, P. Robust irs-aided secrecy transmission with location optimization. IEEE Trans. Commun. 2022,
70, 6149–6163. [CrossRef]
4. Rao, H.; Xiao, S.; Yan, S.; Wang, J.; Tang, W. Optimal geometric solutions to uav-enabled covert communications in line-of-sight
scenarios. IEEE Trans. Wirel. Commun. 2022, 21, 10633–1064. [CrossRef]
5. Daly, M.P.; Bernhard, J.T. Directional modulation technique for phased arrays. IEEE Trans. Antennas Propag. 2009, 57, 2633–2640.
[CrossRef]
6. Yuan, D.; Fusco, V.F. Orthogonal vector approach for synthesis of multi-beam directional modulation transmitters. IEEE Trans.
Antennas Wirel. Propagat. Lett. 2015, 14, 1330–1333.
7. Jinsong, H.; Feng, S.; Jun, L. Robust synthesis method for secure directional modulation with imperfect direction angle. IEEE
Commun. Lett. 2016, 20, 1084–1087.
8. Sammartino, P.F.; Baker, C.J.; Griffiths, H.D. Frequency diverse MIMO techniques for radar. IEEE Trans. Aerosp. Electron. Syst.
2013, 49, 201–222. [CrossRef]
9. Nusenu, S.Y.; Huaizong, S.; Ye, P. Power allocation and equivalent transmit fda beamspace for 5G mmwave noma networks:
Meta-heuristic optimization approach. IEEE Trans. Veh. Technol. 2022, 71, 9635–9646. [CrossRef]
10. Hu, Y.Q.; Chen, H.; Ji, S.-L.; Wang, W.-Q.; Chen, H. Adaptive detector for fda-based ambient backscatter communications. IEEE
Trans. Wirel. Commun. 2022, 21, 10381–10392. [CrossRef]
11. Wang, L.; Wang, W.-Q.; So, H.C. Covariance matrix estimation for fda-mimo adaptive transmit power allocation. IEEE Trans.
Signal Process. 2022, 70, 3386–3399. [CrossRef]
12. Shen, T.; Zhang, S.; Chen, R.; Wang, J.; Hu, J.; Shu, F.; Wang, J. Two Practical Random-Subcarrier-Selection Methods for Secure
Precise Wireless Transmissions. IEEE Trans. Veh. Technol. 2019, 68, 9018–9028. [CrossRef]
13. Shen, T.; Lin, Y.; Zou, J.; Wu, Y.; Shu, F.; Wang, J. Low-Complexity Leakage-Based Secure Precise Wireless Transmission with
Hybrid Beamforming. IEEE Wirel. Commun. Lett. 2020, 9, 1687–1691. [CrossRef]
14. Shu, F.; Wu, X.; Hu, J.; Li, J.; Chen, R.; Wang, J. Secure and precise wireless transmission for random-subcarrier-selection-based
directional modulation transmit antenna array. IEEE J. Sel. Areas Commun. 2018, 36, 890–904. [CrossRef]
15. Zhou, X.; Yan, S.; Hu, J.; Sun, J.; Li, J.; Shu, F. Joint optimization of a uav’s trajectory and transmit power for covert communications.
IEEE Trans. Signal Process. 2019, 67, 4276–4290. [CrossRef]
16. Wu, Q.; Zeng, Y.; Zhang, R. Joint trajectory and communication design for multi-uav enabled wireless networks. IEEE Trans.
Wirel. Commun. 2018, 17, 2109–2121. [CrossRef]
198
Drones 2023, 7, 224
17. Yi, F.; Zhang, C.; Baek, S.; Rawashdeh, S.; Mohammadi, A. Autonomous Landing of a UAV on a Moving Platform Using Model
Predictive Control. Drones 2018, 2, 34. [CrossRef]
18. Ouamri, M.A.; Oteşteanu, M.-E.; Barb, G.; Gueguen, C. Coverage Analysis and Efficient Placement of Drone-BSs in 5G Networks.
Eng. Proc. 2022, 14, 18. [CrossRef]
19. Shen, A.; Luo, J.; Ning, J.; Li, Y.; Wang, Z.; Duo, B. Safeguarding UAV Networks against Active Eavesdropping: An Elevation
Angle-Distance Trade-Off for Secrecy Enhancement. Drones 2023, 7, 109. [CrossRef]
20. Zhou, X.; Yan, S.; Shu, F.; Chen, R.; Li, J. UAV-Enabled Covert Wireless Data Collection. IEEE J. Sel. Areas Commun. 2021,
39, 3348–3362. [CrossRef]
21. Hu, J.; Wu, Y.; Chen, R.; Shu, F.; Wang, J. Optimal Detection of UAV’s Transmission with Beam Sweeping in Covert Wireless
Networks. IEEE Trans. Veh. Technol. 2020, 69, 1080–1085. [CrossRef]
22. Li, Y.; Shu, F.; Shi, B.; Cheng, X.; Song, Y.; Wang, J. Enhanced RSS-based UAV localization via trajectory and multi-base stations.
IEEE Commun. Lett. 2021, 25, 1881–1885. [CrossRef]
23. Cheng, X.; Shu, F.; Li, Y.; Zhuang, Z.; Wu, D.; Wang, J. Optimal Measurement of Drone Swarm in RSS-Based Passive Localization
with Region Constraints. IEEE Open J. Veh. Technol. 2023, 4, 1–11. [CrossRef]
24. Zhou, X.; Li, J.; Shu, F.; Wu, Q.; Wu, Y.; Chen, W.; Hanzo, L. Secure SWIPT for Directional Modulation-Aided AF Relaying
Networks. IEEE J. Sel. Areas Commun. 2021, 37, 253–268. [CrossRef]
25. Kraus, J.D.; Marhefka, R.J. Antennas For All Applications; McGraw-Hill, Inc.: New York, NY, USA, 2002.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.
199
drones
Article
Joint Trajectory Planning, Time and Power Allocation to
Maximize Throughput in UAV Network
Kehao Wang 1, *, Jiangwei Xu 1 , Xiaobai Li 2 , Pei Liu 1,3 , Hui Cao 1 and Kezhong Liu 4
Abstract: Due to the advantages of strong mobility, flexible deployment, and low cost, unmanned
aerial vehicles (UAVs) are widely used in various industries. As a flying relay, UAVs can establish
line-of-sight (LOS) links for different scenarios, effectively improving communication quality. In this
paper, considering the limited energy budget of UAVs and the existence of multiple jammers, we
introduce a simultaneous wireless information and power transfer (SWIPT) technology and study
the problems of joint-trajectory planning, time, and power allocation to increase communication
performance. Specifically, the network includes multiple UAVs, source nodes (SNs), destination
nodes (DNs), and jammers. We assume that the UAVs need to communicate with DNs, the SNs
use the SWIPT technology to transmit wireless energy and information to UAVs, and the jammers
can interfere with the channel from UAVs to DNs. In this network, our target was to maximize
the throughput of DNs by optimizing the UAV’s trajectory, time, and power allocation under the
constraints of jammers and the actual motion of UAVs (including UAV energy budget, maximum
speed, and anti-collision constraints). Since the formulated problem was non-convex and difficult to
solve directly, we first decomposed the original problem into three subproblems. We then solved the
subproblems by applying a successive convex optimization technology and a slack variables method.
Finally, an efficient joint optimization algorithm was proposed to obtain a sub-optimal solution by
using a block coordinate descent method. Simulation results indicated that the proposed algorithm
Citation: Wang, K.; Xu, J.; Li, X.; Liu,
has better performance than the four baseline schemes.
P.; Cao, H.; Liu, K.; Cong, Y. Joint
Trajectory Planning, Time and Power
Allocation to Maximize Throughput
Keywords: UAV network; trajectory planning; power allocation; time allocation
in UAV Network. Drones 2023, 7, 68.
https://doi.org/10.3390/
drones7020068
1. Introduction
Academic Editor: Andrey V. Savkin
In recent years, due to high mobility, flexible deployment, and low cost, UAVs have
Received: 28 December 2022 been widely used in many scenarios, such as intelligent transportation systems [1–3],
Revised: 14 January 2023 disaster relief, military activities, emergency communications, and so on [4–6]. In particular,
Accepted: 16 January 2023 with the development of 5G technology, non-terrestrial networks have become the next
Published: 18 January 2023 hot spot [7]. For example, [8] studied the extreme performance of a cognitive uplink fixed
satellite service and a fixed terrestrial service in the Ka-band 27.5–29.5 GHz frequency
range. Ref. [9] studied the physical layer security problem of a satellite network.
Compared with traditional communication methods, most of the wireless communica-
Copyright: © 2023 by the authors.
tion channels in UAV networks are dominated by line-of-sight (LOS) links [10], which can
Licensee MDPI, Basel, Switzerland.
This article is an open access article
reduce the obstruction of mountains and buildings so as to obtain better data transmission
distributed under the terms and
effects. For example, for natural disasters scenes and ground network communication
conditions of the Creative Commons failures, Ref. [11] used UAVs to provide support for ground base stations and proposed an
Attribution (CC BY) license (https:// adaptive UAV deployment scheme to solve the communication network coverage problem.
creativecommons.org/licenses/by/ Ref. [12] proposed a task-driven routing strategy for emergency UAVs network to enhance
4.0/). rescue efficiency. Aiming at the post-disaster areas where infrastructure has been destroyed,
Ref. [13] proposed an efficient data transmission scheme based on a particle swarm algo-
rithm to ensure communication quality. Ref. [14] studied a UAV deployment strategy for
disaster-affected areas to maximize the number of communication-coverage nodes.
Despite the widespread adoption of UAV-assisted communication technologies, there
are still some challenges in UAV networks. First, due to the openness of a UAV network
interface and the broadcast nature of electromagnetic waves, a UAV network is susceptible
to radio interference. For instance, the existence of malicious jammers could reduce the
communication quality between the nodes in a UAV network. Second, the battery capacity
of the UAV is limited, which greatly shortens total mission time [15–17].
In view of the problem that a UAV network is susceptible to radio interference, the high
mobility of UAVs could be used to improve a system’s performance through the reasonable
optimization of network resources [18–21]. Whereas, in a UAV network, the resources are
limited, coupled with each other, and the established problems are usually non-convex and
difficult to solve. Therefore, it is necessary to design a feasible optimization algorithm to
obtain a solution to the original problem. Based on this, a successive convex approximation
(SCA) [22] can be used as an effective method to solve the original non-convex problem.
To address the problem that the battery capacity of the UAV is limited, traditional
energy harvesting mechanisms represented by solar and wind energy have been exten-
sively studied [23,24]. For example, Ref. [23] studied the energy efficiency problem in
solar-powered UAV systems by optimizing the speed, acceleration, heading angle, and
transmission power of a UAV. Ref. [24] proposed an energy harvesting model based on
hybrid solar and wind power for a UAV system and obtained a solution to the signal-to-
noise ratio (SNR) outage minimization problem. Unfortunately, due to the limitations of
hardware technology, the traditional energy harvesting scheme will significantly increase
the take-off weight and lead to the degradation of the system’s performance. Based on
that, radio frequency (RF)-based SWIPT technology combined with UAV network resource
allocation optimization could provide an effective solution [25].
To be specific, SWIPT is a technology that integrates wireless power transfer (WPT)
and wireless information transfer (WIT). The power and information could be transferred
at the same time, as an RF signal carries both power and information [26]. Typically, time
switching (TS) and power splitting (PS) protocols are two common methods to implement
SWIPT [27]. The former depends on time-slot allocation, where part of the time slot is used
for energy transmission, and the other is used for information transmission and processing.
The latter depends on power allocation, where part of the power is used for information
transmission and processing, and the other is used for energy harvesting (EH).
202
Drones 2023, 7, 68
203
Drones 2023, 7, 68
1.2. Motivation
In spite of the fact that the related works above have made great progress, there are
still several problems needing to be resolved. To be specific, most existing studies did not
consider the existence of multiple jammers, even though jammers have a significant impact
on the legitimate communication of the system. In addition, most of the existing works
considered a single UAV or a single ground node. That is because a multi-UAV system
needs to meet a series of requirements, such as an anti-collision constraint and mission
planning. These will increase the design difficulty and further increase the complexity
of the algorithm. Furthermore, most existing works based on a SWIPT network focused
on optimizing power or trajectory instead of multi-domains, including time, power, and
trajectory. Most importantly, due to the development of onboard batteries, the flight time
of UAVs is limited. The energy constraint problem greatly restricts further applications
of UAVs. Therefore, how to improve the communication quality of a UAV network has
always been a difficult and hot issue.
Inspired by the discussion above, we study a multi-UAV-assisted multi-user network
system. Specifically, where the SN can send information and energy to the power-limited
UAV, and the UAV uses the collected energy to communicate with the DN. It should be
noted that there are multiple jammers in the network blocking legitimate communication.
Different from the existing network, we introduce multiple UAVs based on SWIPT tech-
nology and fully consider the existence of jammers and the energy consumption of UAVs.
In addition, due to the complexity of the network, solving this joint optimization problem
was a considerable challenge. Thus, we introduced multiple slack variables and used the
SCA method to make the original problem satisfy the disciplined convex program (DCP)
rules so that the reformulated problem could be solved based on the solver CVX.
204
Drones 2023, 7, 68
1.3. Contributions
For the sake of solving the problems given above, a multi-domain optimization al-
gorithm based on the PS protocol that combines trajectory planning, time allocation, and
power splitting is proposed by us. We aim to maximize the throughput by considering all
constraints. The main contributions of this paper are as follows:
• We investigate a multi-UAV-assisted multi-user relay network in which the SNs use
SWIPT technology to transmit wireless energy and information to UAVs. The UAVs
use the collected energy to transmit information to the DNs, with the jammers inter-
fering with legitimate channel communications.
• Our goal was to jointly optimize UAV trajectories, time allocation, and power-splitting
factors, to mitigate interference and maximize the system throughput. Given that the
original problem is non-convex and difficult to solve directly, we decomposed the
original problem into three subproblems based on successive convex approximation,
block coordinate descent, and a slack variables method presenting an efficient joint
optimization algorithm to obtain a suboptimal solution.
• Simulation results indicate that the proposed scheme had better performance than
the four benchmark schemes. In addition, we discuss the impact of the number of
jammers and energy budgets on system performance and illustrate the effectiveness
of joint trajectory planning, time, and power allocation to mitigate interference.
The rest part of the paper is organized as follows. In Section 2, the system model is
introduced. In Section 3, we propose a joint optimization algorithm to solve the original
problem. In Section 4, we provide simulation results and some necessary discussions.
Finally, Section 5 concludes this paper.
2. System Model
Considering a multi-UAV enabled wireless communication network as shown in
Figure 1, which includes K1 quadcopter UAVs, K2 source nodes (SNs) and destination
nodes (DNs), and multiple jammers. Since each pair of ground nodes (SN and DN) is
equipped with a fixed UAV to provide communication services for them, then K1 = K2 = K.
In this system, we assume that the UAVs need to communicate with DNs, and the SNs use
SWIPT technology to transmit wireless energy and information to UAVs. Specifically, K
SNs stored information and energy. First, all SNs simultaneously send information and
energy to the UAV relays, and then, the UAV relays use the collected energy to forward the
information to the DNs in DF mode. It is assumed that the SNs, the UAVs, and the DNs are
each equipped with a single antenna, the jammers are equipped with K antennas, and the
jammers’ antennas are aimed at the signal transmission direction of the UAVs [53]. Thus,
The jammers which are far away from SNs and closer to UAVs interfere with the channel
from UAVs to DNs.
In order to describe the model in mathematical terms, we introduce a 3D Cartesian
coordinate system. Suppose the locations of SN k and DN k are wSk = ( xSk , ySk , 0) and
wDk = ( x Dk , y Dk , 0) respectively, k ∈ K = {1, 2, ..., K }. The system contains multiple jam-
mers, denoted as j ∈ J = {1, 2, ..., J }, and the location of the j-th jammer is w j = ( x j , y j , 0).
At the same time, we discretize the UAVs mission period T into N time slots with equal
length δ, i.e., δ = N T
. Therefore, the position of UAV k flying at a height Z in any time slot
n ∈ N = {1, 2, ..., N } is denoted as qk [n] = ( xk [n], yk [n]). Moreover, we assume that the
maximum flight speed of the UAV k is Vmax . Thus, we have the following:
which means that the UAV’s speed between two adjacent time slots cannot exceed the
maximum speed, where • represents the Euclidean norm. In addition, the distance
205
Drones 2023, 7, 68
between any two UAVs needs to be greater than a minimum safe distance of Dmin to avoid
collision and ensure safety. Thus,
] -DPPHU8$9FKDQQHO
8$9'HVWLQDWLRQFKDQQHO
6RXUFH8$9FKDQQHO
8$9WUDMHFWRU\
8$9
8$9N
Ă 'HVWLQDWLRQQRGH
6RXUFHQRGH
\
-DPPHU -DPPHUM
'HVWLQDWLRQQRGHN
[
6RXUFHQRGHN
206
Drones 2023, 7, 68
of energy harvesting efficiency and signal fading, we consider the harvested energy to be
only used for signal processing and transmission. The energy consumed by the motor is
determined by its own battery capacity, which we will discuss later.
į
61ė8$9(QHUJ\KDUYHVWLQJ
Į36
8$9ė'1(VWDEOLVK
61ė8$96LJQDOUHFHSWLRQ FRPPXQLFDWLRQ
DQGSURFHVVLQJ
Į 36
IJį IJ į
Figure 2. PS protocol.
Based on the above analysis, the UAV-received signal used for information processing
during a time slot can be expressed as follows:
,
ykI N = (1 − α) Pk hSk Uk x + nsu . (6)
where x denotes a signal sent by SN, and nsu is additive white Gaussian noise (AWGN)
with mean 0 and variance σsu su ∼ CN (0, σsu ). Therefore, the signal-to-noise ratio
2 , i.e., n 2
(1 − α) Pk h2Sk Uk
SNRk I N = . (7)
σsu
2
It should be noted that in the actual network, the SNR needs to be greater than a
threshold γth1 ; otherwise, the information transmission will be interrupted. Thus, we have
the following:
SNR I N k ≥ γth1 , ∀k, ∀n. (9)
According to the PS protocol, the energy harvesting time during each time slot is τδ.
Therefore, the energy collected by the UAV in a slot can be expressed as follows:
where η is energy collection efficiency. It should be noted that the collected energy cannot
exceed the maximum capacity of the supercapacitor. Thus, we have the following:
cap
EkEH ≤ Ek (11)
cap
where Ek is the maximum capacity of the supercapacitor. Thus, the UAV’s transmission
power during the (1 − τ )δ can be expressed as follows:
PUk [n] hU
2 [n]
k Dk
SI NR Dk = J
≥ γth2 , ∀n, ∀k. (13)
∑ j=1 Pj h jUk [n] + σud
2 2
207
Drones 2023, 7, 68
where σud
2 is the noise power, and P is the interference power. Thus, the achievable rate
j
from the UAV to the DN is as follows:
where v means the UAV’s speed, PB and PI are the blade profile and induced powers,
respectively, when the UAV is hovering. vtip represents the tip speed of the rotor blade,
and v0 is the mean rotor-induced velocity. In addition, d0 is the fuselage drag ratio, ρ is the
air density, s is rotor solidity, and A0 is the rotor disc area. Therefore, we can get the sum of
the energy consumption of the UAV in a mission period T by
T
& ! 12 T
!
v4 v2 3v2 1
EU AV (v) = PI 1+ − 4 dt + PB 1 + + d0 ρsA0 v3 dt (17)
0 4v04 2v0 0 v2tip 2
.
From the definition of time slot δ, we define the UAV’s speed as v = qk [n] − qk [n − 1] δ.
Thus, we can rewrite EU AV as
& ! 12 !
N N
Δq4 Δq2 3Δq2 1 Δq3
EU AV (Δq) = ∑ PI δ4 + 4 − 4
4v0 2v0
+ ∑ PB δ +
δv2tip
+ d0 ρsA0 2
2 δ
(18)
n =2 n =2
where Δq = qk [n] − qk [n − 1]. In summary, we get the energy consumption expression
of the rotary-wing UAV.
N
R Dk = ∑ (1 − τ )log2 (1 + SI NRD ). (19)
n =1
Let μ denote the the minimum throughput of DNs, i.e., μ = min R Dk , and define
k ∈K
Q = {qk [n], ∀k, ∀n}, τ = {τk [n], ∀k, ∀n}, α = {αk [n], ∀k, ∀n}, Eth as the UAV’s energy
budget, and qstart and qend as the start and endpoints of the UAV. The joint trajectory
planning, time, and power allocation optimization problem can be formulated as
208
Drones 2023, 7, 68
P1 : max μ (20a)
{Q,τ,α,μ}
Note that problem P1 is difficult to solve directly since (2), (11), (13), (15), (20c), and
(20d) are non-convex. In the next section, we propose an efficient iterative algorithm to
obtain a feasible solution to original problem.
3. Joint Optimization
Since P1 is a non-convex problem, it is difficult to solve directly. In this section, we
divide P1 into three subproblems and obtain suboptimal solutions by applying a successive
convex approximation and a slack variables method. Then, we develop an overall iterative
algorithm based on the block coordinate descent technique to get a locally optimal solution.
The specific flow chart is shown in Figure 3.
6WDUW
,QLWLDOL]HSDUDPHWHUVDQGVHWLQLWLDOIHDVLEOHSRLQW
4LĮLDQGIJL
'HFRPSRVHWKHRULJLQDOSUREOHP3LQWRWKUHH
VXESUREOHPV33DQG3
%ORFNFRRUGLQDWHGHVFHQW
1R
&RQYHUJHQFH
<HV
(QG
209
Drones 2023, 7, 68
P2 : max μ (21a)
{Q,μ}
s.t. (1), (2), (9), (11), (13), (15), (20c), (20d), (20g), (20h) (21b)
Note that the problem P2 is intractable due to the non-convexity of (2), (11), (13), (15),
(20c), and (20d). To tackle this issue, we first introduce slack variables {A[n], B[n], C[n]}.
P3 : max μ (22a)
{q[n],μ,A[n],··· ,D[n]}
s.t. (1), (2), (9), (11), (15), (20c), (20g), (20h) (22b)
N
1
μ ≤ ∑ (1 − τk [n])log2 1 + (22c)
n =2
Ak [n] Bk [n]Ck [n]
−1
ηαPk β su τk [n] * *
Ak [n] ≥ *qk [n] − wS *2 , ∀k, ∀n (22d)
1 − τk [n] k
* *2
Bk [n] ≥ β−1 *qk [n] − wD * , ∀k, ∀n
ud k
(22e)
J * * −2
Ck [n] ≥ ∑ Pj β Ju *qk [n] − wj * + σud
2
, ∀k, ∀n (22f)
j =1
1
≥ γth2 , ∀k, ∀n (22g)
Ak [n] Bk [n]Ck [n]
Proof. The theorem can be proved by the method of contradiction. Specifically, if (22d)–(22f)
are strict equality constraints, problem P3 is equal to P2. Otherwise, by adjusting the slack
variables, the value of the objective can always be further optimized.
However, P3 is still difficult to solve because (15), (20c), and (22f) are non-convex
constraints, and the left-hand-side (LHS) of (2) and (22g), the right-hand-side (RHS) of (22c)
is convex. Consider that any convex function is globally lower-bounded by its first-order
Taylor expansion at any point [55]. Therefore, taking Taylor expansion approximately as
lower bound, we can obtain the following:
1 1 A [ n ] − Ai
log2 1 + ≥ log2 1 + −
A[ n ]B[ n ]C[ n ] Ai Bi Ci Ai 1 + Ai Bi C ln 2
B [ n ] − Bi C [ n ] − Ci
− − i (23)
Bi 1 + A B C ln 2 C 1 + Ai Bi Ci ln 2
i i i
210
Drones 2023, 7, 68
However, we notice that the RHS of (26b) is convex with respect to trajectory Q, thus,
(26b) is still a non-convex constraint. Relying on the first-order Taylor expansion, we have
the lower bound as
* * * *2
*qk [ n ] − w j * 2 ≥ * *
*qik [n] − w j * + 2(qik [n] − w j ) T × (qk [n] − qik [n]) = Ek,j [n], ∀k, ∀n, ∀ j (27)
For the non-convex constraint (22g). Since the LHS of (22g) is convex, we apply the
first-order Taylor expansion to get the lower bound at the i-th iteration point
1 1 ( A [ n ] − Ai ) ( B [ n ] − Bi ) ( C [ n ] − Ci )
≥ i i i− − − ≥ γth2 (29)
A[ n ]B[ n ]C[ n ] ABC 2
( Ai ) Bi Ci
2
Ai ( Bi ) Ci Ai Bi ( Ci )
2
For the LHS of (2), we can obtain the lower bound according to the first-order Taylor
expansion as
* *2
* *
qk [n] − ql [n]2 ≥ −*qik [n] − qil [n]* + 2(qik [n] − qil [n])T × (qk [n] − ql [n]) (30)
For the information-causality constraint (15), by introducing slack variables {F, G, H, I},
we have the following:
n
ηαPk τ
∑ ( τ ) log 2 1+
(1 − τ ) Fk [t] Gk [t] Hk [t]
t =1
n
(1 − α) Pk
≤ ∑ (1 − τ )log2 1 + , n = 1, ..., N, ∀k. (32a)
t =1 σ Ik [t]
2
J
Fk [n] ≤ ∑ Pj h2jUk [n] + σud
2
, n = 1, ..., N. (32b)
j =1
*
1*
*2
Gk [n] ≤ β− *
su qk [ n ] − wSk , n = 1, ..., N. (32c)
*
1*
*2
Hk [n] ≤ β− *
ud qk [ n ] − w Dk , n = 1, ..., N. (32d)
* *
Ik [n] ≥ β su qk [n] − wSk * , n = 1, ..., N.
−1 * 2
(32e)
Since the RHS of (32a) is convex with respect to I, relying on the first-order Taylor
expansion, we have the following:
!
(1 − α) Pk (1 − α) Pk (1 − α) Pk ( Ik − Iki )
log2 1 + ≥ log 1 + − = L (33)
σ Ik
2 2
σ Ik
2 i 2
σ2 ( I i ) + (1 − α) Pk I i ln 2
k k
211
Drones 2023, 7, 68
J
Fk [n] ≤ ∑ Pj Mk,j [n] + σud
2
, ∀k, ∀n. (35a)
j =1
1 *
1*
*2
≥ β− *
Mk,j [n] Ju qk [ n ] − w j , ∀ k, ∀ n, ∀ j. (35b)
Similar to (27), for the non-convex constraints (32c) and (32d), we have:
* *2
1* i *
β− −1 i
su *qk [ n ] − wSk * + 2β su (qk [ n ] − wSk ) × (qk [ n ] − qk [ n ]) ≥ Gk [ n ], ∀ k, ∀ n.
T i
(37a)
* *2
1* i *
β− −1 i
ud *qk [ n ] − w Dk * + 2β ud (qk [ n ] − w Dk ) × (qk [ n ] − qk [ n ]) ≥ Hk [ n ], ∀ k, ∀ n.
T i
(37b)
ηαPk β su τδ * *2
* *
cap ≤ *qik [n] − wSk * + 2(qik [n] − wSk ) T × (qk [n] − qik [n]), ∀k, ∀n. (38)
Ek
Since the energy constraint expression (20c) is very complex and difficult to solve
directly, in order to facilitate the analysis, we introduce a slack variable O as follows:
& ! 12
Δq4k Δq2k
Ok [ n ] ≥ δ4 + − (39)
4v40 2v40
Δq2k δ4
Ok2 [n] + ≥ , n = 2, ..., N, ∀k. (40)
v20 Ok2 [n]
Therefore, the energy consumption of the UAV k can be equivalently expressed as follows:
N N
3Δq2 1 Δq3
Eth ≥ Ek (Δqk , Ok [n]) = ∑ PB (δ + δv2 ) + d0 ρsA0 2 + ∑ PI Ok [n] ≥ EU AV (Δqk ), ∀k
2 δ
(41)
n =2 tip n =2
212
Drones 2023, 7, 68
P4 : max μ (43a)
{Q,A,...,O,μ}
s.t. (1), (9), (20g), (20h), (22d), (22e), (25), (26a), (28), (29),
(31), (32e), (34), (35a), (36), (37a), (37b), (38), (41), (42) (43b)
P5 : max μ (44a)
{α,μ}
P6 : max μ (47a)
{α,μ}
P6 is also a convex optimization problem that can be solved like P4. Additionally,
the optimal objective value obtained from P6 usually serves as a lower bound of P5.
P7 : max μ (48a)
{τ,μ}
213
Drones 2023, 7, 68
P8 : max μ (49a)
{τ,μ}
where
αk [n]ηPk h2S [ n ] hU
2 [n]
k Uk k Dk
Rk [n] = (50)
J
∑ Pj h2jUk [n] + σud
2
J =1
Since (15) and (49c) are non-convex, P8 is difficult to solve directly. To this end, we
introduce slack variables to solve this problem.
P9 : max μ (51a)
{τ,μ,U,...,X}
1 1
UV = (U + V)2 − (U2 + V2 ) (54)
2 2
214
Drones 2023, 7, 68
Since the first term in the RHS of (54) is convex, we can obtain a lower bound for (54)
by using first-order Taylor expansion, that is
1 1 1 1
(U + V)2 − (U2 + V2 ) ≥ (Ui + Vi )(U + V) − (Ui + Vi )2 − (U2 + V2 ) = Y (55)
2 2 2 2
Thus, (51c) can be rewritten as
N
μ≤ ∑Y (56)
n =2
Furthermore, the RHS of the (57b) is convex on the domain (τ ∈ [0, 1]). Thus, we have
the following: !
τk [n] 1 τk [n] − τki [n]
≥ − 1 + =Γ (58)
1 − τk [n] 1 − τk [n]
i
(1 − τ i [n])
2
k
Thus, (57b) can be rewritten as
0≤Z≤Γ (59)
Similar to the procedure of handling (51e), for the non-convex constraint (51g), by in-
troducing the slack variable Λ, we have
R [n]
Xk [n] ≥ log2 1 + k (60a)
Λk [n]
1 − τk [n]
0 ≤ Λk [n] ≤ (60b)
τk [n]
s.t. (11), (13), (20e), (51d), (51f), (56), (57a), (59), (60a), (61), (62) (63b)
P10 is also a convex optimization problem that can be solved like P6. In addition,
the optimal objective value obtained from P10 usually serves as a lower bound of P9.
215
Drones 2023, 7, 68
point is the Taylor expansion point within the feasible region. Then, the convergence is
proved as follows:
where (a) holds because in Algorithm 1, problem P4 is solved to obtain the optimal solution
Qi+1 with given αi and τ i at step 3; (b) holds because problem P6 is solved to obtain the
optimal solution αi+1 with given Qi+1 and τ i at step 4; (c) holds because problem P10 is
solved to obtain the optimal solution τ i+1 with given Qi+1 and αi+1 at step 5; (d) holds
because the optimal objective values of P4, P6 and P10 are upper bounded by original
problem P1, then the convergence can be guaranteed.
Finally, we briefly analyze the overall complexity of the algorithm. According to
Algorithm 1, the complexity of the algorithm is mainly dominated by steps 3, 4, and 5,
and the number of optimization variables increases with the multiples of K, J, and N.
Hence, the total computational complexity is O((K JN )3.5 log 1ε ), where K is the number of
UAVs, J is the number of jammers, N is the number of time slots, and ε is the convergence
accuracy. In addition, it should be noted that the proposed scheme is an offline algorithm,
which requires path planning and resource allocation through a specific ground station
(such as QGroundControl in LINUX) before the mission is executed and does not need to
run on UAVs.
4. Simulation Results
In this section, simulation results and some detailed discussions are provided. We
first present the simulation settings and then analyze the effect of different energy budgets
and the number of jammers on the experimental results. Finally, we compare the proposed
algorithm with four baseline schemes to further illustrate the superiority of the joint
trajectory planning, time, and power allocation scheme.
216
Drones 2023, 7, 68
217
Drones 2023, 7, 68
200
UAV 1 E th=10000J Jammer 2
UAV 2 E th=15000J
UAV 3
UAV 4 E th=20000J Jammer 1
Source 4
150
Destination 4
Begin
Source 3 End
100 Begin
End Destination 3
y (m)
Begin
End
Source 2 End Begin
50
Destination 2
Source 1
0
Destination 1
-50
0 50 100 150 200 250
x (m)
6
Speed (m/s)
2
UAV 1 E th=10000J
UAV 2
1 E th=15000J
UAV 3
UAV 4 E th=20000J
0
0 5 10 15 20 25 30 35 40 45 50
Time (s)
Figure 6 and Figure 7 illustrate the variation of the power-splitting factor α and the
time-allocation factor τ. It can be seen that α increased with time, and τ first increased
and then decreased with time. This is because the UAVs moved away from the jammers
and approached optimal points over time, at which point the SNs needed to consume less
energy to ensure the SNR threshold constraint. As for τ, in the process of approaching
the optimal points, τ first increased to ensure that enough energy was collected. When
returning to the endpoints, in order to ensure the communication quality of the DNs, τ
decreased to improve the throughput of the DNs.
Figure 8 presents the achievable throughput over every time slot. It is shown that the
throughput increased first and then decreased. This was because the UAVs were initially far
away from the jammers and closer to the optimal points, thereby collecting enough energy
to increase the throughput. When returning to the endpoints, the throughput dropped as
the UAVs moved away from the optimal points and closer to the jammers. Moreover, we
noticed that the larger the UAV’s energy budget, the greater the achievable throughput,
which was in line with expectations.
218
Drones 2023, 7, 68
0.95
0.9
0.8 0.95
0.75 0.94
0.7 14 15 16 17
UAV 1 E th=10000J
0.65 UAV 2 E th=15000J
UAV 3
UAV 4 E th=20000J
0.6
0 5 10 15 20 25 30 35 40 45 50
Time (s)
0.76495
UAV 1
UAV 2
UAV 3
0.7649 UAV 4
E th=10000J
E th=15000J
time allocation factor
0.76485 E th=20000J
0.7648
0.76475
0.7647
0.76465
0 10 20 30 40 50
Time (s)
3.5
UAV 1
UAV 2
3 UAV 3
UAV 4
Achievable throughput (bps/Hz)
E th=10000J
2.5 E th=15000J
E th=20000J
1.5
0.5
0
0 5 10 15 20 25 30 35 40 45 50
Time (s)
219
Drones 2023, 7, 68
25
K=2
K=4
K=6
K=8
20
Achievable throughput (bps/Hz)
E =10000J
th
E =15000J
th
E =20000J
th
15
10
0
0 5 10 15 20 25 30 35 40 45 50
Time (s)
200
Jammer 4 Jammer 2
End Begin
100
Source 3 End Destination 3
Begin
End Begin
y (m)
Begin
50 Source 2 End Destination 2
0
Source 1 Destination 1
-50 UAV 1
UAV 2 J=4
UAV 3 J=3
UAV 4 J=2
-100
0 50 100 150 200 250
x (m)
220
Drones 2023, 7, 68
70
UAV 1
Achievable throughput of 4 UAVs (bits/Hz)
UAV 2
60
UAV 3
UAV 4
50
40
30
20
10
0
2 3 4 5 6
Number of jammers
450
K=2
K=4
400 K=6
K=8
Achievable throughput (bits/Hz)
350
300
250
200
150
100
50
0
2 3 4 5 6
Number of jammers
221
Drones 2023, 7, 68
• Scheme 1: Our proposed joint trajectory planning, time, and power allocation scheme.
• Scheme 2: Optimizing the power-splitting factor α and UAV’s trajectory Q under the
fixed time-allocation factor τ.
• Scheme 3: Optimizing the time-allocation factor τ and UAV’s trajectory Q under the
fixed power-splitting factor α.
• Scheme 4: Optimizing the UAV’s trajectory Q under the fixed time-allocation factor τ
and power-splitting factor α.
• Scheme 5: Optimizing the power-splitting factor α and time-allocation factor τ under
circular trajectory.
We evaluated the average throughput of the two UAVs, as shown in Figure 13. For dy-
namic schemes 1–5, the throughput of the system first increased over time and then de-
creased as the UAVs moved away from the optimal positions and returned to the endpoints.
Also, we noticed that scheme 5 had the worst performance since the circular trajectory
had been set in advance. Moreover, at the best time slot, the average throughput of the
proposed scheme 1 was two times higher than schemes 2 and 3.
4.5
Scheme 1
Scheme 2
4 Scheme 3
Scheme 4
3.5 Scheme 5
Average throughput (bps/Hz)
2.5
1.5
0.5
0
0 5 10 15 20 25 30 35 40 45 50
Time (s)
Figure 14 shows the average throughput with different energy budgets. It can be
seen from Figure 14 that the average throughput of schemes 1–4 increased with the energy
budget. For scheme 5, since the flight trajectory had been set in advance, increasing the
energy budget did not bring about an improvement in average throughput. In addition, we
observed that the proposed scheme 1 had the best performance, and the average throughput
was increased by 40%, 50%, 150%, and 550% compared with schemes 2–5, respectively.
Figure 15 shows the average throughput of the system with differing number of jam-
mers. Consistent with our expectations, the average throughput of all schemes decreased
as the number of jammers increased. However, in comparison, the proposed scheme 1 had
the best performance. Even in the extreme case with 6 jammers, the throughput of scheme
1 was still improved by 26%, 33%, 160%, and 500% compared with schemes 2–5.
222
Drones 2023, 7, 68
1.6
1.4
1
Scheme 1
Scheme 2
0.8 Scheme 3
Scheme 4
0.6 Scheme 5
0.4
0.2
0
1 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2
UAV Energy budget (J) 104
1.6
Scheme 1
1.4 Scheme 2
Average throughput of 2 UAVs (bps/Hz)
Scheme 3
Scheme 4
1.2 Scheme 5
0.8
0.6
0.4
0.2
0
2 3 4 5 6
Number of jammers
5. Conclusions
This paper investigated joint trajectory planning, time, and power resource allocation
to maximize the throughput in UAV networks. Considering the limited energy budget
of UAVs and the existence of multiple jammers, we introduced SWIPT technology to
improve channel quality. Our goal was to maximize the throughput of the DNs. Since
the original problem is non-convex, taking into account the actual flight constraints of the
UAVs, we proposed an efficient joint optimization algorithm based on successive convex
approximations, a block coordinate descent, and the slack variables method to obtain
a suboptimal solution. Simulation results corroborated that the proposed scheme can
significantly improve the channel throughput and illustrated the effectiveness of joint
trajectory planning, time, and power allocation in mitigating interference. Finally, we
compared the proposed scheme with four benchmark schemes to highlight the superiority
of our study. In future work, we will consider the UAVs scenario with mobile nodes, more
complex channel models, and/or scheduling schemes such as multi-UAV coordination and
multi-point access.
223
Drones 2023, 7, 68
Author Contributions: Conceptualisation, K.W.; methodology, J.X.; formal analysis, P.L.; investiga-
tion, H.C.; writing—original draft preparation, J.X.; writing—review and editing, K.W.; validation,
X.L.; supervision, K.L. All authors have read and agreed to the published version of the manuscript.
Funding: This work was supported in part by the National Natural Science Foundation of China under
Grant 62172313, in part by the Natural Science Foundation of Hunan Province under Grant 2021JJ20054, in
part by the National Key Research and Development Program of China under Grant 2021YFB3901503,
in part by the National Natural Science Foundation of China under Grant 62001336, and in part by
the open research fund of Integrated Computing and Chip Security Sichuan Collaborative Innovation
Center of Chengdu University of Information Technology under Grant CXPAQ202204.
Data Availability Statement: Not Applicable.
Conflicts of Interest: The authors declare no conflict of interest.
Notations
The following notations are used in this manuscript:
Notation Definition
wSk Location of the SN
w Dk Location of the DN
wj Locations of the jammer
Z Height of the UAV
qk Locations of the UAV
T Total task time
N Number of time slots
δ Duration of each time slot
Vmax The maximum speed of the UAV
Dmin The minimum safe distance
hSk Uk The channel-power gain between the SN and the UAV
h jUk The channel-power gain between a jammer and a UAV
hUk Dk The channel-power gain between a UAV and the DN
Pk The transmit power of the SN
nsu Additive white Gaussian noise
α Power-splitting factor
τ Time-allocation factor
η Energy collection efficiency
PB The blade profile power
PI The induced power
vtip The tip speed of the rotor blade
v0 The mean rotor induced velocity
d0 The fuselage drag ratio
ρ The air density
s The rotor solidity
A0 The rotor disc area
References
1. Xiang, C.; Li, Y.; Zhou, Y.; He, S.; Qu, Y.; Li, Z.; Gong, L.; Chen, C. A Comparative Approach to Resurrecting the Market of MOD
Vehicular Crowdsensing. In Proceedings of the IEEE INFOCOM 2022—IEEE Conference on Computer Communications, London,
UK, 2–5 May 2022; pp. 1479–1488.
2. Xiang, C.; Zhang, Z.; Qu, Y.; Lu, D.; Fan, X.; Yang, P.; Wu, F. Edge computing-empowered large-scale traffic data recovery
leveraging low-rank theory. IEEE Trans. Netw. Sci. Eng. 2020, 7, 2205–2218. [CrossRef]
3. Ma, B.; Ren, Z.; Cheng, W. Traffic Routing-Based Computation Offloading in Cybertwin-Driven Internet of Vehicles for V2X
Applications. IEEE Trans. Veh. Technol. 2021, 71, 4551–4560. [CrossRef]
4. Zhao, N.; Lu, W.; Sheng, M.; Chen, Y.; Tang, J.; Yu, F.R.; Wong, K.K. UAV-assisted emergency networks in disasters. IEEE Wirel.
Commun. 2019, 26, 45–51. [CrossRef]
5. Jiang, X.; Sheng, M.; Zhao, N.; Xing, C.; Lu, W.; Wang, X. Green UAV communications for 6G: A survey. Chin. J. Aeronaut. 2022,
35, 19–34. [CrossRef]
6. Tran, D.H.; Vu, T.X.; Chatzinotas, S.; ShahbazPanahi, S.; Ottersten, B. Coarse trajectory design for energy minimization in
UAV-enabled. IEEE Trans. Veh. Technol. 2020, 69, 9483–9496. [CrossRef]
224
Drones 2023, 7, 68
7. Miao, J.; Wang, P. Power Control for Multi-UAV Location-aware Wireless Powered Communication Networks. In Proceedings of
the 2020 IEEE/CIC International Conference on Communications in China (ICCC), Xiamen, China, 28–30 July 2020; pp. 225–230.
8. An K.; Liang, T.; Zheng, G.; Yan, X.; Li, Y.; Chatzinotas, S. Performance limits of cognitive-uplink FSS and terrestrial FS for
Ka-band. IEEE Trans. Aerosp. Electron. Syst. 2018, 55, 2604–2611. [CrossRef]
9. An, K.; Lin, M.; Ouyang, J.; Zhu, W.-P. Secure Transmission in Cognitive Satellite Terrestrial Networks. IEEE J. Sel. Areas Commun.
2016, 34, 3025–3037. [CrossRef]
10. Zeng, Y.; Zhang, R. Energy-Efficient UAV Communication with Trajectory Optimization. IEEE Trans. Wirel. Commun. 2017, 16,
3747–3760. [CrossRef]
11. Lin, N.; Liu, Y.; Zhao, L.; Wu, D.O.; Wang, Y. An Adaptive UAV Deployment Scheme for Emergency Networking. IEEE Trans.
Wirel. Commun. 2022, 21, 2383–2398. [CrossRef]
12. Ma, B.; Ren, Z.; Cheng, W. Credibility Computation Offloading Based Task-Driven Routing Strategy for Emergency UAVs
Network. In Proceedings of the 2021 IEEE Global Communications Conference (GLOBECOM), Rio de Janeiro, Brazil, 4–8
December 2021; pp. 1–6.
13. Fu, Y.; Li, D.; Tang, Q.; Zhou, S. Joint Speed and Bandwidth Optimized Strategy of UAV-Assisted Data Collection in Post-Disaster
Areas. In Proceedings of the 2022 20th Mediterranean Communication and Computer Networking Conference (MedComNet),
Pafos, Cyprus, 1–3 June 2022; pp. 39–42.
14. Peer, M.; Bohara, V.A.; Srivastava, A. Multi-UAV Placement Strategy for Disaster-Resilient Communication Network. In
Proceedings of the 2020 IEEE 92nd Vehicular Technology Conference (VTC2020-Fall), Virtual, 18 November–16 December 2020;
pp. 1–7.
15. Kim, Y.H.; Chowdhury, I.A.; Song, I. Design and Analysis of UAV-Assisted Relaying with Simultaneous Wireless Information and
Power Transfer. IEEE Access 2020, 8, 27874–27886. [CrossRef]
16. Zhan, C.; Hu, H.; Wang, Z.; Fan, R.; Niyato, D. Unmanned Aircraft System Aided Adaptive Video Streaming: A Joint Optimization
Approach. IEEE Transactions on Multimedia 2020, 22, 795–807. [CrossRef]
17. Sun, Z.; Yang, D.; Xiao, L.; Cuthbert, L.; Wu, F.; Zhu, Y. Joint Energy and Trajectory Optimization for UAV-Enabled Relaying
Network with Multi-Pair Users. IEEE Trans. Cogn. Commun. Netw. 2021, 7, 939–954. [CrossRef]
18. Gao, Y.; Wu, Y.; Cui, Z.; Yang, W.; Hu, G.; Xu, S. Robust trajectory and communication design for angle-constrained multi-UAV
communications in the presence of jammers. China Commun. 2022, 19, 131–147. [CrossRef]
19. Feng, Z.; Ren, G.; Chen, J.; Zhang, X.; Luo, Y.; Wang, M.; Xu, Y. Power control in relay-assisted anti-jamming systems: A Bayesian
three-layer Stackelberg game approach. IEEE Access 2019, 7, 14623–14636. [CrossRef]
20. Xu, Y.; Ren, G.; Chen, J.; Zhang, X.; Jia, L.; Feng, Z.; Xu, Y. Joint Power and Trajectory Optimization in UAV Anti-Jamming
Communication Networks. In Proceedings of the ICC 2019—2019 IEEE International Conference on Communications (ICC),
Shanghai, China, 20–24 May 2019; pp. 1–5.
21. Wu, Y.; Yang, W.; Guan, X.; Wu, Q. UAV-Enabled Relay Communication Under Malicious Jamming: Joint Trajectory and Transmit
Power Optimization. IEEE Trans. Veh. Technol. 2021, 70, 8275–8279. [CrossRef]
22. Lin, Z.; An K.; Niu, H.; Hu, Y.; Chatzinotas, S.; Zheng, G.; Wang, J. SLNR-based Secure Energy Efficient Beamforming in
Multibeam Satellite Systems. IEEE Trans. Aerosp. Electron. Syst. 2022. [CrossRef]
23. Song, X.; Chang, Z.; Guo, X.; Wu, P.; Hämäläinen, T. Energy Efficient Optimization for Solar-Powered UAV Communications
System. In Proceedings of the 2021 IEEE International Conference on Communications Workshops (ICC Workshops), Montreal,
QC, Canada, 14–23 June 2021; pp. 1–6.
24. Sekander, S.; Tabassum, H.; Hossain, E. Statistical Performance Modeling of Solar and Wind-Powered UAV Communications.
IEEE Trans. Mob. Comput. 2021, 20, 2686–2700. [CrossRef]
25. Mamaghani, M.T.; Hong, Y. Improving PHY-Security of UAV-Enabled Transmission with Wireless Energy Harvesting: Robust
Trajectory Design and Communications Resource Allocation. IEEE Trans. Veh. Technol. 2020, 69, 8586–8600. [CrossRef]
26. Lu, W.; Fang, S.; Gong, Y.; Qian, L.; Liu, X.; Hua, J. Resource Allocation for OFDM Relaying Wireless Power Transfer Based Energy-
Constrained UAV Communication Network. In Proceedings of the 2018 IEEE International Conference on Communications
Workshops (ICC), Kansas City, MO, USA, 20–24 May 2018; pp. 1–6.
27. Ramzan, M.R.; Naeem, M.; Altaf, M.; Ejaz, W. Multi-Criterion Resource Management in Energy Harvested Cooperative UAV-
enabled IoT Networks. IEEE Internet Things J. 2021, 9, 2944–2959. [CrossRef]
28. Wu, Y.; Yang, W.; Guan, X. UAV-UAV Communication Under Malicious Jamming: Trajectory Optimization with Turning Angle
Constraint. In Proceedings of the 2020 International Conference on Wireless Communications and Signal Processing (WCSP),
Nanjing, China, 21–23 October 2020; pp. 26–31.
29. Wang, X.; Gursoy, M.C.; Erpek, T.; Sagduyu, Y.E. Jamming-Resilient Path Planning for Multiple UAVs via Deep Reinforcement
Learning. In Proceedings of the 2021 IEEE International Conference on Communications Workshops (ICC Workshops), Montreal,
QC, Canada, 14–23 June 2021; pp. 1–6.
30. Zhou, L.; Zhao, X.; Guan, X.; Song, E.; Zeng, X.; Shi, Q. Robust trajectory planning for UAV communication systems in the
presence of jammers. Chin. J. Aeronaut. 2022, 35, 265–274. [CrossRef]
31. Wu, Y.; Yang, W.; Guan, X.; Wu, Q. Energy-Efficient Trajectory Design for UAV-Enabled Communication Under Malicious
Jamming. IEEE Wirel. Commun. Lett. 2021, 10, 206–210. [CrossRef]
225
Drones 2023, 7, 68
32. Duo, B.; Luo, J.; Li, Y.; Hu, H.; Wang, Z. Joint trajectory and power optimization for securing UAV communications against active
eavesdropping. China Commun. 2021, 18, 88–99. [CrossRef]
33. Li, X.; Xu, J. Positioning Optimization for Sum-Rate Maximization in UAV-Enabled Interference Channel. IEEE Signal Process.
Lett. 2019, 26, 1466–1470. [CrossRef]
34. Wu, Y.; Fan, W.; Yang, W.; Sun, X.; Guan, X. Robust Trajectory and Communication Design for Multi-UAV Enabled Wireless
Networks in the Presence of Jammers. IEEE Access 2020, 8, 2893–2905. [CrossRef]
35. Xiang, C.; Zhou, Y.; Dai, H.; Qu, Y.; He, S.; Chen, C.; Yang, P. Reusing delivery drones for urban crowdsensing. IEEE Trans. Mob.
Comput. 2021. [CrossRef]
36. Li, J.; Tian, Y.; Zhang, Y. Destination-Based Cooperative Jamming in Security UAV Relay System with SWIPT. In Proceedings of
the 2021 13th International Conference on Communication Software and Networks (ICCSN), Chongqing, China, 4–7 June 2021;
pp. 160–167.
37. Singh, R.; Rawat, M.; Jaiswal, A. On the Physical Layer Security of Mixed FSO-RF SWIPT System with Non-Ideal Power Amplifier.
IEEE Photonics J. 2021, 13, 1–17. [CrossRef]
38. Wang, W.; Li, X.; Zhang, M.; Cumanan, K.; Ng, D.W.K.; Zhang, G.; Tang, J.; Dobre, O.A. Energy-constrained UAV-assisted secure
communications with position optimization and cooperative jamming. IEEE Trans. Commun. 2020, 68, 4476–4489. [CrossRef]
39. Ji, B.; Li, Y.; Zhou, B.; Li, C.; Song, K.; Wen, H. Performance Analysis of UAV Relay Assisted IoT Communication Network
Enhanced with Energy Harvesting. IEEE Access 2019, 7, 38738–38747. [CrossRef]
40. Park, J.C.; Kang, K.-M.; Choi, J. Low-Complexity Algorithm for Outage Optimal Resource Allocation in Energy Harvesting-Based
UAV Identification Networks. IEEE Commun. Lett. 2021, 25, 3639–3643. [CrossRef]
41. Kumar, D.; Singya, P.K.; Bhatia, V. Performance Analysis of SWIPT Enabled Decode-and-Forward based Cooperative Network.
In Proceedings of the 2022 IEEE 11th International Conference on Communication Systems and Network Technologies (CSNT),
Indore, India, 23–24 April 2022; pp. 476–481.
42. Hu, T.; Ma, F.; Shang, Y.; Cheng, Y. Physical Layer Security of Untrusted UAV-enabled Relaying NOMA Network Using SWIPT
and the Cooperative Jamming. In Proceedings of the 2021 IEEE 94th Vehicular Technology Conference (VTC2021-Fall), Virtual, 27
September–28 October 2021; pp. 1–6.
43. Najmeddin, S.; Bayat, A.; Aïssa, S.; Tahar, S. Energy-Efficient Resource Allocation for UAV-Enabled Wireless Powered Communi-
cations. In Proceedings of the 2019 IEEE Wireless Communications and Networking Conference (WCNC), Marrakesh, Morocco,
15–18 April 2019; pp. 1–6.
44. Su, Z.; Tang, J.; Feng, W.; Chen, Z.; Fu, Y.; Wong, K.-K. Energy Efficiency Optimization for D2D communications in UAV-assisted
Networks with SWIPT. In Proceedings of the 2021 IEEE Global Communications Conference (GLOBECOM), Madrid, Spain, 7–11
December 2021; pp. 1–7.
45. Yang, Y.; Xiao, K. Energy Efficiency Optimization of Multi-user Distributed Antenna Systems with SWIPT Technique. In
Proceedings of the 2021 International Conference on Communications, Information System and Computer Engineering (CISCE),
Beijing, China, 14–16 May 2021; pp. 118–122.
46. Yang, H.; Ye, Y.; Chu, X.; Dong, M. Resource and Power Allocation in SWIPT-Enabled Device-to-Device Communications Based
on a Nonlinear Energy Harvesting Model. IEEE Internet Things J. 2020, 7, 10813–10825. [CrossRef]
47. Zargari, S.; Hakimi, A.; Tellambura, C.; Herath, S. User Scheduling and Trajectory Optimization for Energy-Efficient IRS-UAV
Networks with SWIPT. IEEE Trans. Veh. Technol. 2022. [CrossRef]
48. Liu, Y.; Han, F.; Zhao, S. Flexible and Reliable Multiuser SWIPT IoT Network Enhanced by UAV-Mounted Intelligent Reflecting
Surface. IEEE Trans. Reliab. 2022, 71, 1092–1103. [CrossRef]
49. Niu, H.; Chu, Z.; Zhu, Z.; Zhou, F. Aerial intelligent reflecting surface for secure wireless networks: Secrecy capacity and optimal
trajectory strategy. Intell. Converg. Netw. 2020, 3, 119–133. [CrossRef]
50. Lin, Y.; Wang, T.; Wang, S. Trajectory Planning for Multi-UAV Assisted Wireless Networks in Post-Disaster Scenario. In
Proceedings of the 2019 IEEE Global Communications Conference (GLOBECOM), Waikoloa, HI, USA, 9–13 December 2019;
pp. 1–6.
51. Savkin, A.V.; Huang, C.; Ni, W. Joint Multi-UAV Path Planning and LoS Communication for Mobile Edge Computing in IoT
Networks with RISs. IEEE Internet Things J. 2022. [CrossRef]
52. Wang, L.; Wang, K.; Pan, C.; Xu, W.; Aslam, N.; Hanzo, L. Multi-Agent Deep Reinforcement Learning-Based Trajectory Planning
for Multi-UAV Assisted Mobile Edge Computing. IEEE Trans. Cogn. Commun. Netw. 2021, 7, 73–84. [CrossRef]
53. Li, S.; He, C.; Liu, M.; Wan, Y.; Gu, Y.; Xie, J.; Fu, S.; Lu, K. Design and implementation of aerial communication using directional
antennas: Learning control in unknown communication environments. IET Control. Theory Appl. 2019, 13, 2906–2916. [CrossRef]
54. Jayakody, D.N.K.; Perera, T.D.P.; Nathan, M.C.; Hasna, M. Self-energized Full-Duplex UAV-assisted Cooperative Communication
Systems. In Proceedings of the 2019 IEEE International Black Sea Conference on Communications and Networking (BlackSeaCom),
Sochi, Russia, 3–6 June 2019; pp. 1–6.
55. Boyd, S.; Boyd, S.P.; Vandenberghe, L. Convex Optimization; Cambridge University Press: Cambridge, UK, 2004.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.
226
drones
Article
Research on the Cooperative Passive Location of Moving
Targets Based on Improved Particle Swarm Optimization
Li Hao 1 , Fan Xiangyu 2, * and Shi Manhong 3
1 Department of Intelligence, Air Force Early Warning Academy, Wuhan 430010, China
2 Department of Bomber and Transport Aircraft Pilots Conversion, Air Force Harbin Flying College,
Harbin 150088, China
3 Department of Information Countermeasures, Air Force Early Warning Academy, Wuhan 430010, China
* Correspondence: panda0077@163.com; Tel.: +86-177-0360-5050
Abstract: Aiming at the cooperative passive location of moving targets by UAV swarm, this paper
constructs a passive location and tracking algorithm for a moving target based on the A optimization
criterion and the improved particle swarm optimization (PSO) algorithm. Firstly, the localization
method of cluster cooperative passive localization is selected and the measurement model is con-
structed. Then, the problem of improving passive location accuracy is transformed into the problem
of obtaining more target information. From the perspective of information theory, using the A crite-
rion as the optimization target, the passive localization process for static targets is further deduced.
The Recursive Neural Network (RNN) is used to predict the probability distribution of the target’s
location in the next moment so as to improve the localization method and make it suitable for the
localization of moving targets. The particle swarm algorithm is improved by using grouping and
time period strategy, and the algorithm flow of moving target location is constructed. Finally, through
the simulation verification and algorithm comparison, the advantages of the algorithm in this paper
are presented.
Keywords: passive location; UAV swarm; moving target; A optimization criterion; particle swarm
optimization; recursive neural network
Citation: Hao, L.; Xiangyu, F.;
Manhong, S. Research on the
Cooperative Passive Location of
Moving Targets Based on Improved
1. Introduction
Particle Swarm Optimization. Drones
2023, 7, 264. https://doi.org/ As electromagnetic space has become the fifth-dimensional battlefield after “land, sea,
10.3390/drones7040264 air, and sky”, the importance and research efforts of various countries in electromagnetic
space have increased considerably. When using and radiating electromagnetic waves, the
Academic Editors: Zhihong Liu,
position of electromagnetic space is exposed, and passive location emerges as the times
Shihao Yan, Yirui Cong and
require [1–5]. However, the location accuracy of passive location decreases significantly
Kehao Wang
with the increase in the distance from the target, and the location efficiency is highly related
Received: 26 February 2023 to the spatial position distribution of the location points. With the rapid development of
Revised: 3 April 2023 UAV technology, UAV has gradually become a new type of combat force in the future
Accepted: 10 April 2023 battlefield with its unique advantages. Utilizing the distributed characteristics of UAV
Published: 12 April 2023 swarms to optimize their spatial distribution and trajectory has become a new way to
improve the ability to passively locate targets.
The current research on passive location can be divided into two main directions. The
first is to study and improve the location accuracy algorithm, such as improving the time of
Copyright: © 2023 by the authors.
arrival (TOA) [6], time difference of arrival (TDOA) [7], received signal strength (RSS) [8],
Licensee MDPI, Basel, Switzerland.
This article is an open access article
and angle of arrival (AOA) [9,10]. Since this article does not involve the improvement of
distributed under the terms and
the location algorithm, it will not be considerably discussed here.
conditions of the Creative Commons The other major direction is to optimize the spatial location of passive location points
Attribution (CC BY) license (https:// to improve location performance. It mainly includes two research contents: optimizing
creativecommons.org/licenses/by/ the time-series spatial position of a single station and the spatial distribution position
4.0/). of multiple stations. For a single-station location, [11] deduced the factors affecting the
location error based on the AOA-based airborne platform location method and constructs a
method to reduce the single-station error. The authors of [12,13] extend the passive motion
location of a single station to a multi-station, and optimized the corresponding location
mode and designed a new objective function.
In studying the optimal configuration of a multi-station location, the general paradigm
is to first select or design a certain location index as the objective function. Then, through
theoretical derivation or numerical calculation, the aircraft coordinate parameters under
the optimal objective function are obtained, which is the optimal configuration of passive
location.
In [14,15], geometric dilution of precision (GDOP) is used as the objective function
of location, and the corresponding optimization function is designed to further improve
the accuracy of a passive location. The authors of [16] took the AOA location system as
the research object and deduced the conditions of the optimal passive location configu-
ration with the minimum circular error probable (CEP) as the criterion. In [17,18], the
Fisher information matrix (FIM) was considered as the objective function to study the
optimal multi-aircraft passive location configuration when FIM is the largest. In [19,20],
the value of the Cramer–Rao lower bound (CRLB) determinant was used as the objective
function to study the optimal location configuration of multiple stations under the TDOA
location system.
Table 1 shows a comparison of the main work and related research of this article and
the selection of the articles from the above-mentioned literature that conducted in-depth
research into this field of study.
Algorithm
Functions Implemented Article [8] Article [13] Article [20]
in This Paper
Improved location
Yes Yes Yes No
algorithm
Location using
Yes No No Yes
multiple stations
Real-time optimization trajectory Yes No No No
Positioning by target’s motion characteristics Yes No No No
It can be seen from the above-mentioned literature and Table 1 that research on passive
location at this stage has mainly focused on the improvement of the passive location
method and static station deployment. That is, by designing various criteria to improve the
accuracy of passive location algorithms or based on different location systems, research
has been conducted on optimizing the station layout. However, there is little research
on the cooperative passive location of moving targets. At the same time, the method
of static station placement cannot be directly applied to the problem of the cooperative
passive location of moving objects because the passive location of stationary targets has
no constraints on the target point. The location of moving targets is a sequential decision-
making problem. That is, the optimization result in the next moment is subject to the
constraints of the position in the present moment and the performance parameters of the
platform. The subsequent location performance is also affected by the location accuracy of
the previous sequence. Although the localization of stationary targets cannot be directly
used to solve the problem of localization of dynamic targets, the two are not completely
unrelated. It can learn from the research ideas and methods of stationary target location,
combined with the characteristics of the moving target location. Thus, we aim to improve
the location method and promote its scope of application.
The results of the above-mentioned literature also focus on obtaining the optimal
spatial configuration. For the static layout of the site, the above-mentioned research has a
strong practical significance. However, for a spatial motion platform such as an unmanned
aerial vehicle cluster, the optimal configuration can be obtained directly, while ignoring the
228
Drones 2023, 7, 264
process of forming the optimal configuration, which requires a lot of time and computing
resources. Therefore, it is necessary to optimize the space location of UAVs in real time and
to achieve global optimization gradually.
Based on the perspective of information theory, this paper optimizes the spatial
trajectory of each UAV in the UAV swarm to improve the location efficiency. The main
contributions are as follows:
1. The real-time trajectory planning for the passive location of the UAV cluster is imple-
mented based on the RSS model.
2. Using the improved deep learning network to correct the target location probability
parameters in the positioning algorithm, a more accurate positioning of the moving
target is achieved.
3. The depth network can identify the target movement trend in complex mixed noise,
which provides a method to solve the problem of recognition in complex noise.
4. Designing particle grouping and time period to improve the particle swarm optimiza-
tion algorithm, the algorithm effect is improved.
The article is organized as follows. The passive location principle of the cluster and
the corresponding measurement model are constructed in Section 2. The optimization
process of static target and dynamic target location is analyzed, and the optimization
target function for the passive location of a moving target is constructed and derived in
Section 3. To address the shortcomings of particle swarm optimization, the grouping and
time period strategies are used to improve it in Section 4. The optimization function and
corresponding constraints for moving target localization are constructed, and the passive
location optimization process based on improved particle swarm optimization algorithm
are presented in Section 5. Simulation verification and algorithm comparison are performed
to highlight the advantages of the method in Section 6. The discussion and final conclusion
are presented in Sections 7 and 8, respectively.
> x y @T
φ
5[ φ
r r T
5[ > x y @
[t
[ xt yt ]
T
r
φ
5[
[ x y ]
T
229
Drones 2023, 7, 264
In Figure 1, the target radiates electromagnetic signals and its coordinates are Rt = [xt ,
yt ]T . The three platforms Rx1 , Rx2 , and Rx3 receive radiation signals. Combined with the
constructed signal attenuation model, the distance ri between the target to be located and
each detection platform can be obtained. The RSS location equation is:
⎧
⎪
⎪ ( x − x1 )2 + ( y t − y1 )2 = r1
⎪
⎨ t
⎪ ( x − x2 )2 + ( y t − y2 )2 = r2 (1)
⎪ t
⎪
⎩ ( x − x )2 + ( y − y )2 = r
t 3 t 3 3
By solving Formula (1), the RSS envelope of each receiving platform in Figure 1 can be
obtained. The place where the three circles overlap each other in Figure 1 is the area where
the target is located.
φM
RM
[ xM yM ]
T
rM
"r T
> xt yt @
φ
T
R > x y @
ϕM
ϕ
r
φ
R [ x y ]
T
The location of the target is Rt = [xt , yt ]T . The position and velocity of the i-th UAV are
Ri = [xi , yi ]T and Rvi = [vxi , vyi ]T , I = 1, 2, . . . , M, respectively. The connecting line between
the drone and the target has an included angle φi with the x-axis. The distance from the
target is ri = ||Ri − Rt ||2 , and the angle between any two UAVs and the target is ϕij , j = 1,
2, . . . , M.
The attenuation model of the signal in the atmosphere is:
where po is the equivalent radiated power of the target-radiated signal. That is, the product
of the target-radiated power and the antenna gain. As these two parameters are not
the concern of the research in this paper, they are not introduced in detail here. γi is
the attenuation factor of the electromagnetic wave, and di is the length of the signal
propagation path. This paper assumed that the signal is not refracted. That is, di is the
distance ri between the UAV and the target [25]. Then, the signal strength ps of the signal
reaching the UAV receiving end can be calculated by Formula (2).
Due to the existence of electromagnetic interference and clutter in the atmosphere and
the thermal noise of the system in the signal receiver, the actual signal pir (k) received by the
receiver of the i-th UAV at time k can be expressed as:
230
Drones 2023, 7, 264
Among them, n(k) represents the measurement error that obeys the Gaussian distribu-
tion, that is, n(k) ~ N(0, σi2 (di )). The error is related to the distance di between the targets,
satisfying:
σi2 (di ) = diα σ02 (4)
where σ02 is a constant and is the basic unit of measure for variance. α is the path attenuation
factor. According to Formulas (2)–(4) and the signal Pir (k) received by each UAV at time k,
the matrix of the received signal strength distribution of the UAV swarm can be obtained
as Pr (k). The Pr (k) covariance matrix is σp = diag σ12 (k), σ22 (k), . . . σM
2 ( k ) . Then, the signal
received by the UAV swarm can be denoted as Pr (k) ∼ N Ps (k ), σp , where Ps (k) represents
the estimated target position using the pure signal that reaches the UAV.
After acquiring the signal energy of each point, the distance r from the target to the
sensor can be estimated according to the signal attenuation model. Since the positions of the
UAVs themselves are known, the multiple circles shown in Figure 1 can then be obtained
using Formula (1). The overlapping areas of the different circles are the target position.
It can be seen that positioning accuracy is related to the accuracy of the signal attenu-
ation model. The attenuation characteristics and corresponding parameters of the signal
attenuation model are accurate, and the distance between the UAV and the target can be
estimated well. Otherwise, the error is large. Scholars have conducted in-depth research
on this and constructed a variety of attenuation models to further ensure the accuracy of
distance estimation.
where J represents the FIM of the measurement matrix, and −1 represents the inverse of
this matrix. Then, J−1 is CRLB.
231
Drones 2023, 7, 264
The elements of the i-th row and the j-th column of the four matrices in Formula (6)
can be expressed as:
⎧
⎪ i,j ∂ ∂
⎪Jxx = E ∂xti ln( f ( Pr ; Rt )) ∂xtj ln( f ( Pr ; Rt ))
⎪
⎪
⎪
⎪
⎨Ji,j ∂ ∂
xy = E ∂xti ln( f ( Pr ; Rt )) ∂ytj ln( f ( Pr ; Rt ))
(7)
⎪ i,j ∂ ∂
⎪Jxy = E ∂yti ln( f ( Pr ; Rt )) ∂xtj ln( f ( Pr ; Rt ))
⎪
⎪
⎪
⎪
⎩ Ji,j ∂ ∂
yy = E ∂y ln( f ( Pr ; Rt )) ∂y ln( f ( Pr ; Rt ))
ti tj
Similarly, since the horizontal and vertical coordinates of the target are relatively
independent, the processes of obtaining Jxx and Jyy are independent of each other, and the
calculation process is similar. This section analyzes Jxx .
Substituting Formula (8) into Formula (7), we obtained [26]:
The right side of the equal sign of Formula (10) can be regarded as the sum of two
parts, which can be expressed as:
i,j i,j i,j
⎧ Jxx = Jxx1 + Jxx2
⎨ Ji,j = ∇ R Pr T σ−1 ∇ R Pt (11)
xx1 tx p tx
⎩Ji,j −1 ∂σp −1 ∂σp
xx2 = 2 Tr σp ∂x σp ∂x
1
i j
Among them, ∇ Rtx Pr T is the Jacobian matrix obtained after the derivation of the target
abscissa Rtx using the measured value Pr T , which is expressed as:
232
Drones 2023, 7, 264
In Formula (11), Tr represents the trace of the matrix. Then, the two partial derivatives
are: ⎡ ⎤
d1α−1 cos φ1 0 ··· 0
∂σp ⎢ 0 d2α−1 cos φ2 ··· 0 ⎥
⎢ ⎥
= σ02 ⎢ .. .. .. .. ⎥ (15)
∂xi ⎣ . . . . ⎦
0 0 · · · dαM−1 cos φ M
⎡ ⎤
d1α−1 sin φ1 0 ··· 0
∂σp ⎢ 0 d2α−1 sin φ2 ··· 0 ⎥
⎢ ⎥
= σ02 ⎢ .. .. .. .. ⎥ (16)
∂x j ⎣ . . . . ⎦
0 0 · · · dαM−1 sin φM
To further simplify Formula (11), let:
α2 25γi2
βi = + (17)
2
di (ln(10))2 σ02 dia+2
Then, according to the A optimization criterion, the objective function can be expressed
as:
M
8 ∑ βi
i =1
argmintr (J−
opt
Fx = xx )
1
= argmin (21)
M M ( )
∑ ∑ β i β j 1 − cos 2φi − 2φj
i =1 j = i +1
Combining Formulas (21) and (17), it can be seen that the location accuracy of the
target abscissa xt is related to the distance di between each UAV and the target. It is also
related to the angle difference φi − φj between any two drones.
Formula (21) only involves the estimation of the target abscissa xt . The estimation of
the target ordinate yt is the same as xt ; thus, Formula (10) is modified as:
233
Drones 2023, 7, 264
The subsequent operation process is completely similar to Jxx in the previously men-
tioned article and is not repeated in this article.
Since the horizontal and vertical coordinates of the targets are independent of each
other, the effects of directly calculating J as well as Jxx and Jyy are equivalent. Therefore, the
optimization objective function for the passive location of stationary targets is:
opt opt
F opt = argmin Fx + Fy (23)
3.2. The Main Difference between the Location of Moving Objects and Stationary Objects
The key difference between the location of moving targets and stationary targets is
f (Pr ; Rt ) in Formula (8), that is, the probability density distribution function of the target
position changes in different trends with the location of the target.
In the process of locating a stationary target, since there is no prior information as
a support, the target obeys a uniform distribution on the x-axis and y-axis. That is, f (Pr ;
Rt ) obeys an equal probability distribution on the abscissa and ordinate axes. Then, as the
location progresses, it obeys the Gaussian distribution.
In the process of locating the moving target, as the location continues, the coordinates
of the target in the next moment does not obey a uniform distribution on the entire coordi-
nate axis. Instead, the f (Pr ; Rt ) of the target position in the next moment should be derived
by combining the existing multiple location results and the target movement trend.
That is, the main difference between the location of moving objects and stationary
objects is that, in the process of location moving objects, the probability density f (Pr ; Rt ) of
the spatial distribution of the objects should be adjusted in real time.
234
Drones 2023, 7, 264
Through the above process, the CNN and RNN network training can be achieved.
Among them, Step 4 trains the corresponding network parameters according to the
different motion states of the target, which can improve the applicability of the network
and further improve the prediction accuracy.
The specific process of online application in Figure 4 can be described as:
Step 1: Use the RSS passive location method to obtain the trajectory parameters of the
target. Input it into the CNN to identify the motion state of the target.
Step 2: According to the identified motion state, select and load the corresponding
RNN network parameters.
Step 3: Input the trajectory parameters of the target into the RNN to obtain the
predicted trajectory points of the target.
To date, the single prediction of the target trajectory using the deep combination
network has been achieved.
The core purpose of constructing a combined network is not to accurately predict the
position of the target, but to obtain f (Pr ; Rt ) in Formula (8). When used online, step 3 is
repeatedly executed to obtain the predicted values of the multiple sets of target positions.
Frequency is used instead of probability, as f (Pr ; Rt ) of the target in the next moment.
This way of obtaining f (Pr ; Rt ) is not limited by the probability density distribution
function and corresponding parameters. At the same time, it does not require sufficient
professional knowledge and mathematical skills to obtain the probability density distri-
bution function of the target in the next moment. This method is easy to operate and the
results are more accurate.
235
Drones 2023, 7, 264
At the same time, this strategy has another advantage. In practical situations, environ-
mental noise is generally a mixture of multiple different parameters and distribution types
of noise, and has time-varying characteristics. However, it is impossible to obtain the type
and corresponding parameters of each noise in this mixed noise. This also leads to actual
noise being much more complex than theoretical noise and inability to build a theoretical
model of environmental noise. Furthermore, subsequent quantitative analysis and formula
derivation cannot be carried out. The CNN network in this paper can construct noise
distribution based on actual measured parameters. The CNN network can be trained using
the previously measured target and noise measurements. This research can greatly improve
the accuracy of trajectory recognition in complex noise backgrounds.
Although deep learning can be used to predict the position of the target, it is still
necessary to combine the FIM to optimize the spatial position of the UAV and improve
the passive location accuracy. Therefore, its essence is still an NP-hard problem, and it is
difficult to obtain an analytical solution.
Therefore, this paper improved the particle swarm algorithm and optimized the
spatial position and trajectory of the UAV to improve the accuracy of the passive location
of moving targets. There are two main reasons for using the PSO algorithm in this article.
The first reason is that it is difficult to obtain the expression of the parameter f (Pr ; Rt )
through theoretical derivation. Due to such constraints, even if f (Pr ; Rt ) is set, deriving
an analytical solution is extremely difficult and not universal. Therefore, this article used
intelligent optimization algorithms to solve it.
The second reason is that, compared to many other intelligent optimization algorithms,
the PSO algorithm is recognized as being the fastest. The in-depth research that has been
conducted on PSO is sufficient to ensure the effectiveness of PSO and, also due to the
extensive research on PSO, its algorithm has good stability.
t +1 t +1
xid = xid
t
+ vid (25)
where vit = vi1 t , vt , · · · , vt
i2 iD represents the set of velocities of the i-th particle in each
dimension during the t-th iteration; xit represents the set of particle position, i = 1, 2, . . . ,
N, d = 1 , 2, . . . , D, t = 1, 2, . . . , T; ω is the inertia coefficient; c1 and c2 are learning factors;
and r1 and r2 are random numbers uniformly distributed between [0, 1]. pibest t and ptgbest
are the best positions in individual history and population history, respectively.
Then, the fitness function corresponding to the particle position is calculated. The
better the fitness, the better the position of the particle. All particles adjust their speed
direction and move towards a better position by comparing their fitness functions with that
of other particles.
The above is the core formula and basic principle of the PSO algorithm. It can be seen
that the PSO algorithm only needs to adjust the flying speed of the particles to achieve
optimization.
Although PSO can easily achieve the local optimal solution, especially for typical
multimodal functions, its search efficiency is limited. This is due to the fact that particles
are easily influenced by other particles. Some particles are affected by other better particles
236
Drones 2023, 7, 264
when they do not search a certain area completely. All move towards the position of the
optimal particle at this stage, resulting in premature maturity.
If it is possible to conduct a complete and thorough search of each area, a global
comparison can be established. Or during the movement, a detailed search for the area
within the movement track can be performed. This can reduce the possibility of falling into
a local optimum. Therefore, this paper constructs a time-period-based hierarchical PSO
improvement strategy to improve the search performance of PSO.
+j KLJK
OD\HU
PLGGOH
0 0 0 OD\HU
% % ERWWRP
% OD\HU
The core idea of layering is to construct three groups according to the distribution of
particles: bottom layer, middle layer, and high layer. The bottom layer is explored in real
time, and after interaction, the fitness function is compared to obtain the middle and high
layers. The bottom layer of each group only interacts with the group, which ensures that
an area is fully searched. At the same time, the best bottom layer data in this group are
used as the middle layer. Then, the middle layer interacts occasionally, which balances
the contradiction between the global and local searches. Afterwards, the middle and high
layers guide the work of the lower layers, and the upper layers of different ethnic groups
occasionally interact, thereby changing the movement pattern.
In the above discussion, how the particles are grouped and how often the particles
between the middle and high layers exchange information seriously affect the algorithm
performance. For this reason, it is introduced in detail later.
237
Drones 2023, 7, 264
After that, the particles start to be optimized. In the initial stage, the fitness function
corresponding to each particle position is calculated. Then, a comparison within the
group is performed to obtain the optimal particle within the group. That is, ptMbest is
the best particle of the bottom layer, and it also becomes the particle of the middle layer.
Afterwards, each middle-layer particle is compared to obtain the position ptHbest of the
optimal particle of the group, that is, the high-level particles in Figure 5. Then, Formula (24)
can be modified as:
t +1
vid = ωvid
t
+ c1 r1 pibest
t
− xid
t
+ c2 r2 ptMbest − xid
t
+ c3 r3 ptHbest − xid
t
(26)
The parameter definitions in Formula (26) are the same as those in Formula (24), and
are therefore not repeated here.
It can be seen from Formula (26) that the improved PSO is less affected by the global
optimal solution. At the same time, each ethnic group searches for the optimal solution
within its own territory as much as possible. This enables the adequate exploration of
multiple regions. Occasional high-level interactions between groups can ensure that each
group moves toward the optimal solution within the group. Ultimately, the possibility of
the premature maturity of the PSO algorithm is reduced.
t +1 mod(t,t M )
vid = ωvid
t + c r pt
1 1 ibest − xid +
t
tM r2 ptMbest − xid
t
(27)
+ modt(Ht,t H ) r3 ptHbest − xid
t
where mod(a,b) is the remainder operation, that is, the remainder obtained after dividing a
by b.
In Formula (27), when the mod operation result is small, it means that the correspond-
ing optimal value has just been updated. At this time, it is more focused on letting the
t
particles search in their respective areas to obtain a better pibest for subsequent updates. As
the search progresses, the mod results gradually increase, and the particles move closer to
the local optimum. It is ensured that, before the next update of the local optimal value, the
particle performed a more comprehensive search for the region where it is located, thereby
reducing the possibility of falling into the local optimal value.
However, as the search progresses, particles within a group do not always belong to the
same group. Instead, they regroup after multiple searches. This ensures a comprehensive
238
Drones 2023, 7, 264
search of the area. Therefore, in this paper, after every tG search, all particles were regrouped
according to the hierarchical clustering method in the previous section to improve the
search efficiency.
The idea of the time period is borrowed from the clock model. That is, important
parameters, such as the hour hand, should be updated slowly. Exploratory particles, such
as the minute and second hands, should be updated faster. In this way, the effective search
for the full dimension is better achieved, and the possibility of falling into a local optimum
is reduced.
To sum up, this section improves the PSO algorithm by designing the particle grouping
architecture and building the time period.
5. Passive Location Algorithm Flow of the Moving Target Based on Improved PSO
5.1. Objective Function
Using UAV swarms to locate moving targets is an asymptotically optimal process.
Therefore, not only the location effect in the present moment, but also the subsequent
impact of the decision in the present moment, should be considered. In this way, the best
location effect can be achieved at a faster speed.
239
Drones 2023, 7, 264
Assuming a time k, the subsequent motion state of the UAV swarm and the target is
shown in Figure 7.
The coordinate of the i-th UAV in our UAV swarm is xi (k) and the target coordinate is
Rt (k).
At this time, the model predictive control (MPC) method was adopted. That is, the
optimization method of predicting H steps and executing one step was adopted. On the
basis of Formula (23), the objective function is adjusted as:
H −1
FH opt = argmin ∑ γi F opt (k + i ) (28)
i =0
where γ is the decay factor. The MPC method used in Formula (28) is relatively mature,
and is not repeated in this paper.
5.2. Constraints
Constraints mainly include individual motion constraints and obstacle avoidance con-
straints, as well as cluster communication constraints and collision avoidance constraints.
It was assumed that the motion state of the UAV is at k and the next moment, that is,
the motion state at the moment k + Δk, as shown in Figure 8.
The position and speed of m-th UAV at time k are Pkm = [ xm k , yk ] and vk = [ vk , vk ].
m m xm ym
Taking it as the initial condition, it was optimized to obtain the position and velocity in
the next moment as Pkm+Δk = [ xm k +Δk , yk +Δk ] and vk +Δk = [ vk +Δk , vk +Δk ], respectively. The
m m m m
240
Drones 2023, 7, 264
where Δvkm is the value of the velocity change, which should satisfy:
Δvkm 2 ≤ Δvmax
(31)
vmin ≤ vkm+Δk 2 ≤ vmax
That is, the speed and the amount of speed change cannot exceed their allowable limit.
Similarly, the change amount Δθmk of the UAV direction can be calculated according to
241
Drones 2023, 7, 264
Figure 9. Flowchart of the passive location algorithm for the moving target by UAV swarm.
242
Drones 2023, 7, 264
&RUUHFWVWDWH
,GHQWLILHGVWDWH
WLPHPLQ
From Figure 10, it can be seen that CNN has three errors and IMM-EKF has six. CNN
is more accurate than IMM-EKF. This is because the basic function of CNN is recognition,
and the recognition effect will increase with the increase in training data. However, the
recognition effect of IMM-EKF is affected by noise, and the performance does not change
with the amount of training data. Therefore, CNN is more suitable for target motion state
recognition.
The data used to train the CNN-RNN composite network were input for 60 points, that
is, the position of the target and the motion state identified by the CNN per second. The
output was 60 track points that predict the target for the next minute. A total of 360,000 sets
of data were used for training and an additional 3600 sets for testing. The training was
completed after the number of iterations was reached.
Because the length of the data used for training was only 60, the number of RNN
network layers was set to 8 layers, that is, 6 layers were hidden layers, the learning rate was
0.3, 8 neurons per layer, and the number of iterations was 50,000. Comparing the algorithm
with the classical RNN, the result is shown in Figure 11.
3UHGLFWLRQHUURURI&11511
3UHGLFWLRQHUURURI511
WLPHPLQ
As can be seen from Figure 11, the prediction results of the CNN-RNN network were
generally better than that of RNN. This is because there are actually three sets of RNN
networks with different parameters in CNN-RNN. That is, after the CNN identifies the
target motion state, the RNN loads the corresponding parameters to perform the prediction.
With more targeted networks, the results will certainly be more accurate. However, once
the CNN recognizes an error, the error spikes, as shown in Figure 11.
243
Drones 2023, 7, 264
\NP
(a)
\NP
(b)
\NP
(c)
Figure 12. Comparison of algorithm optimization results. (a) Optimization results of the algorithm in
this paper; (b) IMM−EKF optimization results; and (c) the results of the location method in [30].
244
Drones 2023, 7, 264
By comparing the three sets of results in Figure 12, it can be seen that the location
points of the algorithm in this paper are more coincident with the target trajectory.
It can be seen from Figure 12b that the method of IMM-EKF has better localization
accuracy. However, when the motion state of the target is converted, the IMM-EKF cannot
quickly identify the change of the motion state of the target. At the same time, after
identification, it is difficult to quickly establish a new tracking equation, resulting in a
significant decrease in the location efficiency at this time.
Literature [30] uses Doppler rate to improve the accuracy of moving target positioning
based on time delay and Doppler shift. Meanwhile, literature [30] establishes a pseudolinear
set of equations by introducing some additional variables. The analytic solution for moving
target positioning is given. The positioning CRLB is derived. However, by comparing
Figure 12a,c, it can be seen that the positioning method in literature [30] differs from that in
this paper in positioning accuracy. There are two main reasons.
The first is that, as can be seen from Figure 12c, the method in literature [30] has
always had a large error. This is because the method in literature [30] does not consider the
sequential nature of target motion, treating each localization as an independent localization.
As a result, its positioning performance will not improve with the progress of positioning.
The second reason is that the method in literature [30] does not achieve real-time planning
for the trajectory of unmanned aerial vehicles, but rather provides the ultimate ideal
location point distribution method. The real-time optimization is not achieved, and motion
conditions such as platform motion are not considered. This results in poor performance
during the positioning process.
The core reason why the algorithm in this paper is superior to other algorithms is
that this paper constructs a model of cooperative passive location from the perspective of
clusters. This article does not disassemble the five UAV into a “2 + 3” model, but optimizes
the five UAVs as a whole. It can be seen from Figure 12a that among the 5 UAVs, 3 UAVs
are flying towards the target, which is pulling in the relative distance between the cluster
and the target. The 2 UAVs flew towards a wide area, increasing the observation angles of
different drones. This also conforms to Formula (21), that is, the UAV swarm adjusts the
distance and angle factors that affect the location accuracy.
The algorithm in this paper obtains better location performance by adjusting the
distance between the cluster and the target and forming different observation angles at the
same time.
In order to further quantify and compare the location performance. Under the con-
dition that the simulation conditions remain unchanged, 30 Monte Carlo experimental
simulations are carried out for each algorithm. Take the average value of the errors at each
moment to obtain a comparison chart, as shown in Figure 13.
0HWKRGLQWKLVSDSHU
,00(.)
0HWKRGLQ>@
VHDUFKPRPHQW
245
Drones 2023, 7, 264
As can be seen from Figure 13, the algorithm in this paper has two obvious advantages
over other algorithms. One is that the MPC is involved in the algorithm in this paper, so its
error decreases significantly faster than other algorithms.
The other is that the stability of the algorithm in this paper is stronger. When the
motion state of the target changes, it is difficult for each algorithm to judge the change in
the state at the first time, so there is a sudden change in error in Figure 13. By comparison,
it can be seen that, because the algorithm in this paper uses a combined network, the error
is less affected. At the same time, the algorithm also stabilizes faster.
To further compare the effectiveness of the positioning methods, this section counts
the positioning time of 30 Monte Carlo experiments of the above three methods. The results
are shown in Table 2.
As can be seen from Table 2, the algorithm in this paper is superior to the other two
algorithms in terms of efficiency. This is because, when using IMM-EKF to determine the
motion state of a target, it is necessary to calculate the probability of the target’s motion
state in the next moment based on its previous motion trajectory. The algorithm in this
paper only needs to input the trajectory into the trained network, and can directly predict
the position of the target in the next moment, which is faster.
The method in [30] provides an analytical solution, which can intuitively see the
relationship between factors affecting the target’s positioning accuracy and quantification.
However, in the solution process of [30], it involves performing inverse operations on a
large number of matrices, Which seriously affects the speed of the algorithm. Therefore, it
takes a long time.
,362LQWKLVSDSHU
,362LQ>@
+362LQ>@
VHDUFKPRPHQW
246
Drones 2023, 7, 264
It can be seen from Figure 14 that the performance of the algorithm in this paper is
more stable, because the algorithm in this paper can perform a more global search and
improve the algorithm efficiency.
The method in [31] is more focused on enabling PSO particles to jump out of the local
optimization with maximum probability, thereby achieving global search. To achieve this
goal, Formulas (5)–(7) in [31] set a method for generating approximately random search
directions. This setting can reduce the possibility of falling into a local optimum, but this
near-random approach has no significant effect on improving search performance.
The improvement idea of this article was inspired by [32] to group particles for search.
One disadvantage of [32] is that its particle search strategy, i.e., the updated equation of
particle state, is artificially adjusted. In the iterative process of [32], the first 80% of searches
and the last 20% of searches use different update equations. However, in [32], a simple
comparative experiment shows that the ratio of 80% to 20% is better, without indicating
whether it is optimal. Obviously, this ratio may vary depending on the issue.
At the same time, there is another reason why the method in this article is superior to
the above two methods. What this article aimed to solve is a sequential decision-making
problem. The optimization results of the previous moment affect the next moment. The
positioning accuracy of the previous moment is good, providing a good initial condition
for the next moment, and the positioning accuracy of the next moment will not be poor. If
the positioning effect at the previous moment is poor, it will also affect the positioning at
the next moment. Therefore, over time, compared to the other two methods, the effect of
this article becomes better and better.
In order to further compare the performance of the algorithms, the time of the three
optimization algorithms is also counted, and the results are shown in Table 3.
Through comparison, it can be seen that the algorithm speed in this article is weaker
than IPSO [31], but better than HPSO [32].
The improvement of the IPSO algorithm on the search direction of the particles is
still based on a random mode. Compared with the PSO algorithm, this search mode has
almost no significant change in the additional computation amount generated by the PSO
algorithm. Therefore, IPSO still maintains its high-speed solution efficiency. The algorithm
in this paper involves further information interaction between groups and particles, with
a significant increase in computational complexity. Therefore, the performance is weaker
than IPSO.
Both this algorithm and HPSO [32] involve particle grouping and information interac-
tion. However, in each iteration of the HPSO algorithm, the particle parameters at each
level are updated. In this article, by designing a clock cycle, particles at different levels
were updated according to the cycle, which reduces the amount of computation. This can
also allow different particle populations to conduct more detailed searches of their regions.
Under the main premise of ensuring positioning accuracy, the effectiveness of the
algorithm in this paper is even higher.
7. Discussion
This section mainly discusses the main contributions, application scenarios, algorithm
deficiencies, and follow-up work of this article.
This paper built a passive location method for moving targets based on RSS for UAV
clusters. The target probability distribution network was designed to predict the subsequent
location of the target more clearly and easily. Thus, the mature static target positioning
247
Drones 2023, 7, 264
method was extended to the target positioning. At the same time, the PSO algorithm was
improved in this paper. From the simulation comparison, the improved method had a
good performance.
The research results can be applied in many ways, mainly using a UAV cluster to
locate a target and achieve navigation without a GPS signal. UAV clusters can also be used
to search and rescue people with mobile phones. Sound and electromagnetic information
can be collected to build digital maps. It can also locate ships on the sea, or discover and
locate concealed radar.
Although this paper has conducted some research work, there are still some limitations.
Firstly, the positioning model does not take the altitude direction into account, so in practice,
this study is still far from achieving more accurate applications. Secondly, although the
network can suppress complex noise, its effect is limited. Finally, the real-time performance
of the algorithm needs further design. The PSO algorithm cannot increase speed further,
but as UAV clusters are multiple platforms, parallel computing can be considered. It is
feasible to exchange computing resources for optimization time.
To overcome these shortcomings, a passive positioning model of UAV in a three-
dimensional scene will be built in future research to improve the network to improve its
ability of target state recognition under strong noise background. Additionally, a framework
of parallel computing will be designed to test and improve the algorithm.
8. Conclusions
In this paper, the problem of improving passive location accuracy will be transformed
into the problem of obtaining more target information. Based on RSS and the A criterion,
a passive location method for moving objects was constructed. Firstly, the measurement
model of cluster passive location was constructed. After that, the relationship between
the UAV spatial position and the static target localization effectiveness was derived and
constructed. Then, the difference between stationary target and moving target location was
analyzed. In order to expand the scope of the application of the algorithm, the prediction
of the target position was realized by designing a deep combined network. Thereby, the
probability density distribution function required in the passive location process of the
moving target was obtained. Considering that trajectory optimization is an NP-Hard
problem and addressing the problem that the PSO algorithm easily falls into the local
optimum, a layered improvement strategy based on time period was designed to improve
PSO performance. Then, a passive location algorithm flow based on the improved PSO
was constructed. Through simulation verification and algorithm comparison, the feasibility
and performance advantages of the algorithm in this paper were highlighted.
Author Contributions: Conceptualization, L.H. and F.X.; methodology, F.X.; software, S.M.; val-
idation, L.H. and F.X.; formal analysis, F.X.; writing—original draft preparation, F.X.; writing—
review and editing, L.H. and F.X. All authors have read and agreed to the published version of
the manuscript.
Funding: This research was funded by [National Natural Science Foundation of China] grant number
[61502522 and 61502523].
Data Availability Statement: The data can be found for https://pan.baidu.com/s/1lf1zE2eMLW7
mCsQi42NrkQ, and the extract code is nu82.
Conflicts of Interest: The authors declare no conflict of interest.
References
1. Zou, Y.; Wan, Q. Asynchronous Time-of-Arrival-Based Source Localization with Sensor Position Uncertainties. IEEE Commun.
Lett. 2016, 20, 1860–1863. [CrossRef]
2. Dang, L.; Yang, H.; Teng, B. Application of Time-Difference-of-Arrival Localization Method in Impulse System Radar and the
Prospect of Application of Impulse System Radar in the Internet of Things. IEEE Access 2018, 6, 44846–44857. [CrossRef]
3. Chitte, S.D.; Dasgupta, S.; Ding, Z. Distance Estimation from Received Signal Strength under Log-Normal Shadowing: Bias and
Variance. IEEE Signal Process. Lett. 2009, 16, 216–218. [CrossRef]
248
Drones 2023, 7, 264
4. Cheng, Y.; Lin, Y. A new received signal strength based location estimation scheme for wireless sensor network. IEEE Trans.
Consum. Electron. 2009, 55, 1295–1299. [CrossRef]
5. Zhao, Y.; Hu, D.; Liu, Z.; Zhao, Y. Calibrating the Transmitter and Receiver Location Errors for Moving Target Localization in
Multistatic Passive Radar. IEEE Access 2019, 7, 118173–118187. [CrossRef]
6. Li, B.; Zhao, K.; Shen, X. Dilution of Precision in Location Systems Using both Angle of Arrival and Time of Arrival Measurements.
IEEE Access 2020, 8, 192506–192516. [CrossRef]
7. Wei, H.; Ye, S. Comments on “A Linear Closed-Form Algorithm for Source Localization from Time-Differences of Arrival”. IEEE
Signal Process. Lett. 2008, 15, 895. [CrossRef]
8. Abeywickrama, S.; Samarasinghe, T.; Ho, C.K.; Yuen, C. Wireless Energy Beamforming Using Received Signal Strength Indicator
Feedback. IEEE Trans. Signal Process. 2018, 66, 224–235. [CrossRef]
9. Maric, A.; Kaljic, E.; Njemcevic, P.; Lipovac, V. Projective Approach in Determining Homogeneous Hyperspherical Geometrically-
Based Stochastic Channel Model’s Statistics: Angle of Departure, Angle of Arrival and Time of Arrival. IEEE Trans. Wirel.
Commun. 2020, 19, 7864–7880. [CrossRef]
10. Xu, H.; Zhang, Y.; Ba, B.; Wang, D.; Li, X. Fast Joint Estimation of Time of Arrival and Angle of Arrival in Complex Multipath
Environment Using OFDM. IEEE Access 2018, 6, 60613–60621. [CrossRef]
11. Schau, H.; Robinson, A. Passive source localization employing intersecting spherical surfaces from time-of-arrival differences.
IEEE Trans. Acoust. Speech Signal Process. 1987, 35, 1223–1225. [CrossRef]
12. Wang, X.; Huang, Z.; Zhou, Y. Underdetermined DOA estimation and blind separation of non-disjoint sources in time-frequency
domain based on sparse representation method. J. Syst. Eng. Electron. 2014, 25, 17–25. [CrossRef]
13. Zhang, M.; Guo, F.; Zhou, Y.; Yao, S. A Single Moving Observer Direct Position Determination Method Using Interferometer
Phase Difference. Acta Aeronaut. Astronaut. Sin. 2013, 34, 2185–2193.
14. Liu, Y.; Guo, F.; Yang, L.; Jiang, W. Source localization using a moving receiver and noisy TOA measurements. Signal Process.
2016, 119, 185–189. [CrossRef]
15. Xi, L.; Guo, F.; Le, Y.; Min, Z. Improved solution for geolocating a known altitude source using TDOA and FDOA under random
sensor location errors. Electron. Lett. 2018, 54, 597–599.
16. Wang, G.H.; Bai, J.; He, Y.; Xiu, J.J. Optimal deployment of multiple passive sensors in the sense of minimum concentration
ellipse. IET Radar Sonar Navig. 2009, 3, 8–17. [CrossRef]
17. Bishop, A.N.; Fidan, B.; Anderson, B.D.O. Optimality Analysis of Sensor-Target Geometries in Passive Location: Part 2—Time-
of-Arrival Based Localization. In Proceedings of the 3rd International Conference on Intelligent Sensors Sensor Networks and
Information Processing, Melbourne, VIC, Australia, 3–6 December 2007.
18. Bishop, A.N.; Fidan, B.; Anderson, B.D.O. Optimality Analysis of Sensor-Target Geometries in Passive Location: Part 1—Bearing-
Only Localization. In Proceedings of the 3rd International Conference on Intelligent Sensors Sensor Networks and Information
Processing, Melbourne, VIC, Australia, 3–6 December 2007; pp. 7–12.
19. Lui, K.W.K.; So, H.C. A Study of Two-Dimensional Sensor Placement Using Time-Difference-of-Arrival Measurements. Digit.
Signal Process. 2009, 19, 650–659. [CrossRef]
20. Yang, B.; Scheuing, J. A Theoretical Analysis of 2D Sensor Arrays for TDOA Based Localization; ICASSP: Toulouse, France, 2006; pp.
901–904.
21. Wu, Z.; Fu, K.; Jedari, E.; Shuvra, S.R.; Rashidzadeh, R.; Saif, M. A Fast and Resource Efficient Method for Indoor Location Using
Received Signal Strength. IEEE Trans. Veh. Technol. 2016, 65, 9747–9758. [CrossRef]
22. Xu, S.; Ou, Y.; Zheng, W. Optimal Sensor-Target Geometries for 3-D Static Target Localization Using Received-Signal-Strength
Measurements. IEEE Signal Process. Lett. 2019, 26, 966–970. [CrossRef]
23. Vaghefi, R.M.; Gholami, M.R.; Buehrer, R.M.; Strom, E.G. Cooperative Received Signal Strength-Based Sensor Localization with
Unknown Transmit Powers. IEEE Trans. Signal Process. 2013, 61, 1389–1403. [CrossRef]
24. Hemavathi, N.; Meenalochani, M.; Sudha, S. Influence of Received Signal Strength on Prediction of Cluster Head and Number of
Rounds. IEEE Trans. Instrum. Meas. 2020, 69, 3739–3749. [CrossRef]
25. So, H.C.; Lin, L. Linear Least Squares Approach for Accurate Received Signal Strength Based Source Localization. IEEE Trans.
Signal Process. 2011, 59, 4035–4040. [CrossRef]
26. Kay, S.M. Fundamentals of Statistical Signal Processing: Estimation Theory; Prentice Hall PTR: Upper Saddle River, NJ, USA, 1993; pp.
47, 73–76.
27. Zhang, B.; Zhuang, L.; Gao, L.; Luo, W.; Ran, Q.; Du, Q. PSO-EM: A Hyperspectral Unmixing Algorithm Based on Normal
Compositional Model. IEEE Trans. Geosci. Remote Sens. 2014, 52, 7782–7792. [CrossRef]
28. Sun, T.; Tang, C.; Tien, F. Post-Slicing Inspection of Silicon Wafers Using the HJ-PSO Algorithm under Machine Vision. IEEE Trans.
Semicond. Manuf. 2011, 24, 80–88. [CrossRef]
29. Hernandez, M.; Farina, A. PCRB and IMM for Target Tracking in the Presence of Specular Multipath. IEEE Trans. Aerosp. Electron.
Syst. 2020, 56, 2437–2449. [CrossRef]
30. Yongsheng, Z.; Dexiu, H.; Yongjun, Z.; Zhixin, L. Moving target localization for multistatic passive radar using delay, Doppler
and Doppler rate measurements. J. Syst. Eng. Electron. 2020, 31, 939–949. [CrossRef]
249
Drones 2023, 7, 264
31. Mistry, K.; Zhang, L.; Neoh, S.C.; Lim, C.P.; Fielding, B. A Micro-GA Embedded PSO Feature Selection Approach to Intelligent
Facial Emotion Recognition. IEEE Trans. Cybern. 2017, 47, 1496–1509. [CrossRef]
32. Roshanzamir, M.; Balafar, M.A.; Razavi, S.N. Empowering particle swarm optimization algorithm using multi agents’ capability:
A holonic approach. Knowl. Based Syst. 2017, 136, 58–74. [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.
250
drones
Article
Adjustable Fully Adaptive Cross-Entropy Algorithms for Task
Assignment of Multi-UAVs
Kehao Wang 1 , Xun Zhang 1 , Xuyang Qiao 1 , Xiaobai Li 2, *, Wei Cheng 2 , Yirui Cong 3 and Kezhong Liu 4
Abstract: This paper investigates the multiple unmanned aerial vehicle (multi-UAV) cooperative task
assignment problem. Specifically, we assign different types of UAVs to accomplish the classification,
attack, and verification tasks of targets under resource, precedence, and timing constraints. Due to
complex coupling among these tasks, we decompose the considered problem into two subproblems:
one with continuous and independent tasks and another with continuous and correlative tasks. To
solve them, we first present an adjustable, fully adaptive cross-entropy (AFACE) algorithm based on
the cross-entropy (CE) method, which serves as a stepping stone for developing other algorithms.
Secondly, to overcome task precedence in the first subproblem, we propose a mutually independent
AFACE (MIAFACE) algorithm, which converges faster than the CE method when obtaining the
optimal scheme vectors of these continuous and independent tasks. Thirdly, to deal with task
coupling in the second subproblem, we present a mutually correlative AFACE (MCAFACE) algorithm
to find the optimal scheme vectors of these continuous and correlative tasks, while its computational
complexity is inferior to that of the MIAFACE algorithm. Finally, numerical simulations demonstrate
that the proposed MIAFACE (MCAFACE, respectively) algorithm consumes less time than the existing
algorithms for the continuous and independent (correlative, respectively) task assignment problem.
e.g., particle swarm optimization (PSO) [8], ant colony optimization (ACO) [9], and ge-
netic algorithm (GA) [10], when solving the task assignment problem, they had a fast
convergence speed and could effectively obtain optimal assignment schemes, but there is a
possibility of falling into local optimum. Moreover, the auction algorithm, game theory, and
reinforcement learning have also been applied to the multi-UAV task assignment problem.
Duan et al. [11] presented a novel hybrid “two-stage” auction algorithm that combines the
structural advantages of the centralized and distributed auction algorithms, which greatly
facilitates the performance of UAVs in dynamic task assignments. Chen et al. [12] stud-
ied the cooperative reconnaissance and spectrum access (CRSA) problem for task-driven
heterogeneous coalition-based UAV networks, and proposed a joint bandwidth allocation
and coalition formation (JBACF) algorithm to solve the task assignment and bandwidth
allocation. Qie et al. [13] proposed an artificial intelligence method called simultaneous
target assignment and path planning (STAPP) to solve the multi-UAV target assignment
and path planning problem, and the effectiveness of the algorithm was experimentally
verified. In addition, references [14–21] provide a variety of alternative algorithms for the
solution of analogous problems.
Similarly, some novel works on task assignment, e.g., UAV-assisted task assignment,
have been presented. Liu et al. [22] studied a UAV-assisted IoT system while present-
ing a nonconvex age-of-information (AoI) minimization problem, which was solved by
jointly optimizing task assignment, interaction point selection (IPT), and UAV trajectories.
Zhu et al. [23] considered the problem of task loss rate (TLR) fairness among IoTs and
equal energy consumption (EC) fairness among UAVs, and proposed a multiagent deep
deterministic policy gradient (MA-DDPG) method by which to assign UAVs to accomplish
tasks and guarantee the balance between IoT TLR and UAV EC. Seid et al. [24] considered
the assignment of UAVs to perform aerial base station tasks based on a multi-UAV-assisted
IoT network framework, while presenting a joint optimization problem for computational
offloading with energy harvesting (EH) and resource price, and the resource demands
and pricing strategies between IoT devices and UAVs were continuously adjusted by the
Stackelberg game. Hu et al. [25] considered the aging of cache refreshing, computation
offloading, and state updates in UAV-assisted vehicle task awareness, and formulated a
task-assignment energy-minimization problem that was solved by a deep deterministic
policy gradient (DDPG) method. Zhou et al. [26] studied UAV-assisted mobile crowd
sensing (MCS) scenarios and proposed a UAV-assisted multitasking assignment (UMA)
method, while demonstrating the effectiveness of UMA. In addition, compared the UAV-
assisted task assignment with the UAV task assignment, the difference is that UAVs play a
secondary role in the former while serving as the primary reconnaissance and attack objects
in the latter. Furthermore, the simulation scenarios in the paper are not consistent with the
existing works (e.g., references [22–26]).
In the complex stochastic network, the cross-entropy (CE) method [27], a relatively
new technique for dealing with combinatorial optimization problems, was initially utilized
to estimate rare event probabilities. Then, references [28,29] discussed and analysed its
convergence. Additionally, the cross-entropy (CE) method was proved by the authors
in [30] to be particularly meaningful for handling combinatorial optimization problems.
Since then, it has also been proven by many scholars to be a simple and effective tool for
different fields, e.g., vehicle routing [31], buffer allocation [31], and machine learning [32].
In addition, researchers have also considered applying the cross-entropy (CE) method to
the UAV task assignment [33–35]. However, the authors of these papers did not consider
the specific precedence and timing constraints among these tasks.
When it comes to task-assignment schemes in the field of UAVs, some researchers
usually assume that each UAV is assigned to only one target, and they rarely consider the
execution sequence and the time constraints among tasks. On the other hand, multi-UAVs
are sometimes needed to perform some complex combinatorial tasks, such as classifying
the target, attacking it, and then verifying the target’s damage level in a reasonable time
on the battlefield. In addition, such deterministic approaches may not be able to find the
252
Drones 2023, 7, 204
optimal solution in a reasonable time for large-scale task assignment problems. Under these
circumstances, we present an adjustable fully adaptive cross-entropy (AFACE) algorithm
based on CE method.
Therefore, the purpose of this paper is to study the AFACE algorithm for the multi-
UAV cooperative task assignment problem under resource, precedence, and timing con-
straints. The main contributions are summed up as follows.
• We consider the multi-UAV cooperative task assignment problem in which different
types of UAVs are assigned to perform classification, attack and verification tasks of
targets under resource, and precedence and timing constraints. Considering complex
coupling among these tasks, we decompose the considered problem into two subprob-
lems: one with continuous and independent tasks and another with continuous and
correlative tasks.
• We propose an AFACE algorithm, which changes the random sample and the quantile
at each iteration and adds a parameter to adjust the maximum sample based on the
CE method. Meanwhile, the algorithm serves as a stepping stone for developing other
algorithms.
• To overcome task precedence and task coupling existing in these two problems, re-
spectively, we present a mutually independent AFACE (MIAFACE) algorithm and a
mutually correlative AFACE (MCAFACE) algorithm with polynomial time complexity.
The former algorithm converges faster than the CE method, while the computational
complexity of the latter algorithm is inferior to that of the former algorithm.
• Simulation results demonstrate that both MIAFACE and MCAFACE algorithms con-
sume less time than other existing optimization algorithms for solving the correspond-
ing problem.
The rest of this paper is organized as follows. In Section 2, we introduce the related works
of the CE method and other algorithms for the UAV task assignment. Section 3 depicts the multi-
UAV cooperative task assignment problem with its mathematical formulation. In Section 4, we
decompose the considered problem into two subproblems, and propose an AFACE algorithm,
a MIAFACE algorithm, and a MCAFACE algorithm, and apply the latter two algorithms to
solving the corresponding problem. Section 5 conducts several simulations and comparisons to
verify the feasibility and effectiveness of the proposed algorithms. This paper is concluded in
Section 6.
2. Related Work
This section reviews the related works on CE method and other algorithms used for
UAV task assignment.
253
Drones 2023, 7, 204
formed the Branch and Bound algorithm in solving the above problem, especially on a
large scale.
Referring to [33,34], the authors in [35] described the multitype UAV task assignment
problem. In this problem, different types of UAVs, or the same type of UAV as well as
resource constraints, were considered. The authors then formulated the problem and
provided a score function under resource constraints. Then, the CE method was used to
determine the optimal scheme of this problem by assigning multitype UAVs to complete
tasks. Finally, numerical simulations of the CE method for task assignment, as well as
comparisons with the exhaust search method, were conducted to verify its merits in solving
the considered problem.
In [36], the authors first analyzed the CE method, then redefined its construct and
applied it to UAV swarms. Subsequently, due to the robustness of this method, it could be
used as an effective measure to control UAV swarms in the face of obstacles and unforeseen
problems. Finally, it was validated to support UAV swarms in achieving mission objectives.
The authors of [37] considered the multi-UAV task assignment problem under resource
constraint and precedence constraint. The fully adaptive cross-entropy (FACE) algorithm
based on the CE method was then applied to solve the considered problem. Then, simu-
lation results verified that the FACE algorithm was better than the CE method and PSO
algorithm in terms of convergence speed.
Problem Description
On the battlefield, multi-UAVs are deployed to perform different tasks, for example,
to classify targets before attacking them, and then to verify them to check whether these
tasks have been accomplished. The problem considered in this paper is the selection of a
mix of the same type of UAV or different types of UAVs from their bases to perform the
classification, attack and verification tasks of targets. As shown in Figure 1, there are Nb
types of UAVs with the same speed, and the related components of this problem can be
defined as a 5-tuple {A, B , G , K, T }. In the 5-tuple, A := {1, 2, . . . , Nm } denotes the set of
task index of targets, B := {1, 2, . . . , Nb } represents the set of Nb bases, G := {1, 2, . . . , Nt }
denotes the set of Nt targets with known positions, K := {K1 , K2 , . . . , K Nm } represents the
254
Drones 2023, 7, 204
set of Nm tasks of targets, and T := { T1 , T2 , . . . , TNm } denotes the set of the execution time
of Nm tasks of targets. Note that the time required to allocate tasks is ignored.
Variables Explanation
Nb The number of bases
Nt The number of targets
j The target index
K The set of tasks of targets
Nm The number of tasks of targets
m The task index of targets
X The set of all possible UAV deployment schemes
L The number of X for each task
z The maximum number of UAVs in each scheme of X
Z The set of all possible UAV deployment scheme indexes
k A UAV deployment scheme index or a UAV formation index
x(m) A feasible UAV deployment scheme vector of task m
x j (m) A feasible UAV deployment scheme or a UAV formation of task m of target j
g ( x j ( m ); k ) A 0–1 decision variable
Y The set of all feasible x(m)
Ω( x (m)) The performance of task m
Ω The performance vector of Ω( x (m))
ρ The total objective function
ψ( x j (2)) The reward benefit of the attack task of target j
ϕ( x j (m)) The cost of assigning UAV formation x j (m) to accomplish task m of target j
j
pk The probability of killing target j
j
ps The UAV survival probability of accomplishing tasks of target j
w1 , w2 and w3 Weight coefficients
Pc and V The target identification certainty and the constant velocity of each UAV
y j and s j The value and the threat level of target j
d j (m) The farthest distance from the base corresponding to UAV formation x j (m) to target j
Dmax The maximum flying distance
Tm The execution time of task m
255
Drones 2023, 7, 204
Moreover, let x(m) = [ x1 (m), x2 (m), . . . , x Nt (m)] T denote a feasible UAV deployment
scheme vector, and define Y as the set of all feasible x(m). Let Z := {1, 2, . . . , L} be the set
of all possible UAV deployment scheme indices. Thus, x(m) satisfies
'
1, i f φ( x j (m)) = k, j ∈ G , m ∈ A, k ∈ Z
g ( x j ( m ); k ) = , (1)
0, i f φ( x j (m)) = k, x j (m) ∈ X
where j denotes the target index, m designates the task index, x j (m) is a feasible UAV
deployment scheme or a UAV formation of task m of target j, k represents a UAV deploy-
ment scheme index or a UAV formation index, φ( x j (m)) is an index function that serves to
output the subscript corresponding to x j (m) in X , and g( x j (m); k ) is a 0–1 decision variable,
i.e., the kth UAV formation is assigned to accomplish task m of target j.
Then, the total objective function ρ based on x(m) is defined as
Nm
ρ= ∑ Ω( x (m))
m =1
Nt Nm Nt
(2)
= ∑ ψ(x j (2)) − ∑∑ ϕ( x j (m)),
j =1 m =1 j =1
where Ω( x (m)) is the subobjective function of task m, ψ( x j (2)) and ϕ( x j (m)) denote the
reward benefit of the attack task and the cost of assigning x j (m) to accomplish tasks of
target j, respectively, which are
j
ψ( x j (2)) = w1 × Pc × pk × y j (3)
j
ϕ( x j (m)) = w2 × ps × s j + w3 × (V × Tm + d j (m)), (4)
where Pc is the target identification certainty, y j represents the value of target j, s j denotes
the threat level of target j, V is the constant velocity of each UAV, Tm represents the execution
time of task m, d j (m) denotes the farthest distance from the bases corresponding to UAV
formation x j (m) to target j, w1 , w2 and w3 represent weight coefficients, indicating the
j
information about the relative importance of each subobjective, pk denotes the probability
j
of killing target j, and ps is the UAV survival probability of accomplishing task m of target j.
j j
In addition, pk and ps are defined as
∏
j
pk = p aj (5)
a ∈ x j (2)
∏
j
ps = 1 − pbj , (6)
b∈ x j (m)
where a stands for a UAV in UAV formation x j (2); p aj is the probability of killing target
j with UAV a, b represents a UAV in UAV formation x j (m), and pbj is the UAV survival
probability of accomplishing task m of target j with UAV b.
According to Equations (2)–(6), ρ is rewritten as
Nt Nm Nt
ρ = ∑ w1 × Pc × pk × y j − ∑ ∑ [ w2 × p s × s j
j j
j =1 m =1 j =1
+ w3 × (V × Tm + d j (m))]. (7)
256
Drones 2023, 7, 204
Then, our objective is to maximize ρ, and the considered problem can be formulated as
Nm
P : max ρ =
x(m)∈Y
∑ Ω( x (m)) (8)
m =1
s.t. w1 + w2 + w3 = 1, 0 ≤ w1 , w2 , w3 ≤ 1 (9)
d j (m) + V × Tm ≤ Dmax ∀ j, m (10)
j j j
K1 ≺ K2 ≺ K3 ∀ j. (11)
Constraint (9) represents the range of w1 , w2 , and w3 . Constraint (10) is that, for target
j, the sum of d j (m) and the farthest flying distance performed by the UAV formation x j (m)
j j
does not exceed the maximum flying distance Dmax . Constraint (11) means that K1 , K2 , and
j
K3 are the classification, attack, and verification tasks of the target j, which are executed in
a specific order, and ≺ denotes the preceding symbol.
According to Equation (11), the specific precedence and timing constraints are equal to
⎧ j j
⎨ts1 ≥ s1 , e1 ≥ ts1 + T1
⎪
j j
ts2 ≥ s2 , e2 ≥ ts2 + T2 , (12)
⎪
⎩ j j
ts3 ≥ s3 , e3 ≥ ts3 + T3
where [s1 , e1 ], [s2 , e2 ], and [s3 , e3 ] represent the classification, attack, and verification time
j j j
windows and ts1 , ts2 and ts3 denote the start time of classification, attack, and verification
tasks of the target j, respectively.
Moreover, we set a certain value γ, which ensures that the optimal scheme vector
x∗ (m) conforms to Ω( x ∗ (m)) ≥ γ. After that, the maximum ρ∗ is written as
Nm
ρ∗ = ∑ Ω( x ∗ (m)) ≥ Nm γ. (13)
m =1
4. Algorithm Analysis
In this section, an AFACE algorithm will be introduced for the considered problem,
and the differences between the algorithm and cross-entropy (CE) method are that the
former changes the random sample Ndt and the quantile θt at each iteration t, and adds a
parameter to adjust the maximum sample N max . For details, please refer to the analysis of
the algorithm below.
where γ∗ is the maximum of Ω( x (m)) on Y ; that is, the optimal scheme vector is x∗ (m).
After that, transform this problem into a probability estimator problem, which can
be explained by the probability density function (PDF) f (·; u) with respect to u, and the
problem can be written as
= Eu I{Ω( x(m))≥γ} ,
257
Drones 2023, 7, 204
where γ denotes a value close to γ∗ , Pu represents the probability measure under which
the random vector x(m) has the PDF f (·; u), Eu is the corresponding expectation operator,
and I ( x(m); γ), i.e., I{Ω( x(m))≥γ} , denotes the indicator function, which is
'
1, i f Ω( x (m)) ≥ γ
I (·; γ) = . (16)
0, i f Ω( x (m)) < γ
where Ωt,i (i = 1, 2, . . . , Ndt ) denotes the ith sample performance, and Ω( xi (m))) and Ωt,N t
d
are defined by Ωt,i and Ω∗t for convenience. Meanwhile, AFACE algorithm parameters Ndt
and θt satisfy '
N min ≤ Ndt ≤ N max
, (18)
θt = β m /Ndt
where Ndt denotes the random sample of the tth iteration, varying between N min and
N max (N min = N, N max = hN, h ∈ {2, 3, 4, 5}) and θt represents the quantile of the tth
iteration. The reason for presenting h is that by adjusting the size of N max , we can obtain the
optimal N max that matches the combat scenario, which can be conducted by the following
simulations in Section 5.
For the AFACE algorithm, the main idea is to update Ndt and θt based on the elite
sample β m (β m = cm N), where cm and N are the elite sample influence coefficient of task m
(usually 0.01 ≤ cm ≤ 0.1) and the fixed random sample, respectively. Therefore, the set of
elite samples ε t (ε t ∈ Y ) are comprised of such β m samples in { x1 (m), x2 (m), . . . , x N t (m)}
d
with the highest performances Ωt,1 , Ωt,2 , . . . , Ωt,N t .
d
Next, referring to the formulas for solving γ 9t and v9t of CE method [30], they are
modified as
9t = Ω((1−θt ) N t )
γ (19)
d
where xi (m) is generated from f (·; u), f (·; v) denotes another PDF with respect to v on Y
via minimizing the Kullback–Leibler distance, γ 9t is equal to the worst sample performance
among the elite performances, while Ω∗t is the best sample performance among the elite
performances, and v9t converges to the probability density when Ω∗t occurs.
Then, we devise a sampling scheme for each iteration t, ensuring high probability that
'
Ω∗t > Ω∗t−1
. (21)
γ9t > γ 9t−1
258
Drones 2023, 7, 204
9t and v9t .
Algorithm 1 Adaptive updating of γ
Adaptive updating of γ 9t :
1: Given a fixed v 9t−1 at the tth iteration;
2: Let γt be a (1 − θt )-quantile of Ω( x (m)) under v 9t−1 , then γt satisfies Pv9t−1 (Ω( x (m)) ≤ γt ) ≥
1 − θt , where x(m) ∼ f (·; v9t−1 );
3: Obtain a simple estimator γ 9t of γt by drawing Ndt random samples x1 (m), x2 (m), . . . , x N t (m)
d
from f (·; v9t−1 );
4: Calculate and order all performances of Ω( x (m)) from smallest to biggest: Ωt,1 ≤ · · · ≤ Ωt,N t ;
d
5: Compute γ 9t according to Equation (19);
Adaptive updating of v9t :
6: Given a fixed γ 9t and v9t−1 at the tth iteration, then derive v9t according to Equation (20).
Theorem 1. Assume that z ≥ 1 and Nb = 3, the number of the available schemes for each task
of targets is L. Then, according to the mathematical formulas of permutation and combination,
we can obtain
z(z − 1)(z + 7)
L = 3z + , z ≥ 1. (22)
6
259
Drones 2023, 7, 204
task has no effect on the choice of the scheme for the next task, indicating that the available
schemes among these tasks are independent. Thus, the problem P 1 is rewritten as
Nm
P 1 : max ρ =
x(m)∈Y
∑ Ω( x (m))
m =1
, (23)
s.t. (9) − (12)
1
lm =L ∀m
where lm1 is the available schemes when performing the mth task.
Considering time sequence and independence of the available schemes among these tasks,
we present a MIAFACE algorithm, which is a combination of Nm AFACE algorithms. For MI-
AFACE algorithm, we first introduce the probability matrix vector P = [P (1), P (2), . . . , P ( Nm )] T
and the performance vector Ω = [Ω( x (1)), Ω( x (2)), . . . , Ω( x ( Nm ))] T , where P (m) and
Ω( x (m)) are the probability matrix and the performance of task m, respectively. Then,
P (m) is defined as
⎛ ⎞
p(1|1, m) p(2|1, m) 1 ··· p(lm |1, m)
⎜ p(1|2, m) p(2|2, m) ··· 1 |2, m ) ⎟
p ( lm
⎜ ⎟
P(m) = ⎜ .. .. .. .. ⎟ ,
⎝ . . . . ⎠
p(1| Nt , m) p(2| Nt , m) ··· p(lm | Nt , m) N ×l 1
1
t m
where p(k | j, m) represents the probability of assigning the kth UAV formation to accomplish
1
lm
task m of target j and P (m) is subjected to ∑ p(k | j, m) = 1.
k =1
Then, for the mth task, we initialize P0 (m) = ( p0 (k| j, m)) Nt ×lm1 with a uniform distribu-
tion. Let n1jm be the number of the feasible schemes of target j, and define p0 (k | j, m) := 1
n1jm
as the element of P0 (m). After that, we set v90 = P0 (m).
At the tth iteration, we assume that the samples x1 (m), x2 (m), . . . , x N t (m) are drawn
d1
from f ( x(m); v9t−1 (m)). In addition, we calculate the performances Ωt,i (i = 1, 2, . . . , Nd1
t ),
and order them from smallest to largest: Ωt,1 ≤ Ωt,2 ≤ · · · ≤ Ωt,N t . It is noted that β1m
d1
is calculated by β1m = c1m N, and γ 9t is updated by Equation (20). After that, we compare
Ωt,i with γ9t , and obtain all eligible performances greater than γ 9t and merge them into a
set S1 := {Ω(t,(1−θt ) N t ) , Ω(t,(1−θt ) N t +1) , . . . , Ωt,N t }, where β1m is the number of the
d1 d1 d1
element of S1 , and Ω∗t is the maximum element of S1 . Then, pt (k| j, m) is calculated, and the
specific derivation process can be seen in Theorem 2. Thus, Pt (m) is the probability matrix
composed of pt (k| j, m), and v9t is equal to Pt (m).
Theorem 2. Assume that there are Nm continuous and mutually independent tasks for each target.
After that, Nm tasks correspond to Nm AFACE algorithms, which has an elite sample of β1m = c1m N.
In the MIAFACE algorithm, c1 is a combined vector of c1m , e.g., c1 = [c11 , c12 , . . . , c1Nm ] T . Thus,
when performing the mth task, we can then obtain the updating formula of P (m) as follows:
⎧
⎪
⎪ c1m N
⎨ ∑ g( x nj (m);k )
p(k| j, m) = n=1 c1 N . (24)
⎪
⎪ m
⎩k ∈ {1, . . . , L}, n ∈ {1, . . . , c1 N }, c1 ∈ c
m m 1
Through the iterative updating of P (m), the optimal probability matrix vector P∗ and
the maximum performance vector Ω∗ are obtained. Then, the main steps of the MIAFACE
algorithm applied to solving problem P 1 are described in Algorithm 3, and the convergence
of the MIAFACE algorithm is similar to that of the CE method in [40].
260
Drones 2023, 7, 204
Output: P∗ , Ω∗ .
1: Set N min = N and N max = hN;
2: for m = 1; m < Nm ; m + + do
3: Initialize P0 (m) with a uniform distribution and define v90 = P0 (m), then set t = 1;
4: while at the tth iteration (t ≥ 1) do
5: if t = 1 then
t (N t = N min ) random samples x ( m ), x ( m ), . . . , x
6: Generate Nd1 1 2 t ( m ) from
Nd1
d1
f (·; v90 );
7: else
t (N min ≤ N t ≤ N max ) random samples x ( m ), x ( m ), . . . , x
8: Draw Nd1 1 2 t (m)
Nd1
d1
from f (·; v9t−1 );
9: end if
10: Update γ 9t according to Equation (19) and calculate Ω∗t ;
11: Calculate pt (k | j, m) by Equation (A14) in Appendix B;
1
lm
12: if ∑ pt (k| j, m) = 1 and pt (k | j, m) ∈ {0, 1} then
k =1
13: Stop, obtain Pt∗ (m) and Ω∗t , then P∗ (m) ← Pt∗ (m) and Ω( x ∗ (m)) ← Ω∗t ;
14: else
15: Calculate Pt (m) and update v9t by v9t = Pt (m);
16: Set t = t + 1, take random integer Nd1 t in [ N min , N max ], then go to step 4;
17: end if
18: end while
19: end for
20: Return P∗ and Ω∗ .
Nm
P 2 : max ρ =
x(m)∈Y
∑ Ω( x (m))
m =1
, (25)
s.t. (9) − (12)
2
lm = L−m+1 ∀m
where lm2 is the remaining schemes when performing the mth task.
Considering time sequence and relevance of the available schemes among these
tasks, we present a MCAFACE algorithm, which is also combined by Nm AFACE algo-
rithms. For the MCAFACE algorithm, we first introduce the probability matrix vector
Q = [ Q(1), Q(2), . . . , Q( Nm )] T and the performance vector Ω = [Ω( x (1)), Ω( x (2)), . . . ,
Ω( x ( Nm ))] T , where Q(m) and Ω( x (m)) are the probability matrix and the performance of
task m, respectively. Then, Q(m) is defined as
⎛ ⎞
q(1|1, m) q(2|1, m) 2··· q(lm |1, m)
⎜ q(1|2, m) q(2|2, m) ··· 2 |2, m ) ⎟
q ( lm
⎜ ⎟
Q(m) = ⎜ .. .. .. .. ⎟ ,
⎝ . . . . ⎠
q(1| Nt , m) q(2| Nt , m) ··· q ( lm
2 | N , m)
t N ×l 2
t m
261
Drones 2023, 7, 204
where q(k| j, m) represents the probability of assigning the kth UAV formation to accomplish
2
lm
task m of target j and Q(m) is subjected to ∑ q(k | j, m) = 1.
k =1
Then, for the mth task, we initialize Q0 (m) = (q0 (k | j, m)) Nt ×lm2 with a uniform distri-
bution. Let n2jm be the number of the feasible schemes of target j and define q0 (k | j, m) := 1
n2jm
as the element of Q0 (m). After that, we set v90 = Q0 (m).
At the tth iteration, we assume that the samples x1 (m), x2 (m), . . . , x N t (m) are drawn
d2
from f ( x(m); v9t−1 (m)). In addition, we calculate the performances Ωt,i (i = 1, 2, . . . , Nd2
t ),
and order them from smallest to largest: Ωt,1 ≤ Ωt,2 ≤ · · · ≤ Ωt,N t . It is noted that β2m
d2
is calculated by β2m = c2m N, and γ 9t is updated by (20). After that, we compare Ωt,i with
9t , and obtain all eligible performances greater than γ
γ 9t and merge them into a set S2 :=
{Ω(t,(1−θt ) N t ) , Ω(t,(1−θt ) N t +1) , . . . , Ωt,N t }, where β2m is the number of the element of
d2 d2 d2
S2 and Ω∗t is the maximum element of S2 . Then, qt (k| j, m) is calculated and the specific
derivation process can be found in Theorem 3. Thus, Qt (m) is the probability matrix
composed of qt (k | j, m), and v9t is equivalent to Qt (m).
Theorem 3. Assume that there are Nm continuous and mutually correlative tasks for each target.
After that, the selected scheme is required to be deleted after each task is accomplished. The other
settings are the same as Theorem 2. Thus, when performing the mth task, we can obtain the updating
formula of Q(m), as follows:
⎧
⎪
⎪ c2m N
⎨ ∑ g( x nj (m);k)
q(k | j, m) = n=1 c2 N . (26)
⎪
⎪ m
⎩k ∈ {1, . . . , L − m + 1}, n ∈ {1, . . . , c2 N }, c2 ∈ c
m m 2
Through the iterative updating of Q(m), the optimal probability matrix vector Q∗ and
the maximum performance vector Ω∗ are obtained. Then, the main steps of the MCAFACE
algorithm for dealing with problem P 2 are explained in Algorithm 4, and the convergence
of the MCAFACE algorithm is also close to that of CE method in [40].
262
Drones 2023, 7, 204
Output: Q∗ , Ω∗ .
1: Set N min = N and N max = hN;
2: for m = 1; m < Nm ; m + + do
3: Initialize Q0 (m) with a uniform distribution and define v90 = Q0 (m), then set t = 1;
4: while at the t-th iteration (t ≥ 1) do
5: if t = 1 then
t (N t = N min ) random samples x ( m ), x ( m ), . . . , x
6: Generate Nd2 1 2 t ( m ) from
Nd2
d2
f (·; v90 );
7: else
t (N min ≤ N t ≤ N max ) random samples x ( m ), x ( m ), . . . , x
8: Draw Nd2 1 2 t (m)
Nd2
d2
from f (·; v9t−1 );
9: end if
10: Update γ 9t according to Equation (19) and calculate Ω∗t ;
11: Calculate qt (k| j, m) by Equation (A15) in Appendix C;
2
lm
12: if ∑ qt (k | j, m) = 1 and qt (k| j, m) ∈ {0, 1} then
k =1
13: Stop, obtain Q∗t (m) and Ω∗t , then Q∗ (m) ← Q∗t (m) and Ω( x ∗ (m)) ← Ω∗t ;
14: else
15: Calculate Qt (m) and update v9t by v9t = Qt (m);
16: Set t = t + 1, take random integer Nd2t in [ N min , N max ], then go to step 4;
17: end if
18: end while
19: end for
20: Return Q∗ and Ω∗ .
4.3. Complexity Analysis of the MIAFACE Algorithm and the MCAFACE Algorithm
Let Nm represent the number of tasks, nd denote the random sample to perform each
task, nf represent the iteration number of AFACE algorithm to perform each task, Ne denote
the elite sample, Nt represent the number of targets, and L denote the number of all possible
UAV deployment schemes. The computational complexity of AFACE algorithm is divided
into four parts: initialization C1 , sample C2 , sort C3 , and update C4 . Meanwhile, these parts
can be defined as
C1 = Nt × L (27)
C2 = nf × nd (28)
C3 = nf × nd log nd (29)
C4 = nf × (nd − Ne ). (30)
Cf = C1 + C2 + C3 + C4
. (31)
= Nt × L + nf × (nd + nd log nd + nd − Ne )
Cf = Nt × L × (1 + nd + nd log nd + nd − Ne )
, (32)
= Nt × L × (nd log nd + 2nd − Ne + 1)
where nd log nd is greater than the other terms in the bracket on the right side of the
equation. Thus, the time complexity of AFACE algorithm can be computed as O( Nt × L ×
nd log nd ).
263
Drones 2023, 7, 204
Cmi = Nm × Cf . (33)
Uresource (Units)
UAV Base pk ps
a b
Type A B1 1 2 0.9 0.7
Type B B2 2 2 0.8 0.8
Type C B3 3 3 0.7 0.9
264
Drones 2023, 7, 204
Tresource
Target Position y s
K1 ([a,b]) K2 ([a,b]) K3 ([a,b])
Target 1 (23,85) [2,3] [2,3] [2,3] 30 2
Target 2 (35,90) [3,3] [3,3] [3,3] 70 6
Target 3 (48,95) [2,3] [2,3] [2,3] 50 4
Target 4 (92,35) [3,2] [3,2] [3,2] 100 10
Target 5 (95,28) [2,2] [2,2] [2,2] 120 8
Target 6 (100,32) [3,2] [3,2] [3,2] 40 5
Target 7 (45,105) [3,3] [3,3] [3,3] 65 3
Target 8 (90,30) [2,2] [2,2] [2,2] 78 7
Target 9 (88,40) [2,2] [2,2] [2,2] 35 9
Target 10 (50,100) [3,2] [3,2] [3,2] 63 5
Target 11 (160,170) [2,3] [5,3] [4,3] 30 2
Target 12 (165,178) [3,3] [5,3] [5,3] 70 6
Target 13 (132,155) [3,3] [5,3] [6,3] 50 4
Target 14 (90,150) [3,3] [3,5] [4,4] 100 10
Target 15 (162,175) [2,2] [4,5] [4,4] 120 8
Target 16 (140,155) [2,3] [6,3] [5,3] 40 5
Target 17 (82,134) [3,3] [4,3] [4,3] 65 3
Target 18 (148,152) [2,3] [4,2] [5,2] 78 7
Target 19 (145,160) [3,2] [3,4] [4,3] 35 9
Target 20 (95,160) [2,2] [3,4] [4,4] 63 5
Referring to Theorem 1, we note that when z exceeds 3, these simulations are com-
plicated. Thus, z is set to be 3, i.e., no more than 3 UAVs are needed to accomplish three
tasks of targets in a specific order, and then the total number of each type of UAV is un-
restricted. Then, each target in the following cases has 19 possible schemes, i.e., A, B,
C, AA, AB, AC, BB, BC, CC, AAA, AAB, AAC, ABB, ACC, BBB, BBC, BCC, CCC, and
265
Drones 2023, 7, 204
ABC, respectively, and these schemes correspond to numbers from 1 to 19. After that, we
can use a matching approach to quickly find the feasible schemes. The resources needed
to accomplish three tasks of targets are randomly generated and satisfy the maximum
cooperative number of UAVs.
In the following simulations, the notations used in the tables and the figures are
displayed as
• Uresource represents the initial resources consumed by three types of UAVs;
• Tresource represents the resources consumed by three tasks; and
• Time is CPU time in seconds for each case, and the time of each case is the average
consumption time of running 100 times of each algorithm.
The parameters of the CE method, MIAFACE algorithm, MCAFACE algorithm, PSO
algorithm, ACO algorithm, and GA algorithm are assumed to be set in Table 4, where the
settings of the speed and maximum flying distance of the UAV are referred to [35] and they
have no effect on the simulation results. For more detailed theory and parameter settings
of CE, PSO, ACO, and GA (see [8–10,30,35,41]). For the targets in Table 3, there are two
scenarios in the multi-UAV cooperative task assignment problem.
(1) In scenario 1, we consider the first 10 targets or more similar targets. When per-
forming the three tasks of each target, we obtain the identical optimal scheme vector
of each task. Therefore, the situation in which each target has different tasks but
each task has the same optimal scheme is called the problem with continuous and
independent tasks.
(2) In scenario 2, the last 10 targets or more similar targets are considered. When perform-
ing the three tasks of each target, we obtain the different optimal scheme vector of
each task. Thus, the situation in which each target has different tasks and each task
does not have the same optimal scheme is called the problem with continuous and
correlative tasks.
Parameter Value
The target identification Pc = 1
Weight coefficients w1 = 0.8, w2 = 0.18, w3 = 0.02
The UAV’s speed V = 40 m/s
The maximum flying distance Dmax = 1000 m
Time window of task K1 (s) [e1 , s1 ] = [3, 10]
Time window of task K2 (s) [e2 , s2 ] = [8, 20]
Time window of task K3 (s) [e3 , s3 ] = [18, 26]
Consumption time of task K1 T1 = 5 s
Consumption time of task K2 T2 = 10 s
Consumption time of task K3 T3 = 5 s
The number of targets Nt ∈ [5, 20]
The fixed random samples N = 1000
The quantile in CE θ = 0.1
Inertial weight in PSO w = 0.75
Learning factors in PSO η1 = η2 = 0.5
The number of ants in ACO Na = 200
Pheromone evaporation coefficient in ACO ε = 0.9
Transfer probability in ACO Pa = 0.2
Mating probability in GA P1 = 0.8
Mutation probability in GA P2 = 0.01
5.1. Scenario 1
In case 1, we used the first 10 targets in Table 3 to perform continuous and independent
tasks of problem P 1, and the results are shown in Table 5.
According to Table 5, we note that the optimal scheme vector and the total result
of CE and MIAFACE are identical, while that of MCAFACE is suboptimal to the other
266
Drones 2023, 7, 204
two algorithms. Moreover, we can obtain some observations. (i) For CE, the number of
iterations and the optimal scheme vector are both 4 and [3,3,3,2,2,2,3,2,2,3], respectively,
and the results of each task are −79.50, 274.90, and −79.50, and the sum of the results of
each task is 115.9. The situations of MIAFACE are similar to CE, except that the number
of iterations is 3. (ii) For MCAFACE, the numbers of iterations and the optimal scheme
vectors are 3, 2, 1 and [3,3,3,2,2,2,3,2,2,3], [9,9,9,7,7,7,9,7,7,9], [18,18,18,15,15,15,18,15,15,18],
respectively, and the results of each task are −79.50, 179.0, and −82.14, and the sum of the
results of each task is 17.36. (iii) The total times of using CE, MIAFACE and MCAFACE are
3.36, 3.29, and 2.17, respectively.
In case 2, we tested the MIAFACE algorithm and MCAFACE algorithm under h and
c1 , and their times change with Nt in Figures 3a–c and 4a–c, respectively.
From Figures 3 and 4, the curves of MIAFACE and MCAFACE both show an increasing
trend as Nt grows, and their times increase with the increment of c1 and h. Meanwhile,
the time differences between the curves gradually increase with the growth of Nt in each
figure. In Figure 3a, the curve with h = 2 is at the lowest of the four curves, while the
curve with h = 5 is at the highest of the four curves. The remaining two curves are in the
middle, and the curve with h = 4 is at the top and the other one is at the bottom. Moreover,
the time ranges of the four curves are both approximately in [1,12]. In Figure 3b,c, their
situations are described similarly to Figure 3a, and their time ranges are in [1,14] and [1,15],
respectively. From Figure 4a, the order of the four curves is similar to Figure 3a. Moreover,
their time ranges are both roughly in [0.3,10]. In Figure 4b,c, their situations are analogous
to Figure 4a, and their time ranges are in [0.3,10] and [0.3,12], respectively.
In case 3, the CE method, PSO algorithm, ACO algorithm, and GA algorithm are
both used three times for three tasks continuously. We compared them with MIAFACE
algorithm by obtaining the same optimal score under h = 2 and c1 , and their times change
with Nt in Figure 5a–c. Since MCAFACE algorithm obtains suboptimal results in scenario
1, it is not compared to other algorithms.
267
Drones 2023, 7, 204
From Figure 5, we note that the curves of CE and MIAFACE grow linearly, while
the curves of PSO, ACO, and GA increase exponentially. In addition, their times increase
gradually with the increment of c1 and Nt . In Figure 5a, when Nt is in [5,20], the time
of MIAFACE is less than that of CE, and the time difference between the two algorithms
grows as Nt increases. Meanwhile, when Nt is below 8, the times of PSO, ACO, and GA are
lower than that of CE and MIAFACE, but when Nt is more than 8, the situation is reversed.
In addition, the time ranges of CE and MIAFACE are both approximately in [1,10], while
the times of other algorithms are over 20 when Nt is larger than 10. From Figure 5b,c,
their situations are similar to Figure 5a, except that the time difference between CE and
268
Drones 2023, 7, 204
MIAFACE in Figure 5b is lower than that in Figure 5a, and the time difference in Figure 5c
first decreases gradually to intersect at a point where Nt is 10, then increases slowly with
the increment of Nt .
5.2. Scenario 2
In case 4, we utilized the last 10 targets in Table 3 to perform continuous and correlative
tasks of problem P 2, and the results are shown in Table 6.
According to Table 6, we note that the optimal scheme vectors and the total results of
CE, MIAFACE, and MCAFACE are the same. The reason for this phenomenon is that for
three tasks of the same 10 targets, the optimal scheme vectors are eventually obtained and
identical by using the three algorithms, which leads to the same score of the total objective
function; however, the consumption time by the different algorithms varies. Moreover,
some observations are available. First, for CE, the number of iterations and the sum of
each task’s result are 5 and 218.72, and the optimal solution vectors are [3,3,3,3,3,3,3,3,3,3],
[9,9,9,7,7,7,9,7,7,9], [18,18,18,15,15,15,18,15,15,18], and the results of each task are −298.65,
819.85, and −302.48, respectively. Secondly, the situations using MIAFACE and MCAFACE
are similar to that of CE, apart from the fact that the number of iterations in MIAFACE is 4
and the numbers of iterations in MCAFACE are 4, 4, and 3. Finally, the total times using
CE, MIAFACE, and MCAFACE are 7.33, 7.11, and 6.9, respectively.
In case 5, we tested the MIAFACE algorithm and MCAFACE algorithm under h and
c2 , and their times change with Nt in Figures 6a–c and 7a–c, respectively.
From Figures 6 and 7, the variations of the curves, the times and the time differences
are both similar to Figures 3 and 4, while in Figures 6a and 7a, the time grows rapidly
when Nt is over 10. The reason is that the results of these two figures are suboptimal
to others. In Figure 6a, the order of the curves is the same as that of each figure in
Figures 3 and 4. In addition, the time ranges of these four curves are both approximately
in [1,50]. From Figure 6b,c, the situations are described similarly to that of Figure 6a and
their time ranges are in [2,30] and [2,32], except that their results are the optimal results.
In Figure 7, the situation of each figure is roughly similar to that of the corresponding figure
in Figure 6, apart from the fact that the time range is lower than that in Figure 6.
269
Drones 2023, 7, 204
From Figure 8, we note that for CE, MIAFACE, MCAFACE, PSO, ACO, and GA,
the variations of the curves and the times are similar to the case in Figure 5. In Figure 8a,
when Nt is below 11, the times of MIAFACE and MCAFACE are relatively close and less
than that of CE; however, when Nt is over 11, the times of MIAFACE and MCAFACE
grow quickly and more than that of CE due to obtaining the suboptimal results. Moreover,
the time ranges of CE, MIAFACE, and MCAFACE are both approximately in [1,30]. Mean-
while, the times of PSO, ACO, and GA are much higher than that of CE, MIAFACE, and
MCAFACE, and their time ranges are over 30 when Nt is more than 6. From Figure 8b,c,
the situations of PSO, ACO, and GA are similar to Figure 8a. In Figure 8b, the time differ-
ences between CE, MIAFACE, and MCAFACE grow as Nt increases. In addition, the time
of MCAFACE is lower than that of CE and MIAFACE, and the curves of CE and MIAFACE
intersect at Nt = 8 and the time of CE is also lower than that of MIAFACE when Nt is
below 8, then the situation is reversed after Nt exceeds 8. Moreover, the time ranges of CE,
MIAFACE, and MCAFACE are both in [2,22]. In Figure 8c, the time differences between
CE, MIAFACE, and MCAFACE decrease, and then increase as Nt grows. Furthermore,
the curves of CE, MIAFACE, and MCAFACE intersect at Nt = 9 and the time of CE is lower
than that of MIAFACE and MCAFACE when Nt is below 9, then the situation is reversed
after Nt exceeds 9.
270
Drones 2023, 7, 204
5.3. Analysis
Analysing the results of case 1 and case 4, we note that the optimal scheme vectors
of using MIAFACE and MCAFACE algorithms in problems P 1 and P 2, respectively, are
obtained by initializing and updating the probability matrices P (m) and Q(m), which con-
forms to Algorithms 3 and 4 described in Section 4.2. In addition, the result of MCAFACE in
case 1 is suboptimal to that of other algorithms due to deleting the corresponding optimal
solution after the end of each task.
Comprehensively considering the situations of case 2 and case 5, we note that the times
of CE, MIAFACE, and MCAFACE increase with the increment of Nt , h, as well as c and
the time complexity of MCAFACE is lower than that of MIAFACE, and these phenomena
comply with the complexity analysis of MIAFACE and MCAFACE in Section 4.3. In
addition, the time of case 5 is superior to that of case 2 because there are more available
solutions for each target in case 5 than in case 2 after each iteration. Meanwhile, in case 5,
using MIAFACE and MCAFACE for solving this problem is easy to fall into local optimum
when c is inferior to a certain vector, e.g., c = [0.01, 0.02, 0.03]. The reason behind this
phenomenon is that when all elements in c are small and more solutions exist after each
iteration, the optimal scheme may not be selected during one of the iterations of MIAFACE
and MCAFACE, leading to a suboptimal result.
Comparing the situations of case 3 and case 6, we note that the times of PSO, ACO,
and GA are only related to the growth of Nt . Meanwhile, CE, MIAFACE, and MCAFACE
are superior to PSO, ACO, and GA for large-scale allocation problems, e.g., more than
8 targets of case 3 and 5 targets of case 6. Moreover, CE is inferior to MIAFACE in scenario
1, e.g., Figure 5, when Nt is over 10. Moreover, e.g., Figure 8c in scenario 2, MCAFACE is
superior to MIAFACE and CE when Nt is over 9.
6. Conclusions
In this paper, the multi-UAV cooperative task assignment problem was described
and formulated, and three types of UAVs were considered, cooperatively accomplishing
the classification, attack, and verification tasks of targets under resource, precedence, and
timing constraints. After that, considering complex coupling among these three tasks,
we decomposed the considered problem into two subproblems. In order to solve them,
we proposed an AFACE algorithm, a MIAFACE algorithm, and a MCAFACE algorithm.
Finally, simulation results verified that both MIAFACE and MCAFACE consume less time
than other intelligent algorithms for solving the corresponding problem.
Nevertheless, there still exist challenges when applying the MIAFACE algorithm and
MCAFACE algorithm to processing optimization problems, e.g., appropriate parameter
settings, falling into local optimum when using lower elements in c, etc. In future work, it
will be meaningful to concentrate on promoting these two algorithms on problems where
271
Drones 2023, 7, 204
it is vulnerable to local optimum when the number of samples is limited and on task
assignment problems in complex dynamic scenarios.
Author Contributions: Conceptualization, K.W., X.Z. and X.L.; methodology, X.Z.; software, X.Z.;
validation, K.W., X.Z., X.L. and W.C.; formal analysis, K.W. and X.Z.; investigation, X.Z.; resources,
X.Q. and Y.C.; data curation, X.Q., Y.C. and K.L.; writing—original draft preparation, X.Z.; writing—
review and editing, X.Z.; visualization, X.Z.; supervision, K.W.; project administration, Y.C. and K.L.
All authors have read and agreed to the published version of the manuscript.
Funding: This research was funded in part by the National Natural Science Foundation of China
under Grant 62172313 and 52031009, in part by the Natural Science Foundation of Hunan Province
under Grant 2021JJ20054.
Data Availability Statement: Data sharing is not applied.
Conflicts of Interest: The authors declare no conflict of interest.
Appendix A
If one of each type of UAVs is selected, i.e., z = 1, the possible schemes are written as
When z = 2, we can choose no more than three types of UAVs, and then
Appendix B
Inserting P (m) and Equation (1) into f ( x(m); u), we define the problem P 1 as
Nt
f ( x(m); P (m)) = ∏ p(x j (m)| j, m)
j =1
1
, (A5)
Nt lm
g( x j (m);k )
= ∏ ∏ p(k| j, m)
j =1 k =1
where p(k | j, m) represents the coefficient in the column k and the row j of P (m), g( x j (m); k )
is 1 if φ( x j (m)) equals k and 0 otherwise according to Equation (1).
After that, at the tth iteration, we assume that the samples x1 (m), x2 (m), . . . , x N t (m) are
d1
drawn from f (x(m); v 9t−1 (m)). In addition, we calculate the performances Ωt,i , and order them
from smallest to largest: Ωt,1 ≤ Ωt,2 ≤ · · · ≤ Ωt,Nt , and then define γ9t (m) = Ω(Nt −β1m ) .
d1 d1
272
Drones 2023, 7, 204
is equal to
β1m
max ∑ ln f (x(m); P(m)).
P ( m ) n =1
(A7)
β1m
max ∑ ln f ( x(m); P (m))
P ( m ) n =1
!
β1m 1
Nt lm g( x nj (m);k )
= max ∑ ln ∏ ∏ p(k| j, m) (A8)
p(k | j,m) n=1 j =1 k =1
β1m Nt 1
lm
= max ∑ ∑ ∑ g( x nj (m); k) ln( p(k| j, m)).
p(k | j,m) n=1 j=1 k =1
Then, we assume that rkj (m) = p(k | j, m), ankj (m) = g( x nj (m); k), and Equation (A8) is
modeled as
β1m Nt lm
1
P 11 : min (−
rkj (m)
∑ ∑ ∑ ankj (m) ln rkj (m) )
n =1 j =1 k =1
1
lm
s.t. ∑ rkj (m) = 1 ∀ j, m (A9)
k =1
rkj (m) ≥ 0 ∀ j, k, m
1
lm =L ∀m.
Considering P 11 as a convex problem and denoting the convex function by f (rkj (m)),
we can obtain the Lagrangian function
O(rkj (m), λ j (m), μkj (m)) = f rkj (m) +
N ,
Nt L t L (A10)
∑ λ j (m) ∑ rkj (m) − 1 + ∑ ∑ μkj (m) −rkj (m)
j =1 k =1 j =1 k =1
where λ j (m) and μkj (m) are the relevant restraint coefficients.
Generally, for convex optimization problem, the Karush–Kuhn–Tucker (KKT) con-
dition is required and sufficient [42]. Thus, considering the KKT conditions of problem
in Equation (A10), we have
⎧
⎪ a (m)
⎪
⎪ − r kj (m) + λ j (m) − μkj (m) = 0
⎪
⎪ kj
⎪
⎪ L
⎪
⎪ λ ( ) ( ) − =0
⎪
⎨ j m ∑ r kj m 1
k =1
(A11)
⎪ μkj (m)rkj (m) = 0
⎪
⎪
⎪ λ j (m) > 0
⎪
⎪
⎪
⎪
⎪ μkj (m) ≥ 0
⎪
⎩
rkj (m) ≥ 0
273
Drones 2023, 7, 204
Comparing λ j (m) and rkj (m), we acquire the relationship between rkj (m) and ankj (m), i.e.,
ankj (m)
rkj (m) = . (A13)
L
∑ ankj (m)
k =1
β1m
∑ g( x nj (m); k)
n =1
p(k | j, m) =
β1m
, (A14)
c1m N
∑ g( x nj (m); k )
n =1
=
c1m N
Appendix C
Calculating the updating formulas of Nm tasks continuously and correlatively is considered.
For the mth task, if m = 1, its optimal solution is taken from L schemes, and if
1 < m ≤ Nm , its optimal scheme is only taken from the remaining L − m + 1 solutions
since the m − 1 schemes selected before performing the mth task have been deleted.
Thus, referring to the proof process of Theorem 2, the updating formula of Q(m) in
problem P 2 is
β2m
∑ g( x nj (m); k)
n =1
q(k| j, m) =
β2m
(A15)
c2m N
∑ g( x nj (m); k )
n =1
=
c2m N
References
1. Singh, H.; Sharma, M. Electronic Warfare System Using Anti-Radar UAV. In Proceedings of the 2021 8th International Conference
on Signal Processing and Integrated Networks, Noida, India, 26–27 August 2021; pp. 102–107.
2. Deng, Z.; Gao, Y.; Hu, A.; Zhang, Y. A Mobile Phone Uplink CPDP-DTDOA Positioning Method Using UAVs for Search and
Rescue. IEEE Sens. J. 2022, 22, 18170–18179. [CrossRef]
3. Fan, B.; Jiang, L.; Chen, Y.; Zhang, Y.; Wu, Y. UAV Assisted Traffic Offloading in Air Ground Integrated Networks With Mixed
User Traffic. IEEE T. Intell. Transp. 2022, 23, 12601–12611. [CrossRef]
4. D’Arcy, S.; Gonzalez, F. Design and Flight Testing of a Rocket-Launched Folding UAV for Earth and Planetary Exploration
Applications. In Proceedings of the 2022 IEEE Aerospace Conference, Big Sky, MT, USA, 5–12 March 2022; pp. 1–15.
274
Drones 2023, 7, 204
5. Chen, X.; Liu, Y.; Yin, L.; Qi, Y. Cooperative Task Assignment and Track Planning For Multi-UAV Attack Mobile Targets. J. Intell.
Robot. Syst. 2020, 100, 1383–1400.
6. Sabo, C.; Kingston, D.; Cohen, K. A Formulation and Heuristic Approach to Task Allocation and Routing of UAVs under Limited
Communication. Unmanned Syst. 2014, 2, 1–17. [CrossRef]
7. Wang, J.; Zhang, Y.F.; Geng, L.; Fuh, J.Y.H.; Teo, S.H. A Heuristic Mission Planning Algorithm for Heterogeneous Tasks with
Heterogeneous UAVs. Unmanned Syst. 2015, 3, 205–219. [CrossRef]
8. Gou, Q.; Li, Q. Task assignment based on PSO algorithm based on Logistic function inertia weight adaptive adjustment. In
Proceedings of the 2020 3rd International Conference on Unmanned Systems, Harbin, China, 1–4 September 2020; pp. 825–829.
9. Li, Y.; Zhang, S.; Chen, J.; Jiang, T.; Ye, F. Multi-UAV Cooperative Mission Assignment Algorithm Based on ACO method. In
Proceedings of the 2020 International Conference on Computing, Networking and Communications, Big Island, HI, USA, 17–20
February 2020; pp. 304–308.
10. Ma, Y.; Zhang, H.; Zhang, Y.; Gao, R.; Xu, Z.; Yang, J. Coordinated Optimization Algorithm Combining GA with Cluster for
Multi-UAVs to Multi-tasks Task Assignment and Path Planning. In Proceedings of the 2019 IEEE 15th International Conference
on Control and Automation, Edinburgh, UK, 22–26 August 2019; pp. 1026–1031.
11. Duan, X.; Liu, H.; Tang, H.; Cai, Q.; Zhang, F.; Han, X. A Novel Hybrid Auction Algorithm for Multi-UAVs Dynamic Task
Assignment. IEEE Access 2020, 8, 86207–86222. [CrossRef]
12. Chen, J.; Wu, Q.; Xu, Y.; Qi, N.; Guan, X.; Zhang, Y.; Xue, Z. Joint Task Assignment and Spectrum Allocation in Heterogeneous
UAV Communication Networks: A Coalition Formation Game-Theoretic Approach. IEEE Trans. Wirel. Commun. 2021, 20,
440–452. [CrossRef]
13. Qie, H.; Shi, D.; Shen, T.; Xu, X.; Li, Y.; Wang, L. Joint Optimization of Multi-UAV Target Assignment and Path Planning Based on
Multi-Agent Reinforcement Learning. IEEE Access 2019, 7, 146264–146272. [CrossRef]
14. Tang, J.; Chen, X.; Zhu, X.; Zhu, F. Dynamic Reallocation Model of Multiple Unmanned Aerial Vehicle Tasks in Emergent
Adjustment Scenarios. IEEE Trans. Aerosp. Electron. Syst. 2022, 1–43. [CrossRef]
15. Qie, H.; Shi, D.; Shen, T.; Xu, X.; Li, Y.; Wang, L. Distributed Cooperative Search Algorithm With Task Assignment and Receding
Horizon Predictive Control for Multiple Unmanned Aerial Vehicles. IEEE Access 2021, 9, 6122–6136.
16. Fu, X.; Feng, P.; Gao, X. Swarm UAVs Task and Resource Dynamic Assignment Algorithm Based on Task Sequence Mechanism.
IEEE Access 2019, 7, 41090–41100. [CrossRef]
17. Chen, Y.; Yang, D.; Yu, J. Multi-UAV Task Assignment With Parameter and Time-Sensitive Uncertainties Using Modified Two-Part
Wolf Pack Search Algorithm. IEEE Trans. Aerosp. Electron. Syst. 2018, 54, 2853–2872. [CrossRef]
18. Zhu, F.; Wu, F.; Chen, C.F.; Li, D.; Guo, Y.; Zhang, J.G.; Zhao, X. A coordinated assignment method for multi-UAV area search
tasks. In Proceedings of the CSAA/IET International Conference on Aircraft Utility Systems, Nanchang, China, 17–20 August
2022; pp. 751–756.
19. Chen, Y.; Chen, J.; Du, C. Allocation of Multi-UAVs Timing-dependent Tasks based on Completion Time. In Proceedings of the
2022 WRC Symposium on Advanced Robotics and Automation, Beijing, China, 20 August 2022; pp. 71–76.
20. Yan, S.; Xu, J.; Song, L.; Pan, F. Heterogeneous UAV collaborative task assignment based on extended CBBA algorithm. In
Proceedings of the 2022 7th International Conference on Computer and Communication Systems, Wuhan, China, 22–25 April
2022; pp. 825–829.
21. Yan, S.; Pan, F.; Zhang, D.; Xu, J. Research on Task Reassignment Method of Heterogeneous UAV in Dynamic Environment.
In Proceedings of the 2022 6th International Conference on Robotics and Automation Sciences, Wuhan, China, 9–11 June 2022;
pp. 57–61.
22. Liu, C.; Guo, Y.; Li, N.; Song, X. AoI-Minimal Task Assignment and Trajectory Optimization in Multi-UAV-Assisted IoT Networks.
IEEE Internet Things J. 2022, 9, 21777–21791. [CrossRef]
23. Zhu, C.; Zhang, G.; Yang, K. Fairness-Aware Task Loss Rate Minimization for Multi-UAV Enabled Mobile Edge Computing. IEEE
Wirel. Commun. Lett. 2023, 12, 94–98. [CrossRef]
24. Seid, A.M.; Lu, J.; Abishu, H.N.; Ayall, T.A. Blockchain-Enabled Task Offloading with Energy Harvesting in Multi-UAV-assisted
IoT Networks: A Multi-agent DRL Approach. IEEE J. Sel. Areas Commun. 2022, 40, 3517–3532. [CrossRef]
25. Hu, N.; Qin, X.; Ma, N.; Liu, Y.; Yao, Y.; Zhang, P. Energy-efficient Caching and Task offloading for Timely Status Updates in
UAV-assisted VANETs. In Proceedings of the 2022 IEEE/CIC International Conference on Communications in China, Sanshui,
Foshan, China, 11–13 August 2022; pp. 1032–1037.
26. Gao, H.; Feng, J.; Xiao, Y.; Zhang, B.; Wang, W. A UAV-assisted Multi-task Allocation Method for Mobile Crowd Sensing. IEEE
Trans. Mob. Comput. 2022. [CrossRef]
27. Rubinstein, R.Y. Optimization of computer simulation models with rare events. Eur. J. Oper. Res. 1997, 99, 89–112. [CrossRef]
28. Rubinstein, R.Y. The cross-entropy method for combinatorial and continuous optimization. Methodol. Comput. Appl. Probab. 1999,
1, 127–190. [CrossRef]
29. Rubinstein, R.Y. Combinatorial optimization, cross-entropy, ants and rare events. In Stochastic Optimization: Algorithms and
Applications; Springer: Boston, MA, USA, 2001; pp. 303–363.
30. De Boer, P.-T.; Kroese, D.P.; Mannor, S.; Rubinstein, R.Y. A tutorial on the cross-entropy method. Ann. Oper. Res. 2005, 134, 19–67.
[CrossRef]
275
Drones 2023, 7, 204
31. Chepuri, K.; Homem-de-Mello, T. Solving the vehicle routing problem with stochastic demands using the cross-entropy method.
Ann. Oper. Res. 2005, 134, 153–181. [CrossRef]
32. Rubinstein, R.Y.; Kroses, D.P. The cross-entropy method: A unified approach to combinatorial optimization Monte-Carlo
simulation and machine learning. Technometrics 2006, 48, 147–148.
33. Undurti, A.; How, J. A Cross-Entropy Based Approach for UAV Task Allocation with Nonlinear Reward. In Proceedings of the
AIAA Guidance, Navigation, and Control Conference, Toronto, ON, Canada, 2–5 August 2010; pp. 1–16.
34. Le Thi, H.A.; Nguyen, D.M.; Dinh, T.P. Globally solving a nonlinear UAV task assignment problem by stochastic and deterministic
optimization approaches. Optim. Lett. 2012, 6, 315–329. [CrossRef]
35. Huang, L.; Qu, H.;Zuo, L. Multi-Type UAVs Cooperative Task Allocation Under Resource Constraints. IEEE Access 2018, 6,
17841–17850. [CrossRef]
36. Cofta, P.; Ledziński, D.; Śmigiel, S.; Gackowska, M. Cross-Entropy as a Metric for the Robustness of Drone Swarms. Entropy 2020,
22, 597. [CrossRef] [PubMed]
37. Zhang, X.; Wang, K.; Dai, W. Multi-UAVs Task Assignment Based on Fully Adaptive Cross-Entropy Algorithm. In Proceedings of
the 2021 11th International Conference on Information Science and Technology, Chengdu, China, 7–10 May 2021; pp. 286–291.
38. Wei, Y.; Wang, B.; Liu, W.; Zhang, L. Hierarchical Task Assignment of Multiple UAVs with Improved Firefly Algorithm Based on
Simulated Annealing Mechanism. In Proceedings of the 2021 40th Chinese Control Conference, Shanghai, China, 26–28 July 2021;
pp. 1943–1948.
39. Wang, Q.; Liu, L.; Tian, W. Cooperative Task Assignment of Multi-UAV in Road-network Reconnaissance Using Customized
Genetic Algorithm. In Proceedings of the 2021 IEEE 4th Advanced Information Management, Communicates, Electronic and
Automation Control Conference, Chongqing, China, 18–20 June 2021; pp. 803–809.
40. Costa, A.; Jones, O.D.; Kroese, D. Convergence properties of the cross-entropy method for discrete optimization. Oper. Res. Lett.
2007, 35, 573–580. [CrossRef]
41. Kennedy, J.; Eberhart, R. Particle swarm optimization. In Proceedings of the ICNN’95—International Conference on Neural
Networks, Perth, WA, Australia, 27 November–1 December 1995; pp. 1942–1948.
42. Luo, Z.-Q.; Yu, W. An introduction to convex optimization for communications and signal processing. IEEE J. Sel. Areas Commun.
2006, 24, 1426–1438.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.
276
drones
Article
A Distributed Collaborative Allocation Method of
Reconnaissance and Strike Tasks for Heterogeneous UAVs
Hanqiang Deng, Jian Huang *, Quan Liu, Tuo Zhao, Cong Zhou and Jialong Gao
Abstract: Unmanned aerial vehicles (UAVs) are becoming more and more widely used in battlefield
reconnaissance and target strikes because of their high cost-effectiveness, but task planning for
large-scale UAV swarms is a problem that needs to be solved. To solve the high-risk problem caused
by incomplete information for the combat area and the potential coordination between targets when
a heterogeneous UAV swarm performs reconnaissance and strike missions, this paper proposes
a distributed task-allocation algorithm. The method prioritizes tasks by evaluating the swarm’s
capability superiority to tasks to reduce the search space, uses the time coordination mechanism and
deterrent maneuver strategy to reduce the risk of reconnaissance missions, and uses the distributed
negotiation mechanism to allocate reconnaissance tasks and coordinated strike tasks. The simulation
results under the distributed framework verify the effectiveness of the distributed negotiation mecha-
nism, and the comparative experiments under different strategies show that the time coordination
mechanism and the deterrent maneuver strategy can effectively reduce the mission risk when the
target is unknown. The comparison with the centralized global optimization algorithm verifies the
efficiency and effectiveness of the proposed method when applied to large-scale UAV swarms. Since
the distributed negotiation task-allocation architecture avoids dependence on the highly reliable
network and the central node, it can further improve the reliability and scalability of the swarm, and
make it applicable to more complex combat environments.
Citation: Deng, H.; Huang, J.; Liu, Q.; Keywords: heterogeneous UAV swarm; reconnaissance and strike; distributed negotiate; time
Zhao, T.; Zhou, C.; Gao, J. A coordination; deterrent maneuver
Distributed Collaborative Allocation
Method of Reconnaissance and Strike
Tasks for Heterogeneous UAVs.
Drones 2023, 7, 138. 1. Introduction
https://doi.org/10.3390/
The popularity of UAVs in civil aerial photography, agriculture, surveillance, and
drones7020138
mapping [1] has made people see its application prospects in more fields. As a low-cost,
Academic Editor: Oleg Yakimenko low-risk, and cost-effective weapon or carrier, UAVs have frequently appeared on the
battlefield. It has become the focus of researchers to endow decentralized, heterogeneous,
Received: 13 January 2023
Revised: 10 February 2023
and low-cost UAV swarms with autonomous coordination capabilities to complete more
Accepted: 14 February 2023
complex tasks, because this is an important way to improve the flexibility and reliability of
Published: 15 February 2023
small UAV swarms to perform combat tasks [2,3].
Since the battlefield is a highly confrontational environment, distributed collaboration
architecture is an important way to achieve large-scale swarm collaboration [4]. The cen-
tralized architecture that has been widely researched and applied has the advantage of a
Copyright: © 2023 by the authors. simpler algorithm design, but it also has the problem of high requirements on the network
Licensee MDPI, Basel, Switzerland. and central computing nodes. In contrast, each UAV node in a distributed architecture
This article is an open access article communicates and cooperates with other UAVs as an independent entity. Since there are
distributed under the terms and no critical nodes in the network, the architecture is highly scalable and reliable.
conditions of the Creative Commons According to the analysis of relevant researchers, the commonly used methods of
Attribution (CC BY) license (https:// distributed task collaborative assignment can be divided into heuristic optimization algo-
creativecommons.org/licenses/by/
rithms [5,6], market-based methods [7–10] and alliance-based methods [2,11] and so on [12].
4.0/).
Heuristic optimization algorithms are widely used because they do not require gradi-
ent information and do not rely on problem models with good mathematical properties.
For example, in the literature [13], an improved pigeon-inspired optimization algorithm is
proposed to solve the optimization problem of cooperative target searches, while it adopts
a centralized control architecture. For multi-UAV cooperative execution of reconnaissance
missions, ref. [5] proposed an intelligent self-organized algorithm (ISOA) mission-planning
method. UAVs exchange status and planning information with each other, and locally opti-
mize route planning using the improved distributed ant colony algorithm to update route
planning, and repeat the process until the task is completed. However, the article assumes
that all UAVs are homogeneous and that the targets are find-and-destroy elements.Another
paper [14] implements a distributed task assignment method for UAV swarm reconnais-
sance missions based on the wolf pack algorithm, including a cooperative search algorithm
based on wolf reconnaissance behavior and a cooperative attack task assignment method
after the target is discovered. The algorithm has good scalability, but it does not consider
the risk of searching the unknown environment when optimizing the scheme.
The market-based method is one in which the bidders estimate the benefits of com-
pleting different tasks, broadcast the bids to each other, and win with the best one, and the
bidders re-evaluate after the environment or allocation plan is updated until there is no
conflict. Aiming at the task assignment problem of heterogeneous cooperative UAV, a
paper [7] proposed a task assignment algorithm based on improved CBGA (improved
consensus-based grouping algorithm, derived from CBBA [15]). The algorithm has a simple
structure, but less consideration is given to factors such as the cooperative relationship
between UAVs. To deal with real-time task allocation in resource-constrained wireless-
sensor networks, the authors of [16] proposed a reverse auction-based scheme using an
adaptive algorithm for each node (bidder) to locally calculate its best bid response with a
non-smooth and concave payoff function.
The formation of the alliance divides the large-scale UAV swarm into several small
UAV alliances through strategies such as cooperative games [11]. This architecture first
distributes tasks among the alliances, and then redistributes the received tasks within the
alliance to effectively reduce the dimension of the problem. Authors [2] use a layered
extended contract network protocol to realize the collaborative control of UAV swarms,
which has the advantage of solving speed when the swarm scale is large. However, this
literature ignores the influence of the division method of UAV subsets on the effect of
swarm behavior. For example, two UAVs that should have cooperated are divided into
different alliances, resulting in a decrease in the quality of the solution.
Researchers have also tried to combine the advantages of different architectures.
When the problem has complex constraints, it is difficult to converge to a good result
by directly applying CBBA and other methods, and repeated negotiations will cause
high communication costs. Therefore, some researches combine heuristic algorithms with
market-based methods. For example, the paper [17] considers task-time constraints and
obstacle constraints, uses intelligent optimization algorithms locally to optimize the scheme,
and then negotiates with other UAVs. Similarly, ref. [18] regards the minimum distance
sum and the minimum maximum completion time as the optimization goals, and first uses
the genetic algorithm (GA) to locally optimize, and then uses the CBAA-derived algorithm
to reach a consensus among nodes.
In terms of the factors concerned in the research of UAV swarm mission collabo-
ration, the factors considered mainly include UAV maneuvering distance [8,19], area
coverage [5,15], route planning [5,20], avoidance of no-fly zones [2], etc., while the threat of
cooperation between enemy platforms is rarely considered.
Aiming at the scenario where there may be a potential cooperative relationship be-
tween enemy targets, this paper proposes a distributed collaborative optimization method
for heterogeneous UAVs based on a negotiation mechanism and GA.
The main contributions of this paper include the following aspects:
278
Drones 2023, 7, 138
• The priority of tasks is evaluated by the swarm’s capability superiority over the tasks
to reduce the search space. The capability superiority is represented by the spatial
density and the capability availability of the tasks, and the attention mechanism is
combined to suppress the distant tasks to evaluate the task priority;
• The time coordination mechanism and deterrent maneuver strategy is used to reduce
the risk of reconnaissance missions. Due to the incomplete information of the task,
multiple UAVs are used to reconnaissance the dense tasks synchronously, and the
UAVs with strike capabilities are deployed with deterrent maneuver strategy to reduce
the risk of reconnaissance missions;
• A distributed task-assignment negotiation mechanism is designed so that UAVs
can run in a completely distributed manner. Compared with the centralized GA,
the proposed method can reduce the problem search space, improve the optimization
speed and the quality of the solution, and the distributed framework can also improve
the scalability and reliability of the swarm.
The remainder of this paper is organized as follows: the problem is defined and
described in Section 2. The distributed collaborative allocation method for heterogeneous
UAVs is described in Section 3. In Section 4, a distributed simulation environment is built,
and the proposed method is verified in this environment. Finally, we conclude the paper in
Section 5.
2. Problem Description
Assuming that there are several suspicious areas on the battlefield, a heterogeneous
UAV swarm with different reconnaissance and strike capabilities needs to be dispatched to
perform the reconnaissance and strike mission, and the UAV nodes communicate with each
other through a multi-hop ad hoc network. UAVs autonomously negotiate task-allocation
schemes for reconnaissance and strike targets. Since there may be a synergistic relationship
between enemy targets, the mission risk and mission completion time should be minimized
during mission execution.
The problem can be formalized as the problem of NU UAVs U = {ui |i = 1, 2, · · · , NU }
completing NT tasks T = t j | j = 1, 2, · · · , NT .
: ;
p
The state of UAV ui is denoted as ui = pi , υi , ai , Ti , Tib , T̄ib , Ui where pi is the current
position; υi is the maximum flight speed; ai is the load capacity matrix of ui , as shown in
p
Table 1, the capacity between loads can be added but a single load cannot be split; Ti is
b b
the task set that is perceived but has not decided the assignment; Ti and T̄i are the task
queue that ui will participate in and the task set that will not participate; Ui is the UAV
swarm status perceived by the ui , which can be updated through communication with
other UAVs.
< =
The state of the task can be expressed as t j = p j , s j , a j , where p j is the position of
the task, s j is the area of the suspicious area where the task is located, and a j is the strike
capability required by the task. In the process of reconnaissance of suspicious areas by
UAVs with reconnaissance capabilities, existing targets can be found and a j can be obtained;
but when there is no target in the area, this conclusion can only be drawn after the UAV
has scouted the entire area, in this case a j = 0.
The connection relationship between UAVs is expressed as an adjacency matrix
L = [lim ]i,m∈[1,NU ] , and there is lim = 1 when distance dim ≤ dδ , otherwise lim = 0, and dδ
is the maximum distance for single-hop communication. When the reconnaissance node
279
Drones 2023, 7, 138
completes the reconnaissance task, it broadcasts the reconnaissance result (that is, whether
there is a target in the area and the required strike capability) to the swarm by using
the ad hoc network. Each UAV utilizes the perceived task status and the status of other
UAVs to optimize the distribution of reconnaissance and strike tasks by negotiating with
neighboring UAVs.
+HWHURJHQHRXV8$9VLQVZDUP
:LUHOHVVQHWZRUNFRPPXQLFDWLRQ
$OJRULWKPGHSOR\HGRQHDFK8$9
'DWDWUDQVIHUEXV
1HJRWLDWLRQ06* 6FRXWFRQILUP06* 6WDWHV\QF 5HIXVH06* 6WULNHLQYLWD
WLRQ06*
$UHSODQV 6FRXWVHQVRU
FRQVLVWHQW" 8$9
<
6WDWH
1
3ODQ $FFHSWWKHSODQ"
6FRXWSODQ 7DVN6WDWH 1
FRQILUP <
QHJRWLDWLRQ 'LVFRYHUWDUJHW
3HUFHLYHGHQYVVWDWH
3ODQ
&DFKHG FRQILUP
6FRXWWDVNDOORF $P,LGOH"
SODQ
DWLRQRSWLPL]H
< 6WULNHWDVN
6FRXWDV RSWLPL]H
'HWHUUHQWDFWLRQ 6WULNHDV
SODQQHG
RSWLPL]H SODQQHG
7DVNULVN
DQDO\VLV 0RYHPHQWRI
GHWHUUHQFH
6FRXWWDVNDVVLJQPHQW 'HWHUUHQFH 6WULNHWDVNDVVLJQPHQW
6FRXWFRQILUP06* 1HJRWLDWLRQ06* GHFLVLRQPDNLQJ 6WDWHV\QF 6WULNHLQYLWD 5HIXVH06*
WLRQ06*
'DWDWUDQVIHUEXV
Figure 1. The framework of the collaborative allocation algorithm for reconnaissance and strike tasks.
This method consists of three main modules: negotiate for scout task assignment,
strike task assignment, and deterrence decision-making. When negotiating scout tasks,
this method first evaluates the tasks risk according to the degree of superiority of the
UAVs over the enemy, and assigns tasks with the goal of minimizing the degree of task
risk and the task completion time. Based on the perceived environmental information
and the historical status information obtained by communicating with neighbor nodes,
each UAV uses the GA to generate a local task-allocation and time-coordination plan after
analyzing the priority of each task, and negotiates with neighbors to resolve conflicts. After
the reconnaissance node discovers the enemy target, it will optimize the strike plan locally
and request the relevant nodes to coordinate execution. If the request is rejected, it will
re-optimize the strike plan until the strike mission is successfully assigned. When the nodes
with strike capability are idle, they will fly to the reconnaissance nodes with weak strike
capability to enhance their deterrence against the enemy and shorten the time from target
discovery to striking.
280
Drones 2023, 7, 138
Definition 1. S-Sig function. To make each UAV pay more attention to the local environment,
referring to the sigmoid function, function f ssig ( x ) is defined as:
1
f ssig ( x ) = (1)
1 + exp(4x − 4)
When 0 < x < 0.5, f ssig ( x ) decays slowly. The decay speed increases with the increase of x
and reaches the maximum at x = 1. When x > 1, its decay speed decreases and lim f ssig ( x ) = 0.
x →+∝
In task-priority evaluation, this function can be used to smoothly suppress the priority of tasks that
are far away, while the priority of nodes that are close to the reference node is almost unaffected
by distance.
Definition 2. Spatial density of tasks. For the convenience of analysis, the typical influence radius
of a single UAV is set to ϕ according to the cruising speed and combat radius of the UAV. Referring
to the concept of kernel density estimation (KDE) in literature [21], we make the mutual influence
between targets attenuate with the increase of distance, and assume that the probability of mutual
cooperation between two targets within radius ϕ is large. Therefore, Equation (1) is used as the
kernel function to calculate the task space density, and for any task t j ∈ T, its space density ρ j is
defined as:
ρ j = ∑ f ssig d jn /ϕ (2)
tn ∈ T,j=k
where d jn is the Euclidean distance between task t j and tn . It can be inferred that ρ j focuses on the
radius within 2ϕ, because when d jn > 2ϕ, f ssig d jn /ϕ < 0.017.
Definition 3. Capability availability estimation of UAV. UAV ui estimates the capability availabil-
C ( τ ) of neighboring UAV u according to its internal perception state at time τ, which can be
ity f im m
expressed as:
C
f im (τ ) = ∏ fimc (τ )1/|C| (3)
c∈C
f imc (τ ) = ∑ f ssig (dmk /ϕ)λτ −τik ς kc (4)
uk ∈Ui
where C is the set of capability types involved in the problem; f imc (τ ) is the availability of capabilities
of type c; Ui is the collection of UAVs perceived by ui ; dmk is the distance between um and uk ; τik is
the timestamp when ui receives the status of uk ; λ is the coefficient that the weight of the information
from the neighboring UAV decays with time, that is, the longer the status of a UAV is updated,
the lower the weight. ς kc is the capability value of type c possessed by uk .
281
Drones 2023, 7, 138
Definition 5. Task prioritization assessment. Define ηimj (τ ) as the priority of task t j to um that is
evaluated by ui , and then ηimj (τ ) can be expressed by the capability coverage of um at t j , namely:
!
1 C dmj + α1 · max 0, dij − ϕ
ηimj (τ ) = f (τ ) · f ssig (7)
ρ j im ϕ
the function max(·) means to take the maximum value, and α1 is the weighting coefficient of the
extra distance. dmj + α1 · max 0, dij − ϕ indicates that the distance between ui and t j should also
be considered when evaluating the capability coverage of um at t j , and the farther the distance is,
the greater the priority of task t j is suppressed.
It can be inferred from the definition of Equation (7) that the closer the task t j is to
ui and um , the lower its spatial density, and the more sufficient the UAV capability that
can cover it, the higher priority ui thinks um will gives to t j . This formula can be used for
ui to measure the superiority of our UAVs to different tasks, which is consistent with the
heuristic rules.
where
J ( βsc ) =
∑ g j ( βsc ) (9)
ui ,t j ,τjs ,g j ∈ βsc
−αr τjs −τ
g j ( βsc ) = r j ( βsc ) · e (10)
r j ( β ) = rmax − αt ·
sc
∑ τjs < τns · f ssig d jn /ϕ (11)
(um ,tn ,τns ,gn )∈ βsc
where the quaternion ui , t j , τjs , g j indicates that ui will start reconnaissance tasks t j at time
τjs , and the expected benefit is g j ; αr is the coefficient of time for the discount of rewards;
reconnaissance task plan βsc is a collection of task assignment quaternions; τjs and τns are
the start execution times of tasks t j and tn , respectively; r j ( βsc ) is the expected reward for
βsc to complete t j , which is composed of the maximum reward rmax and the estimated risk
1 If P is true
for completing the task, and αt is the weight of risk; [[ P]] = is an Iverson
0 Otherwise
bracket, and if the start time τn is later than τj in the scheme, tn will pose a threat to t j .
s s
It can be inferred from Equation (9) that if adjacent tasks can be scouted at the same
time, the threat to each other can be reduced and the reward can be increased. However,
starting reconnaissance at the same time also means that some tasks need to be deliberately
postponed, leading to a decline in overall reward. Therefore, it is necessary to optimize the
time synergy of each plan.
282
Drones 2023, 7, 138
−α τ s −τ −α ·Δτ s
=e r j r j βsc e r jn − r j ( βsc )
(12)
−α τ s −τ −α ·Δτ s
=e r j r j ( βsc ) + αt · f ssig d jn /ϕ e r jn − r j ( βsc )
−α τ s −τ −α ·Δτ s −α ·Δτ s
=e r j r j ( βsc ) e r jn − 1 + αt · f ssig d jn /ϕ e r jn
Since the time collaboration between any two assignments in a plan will depend on
the recalculation of Equation (8), and the time collaborative optimization is an underlying
algorithm that will be called repeatedly, we define Algorithm 1 based on Equation (12) to
quickly optimize the time collaboration of the given plan.
283
Drones 2023, 7, 138
284
Drones 2023, 7, 138
Only the task assignments of the UAVs belonging to Uisc are retained
10: end for
if βsc sc then If the task assignment changes
11: i inconsistent with βi
12: Broadcast βisc Broadcast the updated plan to neighbors within nc hops
13: end if
sc
14: i ← βi
βsc Replace the local plan with the new plan
15: Bi ← ∅
sc Clear the sets of received scout plan
16: return the updated βsc i
In Algorithm 3, the received plans are conflict resolved with the local plan one by one.
For each task βsc sc
m , the consistent part between it and βi is locked, the UAVs and tasks
involved in the inconsistent part are extracted, and Algorithm 2 is used for re-optimization.
The reason for not merging all the plans at the same time is that the more plans received,
the lower the probability of obtaining assignments containing consistent parts, which makes
each iteration almost equal a full re-assignment, leading to low convergence efficiency.
where
J βst =
∑ t j ∈ Tist · f us t j , Uj · Ξ + αc · Δς t j , Uj + ατ · τmax t j , Uj
Uj ,t j ,τjs ∈ βst
⎛ ⎞>⎛ ⎞
dmj (14)
− αth · min ⎝ ∑ ςm⎠ ⎝ 1 ∑ ⎠
Uj ,t j ,τjs ∈ βst u m ∈U j Uj u m ∈U j
νm
t j ∈ Tisc
285
Drones 2023, 7, 138
⎧
⎨ 0, aj ∑ ςm
f us t j , Uj = u m ∈U j (15)
⎩ 1, else
⎛ ⎞
Δς t j , Uj = ∑ ⎝− a jc + ∑ ς mc ⎠ (16)
c∈C u m ∈U j
βst is the strike plan that composed of the strike capability assignment triplet Uj , t j , τjst ,
and the triplet indicates that the set of UAVs Uj need to arrive and strike t j at time τjst .
f us t j , Uj is used to judge whether the capability requirements of task t j can be met by
the strike plan, and if not, a large constant Ξ will be added to the objective function
to make the algorithm give priority to met the capability requirements of t j . Δς t j , Uj
represent the redundancy value of the strike capability assigned to t j , and τmax t j , Uj is
the latest arrival time of the strike capability assigned to t j , and these two are minimized
by the algorithm on the basis of meeting the capability requirements. αc and ατ are weight
coefficients. The part weighted by αth is expected to maximize the minimum deterrent
degree of reconnaissance tasks.
When a UAV discovers the target during reconnaissance, it triggers Algorithm 4 for
strike task allocation, which uses GA to minimize the objective function Equation (13).
After optimization, the number of strike loads required for each UAV is calculated in
detail, and the invitations are send to the UAVs participating in the strike of the target that
discovered by ui in the plan.
11: Send invitation to um ∈ Uui with the occupied loads and the strike time τui st
13: else if nst < nst max then Expand the request range for strike UAVs until nst max
14: nst ← nst + 1
15: Recursive optimize strike plan using Algorithm 4
16: else
17: return Failed There is not enough strike UAVs to execute this task
18: end if
However, if any UAV rejects the invitation, it will be excluded and the scheme will be
optimized again until the strike task is successfully assigned.
286
Drones 2023, 7, 138
where ⎛ ⎞>⎛ ⎞
dmj
J β th
= min ⎝ ∑ ςm⎠ ⎝ 1 ∑ ⎠ (19)
(Uj ,t j )∈ βth u m ∈U j Uj u m ∈U j
νm
287
Drones 2023, 7, 138
8$9 18 7DVN 17
8$9 7DVN
8$9 7DVN
0HVVDJHVHQGDQGUHFHLYH
0HVVDJHEDVHGVWDWHV\QF 7DVNVWDWHXSGDWH
0LUURUPDQDJHPHQWRIWDVNVDQG
RWKHU8$9VWDWHV ,QWHUDFWLYHUHVXOW
IHHGEDFN
6FRXWWDVNGLVWULEXWHGQHJRWLDWLRQ
6WULNHWDVNRSWLPL]H LQYLWDWLRQ
'HWHUUHQFHPDQHXYHURSWLPL]H
0HVVDJHH[FKDQJH PRGHO 7DVNPDQDJHPHQW
VFKHGXOLQJ LQWHUDFWLRQUHVXOW LQWHUDFWLRQMXGJPHQW
*UDSKLFDO8VHU,QWHUIDFH 8$9PRGHOVFKHGXOLQJ
6FHQHJHQHUDWLRQ 0HVVDJHH[FKDQJHEHWZHHQ8$9V
6LPXODWLRQSURJUHVVFRQWURO ,QWHUDFWLYHUHVXOWGHWHUPLQDWLRQ
6LPXODWLRQFRQWUROPRGXOH
Figure 2. Distributed simulation environment for heterogeneous UAV reconnaissance and strike tasks.
0LQLVFRXWHU
0LQLVWULNHU
0LQL6& 67
0HGLXP6& 67
8QVFRXWHGWDVN
&RQILUPHGWDUJHW
)LQLVKHGWDVN
8$9ZLUHOHVVQHWZRUN
7DVNLQGLFDWRUOLQH
8 8$9KDVVWULNH
ORDG
7 WDVN
288
Drones 2023, 7, 138
To simulate the war fog and the dynamic characteristics of the mission, five types of
tasks are set up in the experiment, each of which has a differently sized suspicious area
and required capability vector, as shown in Table 3. The fake target indicates that there is
no actual target in the region, and before the completion of reconnaissance, the specific
information of any target is unknown. Therefore, UAV reconnaissance and strike forces
need to cooperate more flexibly to reduce mission risk and the time interval from discovery
to strike.
289
Drones 2023, 7, 138
Table 5. The prior order of tasks to each UAV evaluated by different UAVs at time=1.0. The bold
numbers are the tasks selected to participate in this round of assignment.
Evaluator Task Prior Order to Each UAV Evaluator Task Prior Order to Each UAV
U14: (15, 28, 3, 1, 14, 4, 7, . . . )
U0: (5, 23, 6, 20, 12, 10, 25, . . . ) U5: (28, 15, 14, 3, 1, 4, 11, . . . )
U6: (5, 23, 6, 20, 12, 10, 25, . . . ) U7: (15, 28, 3, 4, 14, 1, 11, . . . )
U0 U14
U13: (23, 5, 6, 12, 10, 25, 20, . . . ) U9: (28, 15, 14, 3, 4, 1, 11, . . . )
U21: (5, 6, 23, 20, 12, 27, 10, . . . ) U15: (1, 7, 15, 28, 9, 4, 3, . . . )
U19: (28, 15, 3, 14, 4, 1, 11, . . . )
U15: (9, 19, 7, 16, 1, 18, 22, . . . )
U4: (24, 21, 29, 11, 4, 0, 13, . . . )
U7: (7, 9, 16, 19, 1, 18, 15, . . . )
U4 U7: (4, 11, 29, 24, 21, 22, 13, . . . ) U15
U14: (7, 9, 1, 19, 16, 18, 15, . . . )
U24: (24, 21, 29, 11, 4, 0, 13, . . . )
U29: (19, 9, 7, 1, 16, 18, 22, . . . )
U5: (28, 14, 15, 3, 1, 26, 4, . . . ) U19: (14, 28, 15, 3, 2, 26, 4, . . . )
U7: (15, 28, 14, 3, 1, 4, 26, . . . ) U5: (28, 14, 15, 3, 26, 2, 4, . . . )
U5 U9: (14, 28, 15, 3, 26, 1, 2, . . . ) U19 U7: (15, 3, 28, 14, 4, 11, 2, . . . )
U14: (28, 15, 14, 3, 1, 26, 4, . . . ) U9: (14, 28, 3, 15, 26, 2, 4, . . . )
U19: (28, 14, 15, 3, 1, 26, 4, . . . ) U14: (15, 28, 3, 14, 4, 2, 11, . . . )
U21: (6, 20, 27, 8, 5, 23, 2, . . . )
U6: (6, 20, 5, 23, 27, 8, 12, . . . )
U0: (5, 23, 6, 20, 27, 8, 12, . . . )
U0: (5, 23, 6, 20 , 27, 12, 10, . . . )
U6 U21 U6: (6, 20, 5, 27, 23, 8, 12, . . . )
U13: (6, 23, 5, 20, 12, 27, 10, . . . )
U9: (8, 27, 2, 20, 6, 3, 5, . . . )
U21: (6, 20, 5, 27, 23, 8, 12, . . . )
U13: (6, 5, 23, 20, 27, 8, 12, . . . )
U7: (15, 28, 3, 4, 11, 29, 24, . . . )
U4: (4, 11, 29, 24, 3, 15, 28, . . . ) U9: (14, 28, 26, 3, 15, 2, 8, . . . )
U5: (28, 15, 3, 14, 4, 11, 1, . . . ) U5: (14, 28, 15, 3, 26, 2, 8, . . . )
U9: (28, 3, 15, 14, 4, 11, 2, . . . ) U7: (28, 15, 3, 14, 26, 2, 4, . . . )
U7 U9
U14: (15, 28, 3, 4, 11, 29, 14, . . . ) U14: (28, 15, 14, 3, 26, 2, 4, . . . )
U15: (7, 15, 28, 16, 4, 29, 9, . . . ) U19: (14, 28, 3, 15, 26, 2, 8, . . . )
U19: (15, 28, 3, 4, 14, 11, 2, . . . ) U21: (3, 14, 26, 2, 28, 8, 15, . . . )
U24: (4, 11, 24, 29, 3, 15, 28, . . . )
U24: (21, 24, 0, 29, 11, 4, 12, . . . )
U24 U4: (24, 21, 11, 29, 0, 4, 13, . . . ) U29 U29: (19, 9, 1, 7, 16, 18, 15, . . . )
U7: (24, 4, 11, 29, 21, 13, 22, . . . ) U15: (19, 9, 7, 1, 16, 18, 15, . . . )
290
Drones 2023, 7, 138
&RQIOLFWUHVROXWLRQRI8 5HFHLYHGSODQVIURPQHJRWLDWHQHLJKERUV
3ODQIURP8 3ODQIURP8
3ODQRI8DWWLPH 濨濔濩 濈 濊 濌 濄濇 濄濌 濨濔濩 濌 濈 濊 濄濇 濄濌 濅濄
濨濔濩 濊 濇 濈 濌 濄濇 濄濈 濄濌 濅濇 濧濴瀆濾 濅濋 濄濈 濄濇 濄 濆 濧濴瀆濾 濅濉 濅濋 濇 濄濈 濄濇 濅
濧濴瀆濾 濄濈 濄濄 濅濋 濄濇 濇 濊 濆 濅濌 煹瀆 濆濅濁濊 濆濅濁濊 濅濅濁濃 濈濈濁濊 濅濄濁濅 煹瀆 濆濅濁濅 濆濊濁濊 濆濄濁濅 濆濊濁濊 濄濌濁濌 濉濈濁濃
煹瀆 濆濅濁濊 濇濈濁濅 濆濅濁濊 濅濅濁濃 濊濅濁濌 濈濆濁濆 濅濄濁濅 濉濋濁濆 濺 濅濁濄 濅濁濄 濅濁濆 濄濁濉 濅濁濆 濺 濅濁濄 濅濁濃 濄濁濊 濄濁濌 濅濁濆 濄濁濇
濺 濅濁濄 濄濁濈 濅濁濄 濅濁濆 濄濁濆 濄濁濇 濅濁濆 濄濁濅
3ODQIURP8 3ODQIURP8
濨濔濩 濄濌 濈 濊 濌 濄濇 濨濔濩 濄濇 濈 濊 濌 濄濈 濄濌
&RQIOLFWUHVROXWLRQ
ZLWK$OJRULWKP 濧濴瀆濾 濆 濅濋 濇 濄濇 濄濈 濧濴瀆濾 濄濈 濄 濇 濄濇 濊 濆
煹瀆 濅濄濁濅 濆濊濁濊 濆濄濁濅 濅濅濁濃 濆濊濁濊 煹瀆 濆濊濁濊 濈濉濁濉 濆濄濁濅 濅濅濁濃 濈濆濁濆 濅濄濁濅
3ODQRI8DWWLPH 濺 濅濁濆 濅濁濃 濄濁濊 濅濁濆 濅濁濃 濺 濄濁濊 濄濁濉 濄濁濊 濅濁濆 濄濁濇 濅濁濆
濨濔濩 濊 濇 濈 濌 濄濇 濄濈 濄濌 濅濇 3ODQIURP8 3ODQIURP8 3ODQIURP8
濧濴瀆濾 濄濈 濅濇 濅濋 濄濇 濄 濌 濆 濅濄 濨濔濩 濇 濊 濅濇 濨濔濩 濄濈 濊 濄濇 濅濌 濨濔濩 濅濇 濇 濊
煹瀆 濆濆濁濊 濅濇濁濉 濆濆濁濊 濅濆濁濃 濈濉濁濊 濆濃濁濆 濅濅濁濅 濆濋濁濉 濧濴瀆濾 濅濇 濇 濅濄 濧濴瀆濾 濌 濊 濄 濄濌 濧濴瀆濾 濅濄 濅濇 濇
濺 濅濁濄 濅濁濄 濅濁濄 濅濁濆 濄濁濉 濅濁濃 濅濁濆 濄濁濌 煹瀆 濅濆濁濉 濆濄濁濅 濆濊濁濉 煹瀆 濅濌濁濆 濉濋濁濉 濈濈濁濊 濄濊濁濅 煹瀆 濆濊濁濉 濅濆濁濉 濆濄濁濅
濺 濅濁濄 濄濁濊 濄濁濌 濺 濅濁濃 濄濁濆 濄濁濉 濅濁濈 濺 濄濁濌 濅濁濄 濄濁濊
When the conflict is resolved, the assignment of U7 in Figure 4 is consistent with the
updated assignments of other nodes, so this assignment is finally adopted and implemented.
The allocation results of the first round of reconnaissance tasks are shown as the green
target lines of each reconnaissance node in Figure 5a, and the specific allocation information
is shown in Table 6. The topology after a period of execution is shown in Figure 5b.
From the allocation results, it can be found that the distributed allocation algorithm
generally follows the principle of minimizing the completion time, but it also reflects the
algorithm’s expectation of enhancing the superiority over enemy and reducing mission
risks. For example, U13 chose T6 instead of T12 as the first mission, because it is easier for
the UAV swarms to form a superiority over enemy at T6. In contrast, the location of T12
is too dense and risky, and it should be executed after more UAVs are concentrated. The
mission groups (T5, T23) and (T15, T28) are also relatively dense, but since the UAV swarm
has a capability advantage here, the strategy of coordinating in time is adopted to reduce the
mission risk. In the time-coordinated formation, the UAVs that could have arrived earlier
choose to reduce the flight speed so that the formation can reach the targets at the same
time, thereby avoiding the coordinated strike of the enemy due to individual exposure.
When each UAV adopts the distributed task assignment algorithm in this paper,
the task scheduling Gantt chart is as Figure 6 shows. In it, the blue boxes represent the
reconnaissance behavior, and the red boxes represent strike behavior. From the Gantt
chart, it can be found that for UAVs with both reconnaissance and strike capabilities, such
as U0 and U5, when their own capabilities can meet the target capability requirements,
the strike can be carried out immediately. For example, U0’s strike on T10 and U5’s strike
on T11 are instant. For targets with strong defense capabilities, the coordinated strike of
multiple UAVs is required. For example, the strikes on T0 and T26 are all completed by
291
Drones 2023, 7, 138
the cooperation of four UAVs. Since the nodes with strike capability will maneuver to the
reconnaissance nodes for deterrence when they are idle, it allows reconnaissance nodes
that do not have strike capability can also strike quickly after discovering the target. For
example, U25 can launch a strike on T9 (discovered by U15) within 8 s after confirming the
strike mission.
D E
Figure 5. Topology diagram for different simulation times. (a) Each reconnaissance UAV has con-
firmed the reconnaissance task before time = 8. (b) Each reconnaissance node executes reconnaissance
task, and the strike nodes perform deterrence maneuver at time = 30. The meanings of elements are
consistent with those in Figure 3.
Table 6. Allocation information and time coordination relationship of the first round of reconnais-
sance tasks.
292
Drones 2023, 7, 138
Figure 6. Gantt chart for reconnaissance and strike tasks. Blue boxes represent reconnaissance and
red boxes represent strikes, and the triple elements represent mission confirmation time, duration of
maneuver and task execution, and the concatenation of task type and task ID respectively.
293
Drones 2023, 7, 138
takes the most time, and then decreases gradually. This is because before the first decision,
there is no communication between nodes, and each node makes decisions independently.
The second decision is made after exchanging the results of the first round, and at this
time, each node is performing conflict resolution on multiple collected plans, so it takes
a lot of time. After the second step, the conflict between plans is gradually resolved, so
the decision-making time is also shortened and the final task-allocation plan is formed.
The optimization results are shown in Table 6, and from the global perspective, its fitness is
27.48 using Equation (9).
7LPHFRQVXPHGE\HDFK
GHFLVLRQRIHDFKQRGH V
6LPXODWLRQWLPHVWHS
Figure 7. Time consumption of each UAV in the proposed method.
In the same scenario, we further use the centralized GA to optimize the reconnaissance
task-allocation problem from a global perspective, and the fitness curve obtained is shown
as the two CGA curves in Figure 8. In our proposed algorithm, the population number
πn and iteration number πi of GA are automatically adjusted according to the number of
permutations of the assignment problem. When this strategy is applied to the centralized
method, the obtained parameters of GA are πn = 32, πi = 30 and π p = 0.1. However,
the search space for the global optimization problem of assigning 30 tasks to 13 UAVs
is too large, so GA is difficult to converge to a good result under these parameters. In
order to further expand the search of the global GA to obtain a better result, we adjust the
parameters to πn = 400, πi = 200, π p = 0.3. We can find that after 27 rounds of iteration,
it has obtained a result with fitness close to that of the proposed paper, which consumes
about 4.9 s.
V
)LWQHVV
3URSRVHGPHWKRG
&*$ SQ SL SS
&*$ SQ SL SS
7KHQXPEHURILWHUDWLRQVRIWKHFHQWUDOL]HG *$
Comparing the time consumption of the two methods, it can be found that the pro-
posed method effectively reduces the computing load of each single UAV through the idea
294
Drones 2023, 7, 138
of divide and conquer, and can be applied to more types of small UAVs. By observing
the change of fitness, we found that it is easy for centralized global optimization to fall
into local optimal solution when the problem space is large. If the swarm size is further
increased, it will be difficult for the centralized global optimization method to obtain good
results, while distributed collaborative optimization has better scalability.
Table 7. The scout risk and strike capability coverage compared with no time coordination or
deterrence maneuver. Each result is the mean and standard deviation of 10 simulation scenarios.
The statistics of the results show that time coordination can reduce the scout risk by
about 23%, and deterrence maneuver can improve the strike capability coverage by about
30%, which verifies the effectiveness of time coordination and maneuver deterrence.
4.7. Discussion
4.7.1. Computational Complexity Analysis
In the proposed method, the optimization of scout-task assignment needs to optimize
task allocation and time coordination between UAVs, which is the part with high com-
putational complexity of our proposed method. Therefore, analyzing the computational
complexity of this part will aid further improvement.
In Algorithm 2, the most time-consuming process is to use line 24 to optimize the plan
using GA. The computational complexity of GA can be expressed as O(π p × πi ), where
π p is the population size and πi is the number of iterations. For each plan generated by
the GA, its fitness will be calculated through line 17-21 of Algorithm 2. Among them,
line 20 uses Algorithm 1 for time collaborative optimization. Let the number of UAVs in
the plan be N. The worst case of the while loop of Algorithm 1 will iterate N − 1 times,
and each loop needs to calculate N × ( N − 1) time alignment reward gains according to
Equation (12), so the complexity of Algorithm 1 is about O( N 3 ). Then line 21 of Algorithm 2
uses Equation (9) to calculate the fitness of the plan, in which the start time between any
two UAVs needs to be compared to evaluate the task threat, and thus the complexity is
about O( N 2 ).
Therefore, the overall computational complexity of Algorithm 2 is about O(π p ×
πi × ( N 3 + N 2 )) ≈ O π p × πi × N 3 . Among them, the settings of π p and πi not only
affect the calculation cost, but also affect the quality of the optimization results. In this
paper, these two parameters are simply linearly mapped from the number of allocation
combinations, and before deployment, it is necessary to further study the setting strategy of
these parameters to compromise between the calculation cost and the optimization quality.
295
Drones 2023, 7, 138
5. Conclusions
Due to the high risk of UAV clusters in executing reconnaissance and strike tasks
under the condition of insufficient enemy information and potential synergy between
targets, a distributed task-collaborative allocation method for heterogeneous UAV swarms
is proposed. This method establishes a distributed task-allocation framework composed of a
reconnaissance task-allocation method based on a negotiation mechanism and a strike task-
allocation method based on an invitation mechanism. The reconnaissance task-allocation
algorithm evaluates the task priority according to the superiority of the UAVs against
the tasks to reduce the complexity of the optimization problem. Reconnaissance UAVs
adopt a time-coordination strategy for reconnaissance, and UAVs with strike capabilities
perform deterrent maneuvers when they are idle to reduce mission risks during mission
execution. This method enables the UAV swarm to negotiate the allocation of tasks in a
distributed framework, and at the same time, the evaluation of the capability advantage
over the enemy, time coordination, and deterrence maneuver mechanism effectively reduce
the risk of unknown targets to UAVs. The distributed framework not only improves the
scalability of the swarm, but also enhances its reliability in the battlefield with a more
complex electromagnetic environment.
296
Drones 2023, 7, 138
Further research should include a more efficient algorithm that takes the negotiation
mechanism and the network state of the swarm as prior information to replace the GA
for the local optimization of UAVs, and should aim to obtain the best operating efficiency
under different network connectivity. The proposed method should be further combined
with a centralized or hierarchical task-allocation framework.
Author Contributions: Conceptualization, H.D. and J.H.; methodology and writing—original draft,
H.D.; writing—review and editing, Q.L., C.Z., T.Z. and J.G. All authors have read and agreed to the
published version of the manuscript.
Funding: This research received no external funding.
Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.
Data Availability Statement: Not applicable.
Conflicts of Interest: The authors declare no conflict of interest.
References
1. Shakhatreh, H.; Sawalmeh, A.H.; Al-Fuqaha, A.; Dou, Z.; Almaita, E.; Khalil, I.; Othman, N.S.; Khreishah, A.; Guizani,
M. Unmanned Aerial Vehicles (UAVs): A Survey on Civil Applications and Key Research Challenges. IEEE Access 2019, 7,
48572–48634.
2. Qin, B.; Zhang, D.; Tang, S.; Wang, M. Distributed Grouping Cooperative Dynamic Task Assignment Method of UAV Swarm.
Appl. Sci. 2022, 12, 2865. [CrossRef]
3. Zhang, J.; Xing, J. Cooperative task assignment of multi-UAV system. Chin. J. Aeronaut. 2020, 33, 2825–2827. [CrossRef]
4. Jiang, X.; Zeng, X.; Sun, J.; Chen, J. Research status and prospect of distributed optimization for multiple aircraft. Acta Astronaut.
2021, 42, 524551. (In Chinese) [CrossRef]
5. Zhen, Z.; Xing, D.; Gao, C. Cooperative search-attack mission planning for multi-UAV based on intelligent self-organized
algorithm. Aerosp. Sci. Technol. 2018, 76, 402–411. [CrossRef]
6. Duan, H.; Zhao, J.; Deng, Y.; Shi, Y.; Ding, X. Dynamic Discrete Pigeon-Inspired Optimization for Multi-UAV Cooperative
Search-Attack Mission Planning. IEEE Trans. Aerosp. Electron. Syst. 2021, 57, 706–720. [CrossRef]
7. Ma, Y.; Zhao, Y.; Bai, S.; Yang, J.; Zhang, Y. Collaborative task allocation of heterogeneous multi-UAV based on improved CBGA
algorithm. In Proceedings of the 16th International Conference on Control, Automation, Robotics and Vision, Shenzhen, China,
13–15 December 2020; pp. 795–800.
8. Dai, W.; Lu, H.; Xiao, J.; Zeng, Z.; Zheng, Z. Multi-Robot Dynamic Task Allocation for Exploration and Destruction. J. Intell. Robot.
Syst. 2019, 98, 455–479. [CrossRef]
9. Sheng, W.; Yang, Q.; Tan, J.; Xi, N. Distributed multi-robot coordination in area exploration. Robot. Auton. Syst. 2006, 54, 945–955.
[CrossRef]
10. Ye, F.; Chen, J.; Sun, Q.; Tian, Y.; Jiang, T. Decentralized task allocation for heterogeneous multi-UAV system with task coupling
constraints. J. Supercomput. 2020, 77, 111–132. [CrossRef]
11. Chen, J.; Wu, Q.; Xu, Y.; Qi, N.; Guan, X.; Zhang, Y.; Xue, Z. Joint Task Assignment and Spectrum Allocation in Heterogeneous
UAV Communication Networks: A Coalition Formation Game-Theoretic Approach. IEEE Trans. Wirel. Commun. 2021, 20,
440–452. [CrossRef]
12. Jiang, Y. A Survey of Task Allocation and Load Balancing in Distributed Systems. IEEE Trans. Parallel Distrib. Syst. 2016, 27,
585–599. [CrossRef]
13. Li, L.; Xu, S.; Nie, H.; Mao, Y.; Yu, S. Collaborative Target Search Algorithm for UAV Based on Chaotic Disturbance Pigeon-Inspired
Optimization. Appl. Sci. 2021, 11, 7358. [CrossRef]
14. Hu, J.; Wu, H.; Zhan, R.; Menassel, R.; Zhou, X. Self-organized search-attack mission planning for UAV swarm based on wolf
pack hunting behavior. J. Syst. Eng. Electron. 2021, 32, 1463–1476.
15. Choi, H.; Brunet, L.; How, J.P. Consensus-Based Decentralized Auctions for Robust Task Allocation. IEEE Trans. Robot. 2009, 25,
912–926. [CrossRef]
16. Edalat, N.; Tham, C.; Xiao, W. An auction-based strategy for distributed task allocation in wireless sensor networks. Comput.
Commun. 2012, 35, 916–928. [CrossRef]
17. Choi, H.; Kim, Y.; Kim, H.J. Genetic algorithm based decentralized task assignment for multiple unmanned aerial vehicles in
dynamic environments. Int. J. Aeronaut. Space Sci. 2011, 163–174. [CrossRef]
18. Patel, R.; Rudnick-Cohen, E.; Azarm, S.; Otte, M.; Xu, H.; Herrmann, J.W. Decentralized Task Allocation in Multi-Agent Systems
Using a Decentralized Genetic Algorithm. In Proceedings of the IEEE International Conference on Robotics and Automation
(ICRA), Paris, France, 31 May–31 August 2020; pp. 3770–3776.
297
Drones 2023, 7, 138
19. Wu, H.; Li, H.; Xiao, R.; Liu, J. Modeling and simulation of dynamic ant colony’s labor division for task allocation of UAV swarm.
Physica A 2018, 491, 127–141. [CrossRef]
20. Cao, Y.; Wei, W.; Bai, Y.; Qiao, H. Multi-base multi-UAV cooperative reconnaissance path planning with genetic algorithm. Clust.
Comput. 2019, 22, 5175–5184. [CrossRef]
21. Yu, W.; Ai, T.; Shao, S. The analysis and delimitation of Central Business District using network kernel density estimation. J.
Transp. Geogr. 2015, 45, 32–47. [CrossRef]
22. Khan, M.A.; Safi, A.; Qureshi, I.M.; Khan, I.U. Flying ad-hoc networks (FANETs): A review of communication architectures,
and routing protocols. In Proceedings of the 2017 First International Conference on Latest trends in Electrical Engineering and
Computing Technologies, Karachi, Pakistan, 15–16 November 2017; pp. 1–9.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.
298
drones
Article
Service Function Chain Scheduling in Heterogeneous
Multi-UAV Edge Computing
Yangang Wang 1,2 , Hai Wang 1, *, Xianglin Wei 2, *, Kuang Zhao 2 , Jianhua Fan 2 , Juan Chen 1 , Yongyang Hu 2
and Runa Jia 1,2
1 College of Communication Engineering, Army Engineering University of PLA, Nanjing 210007, China
2 The Sixty-Third Research Institute, National University of Defense Technology, Nanjing 210007, China
* Correspondence: hai_wang@aeu.edu.cn (H.W.); weixianglin@nudt.edu.cn (X.W.)
Keywords: edge computing; unmanned aerial vehicle; artificial intelligence; service function chain
order. For instance, each SF could be a modular in the DL-involved application, such as
data pre-processing functions, deep neural network components, and target tracking, etc.
An SF is a static software template that can derive instances on demand based on the virtual
machine (VM) or docker technology [13–15]. A corresponding SF instance (SI) has to be
created whenever the UAV decides to process a task. Once all the SIs of SFs contained in an
SFC are successfully created, a task can pass through each instance sequentially to obtain
its required services. In addition, in order to be able to run SFCs with AI algorithms, it
is necessary to equip UAVs with custom-made AI processors. Compared with graphics
processing unit (GPU), field programmable gate array (FPGA) has obvious characteristics
of low power consumption and small size [16,17], which has been treated as a promising
solution on the UAV platforms without violating size, weight, and power constraints
inherent to UAV design. Considering that the UAV has limited computation resources
and storage resources, the SFs contained in one SFC and their corresponding SIs can be
distributed on multiple UAVs. When a large number of tasks are offloaded onto the UAV
network at the same time stage, massive SIs corresponding to the required SFs will have
to be created. In order to make full use of the limited computation and communication
resources of UAVs, an efficient SFC scheduling strategy is indispensable. However, it also
faces many challenges: (1) There is a complex matching relationship between tasks, SIs, and
SFCs due to the heterogeneity of UAVs and the resource requirements of the tasks. (2) There
is a complex trade-off between the communication and computation resource scheduling.
(3) It is hard to achieve a long-term multi-objective optimization for the scenario with con-
tinuous task arrival and unknown SFC requirements. Tremendous efforts have been made
in designing task scheduling algorithms in multi-UAV edge computing paradigms [18–34].
However, they often assume that each task is served by only one UAV and less attention
has been paid to SFC scheduling in a multi-UAV edge computing scenario. A detailed
analysis of existing efforts is presented in Section 2. With this backdrop, this paper firstly
formulates the SFC scheduling problem as a 0–1 nonlinear integer programming problem.
Then, a two-stage heuristic algorithm is put forward to derive a sub-optimal solution of the
problem. The main contributions of this paper are threefold:
(1) The SFC scheduling problem in heterogeneous “CPU + FPGA” computation archi-
tecture is formulated as a 0–1 nonlinear integer programming problem. The overall
revenue of the system and the completion time sum of tasks are optimized with vari-
ous resource constraints. To the best of our knowledge, this is the fist paper that has
studied the SFC scheduling problem considering FPGA resources in the multi-UAV
edge computing network;
(2) To solve the NP-hard problem with coupling variables, a two-stage heuristic algorithm
called ToRu is put forward. At the first stage, i.e., when the resources are abundant,
the SFCs of all tasks are deployed to UAV edge servers in parallel based on our
proposed pairing principle between SFCs and UAVs for minimizing the sum of all
tasks’ completion time; at the second stage, i.e., when the resources are insufficient,
a revenue maximization heuristic method is adopted to deploy the arrived SFCs
in a serial service mode. In order to obtain the long-term optimization, a time-slot
partitioning protocol is designed, based on which ToRu can operate repeatedly in
each time-slot;
(3) A series of experiments are conducted to evaluate the performance of our proposal.
Experimental results show that our proposed ToRu algorithm outperforms other
benchmark algorithms in the the sum of all tasks’ completion time, the overall revenue,
and the task execution success ratio.
The remaining of this paper is organized as follows. Section 2 summarizes related
work. The model and problem formulation of the proposed system are introduced in
Section 3. Section 4 describes the details of our proposed ToRu algorithm. The experiments
and analysis of the results are presented in Section 5. Finally, Section 6 concludes this paper.
300
Drones 2023, 7, 132
2. Related Work
In the multi-UAV edge computing system, according to the granularity of services
provided to users, the existing work can be roughly divided into two categories: task
scheduling and SFC scheduling. The difference between the two is mainly reflected in
the matching process between offloaded tasks and UAVs. The former only considers
whether the UAV’s hardware computation resources (such as CPU, RAM, etc.) meet the
task requirements; the latter further considers whether the SFs deployed on UAVs match
the offload tasks requirements and is closer to the real situation.
301
Drones 2023, 7, 132
optimization algorithm. However, the application considered in [32] only contains one SF,
so that SFC scheduling is not involved.
Wang et al. [33] proposed a reconfigurable service provisioning framework based
on SFCs for Space–Air–Ground-Integrated Networks (SAGINs), where the computation
and communications resource consumptions are balanced. Li et al. [34] investigated the
online mapping and scheduling of dynamic virtual network functions (VNFs) in SAGINs,
in which an Internet of Vehicles (IoV) service can be represented by a SFC formed with a
set of chained VNFs. However, in the proposed SFC construction process, only the capacity
constraints of CPU and buffer on NVF nodes are considered, and the channel resources
constraints between NVF nodes are ignored. In fact, the channel resources between NFV
nodes are very scarce in the wireless environment, and it often becomes the bottleneck factor
affecting the SFC construction. In addition, the SFC scheduling mentioned in [33,34] aims
to establish an end-to-end route for servicing data, where the data source and destination
nodes are separately located in different geographical regions. However, in the edge
computing scenario considered in this paper, the SFC scheduling aims to provide services
for requesting users, where the input of raw data and the output of computing results are
all located in the same access node. Moreover, dedicated hardware resources (such as GPU,
FPGA, etc.) supporting AI algorithm operation are also not considered in [33,34].
In summary, it is difficult to solve the SFC scheduling problem involved in this paper
with existing algorithms. To the best of our knowledge, compared with existing works,
this is the first paper that considers the SFC scheduling problem in a multi-UAV edge
computing network with FPGAs resources, which can provide services for complicated
intelligent application-oriented tasks in the weak infrastructure areas.
302
Drones 2023, 7, 132
7KHUHTXLUHG6)&UHFHLYHGE\8 V V V V V
7KHUHTXLUHG6)&UHFHLYHGE\8 V V V V
8 8 V
V 8
V
V
V
8 8 7KHJURXQGGHYLFHV
V V DUHDFRYHUHGE\RQH8$9
8
V V
UHSUHVHQWVWKH6)&SDWKRISURFHVVLQJWKHWDVNRIIORDGHGWR8
UHSUHVHQWVWKH6)&SDWKRISURFHVVLQJWKHWDVNRIIORDGHGWR8
UHSUHVHQWVWKHLQWHUDFWLRQRIFRQWUROPHVVDJHVEHWZHHQ8$9V
Figure 1. A heterogeneous Multi-UAV edge computing scenario. Each UAV deploys several SFs and
can receive the SFC required from the devices on the ground. The master UAV controls the resources
of the edge computing network, which can instantiate the required SFCs.
pm β 0
rm,m = B log2 (1 + ), 1 ≤ m, m ≤ M, m = m . (1)
N0 Bd2
m,m
where β 0 is the transmit power gain at a reference distance of one meter. pm is the trans-
mission power of Um . N0 is the noise power spectrum density. B is the bandwidth of a
sub-channel. dm,m is the distance between Um and Um . Note that when m = m , rm,m = ∞.
In other words, this paper ignores the data exchanging time between two SF instances at
the same UAV. For ease of description, the main notations used in this paper are listed in
Table 1.
Table 1. Notations.
Notation Definition
Un The n-th UAV in U
SF Service Function
SI SF instances
SFC Service Function Chain
SFn The n-th SF
?1
S The set of FPGA-independent SFs
?2
S The set of FPGA-dependent SFs
Q1 ?1
The number of FPGA-independent SFs in S
Q2 ?2
The number of FPGA-dependent SFs in S
303
Drones 2023, 7, 132
Table 1. Cont.
Notation Definition
Sm The set of SF deployed on Um
SFm,q ∗ The q-th SF deployed on Um
T The set of tasks currently requesting services
Tn The n-th requesting task
cpu
Nm The number of CPU cores on Um
f pga
Nm The number of FPGAs on Um
cpu
Nm,a The number of the current idle CPU cores on Um
f pga
Nm,a The number of the current idle FPGAs on Um
∗
nsim,q The number of SIs currently created corresponding to SFm,q
f m,q The frequency of CPU core on Um
The processing speed of CP when running SI of SFm,q ∗
am,q
Lm,m The wireless link of Um to Um
rm,m The data transmission rate on one sub-channel of the link Lm,m
Nm c The number of current idle sub-channels on the link Lm,m
on The source UAV that receives Tn
sn The required SFC of Tn
skn The k-th SF contained in the SFC required by Tn
ln The properties of Tn before its entering into an SFC
lnk The properties of Tn when arriving at the instance of skn in its SFC
lnk,0 The length of Tn when arriving at skn
lnk,1 The number of CPU cycles required for processing one bit task on skn
lnk,2 The number of AI accelerator operations required for processing one bit task on skn
The minimum requirement set for the transmission rate between the instances of SFCs
rn
required by Tn
rnk,k+1 The minimum transmission rate requirement from the instance of skn to skn+1
cn The minimum computation resources requirement set of Tn for the instances of SFC.
ck,0
n The minimum CPU processing speed requirement of Tn on the instances of skn .
ck,1
n The minimum AI accelerator processing speed requirement of Tn on the instances of skn .
vn The revenue obtained through completing Tn
Tnc The completion time of Tn
m
Xn,k The decision variable of creating the instance of skn
304
Drones 2023, 7, 132
is expressed as a two-tuple: { f m,q , am,q }. f m,q is the frequency of CPU core on Um , measured
in GHz; am,q is the processing speed of FPGA when running the SI of SFm,q ∗ , measured in
∗
GOP/s [38]. Note that am,q = 0 when SFm,q is an FPGA-independent SF.
6,VFUHDWHG
DQ6,RI6) DQ6,RI6) DQ6,RI6) DQ6,RI6)
RQ8
&38V
3&,H
GA GA GA
)3*$V
8ORDGVWKHVRIWZDUHFRPSRQHQWVRI6)WR&38FRUHVDQG)3*$V
8ORDGVWKHVRIWZDUHFRPSRQHQWVRI6)WR&38FRUHV
∗ , SF ∗ , and
Figure 2. The creation process of SIs on U1 . The SI of FPGA-dependent SF (i.e., SF1,1 1,2
∗ ) occupies a CPU core and an FPGA. Moreover, the SI of FPGA-independent SF (i.e., SF ∗ ) only
SF1,3 1,4
occupies a CPU core.
∗
cpu
m,q ≤ Nm , SFm,q ∈ S1 , 1 ≤ m ≤ M.
nsi (2)
f pga ∗
m,q ≤ Nm
nsi , SFm,q ∈ S2 , 1 ≤ m ≤ M. (3)
∑
f pga
m,q ≤ Nm
nsi , 1 ≤ m ≤ M. (4)
m ?
∗ ∈(S ∩ S
SFm,q 2)
∑
cpu
m,q ≤ Nm , 1 ≤ m ≤ M.
nsi (5)
∗ ∈S
SFm,q m
Equation (2) ensures that the number of currently created SIs corresponding to any
FPGA-independent SF on Um does not exceed the number of CPU cores on it. Equation (3)
means that the number of currently created SIs corresponding to any FPGA-dependent SF
on Um does not exceed the number of FPGAs on it. Equation (4) guarantees that the total
number of currently created SIs corresponding to all FPGA-dependent SFs on Um does
not exceed the number of FPGAs on it. Equation (5) restricts the total number of currently
created SIs on Um to not exceed the number of CPU cores on it.
305
Drones 2023, 7, 132
Ns
computing result from sn n and transmits it to the ground, which also has to be instanti-
ated at the source UAV. Nns represents the total number of SFs contained in the required
SFC. When 1 ≤ k ≤ Nns , skn ∈ S represents the k-th SF contained in the required SFC,
which can be instantiated on any slave UAV that satisfies its resource requirements. ln
means the current properties of Tn when it arrives at each SF contained in sn , denoted
N s N s +1
as ln = {ln0 , ln1 , ln2 , . . . lnk , . . . , ln n , ln n }, 0 ≤ k ≤ Nns + 1. lnk is the property of Tn when it
arrives at skn , denoted as lnk = {lnk,0 , lnk,1 , lnk,2 }. lnk,0 is the current length of Tn ; lnk,1 is the number
of CPU cycles required for processing one bit task; lnk,2 is the number of AI accelerator
operations required for processing one bit task. The computation resources consumed by
N s +1
s0n and sn n are considered to be negligible in this paper, since their instances require far
N s +1,0
less computation resources than other AI instances. Therefore, ln0,0 and ln n represents
N s +1,1 N s +1,2
the initial length of Tn and its computing result, respectively; ln0,1 , ln0,2 , ln n and ln n
are equal to 0. rn represents the minimum transmission rate requirement between the SFs
N s −1,Nns Nns ,Nns +1
contained in the required SFC, denoted as rn = {rn0,1 , rn1,2 , . . . , rn n , rn }. rnk,k+1
(0 ≤ k ≤ Nn ) indicates the minimum transmission rate requirement from sn to snk+1 . cn
s k
represents the minimum computation resources requirement when creating the instances
N s N s +1
of SFs contained in the required SFC, denoted as cn = {c0n , c1n , c2n , . . . , ckn , . . . , cn n , cn n },
0 ≤ k ≤ Nns + 1. ckn is the minimum computation resource requirement of initiating skn ,
denoted as ckn = {ck,0 k,1 k,0
n , cn }. cn is the minimum processing speed requirement for CPU,
k,1
measured in GHz; cn is the minimum processing speed requirement for AI accelerator,
N s +1
measured in GOP/s. For c0n and cn n , their values are 0. vn represents the revenue
obtained through completing Tn .
As shown in Figure 1, an SFC is unidirectional, and the task goes through each SF
sequentially. Furthermore, UAVs adopt the communication mode of orthogonal frequency
division multiple access. Therefore, the communication resources consumed between
adjacent SFs belong to the UAV instantiating the upstream SF. To sum up, for task Tn , it
is more reasonable to create the SF instances in the reverse order of the required SFC, i.e.,
N s +1
the instance of sn n is created first, and the instance of s0n is created last. Assume that the
instance of skn+1 (0 ≤ k ≤ Nns ) has been created on Um . If we want to continue creating the
instance of skn on Um , the following conditions have to be satisfied at the same time:
∗
skn = SFm,q , 1 ≤ m ≤ M, 1 ≤ q ≤ Qm (6)
n ≤ f m,q , 1 ≤ m ≤ M, 1 ≤ q ≤ Qm
ck,0 (7)
n ≤ am,q , 1 ≤ m ≤ M, 1 ≤ q ≤ Qm
ck,1 (8)
rnk,k+1
≤ Nmc , m = m . (9)
rm,m
Equation (6) means that Um has to deploy the SF matching skn . Equations (7) and (8)
ensure that the SI of skn on Um can satisfy the computing requirement of Tn . Equation (9)
guarantees that Um has enough idle sub-channels for transmitting Tn to Um at a rate no
smaller than the required. Moreover, define tkn as the stay time of Tn on the SI of sn,k , which
includes the executing time tk,1 k,2 k,1
n and transmitting time tn . tn can be expressed as:
' k,0 k,1
ln ln lnk,0 lnk,2
f m,q + am,q , 1 ≤ k ≤ Nn .
s
tn =
k,1
(10)
0, k = 0, k = Nns + 1.
306
Drones 2023, 7, 132
⎧
⎪ lnk+1
⎨ , m = m , 0 ≤ k ≤ Nns
r k,k +1
tk,2 = rn r (11)
n
⎪
⎩ m,m
m,m
0, m = m , 0 ≤ k ≤ Nns .
m to represent the decision of creating the SI of sk ,
We define the binary variable Xn,k n
which can be expressed as:
⎧
⎨ 1, i f the SI o f skn is created on Um with satis f ying the constraints
m
Xn,k = o f Equations (6) to (9) (12)
⎩
0, otherwise.
Considering that the SI of each SF contained in the required SFC is created at most one
time, the following constraints must be satisfied:
m= M
∑ m
Xn,k ≤1 (13)
m =1
on on N s +1
Note that Xn,0 = 1 and Xn,N = 1 always hold, since the SIs of s0n and sn n have to
n +1
s
been created on the source UAV. Tn can be successfully executed only if the SIs of all SFs
contained in its required SFC have been successfully created. At this time, the following
formula holds.
k = Nns +1 m= M
∏ ∑ m
Xn,k = 1. (14)
k =0 m =1
m= M k= Nn +1
s
Tnc = ∑ ∑ m
Xn,k (tk,1
n + t n ), 1 ≤ n ≤ N
k,2
(15)
m =1 k =0
s.t. : Equations (1) to (14).
n ≤ N, 1 ≤ m ≤ M, 0 ≤ k ≤ Nn + 1}.
s
n= N k = Nns +1 m= M
F1 = ∑ Tnc ∏ ∑ m
Xn,k (16)
n =1 k =0 m =1
n= N k= Nns +1 m= M
F2 = ∑ vn ∏ ∑ m
Xn,k (17)
n =1 k =0 m =1
307
Drones 2023, 7, 132
P1 : min(F1 , −F2 )
{X }
s.t.C1 : 1 ≤ n ≤ N, 1 ≤ m ≤ M, 1 ≤ q ≤ Qm , 0 ≤ k ≤ Nns + 1
∗ (18)
C2 : skn = SFm,q , ∀ Xn,k
m
=1
on on
C3 : Xn,0 = 1, Xn,N =1
n +1
s
Constraint (C1 ) specifies the valid ranges of the involved variables in Constraint
(C2 )∼(C4 ). Constraint (C2 ) guarantees that the UAV of instantiating the required SF should
deploy this SF in advance. Constraint (C3 ) restricts that the SIs of the first SF and last SF of
an SFC have to be created on the source UAV. Constraint (C4 ) includes several constraints
related to the resource requirements and the resource capacities, which are the described in
detail after Equations (1)∼(15).
4. Proposed Approach
In P1 , minimizing F1 is a 0–1 nonlinear integer programming problem. Minimizing
−F2 is equivalent to maximizing F2 , which is also a 0–1 nonlinear integer programming
problem. At the same time, there is a close coupling relationship between F1 and F2 .
Furthermore, the task properties considered in this paper are not known in advance.
Therefore, it is difficult to solve P1 effectively with traditional optimization algorithms in
an online manner. To tackle this problem in an online manner, we propose an efficient
online two-stage heuristic algorithm named ToRu with much lower complexity. In ToRu,
the minimization of the completion time sum of tasks is pursued in the case of abundant
resources (i.e., the first stage); on the contrary, when the resources are insufficient (i.e., the
second stage), the maximization of the UAV network revenue is pursued.
308
Drones 2023, 7, 132
^ Ċ ` ^ Ċ `^ Ċ `^ Ċ `
UHTXHVWLQJWDVNV UHTXHVWLQJWDVNV UHTXHVWLQJWDVNV UHTXHVWLQJWDVNV
7KH7R5XDOJRULWKPLVUHSHDWHGO\H[HFXWHGLQHDFKWLPHVORWWRSURYLGH
VHUYLFHVIRUWKHUHTXHVWLQJWDVNVLQWKHSUHYLRXVWLPHVORW
θ θ UHVSHFWLYHO\UHSUHVHQWWKHWDVNVZLWKGLIIHUHQWVHUYLFHUHTXLUHPHQWVLQWLPHVORW
θ θ UHVSHFWLYHO\UHSUHVHQWWKHWDVNVZLWKGLIIHUHQWVHUYLFHUHTXLUHPHQWVLQWLPHVORW
θ θ UHVSHFWLYHO\UHSUHVHQWWKHWDVNVZLWKGLIIHUHQWVHUYLFHUHTXLUHPHQWVLQWLPHVORW
Figure 3. The time-slot partitioning protocol. The ToRu algorithm is started by the master UAV at
the beginning of each time-slot and ends at the end of each time-slot, which is repeatedly executed
in each time-slot to provide services for the requesting tasks in the previous time-slot. For example,
ToRu is executed in time-slot #2 to only provide services for requesting tasks in time-slot #1.
309
Drones 2023, 7, 132
7: Compute the candidate UAVs of the SF skn for the task Tn according to C2 ∼ C4 ;
8: if no the candidate UAVs then
9: N f ail + +;
10: Release the resources previously allocated to the task Tn , and clear the corresponding
decision variable in X;
11: Continue;
12: end if
13: Select an optimal candidate UAV Um ∗ based on the principle proposed in Section 4.3 for
310
Drones 2023, 7, 132
6)VEHLQJ 6)VKDYLQJEHHQ
8QLQVWDQWLDWHG6) LQVWDQWLDWHG LQVWDQWLDWHG
&DQGLGDWH 1XPEHURI 1XPEHURIFKDQ
8$9VRIV 6WD\WLPH LGOHFKDQQHOV QHOVUHTXLUHG
8 5LFK
PV FDQGLGDWH8$9V
&DQGLGDWH 1XPEHURI 1XPEHURIFKDQ
7 V V V V V 8$9VRIV 6WD\WLPH LGOHFKDQQHOV QHOVUHTXLUHG
8 PV 5LFK
8 PV FDQGLGDWH8$9V
7 V V V V &DQGLGDWH 1XPEHURI 1XPEHURIFKDQ
8$9VRIV 6WD\WLPH LGOHFKDQQHOV QHOVUHTXLUHG 5LFK
FDQGLGDWH8$9V
8 PV
8 PV 3RRU
7 V V V V V FDQGLGDWH8$9V
&DQGLGDWH 1XPEHURI 1XPEHURIFKDQ
8$9VRIV 6WD\WLPH LGOHFKDQQHOV QHOVUHTXLUHG
8 PV
7 V V V V V V 8 PV
5LFK
FDQGLGDWH8$9V
311
Drones 2023, 7, 132
candidate UAV, so it is firstly instantiated. When facing multiple such upstream SFs
(like s31 ), randomly select one of them for instantiation;
2. When all upstream SFs have multiple candidate UAVs, select the upstream SF with
the candidate UAV that does not occupy the sub-channel, and instantiate it on this
candidate UAV. As shown in Figure 4, s22 contained in T2 has multiple candidate UAVs,
and if it is instantiated on the candidate U8 , no sub-channel is occupied. Therefore, s22
should be first initialized in the current situation. When facing multiple such upstream
SFs (like s22 ), randomly select one of them for instantiation;
3. If there is no upstream SF that satisfies the above principle 1 or principle 2, the
candidate UAVs of each upstream SF are divided into two categories based on the
number of idle channels: the candidate UAVs with the number of idle channels
greater than Ne are called “rich candidate UAVs”, the remaining UAVs are called
“poor candidate UAVs”. Considering the shortage of UAV wireless link resources, the
upstream SFs should be instantiated preferentially on candidate UAVs with abundant
link resources, i.e.,“rich candidate UAVs”, which is beneficial to maximizing F2 .
Therefore, we first select an optimal UAV for the upstream SF with “rich candidate
UAVs”, the specific principles are as follows:
(a) The upstream SFs with only one “rich candidate UAV” are first instantiated. As
shown in Figure 4, s33 contained in T3 has only one “rich candidate UAV”, so it is
instantiated first. When facing multiple such upstream SFs (e.g., s33 ), randomly
select one of them for instantiation;
(b) When the remaining upstream SFs have multiple “rich candidate UAVs”, we rank
their candidate UAVs according to the stay time of a task executed on them, and
the candidate UAV with short stay time is ranked higher. The upstream SF with
the largest gap in the stay time between its first-ranked candidate UAV and its
second-ranked candidate one will first be instantiated on the first candidate one,
which is beneficial to minimizing F1 . As shown in Figure 4, both s44 contained
in T4 and s35 contained in T5 have two “rich candidate UAV”. The gap in the
stay time of s44 (s35 ) on its different candidate UAVs is 10 ms (20 ms), so s35 is first
instantiated on the candidate U9 . When the gap is the same, select one of them at
random for instantiation
In addition, when all upstream SFs with “rich candidate UAVs” have been instantiated
and there are still uninstantiated upstream SFs, i.e., the upstream SFs with only “poor
candidate UAVs”, we regard the “poor candidate UAVs” as “rich candidate UAVs” and
select the optimal UAVs for the uninstantiated SFs according to the above principle (a)
and (b). As shown in Figure 4, both s26 contained in T6 and s47 contained in T7 have only
“poor candidate UAV”, so they are lastly instantiated according to the above principle (a)
and (b). Note that once an SF is successfully instantiated, all candidate UAVs belonging
to uninstantiated SFs must be updated immediately before starting to select the optimal
UAV for the next uninstantiated SF (i.e., step 16∼25 in Algorithm 2). Repeat the above
operations until the SFCs of all tasks are instantiated, whose pseudocode is as shown in
Algorithm 2.
312
Drones 2023, 7, 132
2. An SF is preferentially instantiated on the candidate UAV that does not occupy the
sub-channel;
3. If there is no task that satisfies the above principle 2, the candidate UAVs of each task
are divided into “rich candidate UAVs” and “poor candidate UAVs“ according to the
principle in Section 4.2. Then, we do the following:
(a) When the number of “rich candidate UAVs” is greater than 0, the candidate UAV
with the lowest performance is selected, and the high-performance UAVs are left
for subsequent tasks with higher computation requirement, which is beneficial
to maximizing F2 ;
(b) When the number of “rich candidate UAVs” is equal to 0, the above operations
are performed among “poor candidate UAVs”.
Repeat the above operations for the required SFCs of all tasks are instantiated, whose
pseudocode is as shown in Algorithm 3.
Experimental process. When the number of input tasks is given, the geographic
location of each task is generated randomly, and then the attributes of each task (such as
313
Drones 2023, 7, 132
task length, required SFC, task complexity, etc.) are generated randomly. Finally, each task
randomly requests service from a UAV that covers it. In addition, for a given number of
input tasks, we simulate 100 times and take the average of the simulation results as the
final value.
Comparison Benchmarks. To validate the necessity of each component of the compari-
son benchmarks design, we adopt a step-by-step evaluation philosophy in the experimental
design. For each benchmark algorithm, there are two steps: the order in which these SFCs
are instantiated, and the principle of instantiating SI contained in an SFC. For the first
step, similar to the greedy algorithm [20], two sorting strategies are chosen as comparison
benchmarks: (1) Revenue: the task with highest payment is firstly served; (2) Length: the
task with the shortest SFC length is firstly served. For the second step, three strategies
are chosen as comparison benchmarks: (1) Random: an SF is instantiated on a random
candidate UAV, similar to the random algorithm [32]; (2) Greedy: an SF is instantiated on a
candidate UAV with the best performance, similar to the greedy algorithm [20]; (3) Local:
an SF is only instantiated on a local UAV, similar to the local algorithm [32]. Similar to these
step-wise algorithms [20,32], we have 6 combination algorithms for comparison, marked
as “Revenue + Random”, “Revenue + Greedy”, “Revenue + Local”, “Length + Random”,
“Length + Greedy”, and “Length + Local”.
1. “Revenue + Random”: it first selects the task with the highest payment and performs
SFC scheduling for it; next, when instantiating one SF contained in an SFC, it always
randomly selects one from the candidate UAVs to instantiate this SF.
2. “Revenue + Greedy”: this algorithm first selects the task with the highest payment
and performs SFC scheduling for it; next, when instantiating one SF contained in an
SFC, it always selects the best performance one from the candidate UAVs to instantiate
this SF.
3. “Revenue + Local”: this algorithm first selects the task with the highest payment and
performs SFC scheduling for it; next, when instantiating one SF contained in an SFC,
it always selects the local one from the candidate UAVs to instantiate this SF.
4. “Length + Random”: this algorithm first selects the task with the shortest SFC length
and performs SFC scheduling for it; next, when instantiating one SF contained in an
SFC, it always randomly selects one from the candidate UAVs to instantiate this SF.
5. “Length + Greedy”: this algorithm first selects the task with the shortest SFC length
and performs SFC scheduling for it; next, when instantiating one SF contained in an
SFC, it always selects the best performance one from the candidate UAVs to instantiate
this SF.
6. “Length + Local”: this algorithm first selects the task with the shortest SFC length and
performs SFC scheduling for it; next, when instantiating one SF contained in an SFC,
it always selects the local one from the candidate UAVs to instantiate this SF.
314
Drones 2023, 7, 132
of the number of tasks, the completion time sum of tasks also increases. This is because
the length and type of newly added tasks are random, so UAV computation resources are
used more fully and more tasks are executed successfully. Furthermore, as the number of
tasks continues to increase, the completion time sum of tasks in all algorithms no longer
increases, it even starts to decrease (e.g., “Length + Greedy”). The reason is that tasks
with less execution time are prioritized. To sum up, when the number of tasks exceeds
the critical point (i.e., tasks can not be executed 100%), it is meaningless to evaluate the
completion sum of tasks.
×104
4
ToRu
Revenue + Random
3.5
The total completion time (ms)
Revenue + Greedy
Revenue + Local
3 Length + Random
Length + Greedy
2.5 Length + Local
2 Critical Point
1.5
0.5
4
×10
1.6 ToRu
Revenue + Random
The total completion time (ms)
0.8
0.6
0.4
0.2
10 20 30 40 50
The number of tasks
Figure 6. The completion time sum with different algorithms in the stage of abundant resources.
315
Drones 2023, 7, 132
1.1
ToRu
1 Revenue + Random
The average execution success ratio
Revenue + Greedy
0.9 Revenue + Local
Length + Random
0.8 Length + Greedy
Length + Local
0.7
0.6
0.5
0.4
0.3
0.2
10 30 50 70 90 110 130 150 170 190
The number of tasks
Figure 7. The task execution success ratio with different algorithms.
316
Drones 2023, 7, 132
700 ToRu
Revenue + Random
600 Revenue + Greedy
The overall revenue
Revenue + Local
500 Length + Random
Length + Greedy
400 Length + Local
300
200
100
317
Drones 2023, 7, 132
0
10 30 50 70 90 110 130 150 170 190
The number of tasks
Figure 9. The channel utilization with different algorithms.
1
The computation resource utilization
0.8
0.6
ToRu
0.4
Revenue + Random
Revenue + Greedy
Revenue + Local
0.2 Length + Random
Length + Greedy
Length + Local
0
10 30 50 70 90 110 130 150 170 190
The number of tasks
Figure 10. The computation resource utilization with different algorithms.
318
Drones 2023, 7, 132
shown in Table 3. However, as the number of tasks requesting service continues to increase,
the shortage of resources will become more obvious, which can be easily judged by several
loop operations of Algorithm 2. Therefore, the operation time of Algorithm 2 can be
ignored, and the operation time of ToRu only includes the one of Algorithm 3. As shown in
Table 3, the operation time of the ToRu algorithm decreased sharply after the number of
tasks requesting service exceeds 110, and then maintained a stable small increase. This is
consistent with our complexity analysis results in Section 4.4, indicating that our proposed
algorithm has good execution efficiency and scalability.
6. Conclusions
This paper formulates the SFC scheduling problem as a 0–1 nonlinear integer program-
ming problem in the multi-UAV edge computing network with CPU + FPGA computation
architecture. A two-stage heuristic algorithm named ToRu is put forward to derive a
sub-optimal solution of the problem. At the first stage, the SFCs of all tasks are scheduled
to UAV edge servers in parallel based on the our proposed pairing principle between SFCs
and UAVs for minimizing the completion time sum of tasks; at the second stage, a revenue
maximization heuristic is adopted to schedule the arrived SFCs in a serial service method.
A series of experiments were conducted to evaluate the performance of our proposal. The
results show that our algorithm outperforms other benchmark algorithms in the completion
time sum of tasks, the overall revenue, and the task execution success ratio.
The main limitation of ToRu algorithm lies in the fact that it is designed to realize
the online long-term SFC scheduling based on the stable UAV network topology. In other
words, it cannot be applied directly in the scenario where UAVs frequently join and exit.
In our future work, we plan to design a supplemental algorithm with network topology
prediction capability, which can help ToRu adapt to the dynamic scenario.
Author Contributions: Conceptualization, Y.W. and H.W.; methodology, Y.W. and X.W.; software,
K.Z.; validation, J.F. and J.C.; formal analysis, Y.H.; investigation, R.J.; resources, K.Z.; data curation,
K.Z.; writing—original draft preparation, Y.W.; writing—review and editing, X.W.; visualization, J.C.;
supervision, H.W.; project administration, Y.H.. All authors have read and agreed to the published
version of the manuscript.
Funding: This research was supported by the Natural Science Foundation of China under Grant
No. 62171465.
Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.
Data Availability Statement: Not applicable.
Acknowledgments: The authors would like to thank all coordinators and supervisors involved
and the anonymous reviewers for their detailed comments that helped to improve the quality of
this article.
Conflicts of Interest: The authors declare no conflict of interest.
319
Drones 2023, 7, 132
References
1. Chen, W.; He, R.; Wang, G.; Zhang, J.; Wang, F.; Xiong, K.; Ai, B.; Zhong, Z. Ai assisted phy in future wireless systems: Recent
developments and challenges. China Commun. 2021, 18, 285–297. [CrossRef]
2. Sarikhani, R.; Keynia, F. Cooperative spectrum sensing meets machine learning: Deep reinforcement learning approach. IEEE
Commun. Lett. 2020, 24, 1459–1462. [CrossRef]
3. Zheng, S.; Chen, S.; Qi, P.; Zhou, H.; Yang, X. Spectrum sensing based on deep learning classification for cognitive radios. China
Commun. 2020, 17, 138–148. [CrossRef]
4. Xie, J.; Liu, C.; Liang, Y.-C.; Fang, J. Activity pattern aware spectrum sensing: A cnn-based deep learning approach. IEEE Commun.
Lett. 2019, 23, 1025–1028. [CrossRef]
5. Deng, S.; Zhao, H.; Fang, W.; Yin, J.; Dustdar, S.; Zomaya, A.Y. Edge intelligence: The confluence of edge computing and artificial
intelligence. IEEE Internet Things J. 2020, 7, 7457–7469. [CrossRef]
6. Wang, J.; Wei, X.; Fan, J.; Duan, Q.; Liu, J.; Wang, Y. Request pattern change-based cache pollution attack detection and defense in
edge computing. Digit. Commun. Netw. 2022. [CrossRef]
7. Liu, Z.; Cao, Y.; Gao, P.; Hua, X.; Zhang, D.; Jiang, T. Multi-uav network assisted intelligent edge computing: Challenges and
opportunities. China Commun. 2022, 19, 258–278. [CrossRef]
8. Wu, W.; Zhou, F.; Wang, B.; Wu, Q.; Dong, C.; Hu, R.Q. Unmanned Aerial Vehicle Swarm-Enabled Edge Computing: Potentials,
Promising Technologies, and Challenges. IEEE Wirel. Commun. 2022, 29, 78–85. [CrossRef]
9. Zhao, N.; Lu, W.; Sheng, M.; Chen, Y.; Tang, J.; Yu, F.R.; Wong, K.K. Uav-assisted emergency networks in disasters. IEEE Wirel.
Commun. 2019, 26, 45–51. [CrossRef]
10. Cao, B.; Li, M.; Liu, X.; Zhao, J.; Cao, W.; Lv, Z. Many-Objective Deployment Optimization for a Drone-Assisted Camera Network.
IEEE Trans. Netw. Sci. Eng. 2021, 8, 2756–2764. [CrossRef]
11. Wang, X.; Han, Y.; Leung, V.C.M.; Niyato, D.; Yan, X.; Chen, X. Convergence of edge computing and deep learning: A
comprehensive survey. IEEE Commun. Surv. Tutor. 2020, 22, 869–904. [CrossRef]
12. Dong, C.; Shen, Y.; Qu, Y.; Wang, K.; Zheng, J.; Wu, Q.; Wu, F. Uavs as an intelligent service: Boosting edge intelligence for
air-ground integrated networks. IEEE Netw. 2021, 35, 167–175. [CrossRef]
13. Behravesh, R.; Harutyunyan, D.; Coronado, E.; Riggio, R. Time-sensitive mobile user association and sfc placement in mec-enabled
5g networks. IEEE Trans. Netw. Serv. Manag. 2021, 18, 3006–3020. [CrossRef]
14. Xu, Z.; Gong, W.; Xia, Q.; Liang, W.; Rana, O.F.; Wu, G. Nfv-enabled iot service provisioning in mobile edge clouds. IEEE Trans.
Mob. Comput. 2021, 20, 1892–1906. [CrossRef]
15. Wang, Y.; Wei, X.; Wang, H.; Fan, J.; Chen, J.; Zhao, K.; Hu, Y. Joint UAV deployment, SF placement, and collaborative task
scheduling in heterogeneous multi-UAV-empowered edge intelligence. IET Commun. Early Access Artic. 2023. [CrossRef]
16. Xu, C.; Jiang, S.; Luo, G.; Sun, G.; An, N.; Huang, G.; Liu, X. The case for fpga-based edge computing. IEEE Trans. Mob. Comput.
2022, 21, 2610–2619. [CrossRef]
17. Li, J.; Un, K.-F.; Yu, W.-H.; Mak, P.-I.; Martins, R.P. An fpga-based energy-efficient reconfigurable convolutional neural network
accelerator for object recognition applications. IEEE Trans. Circuits Syst. Express Briefs 2021, 68, 3143–3147. [CrossRef]
18. Yu, X.; Niu, W.; Zhu, Y.; Zhu, H. UAV-assisted cooperative offloading energy efficiency system for mobile edge computing. Digit.
Commun. Netw. 2022. [CrossRef]
19. Zhang, J.; Zhou, L.; Zhou, F.; Seet, B.-C.; Zhang, H.; Cai, Z.; Wei, J. Computation-efficient offloading and trajectory scheduling for
multi-uav assisted mobile edge computing. IEEE Trans. Veh. Technol. 2020, 69, 2114–2125. [CrossRef]
20. Luo, Y.; Ding, W.; Zhang, B. Optimization of task scheduling and dynamic service strategy for multi-uav-enabled mobile-edge
computing system. IEEE Trans. Cogn. Commun. Netw. 2021, 7, 970–984. [CrossRef]
21. Wang, Y.; Ru, Z.-Y.; Wang, K.; Huang, P.-Q. Joint deployment and task scheduling optimization for large-scale mobile users in
multi-uav-enabled mobile edge computing. IEEE Trans. Cybern. 2020, 50, 3984–3997. [CrossRef]
22. Chang, H.; Chen, Y.; Zhang, B.; Doermann, D. Multi-uav mobile edge computing and path planning platform based on
reinforcement learning. IEEE Trans. Emerg. Top. Comput. Intell. 2022, 6, 489–498. [CrossRef]
23. Ren, T.; Niu, J.; Dai, B.; Liu, X.; Hu, Z.; Xu, M.; Guizani, M. Enabling efficient scheduling in large-scale uav-assisted mobile-edge
computing via hierarchical reinforcement learning. IEEE Internet Things J. 2022, 9, 7095–7109. [CrossRef]
24. Xue, J.; Wu, Q.; Zhang, H. Cost optimization of UAV-MEC network calculation offloading: A multi-agent reinforcement learning
method. Ad Hoc Netw. 2022, 136, 102981. [CrossRef]
25. Seid, A.M.; Boateng, G.O.; Mareri, B.; Sun, G.; Jiang, W. Multi-Agent DRL for Task Offloading and Resource Allocation in
Multi-UAV Enabled IoT Edge Network. IEEE Trans. Netw. Serv. Manag. 2021, 18, 4531–4547. [CrossRef]
26. Wu, Z.; Yang, Z.; Yang, C.; Lin, J.; Liu, Y.; Chen, X. Joint deployment and trajectory optimization in UAV-assisted vehicular edge
computing networks. J. Commun. Netw. 2022, 24, 47–58. [CrossRef]
27. Moura, J.; Hutchison, D. Game Theory for Multi-Access Edge Computing: Survey, Use Cases, and Future Trends. IEEE Commun.
Surv. Tutor. 2019, 2, 260–288. [CrossRef]
28. Liu, L.; Zhang, S.; Zhang, L.; Pan, G.; Yu, J. Multi-UUV Maneuvering Counter-Game for Dynamic Target Scenario Based on
Fractional-Order Recurrent Neural Network. IEEE Trans. Cybern. Access Artic. 2022. [CrossRef]
320
Drones 2023, 7, 132
29. Asheralieva, A.; Niyato, D. Hierarchical game-theoretic and reinforcement learning framework for computational offloading
in uav-enabled mobile edge computing networks with multiple service providers. IEEE Internet Things J. 2019, 6, 8753–8769.
[CrossRef]
30. Wu, Q.; Chen, J.; Xu, Y.; Qi, N.; Fang, T.; Sun, Y.; Jia, L. Joint Computation Offloading, Role, and Location Selection in Hierarchical
Multicoalition UAV MEC Networks: A Stackelberg Game Learning Approach. IEEE Internet Things J. 2022, 9, 18293–18304.
[CrossRef]
31. Zhou, H.; Wang, Z.; Min, G.; Zhang, H. UAV-aided Computation Offloading in Mobile Edge Computing Networks: A Stackelberg
Game Approach. IEEE Internet Things J. Early Access Artic. 2022. [CrossRef]
32. Qu, Y.; Dai, H.; Wang, H.; Dong, C.; Wu, F.; Guo, S.; Wu, Q. Service provisioning for uav-enabled mobile edge computing. IEEE J.
Sel. Areas Commun. 2021, 39, 3287–3305. [CrossRef]
33. Wang, G.; Zhou, S.; Zhang, S.; Niu, Z.; Shen, X. SFC-Based Service Provisioning for Reconfigurable Space-Air-Ground Integrated
Networks. IEEE J. Sel. Areas Commun. 2020, 38, 1478–1489. [CrossRef]
34. Li, J.; Shi, W.; Wu, H.; Zhang, S.; Shen, X. Cost-Aware Dynamic SFC Mapping and Scheduling in SDN/NFV-Enabled
Space–Air–Ground-Integrated Networks for Internet of Vehicles. IEEE Internet Things J. 2022, 9, 5824–5838. [CrossRef]
35. Xia, J.; Wang, P.; Li, B.; Fei, Z. Intelligent task offloading and collaborative computation in multi-UAV-enabled mobile edge
computing. China Commun. 2022, 19, 244–256. [CrossRef]
36. Liu, S.; Yang, T. Delay aware scheduling in uav-enabled ofdma mobile edge computing system. IET Commun. 2020, 14, 3203–3211.
[CrossRef]
37. Tan, G.; Shui, C.; Wang, Y.; Yu, X.; Yan, Y. Optimizing the linpack algorithm for large-scale pcie-based cpu-gpu heterogeneous
systems. IEEE Trans. Parallel Distrib. Syst. 2021, 32, 2367–2380. [CrossRef]
38. Kowsalya, T. Area and power efficient pipelined hybrid merged adders for customized deep learning framework for FPGA
implementation. Microprocess. Microsyst. 2020, 72, 102906. [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.
321
MDPI
St. Alban-Anlage 66
4052 Basel
Switzerland
www.mdpi.com
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are
solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s).
MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from
any ideas, methods, instructions or products referred to in the content.
Academic Open
Access Publishing