0% found this document useful (0 votes)
137 views7 pages

Latency Overhead of ROS2 For Modular Time-Critical

1) The document investigates the end-to-end latency of ROS2 data processing pipelines with different Data Distribution Service (DDS) middlewares. 2) It profiles the ROS2 stack and identifies latency bottlenecks, finding that ROS2 can introduce up to 50% latency overhead compared to low-level DDS communications. 3) The results provide guidelines for designing modular ROS2 architectures and reducing ROS2's latency overhead.

Uploaded by

Joel Biwott
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
137 views7 pages

Latency Overhead of ROS2 For Modular Time-Critical

1) The document investigates the end-to-end latency of ROS2 data processing pipelines with different Data Distribution Service (DDS) middlewares. 2) It profiles the ROS2 stack and identifies latency bottlenecks, finding that ROS2 can introduce up to 50% latency overhead compared to low-level DDS communications. 3) The results provide guidelines for designing modular ROS2 architectures and reducing ROS2's latency overhead.

Uploaded by

Joel Biwott
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Latency Overhead of ROS2 for Modular

Time-Critical Systems
Tobias Kronauer∗ , Joshwa Pohlmann∗ , Maximilian Matthé∗ , Till Smejkal† and Gerhard Fettweis∗
∗ Barkhausen Institute, Dresden, Germany, firstname.lastname@barkhauseninstitut.org
† Operating Systems Group, TU Dresden, Dresden, Germany, till.smejkal@tu-dresden.de

Abstract—The Robot Operating System 2 (ROS2) targets


distributed real-time systems. Especially in tight real-time control
loops, latency in data processing and communication can lead to
arXiv:2101.02074v1 [cs.RO] 6 Jan 2021

instabilities. As ROS2 encourages splitting of the data-processing


pipelines into several modules, it is important to understand the
latency implications of such modularization.
In this paper, we investigate the end-to-end latency of ROS2
data-processing pipeline with different Data Distribution Service
(DDS) middlewares. In addition, we profile the ROS2 stack and
point out latency bottlenecks. Our findings indicate that end-
to-end latency strongly depends on the used DDS middleware.
Moreover, we show that ROS2 can lead to 50 % latency overhead
compared to using low-level DDS communications. Our results
imply guidelines for designing modular ROS2 architectures and
indicate possibilities for reducing the ROS2 overhead. Fig. 1: API layout from ROS2, taken from [10].
Index Terms—distributed systems, mobile robotics, latency,
profiling, ROS2
data contained in this message subscribe to that topic. Each
I. I NTRODUCTION node can have multiple subscribers and publishers. [9]
Through the advent of robotic applications, such as au- Although still using the publish/subscribe mechanism of
tonomous driving or household robotics, the robot operating ROS1, ROS2 builds its transport layer on a new middleware.
system (ROS) emerged as one of the most widely used software The middleware is an implementation of the DDS standard [7]
development frameworks. With more than tens of thousands of that is widely used for distributed, real-time systems. Without
users [1], it is widely used in academia as well as in industry [2]. changing much of the usercode of ROS1, the goal was to
hide the DDS middleware and its API to the ROS2 user
Shortcomings of ROS1, but also its popularity lead to the
as shown in Fig. 1. For that purpose, middleware interface
development of its successor, ROS2, which is developed by the
modules (rmw in short for ros middleware) were introduced.
Open Source Robotics Foundation (OSRF) with many industrial
The most recent release of ROS2, Foxy Fitzroy, supports
contributors [3], [4]. Although being under heavy development,
eProsima FastRTPS, Eclipse Cyclone DDS, and RTI Connext
it is already used by notable companies [5] and the Open Source
as DDS middleware. [10], [11].
community [6]. Reasons for the development of ROS2 are new
use cases, e.g. multiple robots, small embedded platforms, and Since it is not the main focus of the paper, we will only
real-time capability. Further, new technologies are available briefly summarize DDS. The summary is based on the most
such as the Data Distribution Service (DDS) [7]. [8] recent DDS specification 1.4 [7]. It describes a Data-Centric
ROS2 and its predecessor share the same core concept. Publish-Subscribe model (DCPS) for distributed applications
Therefore, we refer to software architectures in ROS1 or ROS2 enabling reliable, configurable, and real-time capable informa-
as ROS systems. Among others, a ROS system comprises nodes, tion between information points. Similar to ROS1 and ROS2,
messages, and topics. A node is a software module performing there are publishers that publish information of different data
computation. As ROS features the notion of modularity, a ROS types. Each publisher has a DataWriter that can be regarded as
system is usually composed of several nodes. These nodes a source of information providing data of predefined type. The
exchange information by passing messages. These messages subscriber, on the other hand, is responsible for receiving the
only contain typed data structures that are standard primitive data sent by the publisher. A DataReader is responsible for
data types. A message is published by a node to a given topic, reading the information obtained by the subscriber. As in ROS,
which is described by a string. Nodes that are interested in the a topic fulfills the task to connect the publisher with one or
more subscribers. A distinguishing feature of DDS is the use of
This research was co-financed by public funding of the state of Saxony, quality of service (QoS) policies. Each entity has an assigned
Germany. We thank RTI for the license and support. QoS policy determining the behavior of each entity concerning
communication and discovery. There is a huge variety of QoS computer to the embedded device that sends the message back.
parameters that can be set. The authors point out how this round-trip latency is influenced
The described concept advocates the use of a distributed, by the network and OS stack.
modularized system. Robotics applications that are based on The previous papers emphasize the influence of various
a ROS system often use a software architecture of many QoS parameters and DDS middlewares on the latency and
nodes with well-defined interfaces [12]. Complex systems like throughput. [19] propose an architecture that dynamically
autonomous cars can benefit from this approach by being easier binds different DDS middlewares to ROS2 nodes to exploit
to evolve and to adapt. However, this entails a certain latency performance benefits of the DDS middlewares under varying
to the system. Since real-time capability is one of the main circumstances.
feature claims of ROS2, the question arises how much latency [20] focused on improving the Inter Process Communication
is entailed by a system of many nodes compared to a system and compared it to the ROS2 implementation obtaining a
that performs the same computation in a single node. This lower latency and higher throughput. [21] evaluated latency
leads to the central question of this paper: How does a ROS2 and throughput in combination with encryption.
system cope with scalability? Latency plays a critical factor for A. Contribution
new technologies, e.g. autonomous cars, edge cloud systems,
and plenty of other applications enabled by 5G [13]–[15]. To the best of the author’s knowledge, [16] did the most
The paper is structured as follows: In Sec. II, we will comprehensive evaluation from a pure ROS2 perspective. Yet,
summarize available publications that analyze ROS2 in terms the authors used a ROS2 alpha version and there have been
of latency followed by the description of the contribution of a couple of releases in the meantime. Although [17] did
our work. In the succeeding section, we will briefly explain our evaluations on realistic hardware, they only provided a brief,
methodology for the latency evaluation. Results are presented very high-level evaluation of their software architecture. The
in section Sec. IV. This work will be concluded with a future evaluations were mostly constrained to two or three nodes,
outlook in Sec. V. i.e. it was not evaluated if the results can be scaled to a node
system comprised of many nodes.
II. R ELATED W ORK In this paper, we answer that question and provide the user
At the early stage of development, ROS2 was evaluated with some guidelines to go along if the latency of a ROS system
by [16] who used an alpha version and compared it with ROS1. is to be decreased. For that, we perform an intra-layer profiling
Messages with sizes between 256 B and 4 MB were published to concisely evaluate the bottlenecks in the communication in
at a 10 Hz rate. The authors also vary the DDS middleware order to show possible implementation improvements for future
used by ROS2, i.e. RTI Connext, OpenSplice, and FastRTPS. ROS2 releases. Parameter values, such as QoS parameters,
The number of nodes that subscribe to a certain topic another publisher frequency, etc. will be derived from a use case of a
node publishes to were varied as well. Latency, throughput modular control loop. We do not focus on the DDS middlewares
via Ethernet, memory consumption, and threads per node were as there are benchmarks on the homepage of the vendors. In
chosen as measurement metrics. Additionally, the overhead addition, configuration strongly depends on the use case.
entailed by ROS2 to the latency is evaluated. They emphasize III. M ETHODOLOGY
the importance of the message size for end-to-end latencies At first, we will define our use case in Sec. III-A, which is
via Ethernet with a constant latency and ROS2 overhead until followed by the description of the parameter space in Sec. III-B.
64 KB with the QoS policy being the major part. Further, they Afterwards, we will introduce measurement metrics and present
recommend a fragment size of 64 KB that is used by the DDS our evaluation framework in Sec. III-C and Sec. III-D.
middleware UDP protocol.
The results obtained were confirmed by [17] who evaluated A. Use Case
ROS2 for autonomous cars on two KIA Niros with respect to We focus on the use case of a robot platform that is modular
real-time capability. They figured out that the results depend on in terms of hardware, i.e. different sensors, and software, i.e.
the Linux kernel. Their software architecture is composed of different data processing nodes. As pointed out in [9], the
14 nodes on i7 Intel NUC. Messages were sent at a frequency ROS system is meant to be composed of multiple nodes.
of 33 Hz and statistics of the time difference between two Therefore, each sensor is represented by a node that accesses
communicating nodes were calculated. By using a real-time the sensor hardware and publishes a message containing the
kernel for Linux the jitter of the time difference could be sensor readings. We assume subsequent nodes that post-process
significantly reduced. Using this information, the overall car the data and afterwards pass it to an estimator, followed by a
behavior was analyzed under different scenarios. control loop, and afterwards to the actuator. To generalize such
The first official ROS2 release in December 2017, Ardent system, we define a setup that passes a message from a starting
Apalone, was evaluated by [18] using a real-time Linux kernel node, in this case the sensor, through a couple of nodes, to
over Ethernet. By varying system loads, e.g. CPU load and the end node, namely the actuator. Due to the modularity, we
concurrent traffic, round-trip latency was measured. One node assume different sensors, e.g. an IMU, a LiDAR, a camera, etc.
was launched on a normal computer, another node was launched with different sampling rates and data sizes. Fig. 2 visualizes
on an embedded device. A message was sent from the normal the setup.

2
B. Parameter Space Publish / Subscriber Sequence

The sampling frequency of the sensor is a variable parameter.


Assuming that the sensor reading node only reads the sensor App
rclcpp rcl rmw_* DDS

data and immediately publishes the message, this corresponds Publisher

to the publisher frequency. The data size of the sensor reading Publisher::publish RCLCPP_INTERPROCESS_PUBLISH

is variable as well. In the ROS system, the payload of the rcl_publish RCL_PUBLISH

message represents this quantity. The number of nodes, which Publisher ROS2 Common
rmw_publish RMW_PUBLISH
Publisher rmw
includes the start and end node in a data-processing pipeline dds_write* DDS_WRITE
can be modified as well. Moreover, we can change the Quality to network DDS
of Service settings. The aforementioned parameters remain Subscriber

constant throughout the data-processing pipeline. spin

loop [forever]
Executor::wait_for_work
C. Measurement Metrics
rcl_wait
Latency is our main measurement metric. In accordance with
rmw_wait
the previous papers mentioned in Sec. II, the latency is defined dds_wait**

between publishing a message and receiving the message, i.e. from network

the call of the callback of the subscriber. DDS_ON_DATA


Rclcpp Notification Delay
The focus of the evaluation is the latency entailed by the
ROS system and not by the DDS middleware. Therefore, we take_and_do_error_handling

want to profile the call stack between publishing and subscriber take_type_erased
RCLCPP_TAKE_ENTER
callback for evaluating the overhead of ROS2 core and the
rcl_take RCL_TAKE_ENTER
middleware interfaces. As statistical quantity, we choose median Subscriber rmw
Subscriber ROS2 Common
take_with_info RMW_TAKE_ENTER
as it is also used by [16] and resilient to outliers. DDS
dds_take*** DDS_TAKE_ENTER

D. Evaluation Framework DDS_TAKE_LEAVE

We found two ROS2 evaluation frameworks available Subscriber ROS2 Common


RMW_TAKE_LEAVE

that are actively maintained: ros2-performance by RCL_TAKE_LEAVE

irobot-ros [22] and performance_test by ApexAI [23]. RCLCPP_TAKE_LEAVE

handle_message
performance_test benchmarks only ROS2 Dashing Di- RCLCPP_HANDLE

ademata by default whereas ros2-performance sup- listener

ports later ROS2 versions as well. Further, it is based on


App
performance_test and was also used for evaluating the rclcpp rcl rmw_* DDS

intra-process communication of ROS2 [24]. The focus is more


on defining a node system and evaluating the ROS2 node system
from a ROS2 perspective as opposed to performance_test Fig. 3: Sequence diagram of necessary rmw adaptions to enable
that aims at fine-tuning QoS parameters and the DDS middle- profiling. Notes indicate the layers. Colored boxes highlight
ware. Therefore, we forked ros2-performance. All repos- the profiling categories.
itories related to our evaluation can be found on GitHub [25].
Fig. 3 depicts the implementation of an intra-layer profiling
for ROS2 messages. The implementation is bundled within to be ensured. The timestamps correspond to the position of
a docker container containing the custom ROS2 build. The the notes in Fig. 3 in the ROS2 callstack. Timestamps are
latency overhead of docker is negligible [26] provided that recorded when entering and leaving the method of the layer.
it is run in the host network. For the implementation, we Arrow annotations denote the called function and asterisks
added a ros2profiling package providing functions to indicate placeholders for the actual function name, as this
generate timestamps since the Epoch in nanoseconds. It patches depends on the middleware. Timestamps were categorized into
the ROS2 source tree to include timestamps into processed the following categories:
messages. Therefore, a minimum message size of 100 B needs • DDS: This category only contains latency entailed by the
DDS transport via network and by the function call to
actually receive the message.
• Subscriber rmw and Publisher rmw: Latency attributed
to the rmw layer. Middleware-specific conversions from
ROS2 messages to DDS messages that might require DDS
utility functions but are not used for the transport of the
message itself are contained in this category.
Fig. 2: Visualization of evaluation setup. Parameters are • Publisher and Subscriber ROS2 Common: Overhead
highlighted in italic. entailed by ROS2 that is independent of the middleware.

3
median latency [us] FastRTPS CycloneDDS Connext
1,000 1,000 30,000
750 750 20,000
500 500
10,000
250 250
0 0 0
1 20 40 60 80 100 1 20 40 60 80 100 1 20 40 60 80 100
Frequency [Hz] Frequency [Hz] Frequency [Hz]

100 B 1 KB 10 KB 100 KB 500 KB

Fig. 4: Investigating the influence of publisher frequency on latency with three nodes. Evaluation is performed on the desktop
PC, QoS reliability is set to BEST_EFFORT. Note the different scaling of the a y-axis for Connext.

Benchmarking: This latency is entailed by code used for


• chosen but inspired by [17] who used 14 nodes for their
benchmarking the latency. autonomous car setup.
• Rclcpp Notification Delay: The time difference between In all cases, we use a time duration for one configuration
the notification of DDS to ROS2 that new data is available run of 60 seconds. Depending on the publisher frequency, this
and triggering of its actual retrieval. results in a varying amount of samples, i.e. the higher the
Latency is measured on one machine, as we are interested publisher frequency the higher the amount of samples. We
in the ROS2 overhead and not on the over-the-network discard the first ten samples to mitigate initialization effects.
performance, which is dictated by DDS itself. Nodes of the The default settings of the Linux kernel are highly adaptive
data-processing pipeline are created in the same process but to the load and other external factors. In order to inhibit idle
in different threads. Each node is associated with its own and energy saving mechanisms, which add additional noise to
StaticSingleThreadedExecutor spinning in its own the latency measured, we change the kernel parameters: the
thread. scaling governor is set to userspace and core frequencies
are set to their maximum value. We deactivate the CPU idle by
IV. E VALUATION passing cpuidle.off=1 as a kernel parameter. The option
We compare two different hardware settings, namely a no_hz disables the scheduler tick if no work is to be done
desktop PC, with an Intel i7-8700 CPU @ 3.2 GHz x 6 and on that CPU, which is why we deactivate it. For the desktop
32 GB RAM. In addition, we use a Raspberry Pi 4 Model PC, we turn off hyperthreading.
B Rev 1.1 with 4 GB RAM. Kernel versions are 5.4.0-42-
generic for the desktop PC and 5.4.0-1015-raspi for Raspberry A. Evaluation of Publisher Frequency
Pi. Foxy Fitzroy 20200807 is used as ROS2 version for our We use three nodes with a publisher frequency between
evaluation. We set the network device as localhost with 1 Hz and 100 Hz as depicted in Fig. 4. This corresponds to
UDP as transport layer. The kernels were updated to the most a ping-pong scenario. While sticking to the desktop PC as
recent versions. At the time of writing, the Connext middleware hardware and the QoS reliability to BEST_EFFORT, we use
was not available for the Raspberry. Therefore, results can only all possible payload sizes.
be obtained for FastRTPS and CycloneDDS. Connext version For all DDS middlewares, we can see that the median
is 5.3.1, FastRTPS is 2.0.1, and CycloneDDS is 0.6.0. latency decreases with increased frequency. This might be
The variable parameters are listed in Tab. I. We assume a due to (unknown) energy saving features in other hardware
publisher frequency of maximum 100 Hz, which is realistic for devices or reprioritized thread scheduling. Further, we can see
e.g. time-of-flight sensors. GPS sensors operate at an update that the obtained latencies for 100 B, 1 KB, 10 KB are equal,
rate of 1 Hz up to 10 Hz. We assume an update rate for cameras but increase for 100 KB and 500 KB. This can be explained
up to 100 Hz. Therefore, the publisher frequency is in a range by the maximum UDP packet size of 64 KB which incurs
from 1 Hz to 100 Hz with a step size of 10 Hz. As variable fragmentation for large payloads. This additional overhead
QoS policies, we modify the reliability from RELIABLE to differs with the employed middleware. FastRTPS is slighty
BEST_EFFORT, since they are frequently used. The number of slower than CycloneDDS, whereas Connext entails the highest
nodes in the data-processing pipeline varies between three and latency. Similar results were obtained for Raspberry Pi.
23 nodes with a step size of two. These values are arbitrarily In discussions with maintainers of the Connext middleware,
we found out that the rmw is not maintained by RTI itself. We
TABLE I: Variable parameter values. were told that the implementation is highly suboptimal, i.e. lots
of calculations and copy instructions need to be performed if
Publisher frequency 1, 10, . . . , 90, 100
Payload 100 B, 1 KB, 10 KB, 100 KB, 500 KB
a ROS2 message is converted to a DDS message. Further, the
Number of Nodes 3, 5, . . . , 21, 23 node to participant mapping, which results in a high resource
DDS Backend Connext, FastRTPS, CycloneDDS consumption if nodes are created in the same process, was only
Reliability reliable, best effort optimized for the rmw of FastRTPS and CycloneDDS [27].

4
FastRTPS, 100 B CyloneDDS, 100 B Connext, 100 B
Median Latency [us]
3,000 3,000 10,000

2,000 2,000
5,000
1,000 1,000

0 0 0
3 7 11 15 19 23 3 7 11 15 19 23 3 7 11 15 19 23
FastRTPS, 500 KB CyloneDDS, 500 KB Connext, 500 KB
Median Latency [us]

15,000 15,000 1 · 106

10,000 10,000
5 · 105
5,000 5,000

0 0 0
3 7 11 15 19 23 3 7 11 15 19 23 3 7 11 15 19 23
Nodes Nodes Nodes

1 Hz 40 Hz 80 Hz 100 Hz

Fig. 5: Investigation of the scalability of a node system. We used a payload of 100 B (upper row) and of 500 KB (lower row).
The DDS middleware was varied. For visualization purposes, only a few frequencies were picked. Evaluation was performed
on the desktop PC with QoS-reliability BEST_EFFORT. Note the different y-axis scaling for Connext.

FastRTPS, 100 B CycloneDDS, 100 B Connext, 100 B


Median Latency [us]

3,000 3,000
6,000
2,000 2,000
4,000
1,000 1,000
2,000
0 0 0
3 7 11 15 19 23 3 7 11 15 19 23 3 7 11 15 19 23
FastRTPS, 500 KB CycloneDDS, 500 KB Connext, 500 KB
Median Latency [us]

15,000 15,000 1 · 106


10,000 10,000
5 · 105
5,000 5,000

0 0 0
3 7 11 15 19 23 3 7 11 15 19 23 3 7 11 15 19 23
Raspberry Pi, FastRTPS, 100 B Raspberry Pi, CycloneDDS, 100 B
Median Latency [us]

4,000 4,000 DDS


Benchmarking
2,000 2,000 Publisher ROS2 Common
Publisher rmw
Subscriber ROS2 Common
0 0
Subscriber rmw
3 7 11 15 19 23 3 7 11 15 19 23
Rclcpp Notification Delay
Nodes Nodes

Fig. 6: Categorization of intra-process profiling of different latencies. Evaluation was performed on the desktop PC with QoS
reliability BEST_EFFORT. Note the different scaling of the y-axis for Connext. The frequency is 100 Hz.

Therefore, higher latencies were already expected prior to the We can observe the same pattern as in Fig. 4: the latency
evaluation. We were told that significant improvements are to decreases if the frequency is increased. This effect is intensified
be expected for the upcoming Connext rmw release, which is with the length of the data-processing pipeline. For a payload
in early release testing. of 100 B, the results indicate a linear relationship between
the number of nodes and the median latency for FastRTPS
B. Evaluation of Scalability and CycloneDDS. The outliers may occur due to too few
samples, which is the case for lower frequencies. It needs to
We evaluate the median latency from starting to end node be further investigated if the relationship will be more linear if
for the data-process pipeline ranging from 3 to 23 nodes. As the number of samples is larger. Similar results were observed
payload, we use 100 B and 500 KB. For visualization purposes, for the Raspberry Pi. However, the graph for CycloneDDS is
we restrict ourselves to a subset of the evaluated frequencies. more linear. Reasons can be a simpler hardware, which is not
Results are shown in Fig. 5. as adaptively controlled to load as the desktop PC.

5
In the case of Connext, a nonlinear relationship for smaller Relative deviation of median between BEST_EFFORT and RELIABLE
frequencies is visible. Furthermore, Connext only yields results 20
until 15 nodes, as for higher node numbers the Connext rmw 15
raises exceptions. This is independent of the parameter set.

Relative Deviation [%]


For 500 KB, we can observe a more erratic behavior in the 10
case of FastRTPS. A nonlinear increase of the median latency 5
can be suspected. In order to further investigate this assumption,
0
the number of nodes needs to be increased in future works. In
the case of CycloneDDS, there seems to be a saturation for −5
80 Hz and 100 Hz, which also needs to be verified in the future. −10 FastRTPS
Connext yields a strong nonlinear relationship for frequencies CycloneDDS
−15
of 80 Hz and 100 Hz. Last but not least, the lowest median Connext
latency was measured for CycloneDDS. −20
3 7 11 15 19 23
C. Profiling Nodes

For a payload of 100 B, the largest amount of latency can Fig. 7: Evaluation of influence of QoS reliability settings on
be attributed to the categories DDS and Rclcpp Notification median latency. For FastRTPS and CycloneDDS, we choose
Delay as seen in Fig. 6. Especially for Connext, the largest 500 KB and 100 Hz. In the case of Connext, we have a payload
portion of the latency can be attributed to the DDS middleware of 500 KB and a publisher frequency of 40 Hz.
itself. However, the overhead of ROS2 compared to raw DDS
amounts up to 50 % for small messages.
For a payload of 500 KB, one can clearly see that for all CycloneDDS yields the lowest latency.

middlewares, the major part of the median latency is due to The DDS middleware and the delay between message

the DDS middleware. In the case of FastRTPS, the latency notification and message retrieval by ROS2 contribute the
seems to increase nonlinearly. In addition, we can observe that biggest portions to the overall latency.
the median latency entailed by Rclcpp Notification Delay is • The Connext rmw is highly suboptimal. In later releases,

higher than in the case of 100 B, i.e it is payload-dependent. this will most likely change.
The erratic behavior of Connext can be mainly attributed to • Latency is larger on Raspberry Pi, however the qualitative

the DDS middleware, but also to the categories Subscriber results are the same. Fluctuation in latency is less
rmw and Publisher rmw. compared to the desktop PC.
As we focus on possible overhead reductions of ROS2 in this During our evaluation, we observe that latency highly
paper, we assume that the DDS middlewares cannot be changed. depends on energy saving features of the OS and the hardware.
Thus, we can observe that major performance improvements Our main focus was on the CPU. However, energy saving
can be obtained for the category Rclcpp Notification Delay. features of the NIC, e.g., might play an important role. This
Similar results could be obtained for the Raspberry Pi with a needs to be taken carefully into consideration if latency is to
higher latency, cf. Fig. 6. be mitigated for real-time critical applications.
For the evaluation of network-independent ROS2 overhead,
D. Influence of QoS Reliability we created the nodes in one process on the same machine.
In the last sections, the QoS reliability policy was set to This use case is unrealistic as Intra-Process Communication
BEST_EFFORT and kept as a constant parameter. Because of would normally be used as this approach is much more efficient.
the use of localhost as network device, the network is not As we use separate executors per node, there should not be
lossy, i.e. the influence of the policy will not be immediately much difference between creating the nodes in one process as
visible. Therefore, we pick the highest possible throughput and opposed to creating nodes in separate processes. However, as
calculate the relative deviation between the median latencies pointed out by [27], the node to participant mapping is highly
obtained with the QoS policy BEST_EFFORT and RELIABLE. suboptimal in the Connext rmw. This might be the reason for
As can be seen in Fig. 7, no trend is actually visible, i.e. we the bad performance. Additionally, the Connext rmw is highly
cannot simulate a lossy network with the available parameter suboptimal in general as thoroughly explained in Sec. IV-A.
sets. Similar results were obtained for the Raspberry Pi. This will be fixed in future releases as discussed with RTI.
In future work, one might consider an evaluation in a
V. C ONCLUSION AND F UTURE O UTLOOK
more realistic setup, i.e. with distributed systems. Network
The goal of this paper was to provide the reader with simple effects could be better evaluated. An effect of the QoS
guidelines if ROS2 is used for time-critical systems. Given our reliability possibility should be observable. Aside from the
parameter set, we discovered the following rules of thumb: median, one could evaluate other statistical quantities or verify
• With a payload higher than the fragmentation size of UDP if the messages follow a certain distribution. The obtained
(here, 64 KB), latency increases with the payload size. information could then be used as an additional uncertainty
• The higher the frequency, the lower the latency. for state estimation and incorporated into the Kalman Filter.

6
R EFERENCES [21] J. Kim, J. M. Smereka, C. Cheung, S. Nepal, and M. Grobler,
“Security and Performance Considerations in ROS 2: A Balancing
Act,” arXiv:1809.09566 [cs], Sep. 2018. [Online]. Available: http:
[1] OSRF, Community Metrics Report, 2019 (accessed August 17, //arxiv.org/abs/1809.09566
2020). [Online]. Available: http://download.ros.org/downloads/metrics/ [22] iRobot, ros2-performance, 2020 (accessed October 10, 2020). [Online].
metrics-report-2019-07.pdf Available: https://github.com/irobot-ros/ros2-performance
[2] ——, ROS Robots, 2020 (accessed August 17, 2020). [Online]. [23] ApexAI, performance test, 2020 (accessed October 10, 2020). [Online].
Available: https://robots.ros.org/ Available: https://gitlab.com/ApexAI/performance test/
[3] ——, Project Governance, 2020 (accessed August 17, 2020). [Online]. [24] O. Robotics, Intra-process Communications in ROS 2, 2020 (accessed
Available: https://index.ros.org/doc/ros2/Governance/#governance August 17, 2020). [Online]. Available: http://design.ros2.org/articles/
[4] D. Casini, T. Blaß, I. Lütkebohle, and B. B. Brandenburg, “Response- intraprocess communications.html
Time Analysis of ROS 2 Processing Chains Under Reservation-Based [25] Barkhausen-Institut, Benchmarking, 2020 (accessed October 10, 2020).
Scheduling,” in 31st Euromicro Conference on Real-Time Systems [Online]. Available: https://github.com/Barkhausen-Institut/projects
(ECRTS 2019), ser. Leibniz International Proceedings in Informatics [26] W. Felter, A. Ferreira, R. Rajamony, and J. Rubio, “An updated
(LIPIcs), S. Quinton, Ed., vol. 133. Dagstuhl, Germany: Schloss performance comparison of virtual machines and Linux containers,”
Dagstuhl–Leibniz-Zentrum fuer Informatik, 2019, pp. 6:1–6:23. [Online]. in 2015 IEEE International Symposium on Performance Analysis of
Available: http://drops.dagstuhl.de/opus/volltexte/2019/10743 Systems and Software (ISPASS), 2015, pp. 171–172.
[5] LGSVL, LGSVL Simulator, 2020 (accessed August 17, 2020). [Online]. [27] OSRF, Node to Participant mapping, 2020 (accessed August 17, 2020).
Available: https://www.lgsvlsimulator.com/ [Online]. Available: http://design.ros2.org/articles/Node to Participant
[6] gazebo ros2 control, 2020 (accessed August 17, 2020). [Online]. mapping.html
Available: https://github.com/gazebo ros2 control
[7] OMG, Data Distribution Service, 2015 (accessed August 17, 2020).
[Online]. Available: https://www.omg.org/spec/DDS/
[8] OSRF, Why ROS 2?, 2020 (accessed August 17, 2020). [Online].
Available: https://design.ros2.org/articles/why ros2.html
[9] M. Quigley, K. Conley, B. Gerkey, J. Faust, T. Foote, J. Leibs, R. Wheeler,
and A. Y. Ng, “ROS: an open-source Robot Operating System,” in ICRA
workshop on open source software, vol. 3, no. 3.2. Kobe, Japan, 2009,
p. 5.
[10] OSRF, ROS on DDS, 2019 (accessed August 17, 2020). [Online].
Available: https://design.ros2.org/articles/ros on dds.html
[11] ——, About different ROS 2 DDS/RTPS vendors, 2020 (accessed
August 17, 2020). [Online]. Available: https://index.ros.org/doc/ros2/
Concepts/DDS-and-ROS-middleware-implementations/
[12] M. Naumann, F. Poggenhans, M. Lauer, and C. Stiller, “CoInCar-Sim: An
Open-Source Simulation Framework for Cooperatively Interacting Auto-
mobiles,” in 2018 IEEE Intelligent Vehicles Symposium (IV), 2018, pp.
1–6.
[13] M. A. Lema, A. Laya, T. Mahmoodi, M. Cuevas, J. Sachs, J. Markendahl,
and M. Dohler, “Business case and technology analysis for 5G low latency
applications,” IEEE Access, vol. 5, pp. 5917–5935, 2017.
[14] S. Maheshwari, D. Raychaudhuri, I. Seskar, and F. Bronzino,
“Scalability and Performance Evaluation of Edge Cloud
Systems for Latency Constrained Applications,” in
2018 IEEE/ACM Symposium on Edge Computing (SEC), 2018,
pp. 286–299.
[15] F. Voigtländer, A. Ramadan, J. Eichinger, C. Lenz, D. Pensky, and
A. Knoll, “5G for Robotics: Ultra-Low Latency Control of Distributed
Robotic Systems,” in 2017 International Symposium on Computer
Science and Intelligent Controls (ISCSIC), 2017, pp. 69–72.
[16] Y. Maruyama, S. Kato, and T. Azumi, “Exploring the performance
of ROS2,” in Proceedings of the 13th International Conference on
Embedded Software - EMSOFT ’16. Pittsburgh, Pennsylvania: ACM
Press, 2016, pp. 1–10.
[17] M. Reke, D. Peter, J. Schulte-Tigges, S. Schiffer, A. Ferrein, T. Walter,
and D. Matheis, “A Self-Driving Car Architecture in ROS2,” in 2020
International SAUPEC/RobMech/PRASA Conference, Jan. 2020, pp. 1–
6.
[18] C. S. V. Gutiérrez, L. U. S. Juan, I. Z. Ugarte, and V. M. Vilches,
“Towards a distributed and real-time framework for robots: Evaluation
of ROS 2.0 communications for real-time robotic applications,”
arXiv:1809.02595 [cs], Sep. 2018, arXiv: 1809.02595. [Online].
Available: http://arxiv.org/abs/1809.02595
[19] R. Morita and K. Matsubara, “Dynamic Binding a Proper DDS Im-
plementation for Optimizing Inter-Node Communication in ROS2,” in
2018 IEEE 24th International Conference on Embedded and Real-Time
Computing Systems and Applications (RTCSA), Aug. 2018, pp. 246–
247, iSSN: 2325-1301.
[20] Y.-P. Wang, W. Tan, X.-Q. Hu, D. Manocha, and S.-M. Hu, “TZC:
Efficient Inter-Process Communication for Robotics Middleware with
Partial Serialization,” in 2019 IEEE/RSJ International Conference on
Intelligent Robots and Systems (IROS), Nov. 2019, pp. 7805–7812, iSSN:
2153-0866.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy