of 16
1
Neural-Swarm2: Planning and Control of
Heterogeneous Multirotor Swarms using
Learned Interactions
Guanya Shi, Wolfgang H
̈
onig, Xichen Shi, Yisong Yue, and Soon-Jo Chung
Abstract
—We
present
Neural-Swarm2
,
a
learning-based
method for motion planning and control that allows heteroge-
neous multirotors in a swarm to safely fly in close proximity. Such
operation for drones is challenging due to complex aerodynamic
interaction forces, such as downwash generated by nearby drones
and ground effect. Conventional planning and control methods
neglect capturing these interaction forces, resulting in sparse
swarm configuration during flight. Our approach combines a
physics-based nominal dynamics model with learned Deep Neural
Networks (DNNs) with strong Lipschitz properties. We evolve two
techniques to accurately predict the aerodynamic interactions
between heterogeneous multirotors: i) spectral normalization for
stability and generalization guarantees of unseen data and ii)
heterogeneous deep sets for supporting any number of hetero-
geneous neighbors in a permutation-invariant manner without
reducing expressiveness. The learned residual dynamics benefit
both the proposed interaction-aware multi-robot motion planning
and the nonlinear tracking control designs because the learned in-
teraction forces reduce the modelling errors. Experimental results
demonstrate that
Neural-Swarm2
is able to generalize to larger
swarms beyond training cases and significantly outperforms a
baseline nonlinear tracking controller with up to three times
reduction in worst-case tracking errors. Video is available at
https://youtu.be/Y02juH6BDxo.
I. I
NTRODUCTION
T
HE ongoing commoditization of unmanned aerial vehi-
cles (UAVs) requires robots to fly in much closer prox-
imity to each other than before, which necessitates advanced
planning and control methods for large aerial swarms [1, 2].
For example, consider a search-and-rescue mission where an
aerial swarm must enter and search a collapsed building. In
such scenarios, close-proximity flight enables the swarm to
navigate the building much faster compared to swarms that
must maintain large distances from each other. Other important
applications of close-proximity flight include manipulation,
search, surveillance, and mapping. In many scenarios, hetero-
geneous teams with robots of different sizes and sensing or
manipulation capabilities are beneficial due to their signifi-
cantly higher adaptability. For example, in a search-and-rescue
mission larger UAVs can be used for manipulation tasks or
to transport goods, while smaller ones are more suited for
exploration and navigation.
A major challenge of close-proximity control and planning
is that small distances between UAVs create complex aerody-
The authors are with California Institute of Technology, USA.
{
gshi,
whoenig, xshi, yyue, sjchung
}
@caltech.edu
.
The work is funded in part by Caltech’s Center for Autonomous Systems
and Technologies (CAST) and the Raytheon Company.
L
arge
R
obot
S
mall
R
obot
G
round
E
ffect
(
b
)
(
c
)
Heterogeneous
Deep
Sets
{
$%&''
,
$%&''
,
)*+
}
'&./)
Inputs
(
a
)
4
E
nvironment
Learned
I
nteraction
Dynamics
N
ominal
D
ynamics
Motion
Planner
Nonlinear
Stable
Controller
+
Minimum
V
ertical
D
istance
:
ퟐퟒ
퐜퐦
1
2
3
Fig. 1. We learn complex interaction between multirotors using heterogeneous
deep sets and design an interaction-aware nonlinear stable controller and a
multi-robot motion planner (a). Our approach enables close-proximity flight
(minimum vertical distance 24 cm) of heterogeneous aerial teams (16 robots)
with significant lower tracking error compared to solutions that do not consider
the interaction forces (b,c).
namic interactions. For instance, one multirotor flying above
another causes the so-called downwash effect on the lower one,
which is difficult to model using conventional model-based
approaches [3]. Without accurate downwash interaction mod-
eling, a large safety distance between vehicles is necessary,
thereby preventing a compact 3-D formation shape, e.g., 60 cm
for the small Crazyflie 2.0 quadrotor (9 cm rotor-to-rotor) [4].
Often, a formation is restricted to 2-D planar motions [5].
For heterogeneous teams, even larger and asymmetric safety
distances are required [6]. However, the downwash for two
small Crazyflie quadrotors hovering 30 cm on top of each other
is only −9 g, which is well within their thrust capabilities,
and suggests that proper modeling of downwash and other
interaction effects can lead to more precise motion planning
and dense formation control.
In this paper, we present a learning-based approach,
Neural-
Swarm2
, which increases the precision, safety, and density of
close-proximity motion planning and control of heterogeneous
multirotor swarms. In the example shown in Fig. 1, we safely
operate with vertical proximities less than half than prior
work [4] using the same robots. In particular, we train deep
neural networks (DNNs) to predict the residual interaction
arXiv:2012.05457v1 [cs.RO] 10 Dec 2020
2
forces that are not captured by the nominal models of free-
space aerodynamics. To the best of our knowledge, this is
the first model for aerodynamic interactions between two or
more multirotors in flight. Our DNN architecture supports het-
erogeneous inputs in a permutation-invariant manner without
reducing the expressiveness. The DNN only requires relative
positions and velocities of neighboring multirotors as inputs,
similar to the existing collision-avoidance techniques [7],
which enables fully-decentralized computation. We use the
predicted interaction forces to augment the nominal dynamics
and derive novel methods to directly consider them during
motion planning and as part of the multirotors’ controller.
From a learning perspective, we leverage and extend two
state-of-the-art tools to derive effective DNN models. First,
we extend deep sets [8] to the heterogeneous case and prove
its representation power. Our novel encoding is used to model
interactions between heterogeneous vehicle types in an index-
free or permutation-invariant manner, enabling better general-
ization to new formations and a varying number of vehicles.
The second is spectral normalization [9], which ensures the
DNN is Lipschitz continuous and helps the DNN generalize
well on test examples that lie outside the training set. We
demonstrate that the interaction forces can be computationally
efficiently and accurately learned such that a small 32-bit
microcontroller can predict such forces in real-time.
From a planning and control perspective, we derive novel
methods that directly consider the predicted interaction forces.
For motion planning we use a two-stage approach. In the
first stage, we extend an existing kinodynamic sampling-based
planner for a single robot to the interaction-aware multi-robot
case. In the second stage, we adopt an optimization-based
planner to refine the solutions of the first stage. Empirically,
we demonstrate that our interaction-aware motion planner both
avoids dangerous robot configurations that would saturate the
multirotors’ motors and reduces the tracking error signifi-
cantly. For the nonlinear control we leverage the Lipschitz
continuity of our learned interaction forces to derive stability
guarantees similar to our prior work [10, 11]. The controller
can be used to reduce the tracking error of arbitrary desired
trajectories, including ones that were not planned with an
interaction-aware planner.
We validate our approach on different tasks using two
to sixteen quadrotors of two different sizes, and we also
integrate ground effect and other unmodeled dynamics into
our model, by viewing the physical environment as a special
“robot”. To our knowledge, our approach is the first that
models interactions between two or more multirotor vehicles
and demonstrates how to use such a model effectively and
efficiently for motion planning and control of aerial teams.
II. R
ELATED
W
ORK
The aerodynamic interaction force applied to a single UAV
flying near the ground (ground effect), has been modeled
analytically [12–14]. In many cases, the ground effect is not
considered in typical nonlinear multirotor controllers and thus
increases the tracking error of a multirotor when operating
close to the ground. However, it is possible to use ground
effect prediction in real-time to reduce the tracking error [10,
14].
The interaction between two rotor blades of a single mul-
tirotor has been studied in a lab setting to optimize the
placement of rotors on the vehicle [15]. However, it remains
an open question how this influences the flight of two or
more multirotors in close proximity. Interactions between two
multirotors can be estimated using a propeller velocity field
model [3]. Unfortunately, this method is hard to generalize to
the multi-robot or heterogeneous case and this method only
considers the stationary case, which is inaccurate for real
flights.
The use of DNNs to learn higher-order residual dynamics
or control actions is gaining attention across a range of control
and reinforcement learning settings [10, 16–21]. For swarms,
a common encoding approach is to discretize the whole space
and employ convolutional neural networks (CNNs), which
yields a permutation-invariant encoding. Another common
encoding for robot swarms is a Graphic Neural Network
(GNN) [22, 23]. GNNs have been extended to heterogeneous
graphs [24], but it remains an open research question how
such a structure would apply to heterogeneous robot teams. We
extend a different architecture, which is less frequently used
in robotics applications, called deep sets [8]. Deep sets enable
distributed computation without communication requirements.
Compared to CNNs, our approach: i) requires less training
data and computation; ii) is not restricted to a pre-determined
resolution and input domain; and iii) directly supports the het-
erogeneous swarm. Compared to GNNs, we do not require any
direct communication between robots. Deep sets have been
used in robotics for homogeneous [11] and heterogeneous [25]
teams. Compared to the latter, our heterogeneous deep set
extension has a more compact encoding and we prove its
representation power.
For motion planning, empirical models have been used
to avoid harmful interactions [2, 4, 6, 26, 27]. Typical safe
boundaries along multi-vehicle motions form ellipsoids [4]
or cylinders [6] along the motion trajectories. Estimating
such shapes experimentally would potentially lead to many
collisions and dangerous flight tests and those collision-free
regions are in general conservative. In contrast, we use deep
learning to estimate the interaction forces accurately in hetero-
geneous multi-robot teams. This model allows us to directly
control the magnitude of the interaction forces to accurately
and explicitly control the risk, removing the necessity of
conservative collision shapes.
We generalize and extend our prior conference paper [11]
significantly: i) we develop
heterogeneous deep sets
to extend
to the heterogeneous case, which also unifies the approach
with respect to our prior work that considers the ground effect
for improved multirotor landing [10], ii) we introduce a novel
method to use the learned interaction forces for multi-robot
motion planning, and iii) we explicitly compensate for the
delay in motor speed commands in our position and attitude
controllers, resulting in stronger experimental results for both
our baseline and
Neural-Swarm2
.
3
III. P
ROBLEM
S
TATEMENT
Neural-Swarm2
can generally apply to any robotic system
and we will focus on multirotors in this paper. We first
present single multirotor dynamics including interaction forces
modeled as disturbances. Then, we generalize these dynamics
for a swarm of multirotors. Finally, we formulate our objective
as a variant of an optimal control problem and introduce our
performance metric.
A. Single Multirotor Dynamics
A single multirotor’s state comprises of the global position
p
R
3
, global velocity
v
R
3
, attitude rotation matrix
R
SO(3)
, and body angular velocity
ω
R
3
. Its dynamics are:
̇
p
=
v
,
m
̇
v
=
m
g
+
Rf
u
+
f
a
,
(1a)
̇
R
=
RS
(
ω
)
,
J
̇
ω
=
J
ω
×
ω
+
τ
u
+
τ
a
,
(1b)
η
=
B
0
u
,
̇
u
=
λ
u
+
λ
u
c
,
(1c)
where
m
and
J
denote the mass and inertia matrix of
the system, respectively;
S
(
·
)
is a skew-symmetric mapping;
g
= [0; 0;
g
]
is the gravity vector; and
f
u
= [0; 0;
T
]
and
τ
u
= [
τ
x
;
τ
y
;
τ
z
]
denote the total thrust and body torques from
the rotors, respectively. The output wrench
η
= [
T
;
τ
x
;
τ
y
;
τ
z
]
is linearly related to the control input
η
=
B
0
u
, where
u
= [
n
2
1
;
n
2
2
;
...
;
n
2
M
]
is the squared motor speeds for a vehicle
with
M
rotors and
B
0
is the actuation matrix. A multirotor is
subject to additional disturbance force
f
a
= [
f
a,x
;
f
a,y
;
f
a,z
]
and disturbance torque
τ
a
= [
τ
a,x
;
τ
a,y
;
τ
a,z
]
. We also consider
a first order delay model in (1c), where
u
c
is the actual
command signal we can directly control, and
λ
is the scalar
time constant of the delay model.
Our model creates additional challenges compared to other
exisiting multirotor dynamics models (e.g., [27]). The first
challenge stems from the effect of delay in (1c). The sec-
ond challenge stems from disturbance forces
f
a
in (1a) and
disturbance torques
τ
a
in (1b), generated by the interaction
between other multirotors and the environment.
B. Heterogeneous Swarm Dynamics
We now consider
N
multirotor robots. We use
x
(
i
)
=
[
p
(
i
)
;
v
(
i
)
;
R
(
i
)
;
ω
(
i
)
]
to denote the state of the
i
th
multirotor.
We use
x
(
ij
)
to denote the
relative
state component between
robot
i
and
j
, e.g.,
x
(
ij
)
= [
p
(
j
)
p
(
i
)
;
v
(
j
)
v
(
i
)
;
R
(
i
)
R
(
j
)
>
]
.
We use
I
(
i
)
to denote the type of the
i
th
robot, where
robots with identical physical parameters such as
m
,
J
, and
B
0
are considered to be of the same type. We assume there
are
K
N
types of robots, i.e.,
I
(
·
)
is a surjective mapping
from
{
1
,
···
,N
}
to
{
type
1
,
···
,
type
K
}
. Let
r
(
i
)
type
k
be the
set of the relative states of the type
k
neighbors of robot
i
:
r
(
i
)
type
k
=
{
x
(
ij
)
|
j
neighbor(
i
)
and
I
(
j
) = type
k
}
.
(2)
The ordered sequence of all relative states grouped by robot
type is
r
(
i
)
I
=
(
r
(
i
)
type
1
,
r
(
i
)
type
2
,
···
,
r
(
i
)
type
K
)
.
(3)
The dynamics of the
i
th
multirotor can be written in
compact form:
̇
x
(
i
)
=
Φ
(
i
)
(
x
(
i
)
,
u
(
i
)
) +
0
f
(
i
)
a
(
r
(
i
)
I
)
0
τ
(
i
)
a
(
r
(
i
)
I
)
,
(4)
where
Φ
(
i
)
(
x
(
i
)
,
u
(
i
)
)
denotes the nominal dynamics of robot
i
, and
f
(
i
)
a
(
·
)
and
τ
(
i
)
a
(
·
)
are the unmodeled force and torque of
the
i
th
robot that are caused by interactions with neighboring
robots or the environment (e.g., ground effect and air drag).
Robots with the same type have the same nominal dynamics
and unmodeled force and torque:
Φ
(
i
)
(
·
) =
Φ
I
(
i
)
(
·
)
,
f
(
i
)
a
(
·
) =
f
I
(
i
)
a
(
·
)
,
τ
(
i
)
a
(
·
) =
τ
I
(
i
)
a
(
·
)
i.
(5)
Note that the homogeneous case covered in our prior work [11]
is a special case where
K
= 1
, i.e.,
Φ
(
i
)
(
·
) =
Φ
(
·
)
,
f
(
i
)
a
(
·
) =
f
a
(
·
)
, and
τ
(
i
)
a
(
·
) =
τ
a
(
·
)
i
.
Our system is heterogeneous in three ways: i) different
robot types have heterogeneous nominal dynamics
Φ
I
(
i
)
; ii)
different robot types have different unmodeled
f
I
(
i
)
a
and
τ
I
(
i
)
a
;
and iii) the neighbors of each robot belong to
K
different sets.
We highlight that our heterogeneous model not only cap-
tures different types of robot, but also different types of
environmental interactions, e.g., ground effect [10] and air
drag. This is achieved in a straightforward manner by viewing
the physical environment as a special robot type. We illustrate
this generalization in the following example.
Example 1
(small and large robots, and the environment)
.
We consider a heterogeneous system as depicted in Fig. 1(a).
Robot
3
(large robot) has three neighbors: robot 1 (small),
robot 2 (small) and environment 4. Therefore, for robot
3
, we
have
f
(3)
a
=
f
large
a
(
r
(3)
I
) =
f
large
a
(
r
(3)
small
,
r
(3)
large
,
r
(3)
env
)
,
r
(3)
small
=
{
x
(31)
,
x
(32)
}
,
r
(3)
large
=
,
r
(3)
env
=
{
x
(34)
}
and a similar expression for
τ
(3)
a
.
C. Interaction-Aware Motion Planning & Control
Our goal is to move the heterogeneous team of robots from
their start states to goal states, which can be framed as the
following optimal control problem:
min
u
(
i
)
,
x
(
i
)
,t
f
N
i
=1
t
f
0
u
(
i
)
(
t
)
dt
(6)
s.t.
robot dynamics (4)
i
[1
,N
]
u
(
i
)
(
t
)
∈U
I
(
i
)
;
x
(
i
)
(
t
)
∈X
I
(
i
)
i
[1
,N
]
p
(
ij
)
‖≥
r
(
I
(
i
)
I
(
j
))
i < j, j
[2
,N
]
f
(
i
)
a
‖≤
f
I
(
i
)
a,
max
;
τ
(
i
)
a
‖≤
τ
I
(
i
)
a,
max
i
[1
,N
]
x
(
i
)
(0) =
x
(
i
)
s
;
x
(
i
)
(
t
f
) =
x
(
i
)
f
i
[1
,N
]
where
U
(
k
)
is the control space for
type
k
robots,
X
(
k
)
is the free space for
type
k
robots,
r
(
lk
)
is the minimum
safety distance between
type
l
and
type
k
robots,
f
(
k
)
a,
max
is the