Meta-Adaptive Nonlinear Control:
Theory and Algorithms
Guanya Shi
†
, Kamyar Azizzadenesheli
‡
, Michael O’Connell
†
, Soon-Jo Chung
†
, Yisong Yue
†
†
Caltech
‡
Purdue University
{gshi,moc,sjchung,yyue}@caltech.edu, kamyar@purdue.edu
Abstract
We present an online multi-task learning approach for adaptive nonlinear control,
which we call Online Meta-Adaptive Control (OMAC). The goal is to control a
nonlinear system subject to adversarial disturbance and unknown
environment-
dependent
nonlinear dynamics, under the assumption that the environment-
dependent dynamics can be well captured with some shared representation. Our
approach is motivated by robot control, where a robotic system encounters a se-
quence of new environmental conditions that it must quickly adapt to. A key
emphasis is to integrate online representation learning with established methods
from control theory, in order to arrive at a unified framework that yields both
control-theoretic and learning-theoretic guarantees. We provide instantiations of
our approach under varying conditions, leading to the first non-asymptotic end-
to-end convergence guarantee for multi-task nonlinear control. OMAC can also
be integrated with deep representation learning. Experiments show that OMAC
significantly outperforms conventional adaptive control approaches which do not
learn the shared representation, in inverted pendulum and 6-DoF drone control
tasks under varying wind conditions
1
.
1 Introduction
One important goal in autonomy and artificial intelligence is to enable autonomous robots to learn
from prior experience to quickly adapt to new tasks and environments. Examples abound in robotics,
such as a drone flying in different wind conditions [
1
], a manipulator throwing varying objects [
2
],
or a quadruped walking over changing terrains [
3
]. Though those examples provide encouraging
empirical evidence, when designing such adaptive systems, two important theoretical challenges
arise, as discussed below.
First, from a learning perspective, the system should be able to learn an “efficient” representation
from prior tasks, thereby permitting faster future adaptation, which falls into the categories of
representation learning or meta-learning. Recently, a line of work has shown theoretically that learning
representations (in the standard supervised setting) can significantly reduce sample complexity on
new tasks [
4
–
6
]. Empirically, deep representation learning or meta-learning has achieved success
in many applications [
7
], including control, in the context of meta-reinforcement learning [
8
–
10
].
However, theoretical benefits (in the end-to-end sense) of representation learning or meta-learning for
adaptive control remain unclear.
Second, from a control perspective, the agent should be able to handle parametric model uncertainties
with control-theoretic guarantees such as stability and tracking error convergence, which is a common
adaptive control problem [
11
,
12
]. For classic adaptive control algorithms, theoretical analysis often
involves the use of Lyapunov stability and asymptotic convergence [
11
,
12
]. Moreover, many recent
1
Code and video:
https://github.com/GuanyaShi/Online-Meta-Adaptive-Control
35th Conference on Neural Information Processing Systems (NeurIPS 2021).
studies aim to integrate ideas from learning, optimization, and control theory to design and analyze
adaptive controllers using learning-theoretic metrics. Typical results guarantee non-asymptotic
convergence in finite time horizons, such as regret [
13
–
17
] and dynamic regret [
18
–
20
]. However,
these results focus on a single environment or task. A multi-task extension, especially whether and
how prior experience could benefit the adaptation in new tasks, remains an open problem.
Main contributions.
In this paper, we address both learning and control challenges in a unified
framework and provide end-to-end guarantees. We derive a new method of Online Meta-Adaptive
Control (OMAC) that controls uncertain nonlinear systems under a sequence of new environmental
conditions. The underlying assumption is that the environment-dependent unknown dynamics can
well be captured by a shared representation, which OMAC learns using a
meta-adapter
. OMAC then
performs environment-specific updates using an
inner-adapter
.
We provide different instantiations of OMAC under varying assumptions and conditions. In the
jointly and element-wise convex cases, we show sublinear cumulative control error bounds, which
to our knowledge is the first non-asymptotic convergence result for multi-task nonlinear control.
Compared to standard adaptive control approaches that do not have a meta-adapter, we show that
OMAC possesses both stronger guarantees and empirical performance. We finally show how to
integrate OMAC with deep representation learning, which further improves empirical performance.
2 Problem statement
We consider the setting where a controller encounters a sequence of
N
environments, with each
environment lasting
T
time steps. We use
outer iteration
to refer to the iterating over the
N
environments, and
inner iteration
to refer to the
T
time steps within an environment.
Notations:
We use superscripts (e.g.,
(
i
)
in
x
(
i
)
t
) to denote the index of the outer iteration where
1
i
N
, and subscripts (e.g.,
t
in
x
(
i
)
t
) to denote the time index of the inner iteration where
1
t
T
. We use
step
(
i, t
) to refer to the inner time step
t
at the
i
th
outer iteration.
k
·
k
denotes
the 2-norm of a vector or the spectral norm of a matrix.
k
·
k
F
denotes the Frobenius norm of a matrix
and