Safe Control for Nonlinear Systems with Stochastic Uncertainty via
Risk Control Barrier Functions
Andrew Singletary, Mohamadreza Ahmadi, and Aaron D. Ames
Abstract
— Guaranteeing safety for robotic and autonomous
systems in real-world environments is a challenging task that
requires the mitigation of stochastic uncertainties. Control
barrier functions have, in recent years, been widely used for
enforcing safety related set-theoretic properties, such as forward
invariance and reachability, of nonlinear dynamical systems. In
this paper, we extend this rich framework to nonlinear discrete-
time systems subject to stochastic uncertainty and propose
a framework for assuring risk-sensitive safety in terms of
coherent risk measures. To this end, we introduce risk control
barrier functions (RCBFs), which are compositions of barrier
functions and dynamic, coherent risk measures. We show that
the existence of such barrier functions implies invariance in
a coherent risk sense. Furthermore, we formulate conditions
based on finite-time RCBFs to guarantee finite-time reachability
to a desired set in the coherent risk. Conditions for risk-sensitive
safety and finite-time reachability of sets composed of Boolean
compositions of multiple RCBF are also formulated. We show
the efficacy of the proposed method through its application to
a cart-pole system in a safety-critical scenario.
I. I
NTRODUCTION
Autonomous robotic systems are being increasingly de-
ployed in real-world settings where safety is critical. With
this transition to practice, the associated risk that stems from
unknown and unforeseen circumstances is correspondingly
on the rise [1]. In the context of safety-critical scenarios, such
as those found in aerospace and human-robot applications,
it is essential that decision making accounts for risk. These
risks are often associated with uncertainty due to extremely
intricate nonlinear dynamics, e.g. bipedal robots [2], and/or
extreme unstructured environments, e.g. subterranean or ex-
traterrestrial exploration [3].
Mathematically speaking, risk can be quantified in numer-
ous ways, such as chance constraints [4], [5], exponential
utility functions [6], and distributional robustness [7]. How-
ever, applications in autonomy and robotics require more
“nuanced assessments of risk” [8]. Artzner
et. al.
[9] charac-
terized a set of natural properties that are desirable for a risk
measure, called a coherent risk measure, and have obtained
widespread acceptance in finance and operations research,
among other fields. An important example of a coherent
risk measure is the conditional value-at-risk (CVaR) that
has received significant attention in decision making prob-
lems, such as Markov decision processes (MDPs) [10]–[13].
For stochastic discrete-time dynamical systems, a model
predictive control technique with coherent risk objectives
The authors are with the Center for Autonomous Systems and Technolo-
gies (CAST) at the California Institute of Technology, 1200 E. California
Blvd., MC 104-44, Pasadena, CA 91125, e-mail:
{
asinglet, mrahmadi,
ames
}
@caltech.edu.
Fig. 1: The value of the safe-set
h
p
x
t
q
is known at time
t
,
but stochastic uncertainty makes
h
p
x
t
`
1
q
a random variable.
We must pick
u
t
such that
h
p
x
t
`
1
q
is safe subject to a risk
measure taken over the worst
β
probability.
was proposed in [14], wherein the authors also proposed
Lyapunov condition for risk-sensitive exponential stability.
Moreover, a method based on stochastic reachability analysis
was proposed in [15] to estimate a CVaR-safe set of initial
conditions via the solution to an MDP.
Our approach to risk-sensitive safety is based on a special
class of control barrier functions. Control barrier functions
were proposed in [16] and have been used for designing safe
controllers (in the absence of a legacy controller, i.e., a de-
sired controller that may be unsafe) and safety filters (in the
presence of a legacy controller) for continuous-time dynami-
cal systems, such as bipedal robots [17] and trucks [18], with
guaranteed robustness [19], [20] (see the survey [21] and
references therein). For discrete-time systems, discrete-time
barrier functions were formulated in [22], [23] and applied
to the multi-robot coordination problem [24]. Recently, for
a class of stochastic (Ito) differential equations, safety in
probability and statistical mean was studied in [25], [26] via
stochastic barrier functions.
This paper goes beyond the conventional notions of safety
in probability and statistical mean through the use of coherent
risk measures (as motivated in Section II). To this end, in
Section III, for discrete-time systems subject to stochastic
uncertainty, we define safety and finite-time reachability in
the risk-sensitive sense, i.e., in the context of the worst
possible realizations, via coherent risk measures. We then
arXiv:2203.15892v1 [eess.SY] 29 Mar 2022
propose
risk control barrier functions
(RCBFs) in Section
IV, together with finite-time RCBFs, as a tool to enforce risk-
sensitive safety and reachability, respectively. The main result
of this paper establishes that RCBFs ensure safety in a risk
sensitive fashion. Finite-time RCBFs allow for the extension
of this result to risk-sensitive reachability. Furthermore, for
safe and goal sets defined as Boolean compositions of mul-
tiple function level-sets, we propose conditions that ensure
safety and reachability of these sets based on RCBFs and
their finite-time counterparts. Importantly, in all cases, the
risk-sensitive controllers are designed to minimally invasive
with respect to a given system legacy controller. We show
the efficacy of our approach in Section V through simulation
on a nonlinear cart-pole system (see Figure 1).
Notation:
We denote by
R
n
the
n
-dimensional Euclidean
space and
N
ě
0
the set of non-negative integers. For a finite
set
A
, we denote by
|
A
|
the number of elements of
A
. For
a probability space
p
X
,
F
,
P
q
and a constant
p
P r
1
,
8q
,
L
p
p
X
,
F
,
P
q
denotes the vector space of real valued random
variables
X
for which
E
|
X
|
p
ă 8
. The Boolean operators
are denoted by
(negation),
_
(conjunction), and
^
(dis-
junction). For a risk measure or a function
ρ
, we denote
ρ
t
to show the function composition of
ρ
with itself
t
times.
II. C
OHERENT
R
ISK
M
EASURES
The goal of this section is to introduce conditional risk
measures with a view toward defining risk control barrier
functions in subsequent sections. In this context, consider
a probability space
p
Ω
,
F
,
P
q
, a filtration
F
0
Ă ̈ ̈ ̈
F
N
Ă
F
, and an adapted sequence of random variables
h
t
, t
“
0
,...,N
, where
N
P
N
ě
0
Y t8u
. For
t
“
0
,...,N
, we
further define the spaces
H
t
“
L
p
p
Ω
,
F
t
,
P
q
,
p
P r
0
,
8q
,
H
t
:
N
“
Z
t
ˆ ̈ ̈ ̈ˆ
Z
N
and
H
“
H
0
ˆ
H
1
ˆ ̈ ̈ ̈
. We assume
that the sequence
h
P
H
is almost surely bounded (with
exceptions having probability zero),
i.e.
,
ess sup
t
|
h
t
p
ω
q| ă
8
.
In order to describe how one can evaluate the risk of
sub-sequence
h
t
,...,h
N
from the perspective of stage
t
, we
require the following definitions.
Definition 1
(Conditional Risk Measure)
.
A mapping
ρ
t
:
N
:
H
t
:
N
Ñ
H
t
, where
0
ď
t
ď
N
, is called a
conditional risk
measure
, if it has the following monotonicity property:
ρ
t
:
N
p
h
q ď
ρ
t
:
N
p
h
1
q
,
@
h
,
@
h
1
P
H
t
:
N
such that
h
ĺ
h
1
.
A
dynamic risk measure
is a sequence of conditional risk
measures
ρ
t
:
N
:
H
t
:
N
Ñ
H
t
,
t
“
0
,...,N
.
One fundamental property of dynamic risk measures is
their consistency over time [27, Definition 3]. That is, if
h
will be as good as
h
1
from the perspective of some future
time
θ
, and they are identical between times
τ
and
θ
, then
h
should not be worse than
h
1
from the perspective at time
τ
. If
a risk measure is time-consistent, we can define the one-step
conditional risk measure
ρ
t
:
H
t
Ñ
H
t
́
1
,
t
“
0
,...,N
́
1
as follows:
ρ
t
p
h
t
q “
ρ
t
́
1
,t
p
0
,h
t
q
,
(1)
and for all
t
“
1
,...,N
, we obtain:
ρ
t,N
p
h
t
,...,h
N
q “
ρ
t
`
h
t
`
ρ
t
`
1
p
h
t
`
1
`
ρ
t
`
2
p
h
t
`
2
` ̈ ̈ ̈
`
ρ
N
́
1
`
h
N
́
1
`
ρ
N
p
h
N
q
̆
̈ ̈ ̈qq
̆
.
(2)
Note that the time-consistent risk measure is completely
defined by one-step conditional risk measures
ρ
t
,
t
“
0
,...,N
́
1
and, in particular, for
t
“
0
, (2) defines a
risk measure of the entire sequence
h
P
H
0:
N
. This leads to
the notion of a coherent risk measure.
Definition 2
(Coherent Risk Measure)
.
We call the one-step
conditional risk measures
ρ
t
:
H
t
`
1
Ñ
H
t
,
t
“
1
,...,N
́
1
as in (2) a
coherent risk measure
if it satisfies the following
conditions
‚
Convexity:
ρ
t
p
λh
`p
1
́
λ
q
h
1
q ď
λρ
t
p
h
q`p
1
́
λ
q
ρ
t
p
h
1
q
,
for all
λ
P p
0
,
1
q
and all
h,h
1
P
H
t
;
‚
Monotonicity:
If
h
ď
h
1
then
ρ
t
p
h
q ď
ρ
t
p
h
1
q
for all
h,h
1
P
H
t
;
‚
Translational Invariance:
ρ
t
p
h
`
h
1
q “
c
`
ρ
t
p
h
1
q
for
all
h
P
H
t
́
1
and
h
1
P
H
t
;
‚
Positive Homogeneity:
ρ
t
p
βh
q “
βρ
t
p
h
q
for all
h
P
H
t
and
β
ě
0
.
All risk measures studied in this paper are time-consistent
coherent risk measures. Concretely, we briefly review two
examples of coherent risk measures.
Total Conditional Expectation:
The simplest risk measure
is the total conditional expectation given by
ρ
t
p
h
t
q “
E
“
h
t
|
F
t
́
1
‰
.
(3)
It is easy to see that total conditional expectation satisfies
the properties of a coherent risk measure as outlined in
Definition 2. Unfortunately, total conditional expectation is
agnostic to realization fluctuations of the stochastic variable
h
and is only concerned with the mean value of
h
at large
number of realizations. Thus, it is a risk-neutral measure of
performance.
Conditional Value-at-Risk:
Let
h
P
H
be a stochastic
variable for which higher values are of interest
1
. For a given
confidence level
β
P p
0
,
1
q
, value-at-risk (
VaR
β
) denotes the
β
-quantile value of a stochastic variable
h
P
H
described as
VaR
β
p
h
q “
sup
ζ
P
R
t
ζ
|
P
p
h
ď
ζ
q ď
β
u
.
Unfortunately,
working with VaR for non-normal stochastic variables is
numerically unstable, optimizing models involving VaR are
intractable in high dimensions, and VaR ignores the values
of
h
with probability less than
β
[28].
In contrast, CVaR overcomes the shortcomings of VaR.
CVaR with confidence level
β
P p
0
,
1
q
denoted
CVaR
β
measures the expected loss in the
β
-tail given that
the particular threshold
VaR
β
has been crossed, i.e.,
CVaR
β
p
h
q “
E
r
h
|
h
ď
VaR
β
p
h
qs
. An optimization for-
mulation for CVaR was proposed in [28] that we use in this
1
For example, greater values of
h
indicate safer performance as will be
discussed in the next section.
paper. That is,
CVaR
β
is given by
CVaR
β
p
h
q
:
“ ́
inf
ζ
P
R
E
„
ζ
`
p ́
h
́
ζ
q
`
β
.
(4)
Note that the above formulation of CVaR is concerned with
the left-tail of distributions (higher values of
h
are preferred).
A value of
β
Ñ
1
corresponds to a risk-neutral case,
i.e.,
CVaR
1
p
h
q “
E
p
h
q
; whereas, a value of
β
Ñ
0
is
rather a risk-averse case, i.e.,
CVaR
0
p
h
q “
VaR
0
p
h
q “
ess inf
p
h
q
[29]. Figure 1 illustrates these notions for an
example
h
variable with distribution
p
p
h
q
.
III. R
ISK
-S
ENSITIVE
S
AFETY AND
R
EACHABILITY
We assume the robot dynamics of interest is described by
a discrete-time stochastic system given by
x
t
`
1
“
f
p
x
t
,u
t
,w
t
q
, x
0
“
x
0
,
(5)
where
t
P
N
ě
0
denotes the time index,
x
P
X
Ă
R
n
is the state,
u
P
U
Ă
R
m
is the control input,
w
P
W
is the stochastic uncertainty/disturbance, and the function
f
:
R
n
ˆ
U
ˆ
W
Ñ
R
n
. We assume that the initial
condition
x
0
is deterministic and that
|
W
|
is finite,
i.e.,
W
“ t
w
1
,...,w
|
W
|
u
. At every time-step
t
, for a state-
control pair
p
x
t
,u
t
q
, the process disturbance
w
t
is drawn
from set
W
according to the probability mass function
p
p
w
q “ r
p
p
w
1
q
,...,p
p
w
|
W
|
qs
T
, where
p
p
w
i
q
:
“
P
p
w
t
“
w
i
q
,
i
“
1
,
2
,...,
|
W
|
. Note that the probability mass
function for the process disturbance is time-invariant, and
that the process disturbance is independent of the process
history and of the state-control pair
p
x
t
,u
t
q
.
Note that, in particular, system (5) can capture stochastic
hybrid systems, such as Markovian Jump Systems [30].
We are interested in studying the properties of the solu-
tions to (5) with respect to the compact set
S
described by:
S
:
“ t
x
P
X
|
h
p
x
q ě
0
u
,
Int
p
S
q
:
“ t
x
P
X
|
h
p
x
q ą
0
u
,
(6)
B
S
:
“ t
x
P
X
|
h
p
x
q “
0
u
,
where
h
:
X
Ñ
R
is a continuous function.
In the presence of stochastic uncertainty
w
, assuring al-
most sure (with probability one) invariance or safety may not
be feasible. Moreover, enforcing safety in expectation is only
meaningful if the law of large numbers can be invoked and
we are interested in the long term performance, independent
of the realization fluctuations. In this work, instead, we
propose safety in the dynamic coherent risk measure sense
with conditional expectation as an special case.
Definition 3
(
ρ
-Safety)
.
Given a safe set
S
as given in (6)
and a time-consistent, dynamic coherent risk measure
ρ
0:
t
as described in (2), we call the solutions to (5), starting at
x
0
P
S
,
ρ
-safe
if and only if
ρ
0
,t
p
0
,
0
,...,h
p
x
qq ě
0
,
@
t
P
N
ě
0
.
(7)
In order to understand (7), consider the case where
ρ
is
the conventional total expectation. Then, (7) implies safety in
expectation. As mentioned earlier, the definition of safety for
general coherent risk measures goes beyond the traditional
total expectation.
Another interesting property we study in this paper arises
when
x
0
P
X
z
S
. That is, when instead of safety, we are
interested in reaching a set of interest in finite time.
Definition 4
(
ρ
-Reachability)
.
Consider system (5) with
initial condition
x
0
P
X
z
S
. Given a set
S
as given in (6)
and a time-consistent, dynamic coherent risk measure
ρ
0:
t
as described in (2), we call the set
S
ρ
-reachable
, if and
only if there exists a constant
t
̊
such that
ρ
0
,t
̊
p
0
,
0
,...,h
p
x
qq ě
0
.
(8)
IV. R
ISK
C
ONTROL
B
ARRIER
F
UNCTIONS
In order to check and enforce risk sensitive safety, i.e.,
ρ
-
safety, we introduce
risk control barrier functions
. We then
extend these to a finite-time variation, which allows us to
establish risk-sensitive reachability, i.e.,
ρ
-reachability.
A. Risk Sensitive Safety with RCBFs
Definition 5
(Risk Control Barrier Function)
.
For the
discrete-time system (5) and a dynamic coherent risk mea-
sure
ρ
, the continuous function
h
:
R
n
Ñ
R
is a
risk control
barrier function
for the set
S
as defined in (6), if there exists
a convex
α
P
K
satisfying
α
p
r
q ă
r
for all
r
ą
0
such that
ρ
p
h
p
x
t
`
1
qq ě
α
p
h
p
x
t
qq
,
@
x
t
P
X
.
(9)
Note that a simple choice for the function
α
is
α
“
α
0
,
where
α
0
P p
0
,
1
q
is a constant.
In the first main contribution of the paper, we demonstrate
that the existence of an RCBF implies invariance/safety in
the coherent risk measure.
Theorem 6.
Consider the discrete-time system
(5)
and the
set
S
as described in
(6)
. Let
ρ
be a given coherent risk
measure. Then,
S
is
ρ
-safe if there exists an RCBF as defined
in Definition 5.
Proof.
The proof is carried out by induction and using
the properties of a coherent risk measure as outlined in
Definition 2. If (9) holds, for
t
“
0
, we have
ρ
p
h
p
x
1
qq ě
α
p
h
p
x
0
qq
.
(10)
Similarly, for
t
“
1
, we have
ρ
p
h
p
x
2
qq ě
α
p
h
p
x
1
qq
.
(11)
Since
ρ
is monotone, composing both sides of (11) with
ρ
does not change the inequality and we obtain
ρ
̋
ρ
p
h
p
x
2
qq ě
ρ
p
α
p
h
p
x
1
qqq
.
(12)
Since
α
is a convex function, from Theorem 3 in [31]
(Jensen’s Inequality for coherent risk measures), we obtain
2
ρ
̋
ρ
p
h
p
x
2
qq ě
ρ
p
α
p
h
p
x
1
qqq ě
α
p
ρ
p
h
p
x
1
qqq
.
2
In particular, if
α
P p
0
,
1
q
is a constant, from positive homogeneity
property of
ρ
, we have
ρ
̋
ρ
p
h
p
x
2
qq ě
ρ
p
αh
p
x
1
qq “
αρ
p
h
p
x
1
qq
.
Then, using inequality (10), we have
ρ
̋
ρ
p
h
p
x
2
qq ě
α
p
ρ
p
h
p
x
1
qqq ě
α
̋
α
p
h
p
x
0
qq
.
Therefore, by induction, at time
t
, we can show that
ρ
t
p
h
p
x
t
qq ě
α
t
p
h
p
x
0
qq
.
The left-hand side of the above
inequality is equal to
ρ
0
,t
p
0
,...,h
p
x
t
qq
. Hence,
ρ
0
,t
p
0
,...,h
p
x
t
qq ě
α
t
p
h
p
x
0
qq
.
(13)
If
x
0
P
S
, from the definition of the set
S
, we have
h
p
x
0
q ě
0
. Since
α
P
K
, then we can infer that (7) holds. Thus, the
system is
ρ
-safe.
Note that, in the case when
x
0
P
X
z
S
, the existence of
an RCBF implies asymptotic convergence to the set
S
in the
coherent risk measure
ρ
. This can be inferred from (13). In
fact, if
α
p
r
q ă
r
, then there exist a constant
δ
P p
0
,
1
q
such
that
α
p
r
q ď
δr
and hence
α
t
p
r
q ď
δ
t
r, t
P
N
ě
0
,
(14)
If
x
0
P
X
z
S
, then
h
p
x
0
q ă
0
. However, from (14), as
t
Ñ 8
,
α
̋ ̈ ̈ ̈ ̋
α
p
r
q Ñ
0
, since the compositions of
class
K
functions is also class
κ
(hence non-negative). We
then obtain
ρ
0
,t
p
0
,...,h
p
x
t
qq ě
0
, which implies that the
solutions become
ρ
-safe.
B. Risk Sensitive Safety with Finite-time RCBFs
In practice, we are often interested in satisfying system
specifications characterized by the set
S
in finite time. To
this end, we define finite-time RCBFs.
Definition 7
(Finite-Time RCBF)
.
For the discrete-time
system (5) and a dynamic coherent risk measure
ρ
, the
continuous function
h
:
X
Ñ
R
is a
finite-time RCBF
for
the set
S
as defined in (6), if there exist constants
0
ă
γ
ă
1
and
ε
ą
0
such that
ρ
p
h
p
x
t
`
1
qq ́
γh
p
x
t
q ě
ε
p
1
́
γ
q
,
@
x
t
P
X
.
(15)
In the second key contribution of the paper, we show that
the existence of a finite-time RCBF implies
ρ
-reachability.
Theorem 8.
Consider the discrete-time system
(5)
and a
dynamic coherent risk measure
ρ
. Let
S
Ă
X
be as described
in
(6)
. If there exists a finite-time RCBF
h
:
X
Ñ
R
as in
Definition 7, then for all
x
0
P
X
z
S
, there exists a
t
̊
P
N
ě
0
such that
S
is
ρ
-reachable, i.e., inequality
(8)
holds.
Furthermore,
t
̊
ď
log
ˆ
ε
́
h
p
x
0
q
ε
̇
{
log
ˆ
1
γ
̇
,
(16)
where the constants
γ
and
ε
are as defined in Definition 7.
Proof.
Similar to the proof of Theorem 1, we use induction
and properties of coherent risk measures. We prove by
induction. From (15), we have
ρ
p
h
p
x
t
`
1
qq ́
ε
ě
γh
p
x
t
q ́
γε
“
γ
p
h
p
x
t
q ́
ε
q
.
Hence, for
t
“
0
, we have
ρ
p
h
p
x
1
qq ́
ε
ě
γ
p
h
p
x
0
q ́
ε
q
.
(17)
For
t
“
1
, we have
ρ
p
h
p
x
2
qq ́
ε
ě
γ
p
h
p
x
1
q ́
ε
q
.
(18)
Since
ρ
is monotone, composing both sides of the above
inequality with
ρ
does not change the inequality and we
obtain
ρ
̋
ρ
p
h
p
x
2
q ́
ε
q ě
ρ
p
γ
p
h
p
x
1
q ́
ε
qq “
γρ
p
h
p
x
1
q ́
ε
q
,
where in the last equality we used the positive homogeneity
property of
ρ
since
γ
P p
0
,
1
q
. Since
ε
ą
0
is a constant,
translational invariance property of
ρ
yields
ρ
̋
ρ
p
h
p
x
2
qq ́
ε
ě
γ
p
ρ
p
h
p
x
1
qq ́
ε
q
.
Moreover, from inequality (17), we infer
ρ
̋
ρ
p
h
p
x
2
qq ́
ε
ě
γ
p
ρ
p
h
p
x
1
qq ́
ε
q ě
γ
2
p
h
p
x
0
q ́
ε
q
.
Thus, by induction, we see that at time step
t
, the following
inequality holds
ρ
t
p
h
p
x
t
qq ́
ε
ě
γ
t
p
h
p
x
0
q ́
ε
q
.
Taking
ε
to the right-hand side and noting that the left-hand
side of the above inequality is equal to
ρ
0
,t
p
0
,...,h
p
x
t
qq
,
we have the following inequality
ρ
0
,t
p
0
,...,h
p
x
t
qq ě
γ
t
p
h
p
x
0
q ́
ε
q`
ε.
(19)
Since
0
ă
γ
ă
1
and
x
0
P
X
z
S
,
i.e.,
h
p
x
0
q ă
0
, as
t
increases
x
t
approaches
S
in the dynamic risk measure
ρ
0
,t
,
because by definition
h
p
x
t
q ě
0
implies
x
t
P
S
. Hence,
S
is
ρ
-reachable in finite time.
by definition,
x
t
reaches
S
at least at the boundary by
t
̊
when
̃
h
p
x
t
q “
0
. Substituting
̃
h
p
x
t
q “
0
in (19) yields
0
ě
γ
t
̊
p
h
p
x
0
q ́
ε
q`
ε,
(20)
where we used the fact that
ρ
0
,t
p
0
,...,h
p
x
t
̊
qq “
ρ
0
,t
p
0
,...,
0
q “
0
. Re-arranging the term and noting that
h
p
x
0
q ď
0
and therefore
h
p
x
0
q ́
ε
ď
0
, we obtain
ε
ε
́
h
p
x
0
q
ď
γ
t
.
Taking the logarithm of both sides of the above inequality
gives
log
́
ε
ε
́
h
p
x
0
q
̄
ď
t
log
p
γ
q
,
or equivalently
́
log
ˆ
ε
́
h
p
x
0
q
ε
̇
ď ́
t
log
p
1
γ
q
.
Since
0
ă
γ
ă
1
,
log
p
1
γ
q
is a positive number. Dividing
both sides of the inequality above with the negative number
́
log
p
1
γ
q
obtains
t
ď
log
́
ε
́
̃
h
p
b
0
q
ε
̄
{
log
́
1
ρ
̄
.
The upper bound described by inequality (16) in Theorem
2 is dependent on the two parameter
γ
and
ε
. In our
experiments, we often fix
0
ă
γ
ă
1
and carry out a line
search over
ε
until the finite-time RCBF condition (15) does
not hold anymore. Then, we pick the corresponding
t
̊
as
the upper-bound on the earliest time the solutions can enter
the goal set
S
.
Fig. 2: Simulation results for the cart-pole system with no RCBF filter, and with standard RCBF (top) and finite-time RCBF
(bottom) filters using total conditional expectation and CVaR.
C. Boolean Compositions of RCBFs
We have proposed RCBFs and finite-time RCBFs as means
to verify
ρ
-safety and
ρ
-reachability, respectively. We now
propose conditions for verifying
ρ
-safety and
ρ
-reachability
for Boolean compositions of several control barrier functions
[24], [32], [33]. We omit proofs due to space constraints.
Proposition 1.
Let
S
i
“ t
x
P
R
n
|
h
i
p
x
q ě
0
u
,
i
“
1
,...,k
denote a family of safe sets with the boundaries and interior
defined analogous to
S
in
(6)
and
ρ
be a given dynamic
coherent risk measure. Consider the discrete-time system
(5)
.
If there exist a
α
P p
0
,
1
q
such that
ρ
ˆ
min
i
“
1
,...,k
h
i
p
x
t
`
1
q
̇
ě
α
min
i
“
1
,...,k
h
i
p
x
t
q
(21)
then the set
t
x
P
R
n
| ^
i
“
1
,...,k
p
h
i
p
x
q ě
0
qu
is
ρ
-safe.
Similarly, if there exist a
α
P p
0
,
1
q
such that
ρ
ˆ
max
i
“
1
,...,k
h
i
p
x
t
`
1
q
̇
ě
α
max
i
“
1
,...,k
h
i
p
x
t
q
(22)
then the set
t
x
P
R
n
| _
i
“
1
,...,k
p
h
i
p
x
q ě
0
qu
is
ρ
-safe.
We next propose conditions for risk-sensitive finite-time
reachability of sets composed of Boolean compositions of
several functions
h
as described in (6).
Proposition 2.
Let
S
i
“ t
x
P
R
n
|
h
i
p
x
q ě
0
u
,
i
“
1
,...,k
denote a family of sets with the boundaries and interior
defined analogous to
S
in
(6)
and
ρ
be a given dynamic
coherent risk measure. Consider the discrete-time system
(5)
.
If there exist constants
0
ă
γ
ă
1
and
ε
ą
0
such that
ρ
ˆ
min
i
“
1
,...,k
h
i
p
x
t
`
1
q
̇
́
γ
min
i
“
1
,...,k
h
i
p
x
t
q ě
ε
p
1
́
γ
q
(23)
then the set
t
x
P
R
n
| ^
i
“
1
,...,k
p
h
i
p
x
q ě
0
qu
is
ρ
-reachable.
Then, there exists a constant
t
̊
satisfying
t
̊
ď
log
ˆ
ε
́
min
i
“
1
,...,k
h
i
p
x
0
q
ε
̇
{
log
ˆ
1
γ
̇
,
(24)
such that if
x
0
P
X
z Y
i
“
1
,...,k
S
i
then
x
t
̊
P X
i
“
1
,...,k
S
i
.
Similarly, the disjunction case follows by replacing
min
with
max
in
(23)
and
(24)
.
V. S
IMULATION
R
ESULTS
In order to illustrate the results of these risk-aware guar-
antees, we apply our method in the case of the cart-pole,
modeled as a nonlinear, control-affine discrete-time system.
x
t
`
1
“
x
t
`
»
—
—
—
—
–
v
x
9
θ
u
t
`
m
p
s
θ
p
l
9
θ
2
`
g
c
θ
q
m
c
`
m
p
s
2
θ
́
u
t
c
θ
́
m
p
l
9
θ
2
c
θ
s
θ
́p
m
c
`
m
p
q
g
s
θ
l
p
m
c
`
m
p
s
2
θ
q
fi
ffi
ffi
ffi
ffi
fl
∆
t
`
w
t
(25)
The disturbance
w
t
P
W
enters the system linearly, and is
described by a pmf over the states. This could include the
modeling error from this Euler-approximated discrete-time
model, but in this case, it is a simple pmf normally distributed
around
0
with standard deviation
σ
“ t
0
.
05
,
0
.
05
,
0
.
2
,
0
.
2
u
for the four states
x
“
”
p
x
,θ,v
x
,
9
θ
ı
.
The safety set is described by
h
p
x
t
q “ ́
2
a
max
p
t
x
́
v
t
x
2
sgn
p
v
t
x
q
,
(26)
where
a
max
ą
0
is a tuneable parameter that designates the
maximum linear acceleration at any point. This function is
positive when
p
x
ă
0
, but allows
h
p
x
t
q ą
0
when
p
x
ą
0
if
v
x
is sufficiently negative.
While this safety set is nonlinear in the control inputs,
the one-step nature of this optimization problem results in
no issues solving such a program in real-time, using modern
solvers such as IPOPT or NLOPT. In future work, we plan
to show how nonlinear CBFs can be linearized to result in
an affine RCBF constraint, with the error included in the
stochastic uncertainty to result in formal safety guarantees.
The RCBF was solved using PAGMO’s integrated SLSQP
solver from NLOPT. Each solution took roughly 0.7 ms
to compute on a modern laptop, resulting in a maximum
control frequency of 1428 Hz. Three trajectories are shown
in Figure 2. The desired trajectory shows the trajectory with
only the nominal controller, which clearly surpasses the safe
set at
x
“
0
. The trajectory corresponding to
E
r
h
s
was
filtered subject to the total conditional expectation coherent
risk measure, which also corresponds to CVaR with
β
“
1
.
While this filter guarantees safety in the expectation, safety is
frequently violated due to the stochastic uncertainty. Finally,
the trajectory corresponding to CVaR with
β
“
0
.
01
results
in safety over the entire trajectory.
Similarly, Figure 2 also demonstrates the same three tra-
jectories with the finite-time reachability RCBF. Specifically,
we utilize constants
γ
“
0
.
05
and
“
0
.
1
, with an initial
safety violation of
h
p
x
0
q “ ́
0
.
2
. From (16), this suggests a
t
̊
ď
0
.
3667
s. While this is not reflected in the plot, which
only shows
p
t
x
rather that
h
p
x
t
q
, we find that
h
p
x
t
̊
q ą
0
at
t
̊
“
0
.
08
s, well below the theoretical guarantee.
VI. C
ONCLUSIONS
In this paper, we propose Risk Control Barrier Functions
(RCBFs) as a means to enforce safety in the presence
of stochastic uncertainty. We demonstrate theoretically that
these RCBFs guarantee safety with respect to dynamic coher-
ent risk measures, which serve as a computationally efficient
means to assess risk. Moreover, we proved that finite-time
RCBFs can be utilized to guarantee convergence to a set in
finite time, resulting in a practical safety filter that works both
inside and outside of the safe set. We also demonstrated how
multiple safe sets can be enforced simultaneously utilizing
Boolean compositions. Finally, we demonstrated the efficacy
of this framework on the nonlinear cart-pole system under
stochastic uncertainty.
R
EFERENCES
[1] S. Thrun, W. Burgard, and D. Fox,
Probabilistic robotics
. Cambridge,
Mass.: MIT Press, 2005.
[2] J. Reher and A. D. Ames, “Dynamic walking: Toward agile and
efficient bipedal robots,”
Annual Reviews
, 2020.
[3] T. Rou
ˇ
cek, M. Pecka, P.
ˇ
C
́
ı
ˇ
zek, T. Pet
ˇ
r
́
ı
ˇ
cek, J. Bayer, V.
ˇ
Salansk
`
y,
D. He
ˇ
rt, M. Petrl
́
ık, T. B
́
a
ˇ
ca, V. Spurn
`
y,
et al.
, “Darpa subterranean
challenge: Multi-robotic exploration of underground environments,”
in
International Conference on Modelling and Simulation for Au-
tonomous Systesm
, pp. 274–290, Springer, 2019.
[4] M. Ono, M. Pavone, Y. Kuwata, and J. Balaram, “Chance-constrained
dynamic programming with application to risk-aware robotic space
exploration,”
Autonomous Robots
, vol. 39, no. 4, pp. 555–571, 2015.
[5] A. Wang, A. M. Jasour, and B. Williams, “Non-gaussian chance-
constrained trajectory planning for autonomous vehicles under agent
uncertainty,”
IEEE Robotics and Automation Letters
, 2020.
[6] S. Koenig and R. G. Simmons, “Risk-sensitive planning with proba-
bilistic decision graphs,” in
Principles of Knowledge Representation
and Reasoning
, pp. 363–373, Elsevier, 1994.
[7] H. Xu and S. Mannor, “Distributionally robust Markov decision
processes,” in
Advances in Neural Information Processing Systems
,
pp. 2505–2513, 2010.
[8] A. Majumdar and M. Pavone, “How should a robot assess risk?
towards an axiomatic theory of risk in robotics,” in
Robotics Research
,
pp. 75–84, Springer, 2020.
[9] P. Artzner, F. Delbaen, J. Eber, and D. Heath, “Coherent measures of
risk,”
Mathematical finance
, vol. 9, no. 3, pp. 203–228, 1999.
[10] Y. Chow, A. Tamar, S. Mannor, and M. Pavone, “Risk-sensitive and
robust decision-making: a cvar optimization approach,” in
Advances
in Neural Information Processing Systems
, pp. 1522–1530, 2015.
[11] Y. Chow and M. Ghavamzadeh, “Algorithms for cvar optimization
in mdps,” in
Advances in neural information processing systems
,
pp. 3509–3517, 2014.
[12] L. Prashanth, “Policy gradients for cvar-constrained mdps,” in
Inter-
national Conference on Algorithmic Learning Theory
, pp. 155–169,
Springer, 2014.
[13] N. B
̈
auerle and J. Ott, “Markov decision processes with average-
value-at-risk criteria,”
Mathematical Methods of Operations Research
,
vol. 74, no. 3, pp. 361–379, 2011.
[14] S. Singh, Y. Chow, A. Majumdar, and M. Pavone, “A framework for
time-consistent, risk-sensitive model predictive control: Theory and
algorithms,”
IEEE Transactions on Automatic Control
, 2018.
[15] M. P. Chapman, J. Lacotte, A. Tamar, D. Lee, K. M. Smith, V. Cheng,
J. F. Fisac, S. Jha, M. Pavone, and C. J. Tomlin, “A risk-sensitive finite-
time reachability approach for safety of stochastic dynamic systems,”
in
2019 American Control Conference (ACC)
, pp. 2958–2963, IEEE,
2019.
[16] A. D. Ames, X. Xu, J. W. Grizzle, and P. Tabuada, “Control barrier
function based quadratic programs for safety critical systems,”
IEEE
Transactions on Automatic Control
, vol. 62, no. 8, pp. 3861–3876,
2016.
[17] Q. Nguyen, A. Hereid, J. W. Grizzle, A. D. Ames, and K. Sreenath,
“3D dynamic walking on stepping stones with control barrier func-
tions,” in
Decision and Control (CDC), 2016 IEEE 55th Conference
on
, pp. 827–834, IEEE, 2016.
[18] Y. Chen, A. Hereid, H. Peng, and J. Grizzle, “Enhancing the perfor-
mance of a safe controller via supervised learning for truck lateral
control,”
Journal of Dynamic Systems, Measurement, and Control
,
vol. 141, no. 10, 2019.
[19] X. Xu, P. Tabuada, J. W. Grizzle, and A. D. Ames, “Robustness of con-
trol barrier functions for safety critical control,”
IFAC-PapersOnLine
,
vol. 48, no. 27, pp. 54–61, 2015.
[20] S. Kolathaya and A. D. Ames, “Input-to-state safety with control
barrier functions,”
IEEE control systems letters
, vol. 3, no. 1, pp. 108–
113, 2018.
[21] A. D. Ames, S. Coogan, M. Egerstedt, G. Notomista, K. Sreenath,
and P. Tabuada, “Control barrier functions: Theory and applications,”
in
2019 18th European Control Conference (ECC)
, pp. 3420–3431,
IEEE, 2019.
[22] M. Ahmadi, A. Singletary, J. W. Burdick, and A. D. Ames, “Safe
policy synthesis in multi-agent pomdps via discrete-time barrier func-
tions,” in
2019 IEEE 58th Conference on Decision and Control (CDC)
,
pp. 4797–4803, IEEE, 2019.
[23] A. Agrawal and K. Sreenath, “Discrete control barrier functions for
safety-critical control of discrete systems with application to bipedal
robot navigation.,” in
Robotics: Science and Systems
, 2017.
[24] M. Ahmadi, A. Singletary, J. W. Burdick, and A. D. Ames, “Barrier
functions for multiagent-pomdps with dtl specifications,” in
The 59th
IEEE Conference on Decision and Control
, 2020.
[25] A. Clark, “Control barrier functions for complete and incomplete
information stochastic systems,” in
2019 American Control Conference
(ACC)
, pp. 2928–2935, IEEE, 2019.
[26] C. Santoyo, M. Dutreix, and S. Coogan, “A barrier function approach
to finite-time stochastic system verification and control,”
arXiv preprint
arXiv:1909.05109
, 2019.
[27] A. Ruszczy
́
nski, “Risk-averse dynamic programming for markov deci-
sion processes,”
Mathematical programming
, vol. 125, no. 2, pp. 235–
261, 2010.
[28] R. T. Rockafellar, S. Uryasev,
et al.
, “Optimization of conditional
value-at-risk,”
Journal of risk
, vol. 2, pp. 21–42, 2000.
[29] R. T. Rockafellar and S. Uryasev, “Conditional value-at-risk for
general loss distributions,”
Journal of banking & finance
, vol. 26, no. 7,
pp. 1443–1471, 2002.
[30] P. Zhao, Y. Kang, and Y.-B. Zhao, “A brief tutorial and survey on
markovian jump systems: Stability and control,”
IEEE Systems, Man,
and Cybernetics Magazine
, vol. 5, no. 2, pp. 37–C3, 2019.
[31] Z. Chen, K. He, R. Kulperger,
et al.
, “Risk measures and nonlinear
expectations,”
Journal of Mathematical Finance
, vol. 3, no. 03, p. 383,
2013.
[32] P. Glotfelter, J. Cort
́
es, and M. Egerstedt, “Nonsmooth barrier func-
tions with applications to multi-robot systems,”
IEEE control systems
letters
, vol. 1, no. 2, pp. 310–315, 2017.
[33] M. Ahmadi, A. Israel, and U. Topcu, “Safe controller synthesis for
data-driven differential inclusions,”
IEEE Transactions on Automatic
Control
, 2020.