2203.15892.pdf

Safe Control for Nonlinear Systems with Stochastic Uncertainty via

Risk Control Barrier Functions

Andrew Singletary, Mohamadreza Ahmadi, and Aaron D. Ames

Abstract

— Guaranteeing safety for robotic and autonomous

systems in real-world environments is a challenging task that

requires the mitigation of stochastic uncertainties. Control

barrier functions have, in recent years, been widely used for

enforcing safety related set-theoretic properties, such as forward

invariance and reachability, of nonlinear dynamical systems. In

this paper, we extend this rich framework to nonlinear discrete-

time systems subject to stochastic uncertainty and propose

a framework for assuring risk-sensitive safety in terms of

coherent risk measures. To this end, we introduce risk control

barrier functions (RCBFs), which are compositions of barrier

functions and dynamic, coherent risk measures. We show that

the existence of such barrier functions implies invariance in

a coherent risk sense. Furthermore, we formulate conditions

based on finite-time RCBFs to guarantee finite-time reachability

to a desired set in the coherent risk. Conditions for risk-sensitive

safety and finite-time reachability of sets composed of Boolean

compositions of multiple RCBF are also formulated. We show

the efficacy of the proposed method through its application to

a cart-pole system in a safety-critical scenario.

I. I

NTRODUCTION

Autonomous robotic systems are being increasingly de-

ployed in real-world settings where safety is critical. With

this transition to practice, the associated risk that stems from

unknown and unforeseen circumstances is correspondingly

on the rise [1]. In the context of safety-critical scenarios, such

as those found in aerospace and human-robot applications,

it is essential that decision making accounts for risk. These

risks are often associated with uncertainty due to extremely

intricate nonlinear dynamics, e.g. bipedal robots [2], and/or

extreme unstructured environments, e.g. subterranean or ex-

traterrestrial exploration [3].

Mathematically speaking, risk can be quantified in numer-

ous ways, such as chance constraints [4], [5], exponential

utility functions [6], and distributional robustness [7]. How-

ever, applications in autonomy and robotics require more

“nuanced assessments of risk” [8]. Artzner

et. al.

[9] charac-

terized a set of natural properties that are desirable for a risk

measure, called a coherent risk measure, and have obtained

widespread acceptance in finance and operations research,

among other fields. An important example of a coherent

risk measure is the conditional value-at-risk (CVaR) that

has received significant attention in decision making prob-

lems, such as Markov decision processes (MDPs) [10]–[13].

For stochastic discrete-time dynamical systems, a model

predictive control technique with coherent risk objectives

The authors are with the Center for Autonomous Systems and Technolo-

gies (CAST) at the California Institute of Technology, 1200 E. California

Blvd., MC 104-44, Pasadena, CA 91125, e-mail:

{

asinglet, mrahmadi,

ames

}

@caltech.edu.

Fig. 1: The value of the safe-set

h

p

x

t

q

is known at time

t

,

but stochastic uncertainty makes

h

p

x

t

`

1

q

a random variable.

We must pick

u

t

such that

h

p

x

t

`

1

q

is safe subject to a risk

measure taken over the worst

β

probability.

was proposed in [14], wherein the authors also proposed

Lyapunov condition for risk-sensitive exponential stability.

Moreover, a method based on stochastic reachability analysis

was proposed in [15] to estimate a CVaR-safe set of initial

conditions via the solution to an MDP.

Our approach to risk-sensitive safety is based on a special

class of control barrier functions. Control barrier functions

were proposed in [16] and have been used for designing safe

controllers (in the absence of a legacy controller, i.e., a de-

sired controller that may be unsafe) and safety filters (in the

presence of a legacy controller) for continuous-time dynami-

cal systems, such as bipedal robots [17] and trucks [18], with

guaranteed robustness [19], [20] (see the survey [21] and

references therein). For discrete-time systems, discrete-time

barrier functions were formulated in [22], [23] and applied

to the multi-robot coordination problem [24]. Recently, for

a class of stochastic (Ito) differential equations, safety in

probability and statistical mean was studied in [25], [26] via

stochastic barrier functions.

This paper goes beyond the conventional notions of safety

in probability and statistical mean through the use of coherent

risk measures (as motivated in Section II). To this end, in

Section III, for discrete-time systems subject to stochastic

uncertainty, we define safety and finite-time reachability in

the risk-sensitive sense, i.e., in the context of the worst

possible realizations, via coherent risk measures. We then

arXiv:2203.15892v1 [eess.SY] 29 Mar 2022

propose

risk control barrier functions

(RCBFs) in Section

IV, together with finite-time RCBFs, as a tool to enforce risk-

sensitive safety and reachability, respectively. The main result

of this paper establishes that RCBFs ensure safety in a risk

sensitive fashion. Finite-time RCBFs allow for the extension

of this result to risk-sensitive reachability. Furthermore, for

safe and goal sets defined as Boolean compositions of mul-

tiple function level-sets, we propose conditions that ensure

safety and reachability of these sets based on RCBFs and

their finite-time counterparts. Importantly, in all cases, the

risk-sensitive controllers are designed to minimally invasive

with respect to a given system legacy controller. We show

the efficacy of our approach in Section V through simulation

on a nonlinear cart-pole system (see Figure 1).

Notation:

We denote by

R

n

the

n

-dimensional Euclidean

space and

N

ě

0

the set of non-negative integers. For a finite

set

A

, we denote by

|

A

|

the number of elements of

A

. For

a probability space

p

X

,

F

,

P

q

and a constant

p

P r

1

,

8q

,

L

p

X

,

F

,

P

q

denotes the vector space of real valued random

variables

X

for which

E

|

X

|

p

ă 8

. The Boolean operators

are denoted by

(negation),

_

(conjunction), and

^

(dis-

junction). For a risk measure or a function

ρ

, we denote

ρ

t

to show the function composition of

ρ

with itself

t

times.

II. C

OHERENT

R

ISK

M

EASURES

The goal of this section is to introduce conditional risk

measures with a view toward defining risk control barrier

functions in subsequent sections. In this context, consider

a probability space

p

Ω

,

F

,

P

q

, a filtration

F

0

Ă ̈ ̈ ̈

F

N

Ă

F

, and an adapted sequence of random variables

h

t

, t

“

0

,...,N

, where

N

P

N

ě

0

Y t8u

. For

t

“

0

,...,N

, we

further define the spaces

H

t

“

L

p

Ω

,

F

t

,

P

q

,

p

P r

0

,

8q

,

H

t

:

N

“

Z

t

ˆ ̈ ̈ ̈ˆ

Z

N

and

H

“

H

0

ˆ

H

1

ˆ ̈ ̈ ̈

. We assume

that the sequence

h

P

H

is almost surely bounded (with

exceptions having probability zero),

i.e.

,

ess sup

t

|

h

t

p

ω

q| ă

8

.

In order to describe how one can evaluate the risk of

sub-sequence

h

t

,...,h

N

from the perspective of stage

t

, we

require the following definitions.

Definition 1

(Conditional Risk Measure)

.

A mapping

ρ

t

:

N

:

H

t

:

N

Ñ

H

t

, where

0

ď

t

ď

N

, is called a

conditional risk

measure

, if it has the following monotonicity property:

ρ

t

:

N

p

h

q ď

ρ

t

:

N

p

h

1

q

,

@

h

,

@

h

1

P

H

t

:

N

such that

h

ĺ

h

1

.

A

dynamic risk measure

is a sequence of conditional risk

measures

ρ

t

:

N

:

H

t

:

N

Ñ

H

t

,

t

“

0

,...,N

.

One fundamental property of dynamic risk measures is

their consistency over time [27, Definition 3]. That is, if

h

will be as good as

h

1

from the perspective of some future

time

θ

, and they are identical between times

τ

and

θ

, then

h

should not be worse than

h

1

from the perspective at time

τ

. If

a risk measure is time-consistent, we can define the one-step

conditional risk measure

ρ

t

:

H

t

Ñ

H

t

́

1

,

t

“

0

,...,N

́

1

as follows:

ρ

t

p

h

t

q “

ρ

t

́

1

,t

p

0

,h

t

q

,

(1)

and for all

t

“

1

,...,N

, we obtain:

ρ

t,N

p

h

t

,...,h

N

q “

ρ

t

`

h

t

`

ρ

t

`

1

p

h

t

`

1

`

ρ

t

`

2

p

h

t

`

2

` ̈ ̈ ̈

`

ρ

N

́

1

`

h

N

́

1

`

ρ

N

p

h

N

q

̆

̈ ̈ ̈qq

̆

.

(2)

Note that the time-consistent risk measure is completely

defined by one-step conditional risk measures

ρ

t

,

t

“

0

,...,N

́

1

and, in particular, for

t

“

0

, (2) defines a

risk measure of the entire sequence

h

P

H

0:

N

. This leads to

the notion of a coherent risk measure.

Definition 2

(Coherent Risk Measure)

.

We call the one-step

conditional risk measures

ρ

t

:

H

t

`

1

Ñ

H

t

,

t

“

1

,...,N

́

1

as in (2) a

coherent risk measure

if it satisfies the following

conditions

‚

Convexity:

ρ

t

p

λh

`p

1

́

λ

q

h

1

q ď

λρ

t

p

h

q`p

1

́

λ

q

ρ

t

p

h

1

q

,

for all

λ

P p

0

,

1

q

and all

h,h

1

P

H

t

;

‚

Monotonicity:

If

h

ď

h

1

then

ρ

t

p

h

q ď

ρ

t

p

h

1

q

for all

h,h

1

P

H

t

;

‚

Translational Invariance:

ρ

t

p

h

`

h

1

q “

c

`

ρ

t

p

h

1

q

for

all

h

P

H

t

́

1

and

h

1

P

H

t

;

‚

Positive Homogeneity:

ρ

t

p

βh

q “

βρ

t

p

h

q

for all

h

P

H

t

and

β

ě

0

.

All risk measures studied in this paper are time-consistent

coherent risk measures. Concretely, we briefly review two

examples of coherent risk measures.

Total Conditional Expectation:

The simplest risk measure

is the total conditional expectation given by

ρ

t

p

h

t

q “

E

“

h

t

|

F

t

́

1

‰

.

(3)

It is easy to see that total conditional expectation satisfies

the properties of a coherent risk measure as outlined in

Definition 2. Unfortunately, total conditional expectation is

agnostic to realization fluctuations of the stochastic variable

h

and is only concerned with the mean value of

h

at large

number of realizations. Thus, it is a risk-neutral measure of

performance.

Conditional Value-at-Risk:

Let

h

P

H

be a stochastic

variable for which higher values are of interest

1

. For a given

confidence level

β

P p

0

,

1

q

, value-at-risk (

VaR

β

) denotes the

β

-quantile value of a stochastic variable

h

P

H

described as

VaR

β

p

h

q “

sup

ζ

P

R

t

ζ

|

P

p

h

ď

ζ

q ď

β

u

.

Unfortunately,

working with VaR for non-normal stochastic variables is

numerically unstable, optimizing models involving VaR are

intractable in high dimensions, and VaR ignores the values

of

h

with probability less than

β

[28].

In contrast, CVaR overcomes the shortcomings of VaR.

CVaR with confidence level

β

P p

0

,

1

q

denoted

CVaR

β

measures the expected loss in the

β

-tail given that

the particular threshold

VaR

β

has been crossed, i.e.,

CVaR

β

p

h

q “

E

r

h

|

h

ď

VaR

β

p

h

qs

. An optimization for-

mulation for CVaR was proposed in [28] that we use in this

1

For example, greater values of

h

indicate safer performance as will be

discussed in the next section.

paper. That is,

CVaR

β

is given by

CVaR

β

p

h

q

:

“ ́

inf

ζ

P

R

E

„

ζ

`

p ́

h

́

ζ

q

`

β



.

(4)

Note that the above formulation of CVaR is concerned with

the left-tail of distributions (higher values of

h

are preferred).

A value of

β

Ñ

1

corresponds to a risk-neutral case,

i.e.,

CVaR

1

p

h

q “

E

p

h

q

; whereas, a value of

β

Ñ

0

is

rather a risk-averse case, i.e.,

CVaR

0

p

h

q “

VaR

0

p

h

q “

ess inf

p

h

q

[29]. Figure 1 illustrates these notions for an

example

h

variable with distribution

p

h

q

.

III. R

ISK

-S

ENSITIVE

S

AFETY AND

R

EACHABILITY

We assume the robot dynamics of interest is described by

a discrete-time stochastic system given by

x

t

`

1

“

f

p

x

t

,u

t

,w

t

q

, x

0

“

x

0

,

(5)

where

t

P

N

ě

0

denotes the time index,

x

P

X

Ă

R

n

is the state,

u

P

U

Ă

R

m

is the control input,

w

P

W

is the stochastic uncertainty/disturbance, and the function

f

:

R

n

ˆ

U

ˆ

W

Ñ

R

n

. We assume that the initial

condition

x

0

is deterministic and that

|

W

|

is finite,

i.e.,

W

“ t

w

1

,...,w

|

W

|

u

. At every time-step

t

, for a state-

control pair

p

x

t

,u

t

q

, the process disturbance

w

t

is drawn

from set

W

according to the probability mass function

p

w

q “ r

p

w

1

q

,...,p

p

w

|

W

|

qs

T

, where

p

w

i

q

:

“

P

p

w

t

“

w

i

q

,

i

“

1

,

2

,...,

|

W

|

. Note that the probability mass

function for the process disturbance is time-invariant, and

that the process disturbance is independent of the process

history and of the state-control pair

p

x

t

,u

t

q

.

Note that, in particular, system (5) can capture stochastic

hybrid systems, such as Markovian Jump Systems [30].

We are interested in studying the properties of the solu-

tions to (5) with respect to the compact set

S

described by:

S

:

“ t

x

P

X

|

h

p

x

q ě

0

u

,

Int

p

S

q

:

“ t

x

P

X

|

h

p

x

q ą

0

u

,

(6)

B

S

:

“ t

x

P

X

|

h

p

x

q “

0

u

,

where

h

:

X

Ñ

R

is a continuous function.

In the presence of stochastic uncertainty

w

, assuring al-

most sure (with probability one) invariance or safety may not

be feasible. Moreover, enforcing safety in expectation is only

meaningful if the law of large numbers can be invoked and

we are interested in the long term performance, independent

of the realization fluctuations. In this work, instead, we

propose safety in the dynamic coherent risk measure sense

with conditional expectation as an special case.

Definition 3

(

ρ

-Safety)

.

Given a safe set

S

as given in (6)

and a time-consistent, dynamic coherent risk measure

ρ

0:

t

as described in (2), we call the solutions to (5), starting at

x

0

P

S

,

ρ

-safe

if and only if

ρ

0

,t

p

0

,

0

,...,h

p

x

qq ě

0

,

@

t

P

N

ě

0

.

(7)

In order to understand (7), consider the case where

ρ

is

the conventional total expectation. Then, (7) implies safety in

expectation. As mentioned earlier, the definition of safety for

general coherent risk measures goes beyond the traditional

total expectation.

Another interesting property we study in this paper arises

when

x

0

P

X

z

S

. That is, when instead of safety, we are

interested in reaching a set of interest in finite time.

Definition 4

(

ρ

-Reachability)

.

Consider system (5) with

initial condition

x

0

P

X

z

S

. Given a set

S

as given in (6)

and a time-consistent, dynamic coherent risk measure

ρ

0:

t

as described in (2), we call the set

S

ρ

-reachable

, if and

only if there exists a constant

t

̊

such that

ρ

0

,t

̊

p

0

,

0

,...,h

p

x

qq ě

0

.

(8)

IV. R

ISK

C

ONTROL

B

ARRIER

F

UNCTIONS

In order to check and enforce risk sensitive safety, i.e.,

ρ

-

safety, we introduce

risk control barrier functions

. We then

extend these to a finite-time variation, which allows us to

establish risk-sensitive reachability, i.e.,

ρ

-reachability.

A. Risk Sensitive Safety with RCBFs

Definition 5

(Risk Control Barrier Function)

.

For the

discrete-time system (5) and a dynamic coherent risk mea-

sure

ρ

, the continuous function

h

:

R

n

Ñ

R

is a

risk control

barrier function

for the set

S

as defined in (6), if there exists

a convex

α

P

K

satisfying

α

p

r

q ă

r

for all

r

ą

0

such that

ρ

p

h

p

x

t

`

1

qq ě

α

p

h

p

x

t

qq

,

@

x

t

P

X

.

(9)

Note that a simple choice for the function

α

is

α

“

α

0

,

where

α

0

P p

0

,

1

q

is a constant.

In the first main contribution of the paper, we demonstrate

that the existence of an RCBF implies invariance/safety in

the coherent risk measure.

Theorem 6.

Consider the discrete-time system

(5)

and the

set

S

as described in

(6)

. Let

ρ

be a given coherent risk

measure. Then,

S

is

ρ

-safe if there exists an RCBF as defined

in Definition 5.

Proof.

The proof is carried out by induction and using

the properties of a coherent risk measure as outlined in

Definition 2. If (9) holds, for

t

“

0

, we have

ρ

p

h

p

x

1

qq ě

α

p

h

p

x

0

qq

.

(10)

Similarly, for

t

“

1

, we have

ρ

p

h

p

x

2

qq ě

α

p

h

p

x

1

qq

.

(11)

Since

ρ

is monotone, composing both sides of (11) with

ρ

does not change the inequality and we obtain

ρ

̋

ρ

p

h

p

x

2

qq ě

ρ

p

α

p

h

p

x

1

qqq

.

(12)

Since

α

is a convex function, from Theorem 3 in [31]

(Jensen’s Inequality for coherent risk measures), we obtain

2

ρ

̋

ρ

p

h

p

x

2

qq ě

ρ

p

α

p

h

p

x

1

qqq ě

α

p

ρ

p

h

p

x

1

qqq

.

2

In particular, if

α

P p

0

,

1

q

is a constant, from positive homogeneity

property of

ρ

, we have

ρ

̋

ρ

p

h

p

x

2

qq ě

ρ

p

αh

p

x

1

qq “

αρ

p

h

p

x

1

qq

.

Then, using inequality (10), we have

ρ

̋

ρ

p

h

p

x

2

qq ě

α

p

ρ

p

h

p

x

1

qqq ě

α

̋

α

p

h

p

x

0

qq

.

Therefore, by induction, at time

t

, we can show that

ρ

t

p

h

p

x

t

qq ě

α

t

p

h

p

x

0

qq

.

The left-hand side of the above

inequality is equal to

ρ

0

,t

p

0

,...,h

p

x

t

qq

. Hence,

ρ

0

,t

p

0

,...,h

p

x

t

qq ě

α

t

p

h

p

x

0

qq

.

(13)

If

x

0

P

S

, from the definition of the set

S

, we have

h

p

x

0

q ě

0

. Since

α

P

K

, then we can infer that (7) holds. Thus, the

system is

ρ

-safe.

Note that, in the case when

x

0

P

X

z

S

, the existence of

an RCBF implies asymptotic convergence to the set

S

in the

coherent risk measure

ρ

. This can be inferred from (13). In

fact, if

α

p

r

q ă

r

, then there exist a constant

δ

P p

0

,

1

q

such

that

α

p

r

q ď

δr

and hence

α

t

p

r

q ď

δ

t

r, t

P

N

ě

0

,

(14)

If

x

0

P

X

z

S

, then

h

p

x

0

q ă

0

. However, from (14), as

t

Ñ 8

,

α

̋ ̈ ̈ ̈ ̋

α

p

r

q Ñ

0

, since the compositions of

class

K

functions is also class

κ

(hence non-negative). We

then obtain

ρ

0

,t

p

0

,...,h

p

x

t

qq ě

0

, which implies that the

solutions become

ρ

-safe.

B. Risk Sensitive Safety with Finite-time RCBFs

In practice, we are often interested in satisfying system

specifications characterized by the set

S

in finite time. To

this end, we define finite-time RCBFs.

Definition 7

(Finite-Time RCBF)

.

For the discrete-time

system (5) and a dynamic coherent risk measure

ρ

, the

continuous function

h

:

X

Ñ

R

is a

finite-time RCBF

for

the set

S

as defined in (6), if there exist constants

0

ă

γ

ă

1

and

ε

ą

0

such that

ρ

p

h

p

x

t

`

1

qq ́

γh

p

x

t

q ě

ε

p

1

́

γ

q

,

@

x

t

P

X

.

(15)

In the second key contribution of the paper, we show that

the existence of a finite-time RCBF implies

ρ

-reachability.

Theorem 8.

Consider the discrete-time system

(5)

and a

dynamic coherent risk measure

ρ

. Let

S

Ă

X

be as described

in

(6)

. If there exists a finite-time RCBF

h

:

X

Ñ

R

as in

Definition 7, then for all

x

0

P

X

z

S

, there exists a

t

̊

P

N

ě

0

such that

S

is

ρ

-reachable, i.e., inequality

(8)

holds.

Furthermore,

t

̊

ď

log

ˆ

ε

́

h

p

x

0

q

ε

̇

{

log

ˆ

1

γ

̇

,

(16)

where the constants

γ

and

ε

are as defined in Definition 7.

Proof.

Similar to the proof of Theorem 1, we use induction

and properties of coherent risk measures. We prove by

induction. From (15), we have

ρ

p

h

p

x

t

`

1

qq ́

ε

ě

γh

p

x

t

q ́

γε

“

γ

p

h

p

x

t

q ́

ε

q

.

Hence, for

t

“

0

, we have

ρ

p

h

p

x

1

qq ́

ε

ě

γ

p

h

p

x

0

q ́

ε

q

.

(17)

For

t

“

1

, we have

ρ

p

h

p

x

2

qq ́

ε

ě

γ

p

h

p

x

1

q ́

ε

q

.

(18)

Since

ρ

is monotone, composing both sides of the above

inequality with

ρ

does not change the inequality and we

obtain

ρ

̋

ρ

p

h

p

x

2

q ́

ε

q ě

ρ

p

γ

p

h

p

x

1

q ́

ε

qq “

γρ

p

h

p

x

1

q ́

ε

q

,

where in the last equality we used the positive homogeneity

property of

ρ

since

γ

P p

0

,

1

q

. Since

ε

ą

0

is a constant,

translational invariance property of

ρ

yields

ρ

̋

ρ

p

h

p

x

2

qq ́

ε

ě

γ

p

ρ

p

h

p

x

1

qq ́

ε

q

.

Moreover, from inequality (17), we infer

ρ

̋

ρ

p

h

p

x

2

qq ́

ε

ě

γ

p

ρ

p

h

p

x

1

qq ́

ε

q ě

γ

2

p

h

p

x

0

q ́

ε

q

.

Thus, by induction, we see that at time step

t

, the following

inequality holds

ρ

t

p

h

p

x

t

qq ́

ε

ě

γ

t

p

h

p

x

0

q ́

ε

q

.

Taking

ε

to the right-hand side and noting that the left-hand

side of the above inequality is equal to

ρ

0

,t

p

0

,...,h

p

x

t

qq

,

we have the following inequality

ρ

0

,t

p

0

,...,h

p

x

t

qq ě

γ

t

p

h

p

x

0

q ́

ε

q`

ε.

(19)

Since

0

ă

γ

ă

1

and

x

0

P

X

z

S

,

i.e.,

h

p

x

0

q ă

0

, as

t

increases

x

t

approaches

S

in the dynamic risk measure

ρ

0

,t

,

because by definition

h

p

x

t

q ě

0

implies

x

t

P

S

. Hence,

S

is

ρ

-reachable in finite time.

by definition,

x

t

reaches

S

at least at the boundary by

t

̊

when

̃

h

p

x

t

q “

0

. Substituting

̃

h

p

x

t

q “

0

in (19) yields

0

ě

γ

t

̊

p

h

p

x

0

q ́

ε

q`

ε,

(20)

where we used the fact that

ρ

0

,t

p

0

,...,h

p

x

t

̊

qq “

ρ

0

,t

p

0

,...,

0

q “

0

. Re-arranging the term and noting that

h

p

x

0

q ď

0

and therefore

h

p

x

0

q ́

ε

ď

0

, we obtain

ε

́

h

p

x

0

q

ď

γ

t

.

Taking the logarithm of both sides of the above inequality

gives

log

́

ε

́

h

p

x

0

q

̄

ď

t

log

p

γ

q

,

or equivalently

́

log

ˆ

ε

́

h

p

x

0

q

ε

̇

ď ́

t

log

p

1

γ

q

.

Since

0

ă

γ

ă

1

,

log

p

1

γ

q

is a positive number. Dividing

both sides of the inequality above with the negative number

́

log

p

1

γ

q

obtains

t

ď

log

́

ε

́

̃

h

p

b

0

q

ε

̄

{

log

́

1

ρ

̄

.

The upper bound described by inequality (16) in Theorem

2 is dependent on the two parameter

γ

and

ε

. In our

experiments, we often fix

0

ă

γ

ă

1

and carry out a line

search over

ε

until the finite-time RCBF condition (15) does

not hold anymore. Then, we pick the corresponding

t

̊

as

the upper-bound on the earliest time the solutions can enter

the goal set

S

.

Fig. 2: Simulation results for the cart-pole system with no RCBF filter, and with standard RCBF (top) and finite-time RCBF

(bottom) filters using total conditional expectation and CVaR.

C. Boolean Compositions of RCBFs

We have proposed RCBFs and finite-time RCBFs as means

to verify

ρ

-safety and

ρ

-reachability, respectively. We now

propose conditions for verifying

ρ

-safety and

ρ

-reachability

for Boolean compositions of several control barrier functions

[24], [32], [33]. We omit proofs due to space constraints.

Proposition 1.

Let

S

i

“ t

x

P

R

n

|

h

i

p

x

q ě

0

u

,

i

“

1

,...,k

denote a family of safe sets with the boundaries and interior

defined analogous to

S

in

(6)

and

ρ

be a given dynamic

coherent risk measure. Consider the discrete-time system

(5)

.

If there exist a

α

P p

0

,

1

q

such that

ρ

ˆ

min

i

“

1

,...,k

h

i

p

x

t

`

1

q

̇

ě

α

min

i

“

1

,...,k

h

i

p

x

t

q

(21)

then the set

t

x

P

R

n

| ^

i

“

1

,...,k

p

h

i

p

x

q ě

0

qu

is

ρ

-safe.

Similarly, if there exist a

α

P p

0

,

1

q

such that

ρ

ˆ

max

i

“

1

,...,k

h

i

p

x

t

`

1

q

̇

ě

α

max

i

“

1

,...,k

h

i

p

x

t

q

(22)

then the set

t

x

P

R

n

| _

i

“

1

,...,k

p

h

i

p

x

q ě

0

qu

is

ρ

-safe.

We next propose conditions for risk-sensitive finite-time

reachability of sets composed of Boolean compositions of

several functions

h

as described in (6).

Proposition 2.

Let

S

i

“ t

x

P

R

n

|

h

i

p

x

q ě

0

u

,

i

“

1

,...,k

denote a family of sets with the boundaries and interior

defined analogous to

S

in

(6)

and

ρ

be a given dynamic

coherent risk measure. Consider the discrete-time system

(5)

.

If there exist constants

0

ă

γ

ă

1

and

ε

ą

0

such that

ρ

ˆ

min

i

“

1

,...,k

h

i

p

x

t

`

1

q

̇

́

γ

min

i

“

1

,...,k

h

i

p

x

t

q ě

ε

p

1

́

γ

q

(23)

then the set

t

x

P

R

n

| ^

i

“

1

,...,k

p

h

i

p

x

q ě

0

qu

is

ρ

-reachable.

Then, there exists a constant

t

̊

satisfying

t

̊

ď

log

ˆ

ε

́

min

i

“

1

,...,k

h

i

p

x

0

q

ε

̇

{

log

ˆ

1

γ

̇

,

(24)

such that if

x

0

P

X

z Y

i

“

1

,...,k

S

i

then

x

t

̊

P X

i

“

1

,...,k

S

i

.

Similarly, the disjunction case follows by replacing

min

with

max

in

(23)

and

(24)

.

V. S

IMULATION

R

ESULTS

In order to illustrate the results of these risk-aware guar-

antees, we apply our method in the case of the cart-pole,

modeled as a nonlinear, control-affine discrete-time system.

x

t

`

1

“

x

t

`

»

—

–

v

x

9

θ

u

t

`

m

p

s

θ

p

l

9

θ

2

`

g

c

θ

q

m

c

`

m

p

s

2

θ

́

u

t

c

θ

́

m

p

l

9

θ

2

c

θ

s

θ

́p

m

c

`

m

p

q

g

s

θ

l

p

m

c

`

m

p

s

2

θ

q

fi

ffi

fl

∆

t

`

w

t

(25)

The disturbance

w

t

P

W

enters the system linearly, and is

described by a pmf over the states. This could include the

modeling error from this Euler-approximated discrete-time

model, but in this case, it is a simple pmf normally distributed

around

0

with standard deviation

σ

“ t

0

.

05

,

0

.

05

,

0

.

2

,

0

.

2

u

for the four states

x

“

”

p

x

,θ,v

x

,

9

θ

ı

.

The safety set is described by

h

p

x

t

q “ ́

2

a

max

p

t

x

́

v

t

x

2

sgn

p

v

t

x

q

,

(26)

where

a

max

ą

0

is a tuneable parameter that designates the

maximum linear acceleration at any point. This function is

positive when

p

x

ă

0

, but allows

h

p

x

t

q ą

0

when

p

x

ą

0

if

v

x

is sufficiently negative.

While this safety set is nonlinear in the control inputs,

the one-step nature of this optimization problem results in

no issues solving such a program in real-time, using modern

solvers such as IPOPT or NLOPT. In future work, we plan

to show how nonlinear CBFs can be linearized to result in

an affine RCBF constraint, with the error included in the

stochastic uncertainty to result in formal safety guarantees.

The RCBF was solved using PAGMO’s integrated SLSQP

solver from NLOPT. Each solution took roughly 0.7 ms

to compute on a modern laptop, resulting in a maximum

control frequency of 1428 Hz. Three trajectories are shown

in Figure 2. The desired trajectory shows the trajectory with

only the nominal controller, which clearly surpasses the safe

set at

x

“

0

. The trajectory corresponding to

E

r

h

s

was

filtered subject to the total conditional expectation coherent

risk measure, which also corresponds to CVaR with

β

“

1

.

While this filter guarantees safety in the expectation, safety is

frequently violated due to the stochastic uncertainty. Finally,

the trajectory corresponding to CVaR with

β

“

0

.

01

results

in safety over the entire trajectory.

Similarly, Figure 2 also demonstrates the same three tra-

jectories with the finite-time reachability RCBF. Specifically,

we utilize constants

γ

“

0

.

05

and

“

0

.

1

, with an initial

safety violation of

h

p

x

0

q “ ́

0

.

2

. From (16), this suggests a

t

̊

ď

0

.

3667

s. While this is not reflected in the plot, which

only shows

p

t

x

rather that

h

p

x

t

q

, we find that

h

p

x

t

̊

q ą

0

at

t

̊

“

0

.

08

s, well below the theoretical guarantee.

VI. C

ONCLUSIONS

In this paper, we propose Risk Control Barrier Functions

(RCBFs) as a means to enforce safety in the presence

of stochastic uncertainty. We demonstrate theoretically that

these RCBFs guarantee safety with respect to dynamic coher-

ent risk measures, which serve as a computationally efficient

means to assess risk. Moreover, we proved that finite-time

RCBFs can be utilized to guarantee convergence to a set in

finite time, resulting in a practical safety filter that works both

inside and outside of the safe set. We also demonstrated how

multiple safe sets can be enforced simultaneously utilizing

Boolean compositions. Finally, we demonstrated the efficacy

of this framework on the nonlinear cart-pole system under

stochastic uncertainty.

R

EFERENCES

[1] S. Thrun, W. Burgard, and D. Fox,

Probabilistic robotics

. Cambridge,

Mass.: MIT Press, 2005.

[2] J. Reher and A. D. Ames, “Dynamic walking: Toward agile and

efficient bipedal robots,”

Annual Reviews

, 2020.

[3] T. Rou

ˇ

cek, M. Pecka, P.

ˇ

C

́

ı

ˇ

zek, T. Pet

ˇ

r

́

ı

ˇ

cek, J. Bayer, V.

ˇ

Salansk

`

y,

D. He

ˇ

rt, M. Petrl

́

ık, T. B

́

a

ˇ

ca, V. Spurn

`

y,

et al.

, “Darpa subterranean

challenge: Multi-robotic exploration of underground environments,”

in

International Conference on Modelling and Simulation for Au-

tonomous Systesm

, pp. 274–290, Springer, 2019.

[4] M. Ono, M. Pavone, Y. Kuwata, and J. Balaram, “Chance-constrained

dynamic programming with application to risk-aware robotic space

exploration,”

Autonomous Robots

, vol. 39, no. 4, pp. 555–571, 2015.

[5] A. Wang, A. M. Jasour, and B. Williams, “Non-gaussian chance-

constrained trajectory planning for autonomous vehicles under agent

uncertainty,”

IEEE Robotics and Automation Letters

, 2020.

[6] S. Koenig and R. G. Simmons, “Risk-sensitive planning with proba-

bilistic decision graphs,” in

Principles of Knowledge Representation

and Reasoning

, pp. 363–373, Elsevier, 1994.

[7] H. Xu and S. Mannor, “Distributionally robust Markov decision

processes,” in

Advances in Neural Information Processing Systems

,

pp. 2505–2513, 2010.

[8] A. Majumdar and M. Pavone, “How should a robot assess risk?

towards an axiomatic theory of risk in robotics,” in

Robotics Research

,

pp. 75–84, Springer, 2020.

[9] P. Artzner, F. Delbaen, J. Eber, and D. Heath, “Coherent measures of

risk,”

Mathematical finance

, vol. 9, no. 3, pp. 203–228, 1999.

[10] Y. Chow, A. Tamar, S. Mannor, and M. Pavone, “Risk-sensitive and

robust decision-making: a cvar optimization approach,” in

Advances

in Neural Information Processing Systems

, pp. 1522–1530, 2015.

[11] Y. Chow and M. Ghavamzadeh, “Algorithms for cvar optimization

in mdps,” in

Advances in neural information processing systems

,

pp. 3509–3517, 2014.

[12] L. Prashanth, “Policy gradients for cvar-constrained mdps,” in

Inter-

national Conference on Algorithmic Learning Theory

, pp. 155–169,

Springer, 2014.

[13] N. B

̈

auerle and J. Ott, “Markov decision processes with average-

value-at-risk criteria,”

Mathematical Methods of Operations Research

,

vol. 74, no. 3, pp. 361–379, 2011.

[14] S. Singh, Y. Chow, A. Majumdar, and M. Pavone, “A framework for

time-consistent, risk-sensitive model predictive control: Theory and

algorithms,”

IEEE Transactions on Automatic Control

, 2018.

[15] M. P. Chapman, J. Lacotte, A. Tamar, D. Lee, K. M. Smith, V. Cheng,

J. F. Fisac, S. Jha, M. Pavone, and C. J. Tomlin, “A risk-sensitive finite-

time reachability approach for safety of stochastic dynamic systems,”

in

2019 American Control Conference (ACC)

, pp. 2958–2963, IEEE,

2019.

[16] A. D. Ames, X. Xu, J. W. Grizzle, and P. Tabuada, “Control barrier

function based quadratic programs for safety critical systems,”

IEEE

Transactions on Automatic Control

, vol. 62, no. 8, pp. 3861–3876,

2016.

[17] Q. Nguyen, A. Hereid, J. W. Grizzle, A. D. Ames, and K. Sreenath,

“3D dynamic walking on stepping stones with control barrier func-

tions,” in

Decision and Control (CDC), 2016 IEEE 55th Conference

on

, pp. 827–834, IEEE, 2016.

[18] Y. Chen, A. Hereid, H. Peng, and J. Grizzle, “Enhancing the perfor-

mance of a safe controller via supervised learning for truck lateral

control,”

Journal of Dynamic Systems, Measurement, and Control

,

vol. 141, no. 10, 2019.

[19] X. Xu, P. Tabuada, J. W. Grizzle, and A. D. Ames, “Robustness of con-

trol barrier functions for safety critical control,”

IFAC-PapersOnLine

,

vol. 48, no. 27, pp. 54–61, 2015.

[20] S. Kolathaya and A. D. Ames, “Input-to-state safety with control

barrier functions,”

IEEE control systems letters

, vol. 3, no. 1, pp. 108–

113, 2018.

[21] A. D. Ames, S. Coogan, M. Egerstedt, G. Notomista, K. Sreenath,

and P. Tabuada, “Control barrier functions: Theory and applications,”

in

2019 18th European Control Conference (ECC)

, pp. 3420–3431,

IEEE, 2019.

[22] M. Ahmadi, A. Singletary, J. W. Burdick, and A. D. Ames, “Safe

policy synthesis in multi-agent pomdps via discrete-time barrier func-

tions,” in

2019 IEEE 58th Conference on Decision and Control (CDC)

,

pp. 4797–4803, IEEE, 2019.

[23] A. Agrawal and K. Sreenath, “Discrete control barrier functions for

safety-critical control of discrete systems with application to bipedal

robot navigation.,” in

Robotics: Science and Systems

, 2017.

[24] M. Ahmadi, A. Singletary, J. W. Burdick, and A. D. Ames, “Barrier

functions for multiagent-pomdps with dtl specifications,” in

The 59th

IEEE Conference on Decision and Control

, 2020.

[25] A. Clark, “Control barrier functions for complete and incomplete

information stochastic systems,” in

2019 American Control Conference

(ACC)

, pp. 2928–2935, IEEE, 2019.

[26] C. Santoyo, M. Dutreix, and S. Coogan, “A barrier function approach

to finite-time stochastic system verification and control,”

arXiv preprint

arXiv:1909.05109

, 2019.

[27] A. Ruszczy

́

nski, “Risk-averse dynamic programming for markov deci-

sion processes,”

Mathematical programming

, vol. 125, no. 2, pp. 235–

261, 2010.

[28] R. T. Rockafellar, S. Uryasev,

et al.

, “Optimization of conditional

value-at-risk,”

Journal of risk

, vol. 2, pp. 21–42, 2000.

[29] R. T. Rockafellar and S. Uryasev, “Conditional value-at-risk for

general loss distributions,”

Journal of banking & finance

, vol. 26, no. 7,

pp. 1443–1471, 2002.

[30] P. Zhao, Y. Kang, and Y.-B. Zhao, “A brief tutorial and survey on

markovian jump systems: Stability and control,”

IEEE Systems, Man,

and Cybernetics Magazine

, vol. 5, no. 2, pp. 37–C3, 2019.

[31] Z. Chen, K. He, R. Kulperger,

et al.

, “Risk measures and nonlinear

expectations,”

Journal of Mathematical Finance

, vol. 3, no. 03, p. 383,

2013.

[32] P. Glotfelter, J. Cort

́

es, and M. Egerstedt, “Nonsmooth barrier func-

tions with applications to multi-robot systems,”

IEEE control systems

letters

, vol. 1, no. 2, pp. 310–315, 2017.

[33] M. Ahmadi, A. Israel, and U. Topcu, “Safe controller synthesis for

data-driven differential inclusions,”

IEEE Transactions on Automatic

Control

, 2020.