Synthesizing Reactive Test Environments for Autonomous Systems:
Testing Reach-Avoid Specifications with Multi-Commodity Flows
Apurva Badithela*
1
, Josefine B. Graebener*
2
, Wyatt Ubellacker
1
, Eric V. Mazumdar
1
,
Aaron D. Ames
1
, Richard M. Murray
1
Abstract
— We study automated test generation for verifying
discrete decision-making modules in autonomous systems. We
utilize linear temporal logic to encode the requirements on the
system under test in the system specification and the behavior
that we want to observe during the test is given as the test
specification which is unknown to the system. First, we use the
specifications and their corresponding non-deterministic B
̈
uchi
automata to generate the specification product automaton.
Second, a virtual product graph representing the high-level
interaction between the system and the test environment is
constructed modeling the product automaton encoding the
system, the test environment, and specifications. The main result
of this paper is an optimization problem, framed as a multi-
commodity network flow problem, that solves for constraints
on the virtual product graph which can then be projected to
the test environment. Therefore, the result of the optimization
problem is reactive test synthesis that ensures that the system
meets the test specifications along with satisfying the system
specifications. This framework is illustrated in simulation on
grid world examples, and demonstrated on hardware with the
Unitree A1 quadruped, wherein dynamic locomotion behaviors
are verified in the context of reactive test environments.
I. I
NTRODUCTION
Operational testing of autonomous systems at various
levels of abstraction— from low-level continuous dynamics
to high-level discrete decision-making— is essential for
verification and validation. In formal methods, testing refers
to simulation-based falsification, where inputs to a model
of the system are found which result in system violating
its requirements [1], [2], [3], [4], [5], [6], [7]. Falsification
methods typically minimize a robustness metric associated
with the formal specifications of the system to find inputs
that result in falsifying traces [8], [9], [10]. However, another
approach to testing is to have test engineers hand-design test
scenarios as seen in the qualification tests of the DARPA
Urban Challenge [11], [12]. In this work, we bridge these
two approaches by leveraging test engineer expertise at
the specification level and then automating the construction
of the test environment for testing discrete, long-horizon
decision-making in robotic systems. In the last decade, the
control synthesis community has demonstrated the effective-
ness of using temporal logic to specify formal requirements
* The authors contributed equally. Corresponding authors: A. Badithela,
J.B. Graebener
{
apurva,jgraeben
}
@caltech.edu
We acknowledge funding from AFOSR Test and Evaluation Pro-
gram, grant FA9550-19-1-0302, National Science Foundation award CNS-
1932091, and Dow (#227027AT).
1
Department of Computing and Mathematical Sciences, California Insti-
tute of Technology, Pasadena, CA 91125, USA
2
Graduate Aerospace Laboratories, California Institute of Technology,
Pasadena, CA 91125, USA
Fig. 1: Overview of the test environment synthesis frame-
work and the hardware demonstration.
for robotic systems [13], [14], [15], [16]. Furthermore, we
assume that via the use of rulebooks and industry standard
manuals [17], [18], [19], a test engineer can provide these
high-level descriptions on the desired test outcomes using
temporal logic.
Our notion of testing in this work is also complementary
to falsification — we seek to construct a test environment to
observe a desired high-level behavior, on which falsification
can then be applied to determine the worst-case scenario
which can provide confidence that the system will behave
correctly in operational environments. Furthermore, falsifi-
cation algorithms oftentimes search over continuous inputs
to find a failure case [20], [21], [22], [23]. We envision that
falsification could be used to refine the test environments
generated by our approach.
Our approach to test generation shares similarities with
existing methods, but has key differences. Similar to [24],
we characterize the mission requirements on the system as
a system specification, and characterize the desired behavior
observed during the test via a test specification, which is
unknown to the system. However, unlike [24], we seek to
construct the test environment, by constraining actions that
the system can take, such that: a) the system can still satisfy
its requirements, and b) the test specification is satisfied
in a successful test execution (1). Additionally, we seek to
arXiv:2210.10304v1 [cs.RO] 19 Oct 2022
synthesize tests in which the system is not too restricted in
its decision-making (2).
Building upon the results of [25], the key contributions of
this paper are (i) framing the problem of synthesizing test en-
vironments for reach-avoid specifications in linear temporal
logic (LTL) as a multi-commodity network flow problem, (ii)
presenting an efficiently solvable convex-concave min-max
optimization-based relaxation that results in a constrained
test, (iii) demonstrating the approach by executing the re-
sulting test strategy to reactively test dynamic locomotion
behaviors of the Unitree A1 quadruped. A key advantage
of our method is that the synthesized test is reactive — the
constraints visible to the system under test are reactive to the
system state and depend on the system’s strategy, which is
not known to the tester
a priori
.
II. B
ACKGROUND
A. Temporal Logic, Transition Systems, and Automata
Linear temporal logic (LTL) can describe temporal prop-
erties on a trace of propositional formulas [26]. The syntax
of LTL is comprised of both logical (
∧
and
,
∨
or
, and
¬
negation
) and temporal operators (
©
next
,
always
,
♦
even-
tually
, and
U
until
) operators. LTL can specify requirements
on high-level decision-making in autonomous systems such
as
safety
(
φ
s
sys
)
,
progress
♦
(
φ
p
sys
)
, and
fairness
♦
(
φ
f
sys
)
.
A nondeterministic B
̈
uchi automaton (NBA) [27] is a tuple
B
= (
Q,
2
AP
,δ,Q
0
,F
)
, where
Q
represents the states,
AP
is the set of atomic propositions,
δ
represents the transition
function,
Q
0
⊆
Q
represents the initial states, and
F
⊆
Q
is the set of acceptance states. A transition system is a tuple
T
= (
S,A,E,I,AP,L
)
where
S
is a set of states,
A
is the
set of actions,
E
:
S
×
A
→
S
is the transition relation,
I
⊆
S
is the set of initial states,
AP
is the set of atomic
propositions, and
L
:
S
→
2
AP
is a labeling function that
indicates the set of atomic propositions that evaluate to
true
at a particular state.
Definition 1
(Product Automaton)
.
A
product automaton
is the synchronous product of a transition system
T
=
(
S,A,E,I,AP,L
)
and a NBA
B
= (
Q,
2
AP
,δ,Q
0
,F
)
, is
the tuple
T ⊗B
= (
S
′
,A,E
′
,I
′
,AP
′
,L
′
)
, where:
•
S
′
=
S
×
Q
,
•
∀
s,t
∈
S
,
∀
q,p
∈
Q
such that
s
a
→
t
and
δ
(
q,L
(
t
)) =
p
, then,
(
s,q
)
a
→
′
(
t,p
)
,
•
I
′
=
{
(
s
0
,q
) :
s
0
∈
I,
∃
q
0
∈
Q
0
s.t.
q
0
L
(
s
0
)
→
q
}
,
•
AP
′
=
Q
, and
•
L
′
:
S
×
Q
→
2
Q
such that
L
′
((
s,q
)) =
{
q
}
.
Definition 2
(Asynchronous Product Automaton)
.
An
asynchronous product automaton
of finite state au-
tomata
A
1
,...,
A
n
is the finite state automaton
A
Π
=
(
S
Π
,s
0
,
Π
,L
Π
,F
Π
,T
Π
)
, where:
•
S
Π
:=
S
1
×
...
×
S
n
, the Cartesian product of the states
of the individual product automata,
•
s
0
,
Π
= (
s
01
,...,s
0
n
)
, the
n
−
tuple representing initial
conditions,
•
L
Π
:=
L
1
∪
...
∪
L
n
, where
L
i
= 2
AP
i
for all
i
∈
{
1
,...,n
}
,
•
T
Π
:= ((
u
1
,...,u
n
)
,l,
(
v
1
,...,v
n
))
such that
∃
i
and
1
≤
i
≤
n
,
(
u
i
,l,v
i
)
∈
T
i
and
∀
j
6
=
i
,
u
j
=
v
j
,
•
F
Π
:= (
s
1
,...,s
n
)
∈
S
such that
∃
i
such that
s
i
∈
F
i
.
B. System and Test Environment
We utilize the notion of a system specification and a test
specification, which represent requirements on the system
under test and the test environment, respectively [24]. The
system specification is assumed to be given, while the
system does not necessarily have knowledge of the entire
test specification. In this work we will frame the system
specification and the test specification as reach-avoid type
specifications defined as
φ
sys
=
φ
s
sys
∧
♦
φ
p
sys
, φ
test
=
(
φ
s
test
)
∧
∧
i
♦
(
φ
p
test
)
i
,
(1)
which capture the safety and progress requirements on the
system, and the tester respectively. We show that it is possible
to model the set of feasible test executions using network
flows on an automaton.
Definition 3
(Network Flow)
.
A
network flow
is a tuple
N
=
〈
V,E,c,s,t
〉
where
V
is a set of vertices,
E
is a set of
directed edges,
E
⊆
V
×
V
,
c
is a capacity function for the
amount of flow that each edge can transfer, and
s
∈
V
are
the source vertices and
t
∈
V
are the target sink vertices. In
a single network flow, each edge is associated with a single
flow, while in a multi-commodity setting, multiple flows are
associated with each edge. An extension of network flows
to the game setting is known as a flow game [28]. While
our work involves solving a Stackleberg game over a multi-
commodity flow network, it differs from [28] in that the tester
is not completely adversarial.
III. S
YNTHESIZING
T
EST
E
NVIRONMENTS
This section sets up the test generation problem statement
and introduces a running example to illustrate the approach
we take in this paper.
A. Problem Statement
The system and test specifications are written at the same
level of abstraction as the model of the system characterized
by the transition system
T
. We require that the sub-formulas
of the test specifications,
φ
s
test
and
φ
p
test
in equation 1, be high-
level descriptions of desired test scenarios provided by a test
engineer. In this way, the task of describing the behavior of
the system to be observed during the test is left to the test
engineer, but the process of synthesizing a corresponding test
environment can be automated.
Problem 1.
Given a discrete abstraction of a system model
T
= (
S,A,E,I,AP,L
)
, and system and test specifications,
φ
sys
and
φ
test
, defined over the set
AP
, find the set of
transitions of the system
E
cut
⊂
E
that need to be constrained
such that all traces
σ
|
=
φ
sys
of the constrained system on
T
= (
S,A,E
cut
,I,AP,L
)
satisfy the following property:
σ
|
=
φ
sys
=
⇒
σ
|
=
φ
test
.
(2)
In other words, a trace of the constrained system that
satisfies the system specification must also satisfy the test
init
goal
T
T
(a) System B
̈
uchi automaton
B
sys
.
init
(b) Tester B
̈
uchi automaton
B
test
.
init
(c) Specification product automaton
B
π
.
(d) Resulting Test Execution.
(e) Virtual game graph
G
=
T ⊗B
π
. The
initial state corresponds to
S
, (partially)
yellow states to
T
, and exclusively blue
states to
I
.
Fig. 2: B
̈
uchi automata for the system specification, the test specification and the product automaton
B
π
=
B
test
×B
sys
, and
the virtual product graph
G
for the corridor navigation example. The accepting states of the system are shaded in yellow and
the acceptance states of the tester are shaded in blue, with nodes shaded in both yellow and blue representing acceptance
states of both tester and system. Transition labels and self-loops have been omitted in
B
test
,
B
π
, and
G
for clarity.
specification, and the constraints are synthesized such that
there always exists a trace satisfying the system specification.
In addition to determining constraints that result in test
executions of the system abiding by equation (2), we do
not want the system to be so constrained that it does not
have much freedom in decision-making during the test. In
this work, we use maximum network flow as a proxy for the
maximum freedom a robot has to achieve its specifications,
and we present an algorithmic framework that addresses both
these problems on examples in both simulation and hardware.
B. Running Example: Robot in a corridor
Consider a corridor in a grid world shown in figure 2d.
The system under test is starting in the middle of the corridor
with the goal of reaching either end. The dynamics are
simple grid world dynamics enabling horizontal transitions
to neighboring grid cells. The test behavior that we want
to observe is that the system passes through the two blue
grid cells
φ
test
=
♦
key
1
∧
♦
key
2
. The system specification is
given as
φ
sys
=
♦
goal, which corresponds to the yellow grid
cells. We then constrain the system transitions according to
our algorithm by placing obstacles on the grid cells, which
we will show in the following sections.
C. Constructing Product Automata
We draw from automata theory to define the specification
product automaton and virtual product graph. The product
automaton of the system transition
T
and the B
̈
uchi automa-
ton corresponding to the specification
φ
sys
,
T ⊗B
sys
, is de-
noted by the tuple
S
= (
S
sys
,A,E
sys
,I
sys
,AP
sys
,L
sys
)
. The
asynchronous product is used to construct the specification
product automaton of the system and test B
̈
uchi automata.
The product automaton of the transition system,
T
, and the
specification product automaton is denoted as the virtual
product graph
G
. It is for this product automaton
G
, that
we will be synthesizing constraints.
Definition 4
(Specification Product Automaton)
.
The
spec-
ification product automaton
B
Π
=
B
sys
×B
test
is the asyn-
chronous product of the B
̈
uchi automata of the system and
the test specification. In particular,
B
Π
.F
=
{
(
q
sys
,q
test
)
∈
B
Π
.Q
|
q
sys
∈B
sys
.F
}∪{
(
q
sys
,q
test
)
∈B
Π
.Q
|
q
test
∈B
test
.F
}
.
Definition 5
(Virtual Product Graph)
.
The synchronous
product automaton of system transition
T
and the NBA
B
Π
is the
virtual product graph
G
=
T ⊗B
Π
. In tuple form, we
denote
G
= (
S
′
,A,E
′
,I
′
,AP
′
,L
′
)
.
Definition 6
(Source, Intermediate and Target Nodes)
.
The
source (
S
), intermediate (
I
) and target (
T
) are the set of nodes
on the virtual product graph
G
with the following properties:
S
=
{
(
s
0
,q
0
)
∈
S
′
|
q
0
∈B
Π
.Q
}
I
=
{
(
s,
(
q
sys
,q
test
))
∈
S
′
|
q
test
∈B
test
.F, q
sys
/
∈B
sys
.F
}
T
=
{
(
s,
(
q
sys
,q
test
))
∈
S
′
|
q
sys
∈B
sys
.F
}
(3)
The source nodes
S
represent the initial conditions of
the test, the intermediate nodes
I
represent the acceptance
states corresponding to the test specification, and the target
nodes
T
represent the acceptance states corresponding to the
system specification. For the running example, the automata
corresponding to
B
sys
,
B
test
,
B
Π
are illustrated in Figure 2a-
2c. the virtual product graph
G
and the corresponding source,
intermediate and target nodes are illustrated in Figure 2e.
Problem 2.
Given the setting in Problem 1, synthesize the
set of edge constraints
E
′
cut
on the virtual product graph
G
such that flows from
S
to
T
is maximized, and all possible
traces on
G
from
S
to
T
comprise of a node in
I
.
D. Multi-Commodity Flows and Bilevel Optimization
To synthesize constraints
E
′
cut
on the virtual product graph
G
, we use multi-commodity flows on
G
and present a bilevel
optimization to find cuts. These constraints are such that a)
there exists a system controller that can satisfy the system
specification (
σ
|
=
φ
sys
), and that for every satisfying trace,
the test specification is also satisfied (equation (2)), and b)
the set of cuts
E
′
cut
on
G
result in maximum flow from
S
to
T
.
We then map these synthesized constraints
E
′
cut
to constraints
E
cut
on system transitions
T
.
Given a graph
G
with
S
,
I
, and
T
, a brute force approach
to solving Problems 1 and 2 is not viable, as it would involve
a) finding a set of paths
P
S
→
I
realizing max-flow from
S
to
I
, and b) finding a set of paths
P
I
→
T
realizing max flow from
the
I
to
T
, such that
P
S
→
I
and
P
I
→
T
are disjoint except for
the intermediate
I
. Finding such a feasible pair of
P
S
→
I
and
P
I
→
T
would take exponential time because enumerating all
paths is exponential in the size of the graph [29].
To address this combinatorial problem, we formulate a
bilevel optimization that relaxes edge cuts to take fractional
values. These relaxed constraints, in addition to our choice
of the objective function, makes the optimization tractable.
While the cut values of some edges take on fractional values
due to the relaxation, we find empirically that these fractional
cuts are not relevant to constraining the flow. The system
and tester are players that optimize for different flows on the
same virtual product graph
G
. The system player maximizes
flow
f
S
→
T
, defined from
S
to
T
, with the flow into and out of
the intermediate (
I
) constrained to zero. These represent be-
haviors of the system satisfying
φ
sys
without satisfying
φ
test
.
The tester player: i) maximizes flow
f
S
→
I
(defined from
S
to
I
), ii) maximizes flow
f
I
→
T
(defined from
I
to
T
), and iii)
minimizes flow
f
S
→
T
that bypasses the intermediate
I
. We
use the multi-commodity network flows to simultaneously
reason about several flows on
G
— maximizing flows
f
S
→
I
and
f
I
→
T
, and cutting the flow
f
S
→
T
. However, unlike the
canonical multi-commodity flow framework [30], our flows
do not compete for edge capacities. Instead, the flows are
coupled by the placement of cuts by constraining flow along
the edges. Therefore, the tester does not directly set the
flow
f
S
→
T
, but indirectly constrains it by placing cuts on
system transitions. This multi-commodity flow-based bilevel
optimization is given in (8). The variables of the optimization
are normalized by the total flow
F
on
G
, which is defined
as follows,
F
= min
{
∑
v
:(
S
,v
)
∈
E
′
f
(
S
,v
)
S
→
I
,
∑
v
:(
I
,v
)
∈
E
′
f
(
I
,v
)
I
→
T
}
.
(4)
We require the auxiliary variable
t
:= 1
/F
to re-write
network flow constraints in the normalized form. For every
edge
e
∈
E
′
, let
d
e
represent the constraint on the edge —
d
e
=
t
means that the edge
e
is cut or fully constrained and
d
e
= 0
means that the edge
e
is unconstrained. Similarly, for
every
e
∈
E
′
,
f
e
S
→
I
,
f
e
I
→
T
, and
f
e
S
→
T
are the respective flow
values on edge
e
in
G
. As the outer (min) player, the tester
variables are the flows
f
e
S
→
I
and
f
e
I
→
T
, edge cuts
d
e
, and the
auxiliary variable
t
. The objective function corresponds to the
tester synthesizing constraints
d
e
such that total flow
F
is
maximized while max-flow of
f
S
→
T
is minimized. Likewise,
the system player gets to maximize flow
f
S
→
T
. Next, the
constraints of the bilevel optimization are detailed. Capacity
constraints determine the maximum flow allowed on an edge.
The capacity constraints for normalized variables in this
optimization are given as follows,
∀
e
∈
E
′
,
0
≤
d
e
≤
t,
0
≤
f
e
S
→
I
≤
t,
0
≤
f
e
S
→
T
≤
t,
0
≤
f
e
I
→
T
≤
t.
(c1)
Cut constraints correspond to the cut variable and flow
variable of an edge competing for its capacity. For all
k
∈
{
S
→
I
,
I
→
T
,
S
→
T
}
, the cut constraints are as follows,
∀
e
∈
E
′
, d
e
+
f
e
k
≤
t.
(c2)
Flow conservation constraints ensure that the total flow
entering entering a node is equal to the total flow leaving
the node (unless the node is a source or a target). For
k
∈ {
S
→
I
,
I
→
T
,
S
→
T
}
, the conservation constraints
are as follows,
∀
v
∈
S
′
∑
u
:(
u,v
)
∈
E
′
f
(
u,v
)
k
=
∑
u
:(
v,u
)
∈
E
′
f
(
v,u
)
k
.
(c3)
Finally, equation (4) can be re-written as the following
constraint after normalizing the variables,
1
≤
∑
v
:(
S
,v
)
∈
E
f
(
S
,v
)
S
→
I
,
1
≤
∑
v
:(
I
,v
)
∈
E
f
(
I
,v
)
I
→
T
.
(c4)
Our framework can synthesize test environments by con-
straining system actions, not by enforcing them. When the
bilevel optimization finds constraints on the virtual product
G
, it takes into account tester behavior unknown to the
system in the form of the test specification. The framework
guarantees that under the synthesized constraints, there al-
ways exists a system trace
σ
=
s
′
0
s
′
1
...
on
G
satisfying the
system specification. However, this could return constraints
for which at some temporal instances in the test execution,
there is no path for satisfying trace on the system product
automaton
S
. During a test execution, these constraints on
the system could potentially translate to the system, at that
temporal instance, not being able to re-plan to satisfying
its requirements. If possible, our framework should return
constraints for which at every temporal instance of the test
execution, the system should find a feasible path to satisfying
its requirements. We add the following constraints to ensure
that for the state
σ
t
=
s
′
t
during the test, for which the state
of the system corresponds to
s
t
, there exists a path to the
acceptance states of
φ
sys
on
S
.
It is not necessary that all of these constraints on the virtual
product graph
G
be visible at any given system state
s
t
. This
feature allows for the tester to place constraints reactively to
the system state
s
t
. To reason about constraints that would
be visible to the system at
s
t
, we define mappings between
the product automata. The first mapping,
M
B
Π
→G
, from the
specification product automaton
B
Π
to
G
defines the states
of
G
that are active for a state
q
∈
B
Π
.Q
:
M
B
Π
→G
(
q
) =
{
(
s,
(
q
sys
,q
test
))
∈
S
′
|
q
= (
q
sys
,q
test
)
}
.
(5)
At
s
′
t
= (
s
t
,q
t
)
during the test execution, state
q
t
is active in
the specification product automaton
B
Π
. The outgoing edges
of nodes in
M
B
Π
→G
(
q
t
)
represent the set of constraints that
can be mapped to
S
at
q
t
. Let
C
G
(
q
t
)
denote these edges:
C
G
(
q
t
) =
{
(
e,d
e
)
|
e
= (
u,v
)
∈
E
′
s.t.
u
∈
M
B
Π
→G
(
q
t
)
}
.
(6)
By construction, every edge-constraint pair
(
e,d
e
)
in
C
G
(
q
t
)
maps to an edge-constraint pair in
S
. Let
V
G→S
map the
nodes from
G
to
S
. If
v
= (
s,
(
q
sys
,q
test
)
∈
S
′
, then
V
G→S
(
v
) = (
s,q
sys
)
∈
S
sys
. Therefore, for every
e
=
((
u,v
)
,d
e
)
∈
C
G
(
q
t
)
, the corresponding edge-constraint pair
on
S
is
((
V
G→S
(
u
)
,V
G→S
(
v
))
,d
e
)
. We denote this map as
M
G→S
, and the mapping is formally stated as follows,
M
G→S
((
u,v
)
,d
e
) = (((
V
G→S
(
u
)
,V
G→S
(
v
))
,d
e
)
.
(7)
Note that the constraint found on
G
is mapped one-to-one
to the constraint on
S
. This node mapping also provides the
initial and acceptance states in
S
that are relevant denoted
V
G→S
(
S
)
and
V
G→S
(
T
)
, respectively. Thus, the constraints
that are active on
S
at
q
∈B
Π
.Q
are as follows,
C
S
(
q
t
) =
{
M
G→S
(
e,d
e
)
|
(
e,d
e
)
∈
C
G
(
q
)
}
.
(c5)
Remark 1.
C
S
(
q
t
)
represents the largest set of constraints
that could be visible to the system at
s
t
. Note that we say
largest possible because the constraints visible to the system
at
s
t
are the constraints on
S
projected onto system transition
T
at that instant. However, not all constraints in
C
S
(
q
t
)
might apply to the system, which is in state
s
t
.
For every edge
e
∈
E
sys
and for every
q
∈B
Π
.Q
, let
f
e
S
(
q
)
denote the flow from source
V
G→S
(
S
)
to target
V
G→S
(
T
)
.
For brevity, we do not elaborate the constraints here, but the
flow variables
f
e
S
(
q
)
must respect the standard network flow
constraints in equations (c1)-(c3). We require the following
condition to be satisfied:
∀
q
∈B
Π
.Q,
∑
e
=(
V
G→S
(
S
)
,v
)
∈
E
sys
f
e
S
(
q
)
≥
1
.
(c6)
Since the above constraint is defined from a fixed source
V
G→S
(
S
)
to target
V
G→S
(
T
)
on
S
, we assume that from
every state
(
s,q
sys
)
∈
S
sys
, there exists an edge to the
source
V
G→S
(
S
)
. For the examples of this paper with the
system specifications of the class (1), this is always the case.
In future work, we would like to prove these properties
for a larger class of specifications and transition systems.
Therefore, the bilevel optimization for synthesizing reactive
constraints is as follows,
MCF-OPT
(
λ
) :
argmin
f
e
S
→
I
f
e
I
→
T
,d
e
,t,f
e
S
(
q
)
argmax
f
e
S
→
T
t
+
λ
∑
v
:(
s
3
,v
)
∈
E
′
f
e
S
→
T
s.t.
(c1)
−
(c6)
,
(8)
where the regularization parameter
λ
penalizes the tester (and
rewards the system) on
f
S
→
T
flow. This optimization is in
the form of a convex-concave min-max Stackleberg game
with dependent constraint sets studied in[31], for which there
always exists a solution.
E. Projecting the constraints onto the physical space
The optimization returns cut edges
E
′
cuts
on the virtual
product graph
G
, which could be fractional values. Fully
constrained edges are assigned
d
e
values close to
t
, therefore
we will only consider those edges to be cut. Lower fractional
values for
d
e
still allow flow to pass through and are not
considered cut. We now need to map these cuts to the
physical space to constrain the system’s actions during the
test execution. We define the projection
P
G→T
(
g
) =
{
s
∈
S
|
g
= (
s,
(
q
sys
,q
test
)
}
,
(9)
Algorithm 1: Constraining Virtual Product Graph
G
1:
procedure
A
UTOMATA
(
T
,φ
sys
,φ
test
)
2:
B
sys
←
BA
(
φ
sys
)
.
System B
̈
uchi automaton
3:
B
test
←
BA
(
φ
test
)
.
Tester B
̈
uchi automaton
4:
B
Π
←B
sys
×B
test
.
Specification product
5:
S ←T ⊗B
sys
.
System product
6:
G ←T ⊗B
Π
.
Virtual Product Graph
7:
return
G
,
S
,
B
π
,
B
sys
,
B
test
8:
9:
procedure
C
ONSTRAINTS
(
T
,φ
sys
,φ
test
)
10:
G
,
S
,
B
π
,
B
sys
,
B
test
←
A
UTOMATA
(
T
,φ
sys
,φ
test
)
11:
Identify nodes
S
,
I
,
T
on
G
12:
Choose
λ
∗
that maximizes
F
and cuts
f
S
→
T
13:
f
∗
S
→
I
,f
∗
I
→
T
,f
∗
S
→
T
,d
∗
,t
∗
←
MCF-OPT
(
λ
∗
)
14:
E
′
cuts
=
set
()
.
To store cuts of
G
15:
for do
e
∈
E
′
16:
if then
d
∗
,e
= 1
.
Ignore fractional cuts
17:
E
′
cuts
←
E
′
cuts
∪
e
18:
Verify
G
= (
S
′
,A,E
′
\
E
′
cuts
)
has no
f
S
→
T
flow.
19:
return
E
′
cuts
which maps each state
g
in the virtual product graph
G
to
its corresponding state in the transition system
T
. This is a
many-to-one mapping where multiple states in
G
will map
to a single state in
T
. Additionally we define the projection
P
G→B
π
(
g
) =
{
(
q
sys
,q
test
)
∈B
π
.Q
|
g
= (
s,
(
q
sys
,q
test
)
}
(10)
which maps each state
g
in
G
to its corresponding state
in
B
π
. During the test execution we will keep track of the
system state in
G
, and when the system enters a state
g
in
G
with an active cut, the corresponding transition from state
P
G→T
(
g
)
in
T
will be constrained. We use the projection
defined in equation 10 to determine a change of state in
B
π
.
For each state of the system in
B
π
, the test environment
will accumulate constraints. Upon transitioning to a state
g
′
that
P
G→B
π
(
g
′
)
results in a change of state in
B
π
, the
obstacles that were placed previously will be removed and
new obstacles will be placed according the active cuts on
G
.
This procedure is outlined in Algorithm 2.
This makes our framework reactive to the system state
during the test execution, where finding static constraints on
G
results in a reactive test policy that constrains the system
actions according to the observed behavior during the test.
IV. E
XPERIMENTAL
R
ESULTS
We have implemented and validated this framework on
simulated grid world examples and hardware experiments.
In addition to these examples we have implemented the
algorithm on grid world mazes and road networks: the results
can be found in this GitHub repository
1
.
1) Robot in a Corridor:
The agent under test for the run-
ning example is controlled by a simple grid world controller
synthesized using TuLiP (Temporal Logic and Planning
Toolbox) [32]. The algorithm presented in section III-D
1
https://github.com/abadithela/Flow-Constraints
(a) Motion primitive graph.
(b) Snapshots of the hardware test execution on the Unitree A1 quadruped.
Fig. 3: Resulting test execution on the Unitree A1 quadruped generated by this framework.
Algorithm 2: Reactive Test Synthesis
1:
procedure
R
EACTIVE
T
EST
(
T
,φ
sys
,φ
test
)
2:
E
′
cuts
←
C
ONSTRAINTS
(
T
,φ
sys
,φ
test
)
3:
g
←
g
0
∈G
4:
C ←∅
.
Initialize empty set of active cuts.
5:
E
current
←
e
.
Initially all transitions from
T
.
6:
while not
q
′
sys
∈B
sys
.F
do
7:
g
′
←
update
state
(
g,
G
,E
current
)
8:
(
q
′
sys
,q
′
test
)
←
P
G→B
π
(
g
′
)
.
Find state in
B
π
.
9:
if
(
q
′
sys
,q
′
test
)
6
= (
q
sys
,q
test
)
then
10:
C ←∅
.
Remove all active cuts.
11:
if
outgoing
edge
G
(
g
′
)
∈
(
E
′
cuts
)
then
.
Add cut.
12:
C ←C∪
outgoing
edge
T
(
P
G→T
(
g
′
))
13:
E
current
←
E
\C
.
Update available transitions.
14:
g
←
g
′
results in a test execution during which the system under test
visits the two pre-determined key locations before reaching
one of the goal states at the end of the corridor. The resulting
test execution is shown in Figure 2d.
2) Hardware Experiments with Quadruped:
Next we will
find a test strategy to test an actual robotic system, the
Unitree A1 quadruped. This quadruped is controlled using
a motion primitive layer with behaviors for lying down,
standing, walking, and jumping. The underlying dynamics of
the transitions between primitives are abstracted away from
the higher-level autonomy as described in [33], and can be
commanded directly. The autonomy layer is provided by a
TuLiP controller generated on an abstraction of the transition
system of the quadruped, consisting of grid world locations
and states corresponding to the available motion primitives.
We find test strategies and execute the resulting test for two
test specifications inspired by a search and rescue mission.
a) Beaver Rescue:
The quadruped’s task is to rescue
the beaver from the hallway and return it to the lab, the
system specification is given as
φ
sys
=
♦
goal, where
goal
corresponds to the quadruped and the beaver reaching the
safe location in the lab. The test specification is given as
φ
test
=
♦
door
1
∧
♦
door
2
ensuring that the quadruped will
use a different door on the way to the beaver and back into
the lab. The resulting test execution shows the quadruped
using
door 2
to exit the lab into the hallway, after it reaches
the beaver this door is shut and the quadruped walks to
door 1
and returns the beaver to the safe location in the
lab. Snapshots of this test execution can be seen in Figure 1.
b) Search and Rescue: Motion Primitive Testing:
In
this test we want to test the motion primitives of the
quadruped shown in figure 3a. The goal for the quadruped is
reaching the beaver in the hallway. The test specification is
given as
φ
test
=
♦
jump
∧
♦
lie
∧
♦
stand, which ensures that
we will test each of these motion primitive at least once.
The test setup includes lights at different heights, which
correspond to the motion primitive which might unlock the
door. The light starts in blue and after the motion primitive
has been executed, the light will turn red - if the door
remains locked - or green - if the door is unlocked. Our
framework will decide whether the doors will be locked or
unlocked according to which motion primitives have already
been observed during the test. Snapshots of the test execution
are shown in Figure 3b.
V. C
ONCLUSIONS AND
F
UTURE
W
ORK
We outlined the problem of finding the minimally con-
strained test as a bilevel optimization. For future work, we
would like to prove that our algorithm is sound and complete,
and provide sub-optimality guarantees on the generated test
environment. The approach outlined in this paper requires
reactive placement of obstacles during a test, which can be
challenging for real-world use cases. Therefore, we aim to
extend this framework to include dynamic test agents to
constrain the system actions, and also allow for finding test
environments requiring the minimum number of test agents
to constrain a test environment for a test specification.
ACKNOWLEDGMENTS
The authors would like to acknowledge Mani Chandy,
Tichakorn Wongpiromsarn, Qiming Zhao, Michel Ingham,
Joel Burdick, Leonard Schulman, Shih-Hao Tseng, Ioannis
Filippidis, and Ugo Rosolia for insightful discussions.
R
EFERENCES
[1] S. Sankaranarayanan and G. Fainekos, “Falsification of temporal
properties of hybrid systems using the cross-entropy method,” in
Proceedings of the 15th ACM international conference on Hybrid
Systems: Computation and Control
, 2012, pp. 125–134.
[2] J. Kapinski, J. V. Deshmukh, X. Jin, H. Ito, and K. Butts, “Simulation-
based approaches for verification of embedded control systems: An
overview of traditional and advanced modeling, testing, and verifica-
tion techniques,”
IEEE Control Systems Magazine
, vol. 36, no. 6, pp.
45–64, 2016.
[3] Y. Annpureddy, C. Liu, G. Fainekos, and S. Sankaranarayanan, “S-
taliro: A tool for temporal logic falsification for hybrid systems,” in
International Conference on Tools and Algorithms for the Construction
and Analysis of Systems
. Springer, 2011, pp. 254–257.
[4] G. Chou, Y. E. Sahin, L. Yang, K. J. Rutledge, P. Nilsson, and N. Ozay,
“Using control synthesis to generate corner cases: A case study on
autonomous driving,”
IEEE Transactions on Computer-Aided Design
of Integrated Circuits and Systems
, vol. 37, no. 11, pp. 2906–2917,
2018.
[5] T. Dang and T. Nahhal, “Coverage-guided test generation for continu-
ous and hybrid systems,”
Formal Methods in System Design
, vol. 34,
no. 2, pp. 183–213, 2009.
[6] M. Hekmatnejad, B. Hoxha, and G. Fainekos, “Search-based test-case
generation by monitoring responsibility safety rules,”
arXiv preprint
arXiv:2005.00326
, 2020.
[7] E. Plaku, L. E. Kavraki, and M. Y. Vardi, “Falsification of ltl safety
properties in hybrid systems,”
International Journal on Software Tools
for Technology Transfer
, vol. 15, no. 4, pp. 305–320, 2013.
[8] A. Donz
́
e, “Breach, a toolbox for verification and parameter synthesis
of hybrid systems,” in
International Conference on Computer Aided
Verification
. Springer, 2010, pp. 167–170.
[9] G. E. Fainekos and G. J. Pappas, “Robustness of temporal logic spec-
ifications for continuous-time signals,”
Theoretical Computer Science
,
vol. 410, no. 42, pp. 4262–4291, 2009.
[10] T. Dreossi, D. J. Fremont, S. Ghosh, E. Kim, H. Ravanbakhsh,
M. Vazquez-Chanlatte, and S. A. Seshia, “Verifai: A toolkit for the
formal design and analysis of artificial intelligence-based systems,” in
International Conference on Computer Aided Verification
. Springer,
2019, pp. 432–442.
[11] “Technical
Evaluation
Criteria,”
https://archive.darpa.mil/
grandchallenge/rules.html.
[12] “DARPA Urban Challenge,” https://www.darpa.mil/about-us/timeline/
darpa-urban-challenge.
[13] H. Kress-Gazit, G. E. Fainekos, and G. J. Pappas, “Temporal-logic-
based reactive mission and motion planning,”
IEEE transactions on
robotics
, vol. 25, no. 6, pp. 1370–1381, 2009.
[14] T. Wongpiromsarn, U. Topcu, and R. M. Murray, “Receding horizon
temporal logic planning,”
IEEE Transactions on Automatic Control
,
vol. 57, no. 11, pp. 2817–2830, 2012.
[15] M. Kloetzer and C. Belta, “Temporal logic planning and control of
robotic swarms by hierarchical abstractions,”
IEEE Transactions on
Robotics
, vol. 23, no. 2, pp. 320–330, 2007.
[16] M. Lahijanian, S. Almagor, D. Fried, L. E. Kavraki, and M. Y. Vardi,
“This time the robot settles for a cost: A quantitative approach to
temporal logic planning with partial satisfaction,” in
Twenty-Ninth
AAAI Conference on Artificial Intelligence
, 2015.
[17] A. Censi, K. Slutsky, T. Wongpiromsarn, D. Yershov, S. Pendleton,
J. Fu, and E. Frazzoli, “Liability, ethics, and culture-aware behavior
specification using rulebooks,” in
2019 International Conference on
Robotics and Automation (ICRA)
. IEEE, 2019, pp. 8536–8542.
[18] S. Shalev-Shwartz, S. Shammah, and A. Shashua, “On a for-
mal model of safe and scalable self-driving cars,”
arXiv preprint
arXiv:1708.06374
, 2017.
[19] T. Wongpiromsarn, K. Slutsky, E. Frazzoli, and U. Topcu, “Minimum-
violation planning for autonomous systems: Theoretical and practical
considerations,” in
2021 American Control Conference (ACC)
. IEEE,
2021, pp. 4866–4872.
[20] G. Ernst, P. Arcaini, I. Bennani, A. Donze, G. Fainekos, G. Frehse,
L. Mathesen, C. Menghi, G. Pedrinelli, M. Pouzet
et al.
, “Arch-comp
2020 category report: Falsification,”
EPiC Series in Computing
, 2020.
[21] S. Ghosh, F. Berkenkamp, G. Ranade, S. Qadeer, and A. Kapoor,
“Verifying controllers against adversarial examples with bayesian
optimization,” in
2018 IEEE International Conference on Robotics
and Automation (ICRA)
. IEEE, 2018, pp. 7306–7313.
[22] A. Zutshi, J. V. Deshmukh, S. Sankaranarayanan, and J. Kapinski,
“Multiple shooting, cegar-based falsification for hybrid systems,”
in
Proceedings of the 14th International Conference on Embedded
Software
, 2014, pp. 1–10.
[23] A. Corso, R. Moss, M. Koren, R. Lee, and M. Kochenderfer, “A
survey of algorithms for black-box safety validation of cyber-physical
systems,”
Journal of Artificial Intelligence Research
, vol. 72, pp. 377–
428, 2021.
[24] J. B. Graebener, A. Badithela, and R. M. Murray, “Towards better test
coverage: Merging unit tests for autonomous systems,” in
NASA For-
mal Methods: 14th International Symposium, NFM 2022, Pasadena,
CA, USA, May 24–27, 2022, Proceedings
, 2022, pp. 133–155.
[25] A. Badithela, J. B. Graebener, and R. M. Murray, “Minimally con-
strained testing for autonomy with temporal logic specifications,”
2022.
[26] C. Baier and J.-P. Katoen,
Principles of model checking
. MIT press,
2008.
[27] J. R. B
̈
uchi,
On a Decision Method in Restricted Second Order
Arithmetic
. New York, NY: Springer New York, 1990, pp. 425–435.
[Online]. Available: https://doi.org/10.1007/978-1-4613-8928-6
23
[28] O. Kupferman, G. Vardi, and M. Y. Vardi, “Flow games,” in
37th
IARCS Annual Conference on Foundations of Software Technology and
Theoretical Computer Science (FSTTCS 2017)
.
Schloss Dagstuhl-
Leibniz-Zentrum fuer Informatik, 2018.
[29] T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein,
Introduction
to algorithms
. MIT press, 2009.
[30] V. V. Vazirani,
Approximation algorithms
. Springer, 2001, vol. 1.
[31] D. Goktas and A. Greenwald, “Convex-concave min-max stackelberg
games,”
Advances in Neural Information Processing Systems
, vol. 34,
2021.
[32] T. Wongpiromsarn, U. Topcu, N. Ozay, H. Xu, and R. M. Murray,
“Tulip: a software toolbox for receding horizon temporal logic plan-
ning,” in
Proceedings of the 14th international conference on Hybrid
systems: computation and control
, 2011, pp. 313–314.
[33] W. Ubellacker, N. Csomay-Shanklin, T. G. Molnar, and A. D. Ames,
“Verifying safe transitions between dynamic motion primitives on
legged robots,” in
2021 IEEE/RSJ International Conference on In-
telligent Robots and Systems (IROS)
, 2021, pp. 8477–8484.