qu21a.pdf

Proceedings of Machine Learning Research vol 144:1–12, 2021

3rd Annual Conference on Learning for Dynamics and Control

Stable Online Control of Linear Time-Varying Systems

Guannan Qu

∗

GQU

CALTECH

EDU

Yuanyuan Shi

∗

YSHI

CALTECH

EDU

Sahin Lale

∗

ALALE

CALTECH

EDU

Anima Anandkumar

ANIMA

CALTECH

EDU

Adam Wierman

ADAMW

CALTECH

EDU

California Institute of Technology, Pasadena, CA

∗

Equal contribution

Editors:

A. Jadbabaie, J. Lygeros, G. J. Pappas, P. A. Parrilo, B. Recht, C. J. Tomlin, M. N. Zeilinger

Abstract

Linear time-varying (LTV) systems are widely used for modeling real-world dynamical systems

due to their generality and simplicity. Providing stability guarantees for LTV systems is one of the

central problems in control theory. However, existing approaches that guarantee stability typically

lead to significantly sub-optimal cumulative control cost in online settings where only current or

short-term system information is available. In this work, we propose an efficient online control

algorithm, COvariance Constrained Online Linear Quadratic (COCO-LQ) control, that guarantees

input-to-state stability for a large class of LTV systems while also minimizing the control cost. The

proposed method incorporates a state covariance constraint into the semi-definite programming

(SDP) formulation of the LQ optimal controller. We empirically demonstrate the performance of

COCO-LQ in both synthetic experiments and a power system frequency control example.

Keywords:

Time-varying systems, online linear quadratic control, stability guarantee

1. Introduction

Time-invariant systems have traditionally been the main focus of the study for the linear dynamical

systems community. However, real-world systems are often

time-varying.

For example, consider a

power system that includes renewable generation (e.g. solar/wind). Due to the intermittency of re-

newable energy, the system dynamics for frequency regulation in the power system are time-varying.

Applying a time-invariant controller in this setting may lead to frequency instability and line fail-

ures (Ulbig et al., 2014). Time-varying systems are also crucial for many other applications, such as

autonomous vehicles and aircraft control (Falcone et al., 2008). While not all time-varying systems

have linear dynamics, many applications can be approximated by linear time-varying (LTV) systems

via a local linear approximation at each time step (Todorov and Li, 2005), e.g., the frequency control

example described above. As a result, LTV systems are widely-used and there is a large literature

focused on designing controllers for LTV systems (Amato et al., 2010; Ouyang et al., 2017).

Perhaps the most fundamental challenge in dynamical systems is stability. While the design

of stable linear time-invariant (LTI) systems is well understood, the same cannot be said for LTV

systems. To this point, several notions of stability have recieved attention, e.g., input-to-state stabil-

ity (ISS), mean-square stability and Lyapunov stability. ISS is the most widely adopted notion and

aims to guarantee the boundedness of the state given bounded initial conditions (Hong et al., 2010).

TABLE

NLINE

ONTROL OF

LTV S

YSTEMS

In most applications of LTV systems, it is crucial to guarantee ISS both in order to avoid saturation,

maintain the robustness and validity of linearization (Tarbouriech et al., 2006; Khalil, 2002).

While there is considerable prior work focused on stability in LTV systems, most prior work

studies stability in the offline setting where either the sequence of system parameters are known,

e.g., (Amato et al., 2010; Li et al., 2019a), or the system parameters have a particular variation

pattern, e.g., (Garcia et al., 2009). Maintaining ISS guarantees becomes significantly harder in

the online setting where the system parameters are observed in real-time and may have arbitrary

variations. This online setting is the most relevant to many applications, e.g., frequency regulation.

Though stability is crucial, it is not enough for a controller to be stable. A controller must also

have low cost. For instance, in order to stabilize the dynamics, a controller may use arbitrarily big

control inputs, which may result in sub-optimal cost. In classical optimal control problems, e.g. the

time-varying linear quadratic (LQ) control setting, the goal is to design a stabilizing controller that

minimizes the cost for a particular finite horizon while assuming access to the whole trajectory for

that duration. It is possible to characterize the optimal policy in such settings (Bertsekas et al., 1995);

however, in the online setting when only current or short-termed system information is available,

these methods may not guarantee stability, e.g., see Section 3. There have been recent efforts to

provide sub-optimality guarantees on the acquired cost in the online LTV setting, e.g., (Gradu et al.,

2020), but it is unclear if the proposed controllers maintain stability for all time-steps since the main

focus is on minimizing the cumulative cost.

Thus, despite considerable recent work, much remains to be understood about the design of

online LTV controllers. In particular, this paper is motivated by the following question:

Is it possible for an online controller to guarantee stability and maintain low cost in LTV systems?

Contributions.

In this work, we answer question above affirmatively. Specifically, we propose

variance

onstrained

nline

inear

uadratic (COCO-LQ) control, a novel online control al-

gorithm that aims to minimize the control cost while ensuring provable stability guarantees in LTV

systems without restricting how slow or fast the underlying system changes. Further, we demon-

strate the performance of the proposed method in various synthetic LTV systems and in the power

system frequency control example that motivated our study.

The main technical contribution of the paper is a stability guarantee for COCO-LQ in LTV sys-

tems. Specifically, we show that COCO-LQ guarantees ISS in online time-varying systems. The

key technique that underpins the proposed algorithm is the addition of a novel semi-definiteness

constraint on the state covariance matrix into the standard online semi-definite programming (SDP)

formulation of linear quadratic optimal control. We show that this constraint promotes the sequen-

tial strong stability of the controllers (Cohen et al., 2018), which in turn guarantees ISS with a proper

choice of an algorithm hyperparameter. Adding this additional constraint is simple and does not re-

sult in a significant increase of computational complexity compared to the standard LQ formulation.

Moreover, we prove that if the proposed SDP is not directly feasible, short-term predictions on the

future system parameters are necessary and can be used in COCO-LQ in order to ensure ISS.

Related work.

The work in this paper builds on the design of linear time-invariant (LTI) con-

trollers to provide a new approach for the design of stable controllers for linear-time-varying (LTV)

systems. As such, we describe related work on both LTI and LTV systems below.

LTI Systems.

In study of control of LTI systems, linear quadratic regulator (LQR) has been con-

sidered in detail. In the classical setting where the underlying system is known, the optimal control

TABLE

NLINE

ONTROL OF

LTV S

YSTEMS

law is given by a linear feedback controller obtained by solving Riccati equations (Bertsekas et al.,

1995). Alternatively, the optimal control problem can also be posed via semi-definite programming

(SDP) (Vandenberghe and Boyd, 1996), which is the approach we build on in the current paper.

Recently, there has been growing interest in online control of these linear systems when the un-

derlying dynamics are unknown. Most of these works study the problem with a regret minimization

perspective, e.g., (Abbasi-Yadkori and Szepesv

ari, 2011; Dean et al., 2018; Lale et al., 2020a,b).

However, these methods have so far only been applied in LTI systems with time-varying costs and

disturbances. Extensions to LTV dynamics, which are the focus of this paper, are not known.

LTV Systems.

As in the case of LTI systems, optimal control of LTV systems where the se-

quence of system parameters can be obtained by solving backwards Riccati equations (Bertsekas

et al., 1995). However, in the online case when the sequence of systems is unknown, the design of

controllers is challenging. There are several lines of work in adaptive control and model-predictive

control (MPC) that have been studied to this point. In adaptive control of LTV systems, the un-

derlying systems are unknown and the results generally assume slow and bounded or fixed system-

atic variation of dynamics with bounded disturbances (Middleton and Goodwin, 1988; Marino and

Tomei, 2000; Ouyang et al., 2017). In MPC of LTV systems, a finite horizon of sequence of systems

(predictions) is known and the system is again assumed to be slowly varying or open-loop stable,

e.g., (Zheng and Morari, 1994; Falcone et al., 2007). Different from prior works, in the current work

we consider the online problem and make no assumptions about how the system varies over time.

As in the LTI setting, the study of regret minimization in LTV systems has recently received

attention. Goel and Hassibi (2020); Gradu et al. (2020) are most related to the current paper. Goel

and Hassibi (2020) considers the setting where the sequence of systems is known and provides

regret-optimal controller framework. Gradu et al. (2020) studies the adaptive regret of online control

in LTV systems with bounded cost. Note that when the cost is bounded, a finite regret need not

guarantee stability. In contrast, we use a quadratic (unbounded) cost and we can guarantee stability.

Notation.

We denote the Euclidean norm of a vector

‖

. For a matrix

‖

is its spectral

norm,

is its transpose, and

Tr(

)

is its trace.

(

μ,

Σ)

denotes normal distribution with mean

and covariance

and

denote that

−

is positive definite and positive semi-

definite respectively.

•

denotes the element-wise inner product of

and

i.e.

Tr(

)

2. Model & Background

We consider the following linear time-varying (LTV) system,

(1)

where

∈

is the system state,

∈

is the control input and

∈

is the disturbance at

time

. The system is stochastic,

i.e.

∼ N

)

for

. The cost at each time-step is a

quadratic function of the state and control,

, where

Q,R

The decision maker operates in an online setting. That is, at each time-step

, the learner

observes the state

and system matrix

(

)

before choosing action

and suffering cost

. We assume that the cost matrices

(

Q,R

)

are time-invariant and known to the

learner. However, future system matrices

(

,...,A

)

and

(

,...,B

)

are unknown to the

learner and are chosen by the environment, potentially stochastically or adversarially.

TABLE

NLINE

ONTROL OF

LTV S

YSTEMS

Stability.

One of the most central goals for controller design is to ensure stability. In this work,

we focus on the notion of input to state stability (ISS) and strive to design controllers that provide

ISS. ISS has been the main notion of stability considered in designing stabilizing controllers both

in linear and nonlinear systems (Hong et al., 2010; Sontag, 2008; Jiang and Wang, 2001). To

formally define ISS, let

∞

be the set of functions from nonnegative reals to nonnegative reals that

are continuous, strictly increasing, and bijective. Then, ISS is defined as follows.

Definition 1 (ISS)

A LTV system with deterministic policy

is said to be input to state stable if

there exists functions

: [0

∞

)

→

∞

)

and

∈K

∞

that satisfy

(

)

∈K

∞

for any

∈

lim

→∞

(

a,t

) = 0

for any

≥

such that, for any disturbance sequence

{

}

∞

, any initial

time

, any initial state

0

, and any

≥

, we have

‖

‖≤

(

‖

0

‖

−

) +

(sup

′

∈

‖

′

‖

)

Cost.

In addition to stability, another important objective for controller design is maintaining a

small, near-optimal control cost. Here we adopt the standard linear quadratic (LQ) cost model,

i.e.

(

) = lim

→∞

[

∑

]

(2)

where

,...,u

are chosen according to policy

, and the expectation is taken with respect to the

randomness of noise sequence

In this work, our goal is to ensure both stability and near-optimal cost. It should be noted that

there is a trade-off between these two goals. On the one hand, a stabilizing controller without cost-

awareness may produce arbitrarily large control inputs and induce high cost, which is impractical

to implement. On the other hand, a greedy approach that merely focuses on cost minimization may

lead to instability, as we highlight in the Section 3 below.

Though our focus is on LTV systems, our approach builds on the SDP formulation of the optimal

controller for LTI systems in (Vandenberghe and Boyd, 1996).

Proposition 2

(Vandenberghe and Boyd, 1996) When

A,B

and

(

A,B

)

is controllable,

the optimal

∗

LQR

(

A,B,Q,R

)

where

∗

, can be obtained by the following SDP

min

[

]

•

s.t.

[

]

[

]

which has a unique symmetric solution

∗

that decomposes to the following blocks

∗

[

∗

]

where

∗

∈

∗

∈

and

∗

∈

. Then, the optimal controller is

∗

= Σ

∗

(Σ

∗

)

−

The optimal LQR controller described above both stabilizes the system and achieves the minimum

cost. The current paper makes a step toward understanding if it is possible to extend this formulation

to the case of LTV systems.

3. A Naive Approach

How to achieve stable, cost-optimal control of LTI systems is well-known; however this is not the

case in LTV systems. To illustrate the challenge of online control of LTV systems, we start by

studying the performance of a naive “plug in” approach where upon receiving

(

)

an optimal

controller for

is computed under the assumption that the system is time-invariant. Due to its

TABLE

NLINE

ONTROL OF

LTV S

YSTEMS

simplicity, this approach has been employed in many contexts, e.g. Li et al. (2019b) for a Markov

decision process setting. In this section we provide an example which shows that such a myopic

approach based on optimal LTI control described above fails to stabilize the system even in simple

settings where

can only switch between two possible choices and

is fixed. This highlights that

one cannot naively apply LTI design approaches in LTV systems and expect to maintain stability.

Example 1

Consider a system with

= 0

, and

[

a ρ

]

′

[

ρ a

]

where

<ρ<

, and

√

. Suppose

alternates between

and

′

and

. Define the

optimal LTI controllers for

and

′

LQR

(

A,B,Q,R

)

and

′

LQR

(

′

,B,Q,R

)

To show that the optimal LTI controllers will not stabilize the system, we consider a case where

→

. In this case, one can check that

K,K

′

→

. Since

alternates between

A,A

′

also

alternates between

and

′

under the myopic design we are considering. Thus, the system state

follows

= (

)(

′

)

Notice that as

→

(

)(

′

)

→

′

[

aρ

aρ a

]

. Here,

′

is unstable since its largest eigenvalue is greater than

Tr(

′

) =

2

. Thus, for small enough

, the naive strategy that uses the LTI controller at each

time-step leads to instability.

4. Main Result

The previous section highlights that a naive application of LTI control cannot guarantee stability for

LTV systems. We now propose a new approach, COvariance Constrained Online LQ (COCO-LQ)

control (Section 4.1). Our main technical result shows that COCO-LQ provably guarantees stability

in LTV systems (Section 4.2) when the SDP is feasible. In Section 4.3, we discuss how to handle

the situation when the SDP is infeasible and Section 4.4 discuss the effect of model estimation error.

Detailed proofs could be found in the Appendix of our online report Qu et al. (2021).

4.1. COvariance Constrained Online LQ (COCO-LQ)

The naive approach discussed in Section 3 seeks to solve the LTI problem at every time step, which

is equivalent to solving the SDP in Proposition 2 for every

(

)

. The reason this method fails

is that it only considers cost minimization without explicitly considering stability. The main idea of

COCO-LQ is to enforce stability via a state covariance constraint embedded into the SDP frame-

work. The proposed algorithm is stated formally in Algorithm 1. COCO-LQ solves an SDP (3) at

each time step that is similar to that in Proposition 2. The crucial difference is the new constraint

(3d), which involves parameter

. Plugging (3b) into constraint (3d) yields the following:

−

This highlights that constraint (3d) can be interpreted as an upper bound on the state covariance

matrix

. When

= 0

, the controller essentially cancels out the dynamics, without taking into

account the cost of doing so. This ensures stability but can lead to large cost. At another extreme,

when

→

, the SDP solved at each time step is the same as for the LTI setting, and so COCO-

LQ matches the naive approach in Section 3. Thus,

trades off between stability and cost. In the

following section, we show that this novel state covariance constraint promotes sequential strong

stability (Cohen et al., 2018), which in turn guarantees ISS with a proper choice of

TABLE

NLINE

ONTROL OF

LTV S

YSTEMS

Algorithm 1:

COCO-LQ: COvariance Constrained Online LQ

Parameters:

∈

Input:

Q,R,W

for

= 1

,...

Receive

state

, and system parameter

Compute policy:

Let

∈

be an optimal solution to the SDP program:

minimize

[

]

•

(3a)

subject to

[

]

[

]

(3b)

(3c)

[

]

[

]

(3d)

and

= Σ

−

Play

Update

∼

)

end

4.2. Stability

We now state our main technical result, which provides a formal stability guarantee for COCO-LQ.

Theorem 3

Let

≤

α <

, and suppose

(3)

is feasible for all

, then the resulting dynamical

system satisfies ISS in the sense that for any disturbance sequence

{

}

∞

and for any

≥

‖

‖≤

−

0

‖

0

‖

κρ

−

sup

0

≤

k<t

‖

for

√

−

∈

and

√

−

, where

‖

‖‖

−

‖

is the condition number of

The key intuition underlying this result is that the additional state covariance constraint (3d) im-

plicitly enforces sequential strong stability (Cohen et al., 2018), which in turn ensures ISS. More

formally, sequential strong stability is defined as follows,

Definition 4

(Sequential Strong Stability) A sequence of policies

,...,

such that

(

κ,γ,ρ

)

-sequential strongly stable (for

κ >

< γ

≤

and

≤

ρ <

) if there exist matrices

,...,

and

...,

such that

−

for all

, with the following properties:

(a)

‖

‖≤

−

; (b)

‖

‖≤

and

−

||≤

/β

with

/β

; (c)

‖

−

‖≤

−

The following lemma formalizes the connection between (3d) and sequential strong stability.

Lemma 5

Under the conditions in Theorem 3, the policies designed by COCO-LQ are

(

κ,γ,ρ

)

sequential strongly stability for

√

−

,γ

= 1

−

√

α,ρ

√

−

where

‖

‖‖

−

‖

With Lemma 5, proving the result in Theorem 3 only requires showing that sequential strong sta-

bility implies ISS. The complete proof of Lemma 5 and Theorem 3 are given in Appendix A of our

online report Qu et al. (2021). A critical assumption in Theorem 3 is that the SDP given in (3) is

feasible for

≤

α <

. The following result shows that when

is full row rank, the problem is

always feasible. The proof of Lemma 6 is postponed to Appendix B in Qu et al. (2021).

TABLE

NLINE

ONTROL OF

LTV S

YSTEMS

Algorithm 2:

COCO-LQ-Prediction: COVariance Constrained Online LQ with Predictions

Parameters:

∈

Input:

Q,R,W

for

= 1

,...

≡

1 (mod

)

then

Receive

state

, and system parameters

(

)

,...,

(

−

)

Compute policy:

Let

∈

be a solution to the constrained SDP in (3) with

(

R,A

)

replaced by

(

)

, where













(

repeating blocks),

−

···

:= [

−

···

−

...A

]

Set

= Σ

−

and

[

−

,...,u

]

end

Play

Implement the planned control action

Update

∼N

)

end

Lemma 6

When

is full row rank, then the SDP

(3)

is always feasible.

Note that having

full row rank is a sufficient but not necessary condition for feasibility of (3) of

COCO-LQ. When

is not full row rank, the feasibility assumption may still hold, and therefore

our assumption is weaker than the invertibility assumption used in the literature, e.g. Lai (1986).

More broadly, in Theorem 3,

α <

is a sufficient condition for stability. For

≥

, stability

may still hold for some problem instances

(

)

as will be shown in the simulations in Section 5.

How to provide a more refined instance-dependent threshold on

is an interesting future direction.

4.3. Infeasibility and the Role of Predictions

We now turn our attention to the case when the SDP given in (3) is infeasible. In this case it is

necessary for the controller to use additional information in order to stabilize the system. In partic-

ular, we provide an example that shows the necessity of predictions when

is not full row rank

in Appendix C of our online report Qu et al. (2021). This example shows that when

is not full

row rank, for any (deterministic) online control algorithm that has causal access to system matrices,

there exists a future sequence of

(

)

in which the algorithm cannot stabilize the system. In this

section, we show that using

(

)

together with short-term predictions of future system matrices

is enough to stabilize the system under standard controllability assumptions. Specifically, we extend

COCO-LQ to include future

steps of predictions in Algorithm 2. The key idea is to rewrite the

dynamics as

+ [

I,A

−

···

−

...A

] ̄

(4)

where we define

−

···

:= [

−

···

−

...A

]

:= [

−

,...,u

]

and

:= [

−

,...,w

]

. When

is long

enough such that

is full row rank, we can use Algorithm 1 on

and

and avoid the in-

feasibility issue, and our stability guarantee is provided below. The proof of Theorem 7 can be

found in Appendix D of our online report Qu et al. (2021).

TABLE

NLINE

ONTROL OF

LTV S

YSTEMS

Theorem 7

Suppose for each

, matrix

= [

−

···

−

...A

]

satisfies

σI

for some

σ >

, and

‖

‖ ≤

‖

‖ ≤

for some

a,b >

. Then, the SDP

in Algorithm 2 is always feasible. Further, when

α <

, the closed-loop system is ISS for any

‖

‖≤

′

−

‖

′

max(1

−

) sup

≤

s<t

‖

where the same as Theorem 3,

√

−

∈

and

√

−

with

‖

‖‖

−

‖

being

the condition number of

; further,

= 1 +

...

−

, and

′

−

(1 +

···

−

)

with

being the condition number of

4.4. Estimation Error

In both Algorithm 1 and Algorithm 2, the exact knowledge of state-transition matrices

(

)

the extended state-transition matrices

(

)

are needed when deriving the control actions. In

this section, we show that COCO-LQ can still obtain a stabilizing controller in the case where only

approximations are known, if the estimation error is controlled. Our main result is the following.

Theorem 8

Let

(

)

be an estimate of (

). Given

∈

)

, let

√

−

‖

‖‖

−

1

‖

√

−

and

= 1

−

√

. Let

,...

be the policies designed by COCO-LQ for

(

)

with parameter

. When the estimation error satisfies,

max

{||

−

}≤

(1 +

max

)

(5)

where

can be any number in

√

−

√

−

√

)

, and

max

is any uniform upper bound on

‖

Then, the policies

are ISS when applied to the system

(

)

‖

‖≤

(

′

)

−

0

‖

0

‖

κρ

′

−

′

sup

0

≤

k<t

‖

where

′

−

)

−

∈

. Finally, when

‖

‖ ≤

‖

‖ ≤

and

, one uniform

upper bound for

‖

max

2

(

−

) + ̄

)

with

‖

‖‖

−

‖

A proof of Theorem 8 is provided in Appendix E in our online report Qu et al. (2021). This result

highlights the tradeoff between the estimation error and the algorithm performance. If we choose

a small

, the algorithm can tolerant a larger estimation error (i.e. larger right hand side of (5) can

be obtained) but may lead to high control cost due to the tight state co-variance constraint. If we

choose a larger

, the algorithm tolerates smaller estimation error while its performance improves

due to the less strict state co-variance constraint.

5. Experiments

The results in the previous section focus on stability of COCO-LQ approach. Here, we use experi-

mental results to highlight that COCO-LQ also performs near-optimally in terms of cost while also

stabilizing systems that the naive approach based on LTI control cannot. In Section 5.1, we test

our method on random, synthetic linear time varying systems, and in Section 5.2 we demonstrate

the algorithm performance in real-world power system frequency control settings. Due to space

limit, more experiment results on nonlinear systems via local linear approximation can be found in

Appendix F of our online report Qu et al. (2021).

TABLE

NLINE

ONTROL OF

LTV S

YSTEMS

5.1. Synthetic Time-Varying Systems

We first consider the control of switching and time-variant systems. The cost function is set as

= 0

I,R

, and system is subject to Gaussian disturbance

∼

)

. We average the

simulation results over 5 runs and visualize the mean performance and standard deviation.

Switching systems.

we consider a switching system following Example 1 in Section 3, where

alternates between

= [[0

99]]

and

′

= [[0

99]]

, and

Time-variant systems.

We consider a system

= [[0

sin(

πt

)

]

[

cos(

πt

)

99]]

that is continually changing over time, and

Figure 1: Performance comparison of COCO-LQ and LQ on synthetic time-varying systems. The

left two figures show the state evolution, and right two figures show the normalized cost

(cost of COCO-LQ divided by cost of the offline optima) under different

As we can see in Figure 1, COCO-LQ is able to quickly and effectively stabilize the system under

various time-varying scenarios, which validates our theoretical findings. As

increases, the ac-

quired cost of COCO-LQ first decreases and then increases (explosion of state), highlighting that

can explicitly control the tradeoff between cost and stability. With proper selection of

, COCO-LQ

achieves near-optimal cost (within 30% of the offline optimal for both system a and b).

5.2. Frequency Control with Renewable Generation

We now consider a power system frequency control problem on standard IEEE WECC 3-machine

9-bus system (Figure 2(a)), which is a widely adopted system used in frequency stability studies.

The state space model of power system frequency dynamics follows Hidalgo-Gonzalez et al. (2019),

[

]

︸︷︷︸

[

−

]

︸

︷︷

︸

[

]

[

−

]

︸

︷︷

︸

︸︷︷︸

t

(6)

where the state variable is defined as the stacked vector of the voltage angle

and frequency

diag

(

t,i

)

is the inertia matrix, where

t,i

represents the equivalent rotational inertia at

bus

and time

is time-varying and depends on the mix of online generators, since only

thermal generators provide rotational inertia and renewable generation does not Ulbig et al. (2014).

diag

(

)

is the damping matrix, where

is the generator damping coefficient.

is the

network susceptance matrix. The control variable

corresponds to the electric power generation.

We assume the system is changing between two states: a high renewable generation scenario

where

t,i

= 2

(i.e., 80 percent renewable with zero inertia and 20 percent of thermal generation

TABLE

NLINE

ONTROL OF

LTV S

YSTEMS

(

)

(

)

Figure 2: (a) IEEE WECC 3-machine 9-bus system schematic with generators at bus 1, 5, 9 are

mixture of thermal generation and renewable. (b) Frequency dynamics under offline op-

tima, baseline H-horizon control, and COCO-LQ. The dotted grey lines (

Hz) are

the safety margin of power system frequency variation.

with 10s inertia), and a low renewable generation scenario where

t,i

= 8

(i.e., 20 percent renew-

able and 80 percent thermal generation), with additional random fluctuations between

. This

setup represents the real-world situation where we have high solar output during the daytime, and

low output in the morning/evening, with intra-day variations due to clouds and weather changes.

Notice that

is not full rank, thus we need to leverage predictions, i.e.,

and

. For fair

comparison, we compete against the

-horizon optimal control in Bertsekas et al. (1995), which

is the extension of naive LTI controller to use

-step predictions. In both cases, we assume the

prediction is accurate and use the exact value of

and

for computing control actions.

Figure 2(b) visualizes the power system frequency dynamics under three controllers: the offline

optimal control, the baseline

-horizon optimal controller and the proposed COCO-LQ-Prediction

method. We ideally desire a controller that is able to maintain the frequency variation within

Hz and eventually stabilize the system. It can be observed that our algorithm succeeds at

maintaining the frequency stability under random, time-varying renewable generations. Further-

more, the performance of COCO-LQ is very close to the offline optimal, while the system frequency

diverges under the baseline

-horizon optimal control.

6. Conclusion

In this paper, we study the stability of LTV systems. Our results demonstrate the challenge of

ensuring stability for LTV systems compared to LTI systems. Motivated by this challenge, we

propose a COCO-LQ/COCO-LQ-Prediction policy that can guarantee stability for LTV systems

under certain assumptions. There are many interesting open questions that remain. For example,

the bound

α <

in Theorem 3 is a sufficient condition, and studying how to relax the bound

and how to derive instance-dependent bounds is an interesting future question. Another important

direction is to analyze the performance (e.g. the regret) of the proposed approach in order to quantify

the tradeoff between stability and performance.

TABLE

NLINE

ONTROL OF

LTV S

YSTEMS

References

Yasin Abbasi-Yadkori and Csaba Szepesv

ari. Regret bounds for the adaptive control of linear

quadratic systems. In

Proceedings of the 24th Annual Conference on Learning Theory

, pages

1–26, 2011.

Francesco Amato, Marco Ariola, and Carlo Cosentino. Finite-time control of discrete-time linear

systems: analysis and design conditions.

Automatica

, 46(5):919–924, 2010.

Dimitri P Bertsekas, Dimitri P Bertsekas, Dimitri P Bertsekas, and Dimitri P Bertsekas.

Dynamic

programming and optimal control

, volume 1. Athena scientific Belmont, MA, 1995.

Alon Cohen, Avinatan Hassidim, Tomer Koren, Nevena Lazic, Yishay Mansour, and Kunal Talwar.

Online linear quadratic control.

arXiv preprint arXiv:1806.07104

, 2018.

Sarah Dean, Horia Mania, Nikolai Matni, Benjamin Recht, and Stephen Tu. Regret bounds for

robust adaptive control of the linear quadratic regulator. In

Advances in Neural Information

Processing Systems

, pages 4188–4197, 2018.

Paolo Falcone, Manuela Tufo, Francesco Borrelli, Jahan Asgari, and H Eric Tseng. A linear time

varying model predictive control approach to the integrated vehicle dynamics control problem in

autonomous systems. In

2007 46th IEEE Conference on Decision and Control

, pages 2980–2985.

IEEE, 2007.

Paolo Falcone, Francesco Borrelli, H Eric Tseng, Jahan Asgari, and Davor Hrovat. Linear time-

varying model predictive control and its application to active steering systems: Stability analysis

and experimental validation.

International Journal of Robust and Nonlinear Control: IFAC-

Affiliated Journal

, 18(8):862–875, 2008.

Germain Garcia, Sophie Tarbouriech, and Jacques Bernussou. Finite-time stabilization of linear

time-varying continuous systems.

IEEE Transactions on Automatic Control

, 54(2):364–369,

2009.

Gautam Goel and Babak Hassibi. Regret-optimal control in dynamic environments.

arXiv preprint

arXiv:2010.10473

, 2020.

Paula Gradu, Elad Hazan, and Edgar Minasyan. Adaptive regret for control of time-varying dynam-

ics.

arXiv preprint arXiv:2007.04393

, 2020.

Patricia Hidalgo-Gonzalez, Rodrigo Henriquez-Auba, Duncan S Callaway, and Claire J Tomlin.

Frequency regulation using data-driven controllers in power grids with variable inertia due to

renewable energy. In

2019 IEEE Power & Energy Society General Meeting (PESGM)

, pages

1–5. IEEE, 2019.

Yiguang Hong, Zhong-Ping Jiang, and Gang Feng. Finite-time input-to-state stability and applica-

tions to finite-time control design.

SIAM Journal on Control and Optimization

, 48(7):4395–4418,

2010.

Zhong-Ping Jiang and Yuan Wang. Input-to-state stability for discrete-time nonlinear systems.

Au-

tomatica

, 37(6):857–869, 2001.

TABLE

NLINE

ONTROL OF

LTV S

YSTEMS

Hassan K Khalil.

Nonlinear systems

, volume 3. 2002.

T.L Lai. Asymptotically efficient adaptive control in stochastic regression models.

Advances in

Applied Mathematics

, 7(1):23 – 45, 1986.

Sahin Lale, Kamyar Azizzadenesheli, Babak Hassibi, and Anima Anandkumar.

Adaptive

control and regret minimization in linear quadratic gaussian (lqg) setting.

arXiv preprint

arXiv:2003.05999

, 2020a.

Sahin Lale, Kamyar Azizzadenesheli, Babak Hassibi, and Anima Anandkumar. Explore more and

improve regret in linear quadratic regulators.

arXiv preprint arXiv:2007.12291

, 2020b.

Xiaodi Li, Xueyan Yang, and Shiji Song. Lyapunov conditions for finite-time stability of time-

varying time-delay systems.

Automatica

, 103:135–140, 2019a.

Yingying Li, Aoxiao Zhong, Guannan Qu, and Na Li. Online markov decision processes with time-

varying transition probabilities and rewards. In

ICML Real-world Sequential Decision Making

workshop

, 2019b.

Riccardo Marino and Patrizio Tomei. Robust adaptive regulation of linear time-varying systems.

IEEE Transactions on Automatic Control

, 45(7):1301–1311, 2000.

Richard H Middleton and Graham C Goodwin. Adaptive control of time-varying linear systems.

IEEE Transactions on Automatic Control

, 33(2):150–155, 1988.

Yi Ouyang, Mukul Gagrani, and Rahul Jain. Learning-based control of unknown linear systems

with thompson sampling.

arXiv preprint arXiv:1709.04047

, 2017.

Guannan Qu, Yuanyuan Shi, Sahin Lale, Anima Anandkumar, and Adam Wierman. Stable online

control of linear time-varying systems.

arXiv preprint arXiv:2104.14134

, 2021.

Eduardo D Sontag. Input to state stability: Basic concepts and results. In

Nonlinear and optimal

control theory

, pages 163–220. Springer, 2008.

Sophie Tarbouriech, Germain Garcia, and Adolf H Glattfelder.

Advanced Strategies in Control

Systems with Input and Output Constraints

, volume 346. Springer Science & Business Media,

2006.

Emanuel Todorov and Weiwei Li. A generalized iterative lqg method for locally-optimal feed-

back control of constrained nonlinear stochastic systems. In

Proceedings of the 2005, American

Control Conference, 2005.

, pages 300–306. IEEE, 2005.

Andreas Ulbig, Theodor S Borsche, and G

oran Andersson. Impact of low rotational inertia on

power system stability and operation.

IFAC Proceedings Volumes

, 47(3):7290–7297, 2014.

Lieven Vandenberghe and Stephen Boyd. Semidefinite programming.

SIAM review

, 38(1):49–95,

1996.

Alex Zheng and Manfred Morari. Robust control of linear time-varying systems with constraints. In

Proceedings of 1994 American Control Conference-ACC’94

, volume 3, pages 2416–2420. IEEE,

1994.