Published May 18, 2021 | Version Accepted Version
Book Section - Chapter Open

Constrained Risk-Averse Markov Decision Processes

Abstract

We consider the problem of designing policies for Markov decision processes (MDPs) with dynamic coherent risk objectives and constraints. We begin by formulating the problem in a Lagrangian framework. Under the assumption that the risk objectives and constraints can be represented by a Markov risk transition mapping, we propose an optimization-based method to synthesize Markovian policies that lower-bound the constrained risk-averse problem. We demonstrate that the formulated optimization problems are in the form of difference convex programs (DCPs) and can be solved by the disciplined convex-concave programming (DCCP) framework. We show that these results generalize linear programs for constrained MDPs with total discounted expected costs and constraints. Finally, we illustrate the effectiveness of the proposed method with numerical experiments on a rover navigation problem involving conditional-value-at-risk (CVaR) and entropic-value-at-risk (EVaR) coherent risk measures.

Additional Information

© 2021 Association for the Advancement of Artificial Intelligence. Published 2021-05-18.

Attached Files

Accepted Version - 2012.02423.pdf

Files

2012.02423.pdf

Files (1.5 MB)

Name Size Download all
md5:8c89ec951d198d4de9e60549ea9a0da0
1.5 MB Preview Download

Additional details

Identifiers

Eprint ID
107610
Resolver ID
CaltechAUTHORS:20210120-165231602

Dates

Created
2021-01-21
Created from EPrint's datestamp field
Updated
2023-06-02
Created from EPrint's last_modified field

Caltech Custom Metadata

Caltech groups
Division of Biology and Biological Engineering (BBE)