A Caltech Library Service

Gradient-based inverse risk-sensitive reinforcement learning

Mazumdar, Eric and Ratliff, Lillian J. and Fiez, Tanner and Sastry, S. Shankar (2017) Gradient-based inverse risk-sensitive reinforcement learning. In: 2017 IEEE 56th Annual Conference on Decision and Control (CDC). IEEE , Piscataway, NJ, pp. 5796-5801. ISBN 978-1-5090-2873-3.

Full text is not posted in this repository. Consult Related URLs below.

Use this Persistent URL to link to this item:


We address the problem of inverse reinforcement learning in Markov decision processes where the agent is risksensitive. In particular, we model risk-sensitivity in a reinforcement learning framework by making use of models of human decision-making having their origins in behavioral psychology and economics. We propose a gradient-based inverse reinforcement learning algorithm that minimizes a loss function defined on the observed behavior. We demonstrate the performance of the proposed technique on two examples, the first of which is the canonical Grid World example and the second of which is an MDP modeling passengers' decisions regarding ride-sharing. In the latter, we use pricing and travel time data from a ride-sharing company to construct the transition probabilities and rewards of the MDP.

Item Type:Book Section
Related URLs:
URLURL TypeDescription
Mazumdar, Eric0000-0002-1815-269X
Ratliff, Lillian J.0000-0001-8936-0229
Additional Information:© 2017 IEEE. This work is supported by NSF CRII Award CNS-1656873, NSF US-Ignite Award CNS-1646912, NSF FORCES (Foundations Of Resilient CybEr-physical Systems) Award CNS-1238959, CNS-1238962, CNS- 1239054, CNS-1239166.
Funding AgencyGrant Number
Record Number:CaltechAUTHORS:20210903-222215940
Persistent URL:
Official Citation:E. Mazumdar, L. J. Ratliff, T. Fiez and S. S. Sastry, "Gradient-based inverse risk-sensitive reinforcement learning," 2017 IEEE 56th Annual Conference on Decision and Control (CDC), 2017, pp. 5796-5801, doi: 10.1109/CDC.2017.8264535
Usage Policy:No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:110738
Deposited By: George Porter
Deposited On:07 Sep 2021 16:33
Last Modified:07 Sep 2021 19:31

Repository Staff Only: item control page