CaltechAUTHORS
  A Caltech Library Service

End-to-End Safe Reinforcement Learning through Barrier Functions for Safety-Critical Continuous Control Tasks

Cheng, Richard and Orosz, Gábor and Murray, Richard M. and Burdick, Joel W. (2019) End-to-End Safe Reinforcement Learning through Barrier Functions for Safety-Critical Continuous Control Tasks. In: Thirty-Third AAAI Conference on Artificial Intelligence (AAAI-19), 27 January-1 February 2019, Honolulu, HI. https://resolver.caltech.edu/CaltechAUTHORS:20190410-120654801

[img] PDF - Published Version
See Usage Policy.

1MB
[img] PDF - Submitted Version
See Usage Policy.

1MB

Use this Persistent URL to link to this item: https://resolver.caltech.edu/CaltechAUTHORS:20190410-120654801

Abstract

Reinforcement Learning (RL) algorithms have found limited success beyond simulated applications, and one main reason is the absence of safety guarantees during the learning process. Real world systems would realistically fail or break before an optimal controller can be learned. To address this issue, we propose a controller architecture that combines (1) a model-free RL-based controller with (2) model-based controllers utilizing control barrier functions (CBFs) and (3) on-line learning of the unknown system dynamics, in order to ensure safety during learning. Our general framework leverages the success of RL algorithms to learn high-performance controllers, while the CBF-based controllers both guarantee safety and guide the learning process by constraining the set of explorable polices. We utilize Gaussian Processes (GPs) to model the system dynamics and its uncertainties. Our novel controller synthesis algorithm, RL-CBF, guarantees safety with high probability during the learning process, regardless of the RL algorithm used, and demonstrates greater policy exploration efficiency. We test our algorithm on (1) control of an inverted pendulum and (2) autonomous car-following with wireless vehicle-to-vehicle communication, and show that our algorithm attains much greater sample efficiency in learning than other state-of-the-art algorithms and maintains safety during the entire learning process.


Item Type:Conference or Workshop Item (Paper)
Related URLs:
URLURL TypeDescription
https://guidebook.com/guide/150194/poi/11379090/PublisherArticle
https://arxiv.org/abs/1903.08792arXivArticle
ORCID:
AuthorORCID
Orosz, Gábor0000-0002-9000-3736
Murray, Richard M.0000-0002-5785-7481
Additional Information:© 2019, Association for the Advancement of Artificial Intelligence (www.aaai.org). The authors would like to thank Hoang Le and Yisong Yue for helpful discussions.
DOI:10.48550/arXiv.1903.08792
Record Number:CaltechAUTHORS:20190410-120654801
Persistent URL:https://resolver.caltech.edu/CaltechAUTHORS:20190410-120654801
Usage Policy:No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:94639
Collection:CaltechAUTHORS
Deposited By: George Porter
Deposited On:10 Apr 2019 19:56
Last Modified:02 Jun 2023 00:56

Repository Staff Only: item control page