CaltechAUTHORS
  A Caltech Library Service

Control Regularization for Reduced Variance Reinforcement Learning

Cheng, Richard and Verma, Abhinav and Orosz, Gábor and Chaudhuri, Swarat and Yue, Yisong and Burdick, Joel W. (2019) Control Regularization for Reduced Variance Reinforcement Learning. Proceedings of Machine Learning Research, 97 . pp. 1141-1150. ISSN 1938-7228. https://resolver.caltech.edu/CaltechAUTHORS:20190905-154302241

[img] PDF - Published Version
See Usage Policy.

1183Kb
[img] PDF - Submitted Version
See Usage Policy.

1822Kb
[img] PDF - Supplemental Material
See Usage Policy.

810Kb

Use this Persistent URL to link to this item: https://resolver.caltech.edu/CaltechAUTHORS:20190905-154302241

Abstract

Dealing with high variance is a significant challenge in model-free reinforcement learning (RL). Existing methods are unreliable, exhibiting high variance in performance from run to run using different initializations/seeds. Focusing on problems arising in continuous control, we propose a functional regularization approach to augmenting model-free RL. In particular, we regularize the behavior of the deep policy to be similar to a policy prior, i.e., we regularize in function space. We show that functional regularization yields a bias-variance trade-off, and propose an adaptive tuning strategy to optimize this trade-off. When the policy prior has control-theoretic stability guarantees, we further show that this regularization approximately preserves those stability guarantees throughout learning. We validate our approach empirically on a range of settings, and demonstrate significantly reduced variance, guaranteed dynamic stability, and more efficient learning than deep RL alone.


Item Type:Article
Related URLs:
URLURL TypeDescription
http://proceedings.mlr.press/v97/cheng19a.htmlPublisherArticle
http://arxiv.org/abs/1905.05380arXivDiscussion Paper
http://proceedings.mlr.press/v97/cheng19a/cheng19a-supp.pdfPublisherSupporting Information
https://github.com/rcheng805/CORE-RLRelated ItemCode
ORCID:
AuthorORCID
Cheng, Richard0000-0001-8301-9169
Orosz, Gábor0000-0002-9000-3736
Yue, Yisong0000-0001-9127-1989
Additional Information:Copyright 2019 by the author(s). This work was funded in part by Raytheon under the Learning to Fly program, and by DARPA under the Physics-Infused AI Program.
Funders:
Funding AgencyGrant Number
Raytheon CompanyUNSPECIFIED
Defense Advanced Research Projects Agency (DARPA)UNSPECIFIED
Record Number:CaltechAUTHORS:20190905-154302241
Persistent URL:https://resolver.caltech.edu/CaltechAUTHORS:20190905-154302241
Usage Policy:No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:98457
Collection:CaltechAUTHORS
Deposited By: George Porter
Deposited On:06 Sep 2019 14:41
Last Modified:03 Oct 2019 21:41

Repository Staff Only: item control page