Le, Hoang M. and Voloshin, Cameron and Yue, Yisong (2019) Batch Policy Learning under Constraints. Proceedings of Machine Learning Research, 97 . pp. 3703-3712. ISSN 1938-7228. https://resolver.caltech.edu/CaltechAUTHORS:20190327-085845627
![]() |
PDF
- Published Version
See Usage Policy. 1MB |
![]() |
PDF
- Submitted Version
See Usage Policy. 2MB |
![]() |
PDF
- Supplemental Material
See Usage Policy. 1MB |
Use this Persistent URL to link to this item: https://resolver.caltech.edu/CaltechAUTHORS:20190327-085845627
Abstract
When learning policies for real-world domains, two important questions arise: (i) how to efficiently use pre-collected off-policy, non-optimal behavior data; and (ii) how to mediate among different competing objectives and constraints. We thus study the problem of batch policy learning under multiple constraints, and offer a systematic solution. We first propose a flexible meta-algorithm that admits any batch reinforcement learning and online learning procedure as subroutines. We then present a specific algorithmic instantiation and provide performance guarantees for the main objective and all constraints. To certify constraint satisfaction, we propose a new and simple method for off-policy policy evaluation (OPE) and derive PAC-style bounds. Our algorithm achieves strong empirical results in different domains, including in a challenging problem of simulated car driving subject to multiple constraints such as lane keeping and smooth driving. We also show experimentally that our OPE method outperforms other popular OPE techniques on a standalone basis, especially in a high-dimensional setting.
Item Type: | Article | |||||||||
---|---|---|---|---|---|---|---|---|---|---|
Related URLs: |
| |||||||||
ORCID: |
| |||||||||
Additional Information: | © 2019 by the author(s). Proceedings of the 36th International Conference on Machine Learning, Long Beach, California, PMLR 97, 2019. | |||||||||
Record Number: | CaltechAUTHORS:20190327-085845627 | |||||||||
Persistent URL: | https://resolver.caltech.edu/CaltechAUTHORS:20190327-085845627 | |||||||||
Usage Policy: | No commercial reproduction, distribution, display or performance rights in this work are provided. | |||||||||
ID Code: | 94191 | |||||||||
Collection: | CaltechAUTHORS | |||||||||
Deposited By: | George Porter | |||||||||
Deposited On: | 27 Mar 2019 22:19 | |||||||||
Last Modified: | 14 Feb 2020 23:10 |
Repository Staff Only: item control page