Voloshin, Cameron and Le, Hoang M. and Jiang, Nan and Yue, Yisong (2019) Empirical Study of Off-Policy Policy Evaluation for Reinforcement Learning. . (Unpublished) https://resolver.caltech.edu/CaltechAUTHORS:20200109-100747650
![]() |
PDF
- Submitted Version
See Usage Policy. 2MB |
Use this Persistent URL to link to this item: https://resolver.caltech.edu/CaltechAUTHORS:20200109-100747650
Abstract
Off-policy policy evaluation (OPE) is the problem of estimating the online performance of a policy using only pre-collected historical data generated by another policy. Given the increasing interest in deploying learning-based methods for safety-critical applications, many recent OPE methods have recently been proposed. Due to disparate experimental conditions from recent literature, the relative performance of current OPE methods is not well understood. In this work, we present the first comprehensive empirical analysis of a broad suite of OPE methods. Based on thousands of experiments and detailed empirical analyses, we offer a summarized set of guidelines for effectively using OPE in practice, and suggest directions for future research.
Item Type: | Report or Paper (Discussion Paper) | ||||||
---|---|---|---|---|---|---|---|
Related URLs: |
| ||||||
ORCID: |
| ||||||
Record Number: | CaltechAUTHORS:20200109-100747650 | ||||||
Persistent URL: | https://resolver.caltech.edu/CaltechAUTHORS:20200109-100747650 | ||||||
Usage Policy: | No commercial reproduction, distribution, display or performance rights in this work are provided. | ||||||
ID Code: | 100590 | ||||||
Collection: | CaltechAUTHORS | ||||||
Deposited By: | Tony Diaz | ||||||
Deposited On: | 09 Jan 2020 18:15 | ||||||
Last Modified: | 09 Jan 2020 18:15 |
Repository Staff Only: item control page