CaltechAUTHORS
  A Caltech Library Service

Iterative Amortized Policy Optimization

Marino, Joseph and Piché, Alexandre and Ialongo, Alessandro Davide and Yue, Yisong (2020) Iterative Amortized Policy Optimization. In: 34th Conference on Neural Information Processing Systems (NeurIPS 2020). Neural Information Processing Foundation , La Jolla, CA, pp. 1-15. ISBN 9781713829546. https://resolver.caltech.edu/CaltechAUTHORS:20221222-185354379

Full text is not posted in this repository. Consult Related URLs below.

Use this Persistent URL to link to this item: https://resolver.caltech.edu/CaltechAUTHORS:20221222-185354379

Abstract

Policy networks are a central feature of deep reinforcement learning (RL) algorithms for continuous control, enabling the estimation and sampling of high-value actions. From the variational inference perspective on RL, policy networks, when used with entropy or KL regularization, are a form of amortized optimization, optimizing network parameters rather than the policy distributions directly. However, direct amortized mappings can yield suboptimal policy estimates and restricted distributions, limiting performance and exploration. Given this perspective, we consider the more flexible class of iterative amortized optimizers. We demonstrate that the resulting technique, iterative amortized policy optimization, yields performance improvements over direct amortization on benchmark continuous control tasks.


Item Type:Book Section
Related URLs:
URLURL TypeDescription
https://proceedings.neurips.cc/paper/2021/hash/83fa5a432ae55c253d0e60dbfa716723-Abstract.htmlPublisherArticle
https://resolver.caltech.edu/CaltechAUTHORS:20201110-082336091Related ItemDiscussion Paper
ORCID:
AuthorORCID
Marino, Joseph0000-0001-6387-8062
Yue, Yisong0000-0001-9127-1989
Additional Information:JM acknowledges Scott Fujimoto for helpful discussions. This work was funded in part by NSF #1918839 and Beyond Limits. JM is currently employed by Google DeepMind. The authors declare no other competing interests related to this work.
Funders:
Funding AgencyGrant Number
NSFCCF-1918839
Beyond LimitsUNSPECIFIED
Record Number:CaltechAUTHORS:20221222-185354379
Persistent URL:https://resolver.caltech.edu/CaltechAUTHORS:20221222-185354379
Usage Policy:No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:118584
Collection:CaltechAUTHORS
Deposited By: George Porter
Deposited On:22 Dec 2022 23:58
Last Modified:22 Dec 2022 23:58

Repository Staff Only: item control page