Li, Kun and Burdick, Joel W. (2017) Meta Inverse Reinforcement Learning via Maximum Reward Sharing for Human Motion Analysis. In: Workshop on Meta-Learning (MetaLearn 2017), 9 December 2017, Long Beach, CA. https://resolver.caltech.edu/CaltechAUTHORS:20190410-120626845
![]() |
PDF
- Published Version
See Usage Policy. 135kB |
Use this Persistent URL to link to this item: https://resolver.caltech.edu/CaltechAUTHORS:20190410-120626845
Abstract
This work handles the inverse reinforcement learning (IRL) problem where only a small number of demonstrations are available from a demonstrator for each high-dimensional task, insufficient to estimate an accurate reward function. Observing that each demonstrator has an inherent reward for each state and the task-specific behaviors mainly depend on a small number of key states, we propose a meta IRL algorithm that first models the reward function for each task as a distribution conditioned on a baseline reward function shared by all tasks and dependent only on the demonstrator, and then finds the most likely reward function in the distribution that explains the task-specific behaviors. We test the method in a simulated environment on path planning tasks with limited demonstrations, and show that the accuracy of the learned reward function is significantly improved.
Item Type: | Conference or Workshop Item (Paper) | |||||||||
---|---|---|---|---|---|---|---|---|---|---|
Related URLs: |
| |||||||||
Record Number: | CaltechAUTHORS:20190410-120626845 | |||||||||
Persistent URL: | https://resolver.caltech.edu/CaltechAUTHORS:20190410-120626845 | |||||||||
Usage Policy: | No commercial reproduction, distribution, display or performance rights in this work are provided. | |||||||||
ID Code: | 94631 | |||||||||
Collection: | CaltechAUTHORS | |||||||||
Deposited By: | George Porter | |||||||||
Deposited On: | 11 Apr 2019 14:50 | |||||||||
Last Modified: | 03 Oct 2019 21:05 |
Repository Staff Only: item control page