CaltechAUTHORS
  A Caltech Library Service

Embodied Question Answering in Photorealistic Environments With Point Cloud Perception

Wijmans, Erik and Datta, Samyak and Maksymets, Oleksandr and Das, Abhishek and Gkioxari, Georgia and Lee, Stefan and Essa, Irfan and Parikh, Devi and Batra, Dhruv (2019) Embodied Question Answering in Photorealistic Environments With Point Cloud Perception. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE , Piscataway, NJ, pp. 6652-6661. ISBN 978-1-7281-3293-8. https://resolver.caltech.edu/CaltechAUTHORS:20221215-789772000.17

Full text is not posted in this repository. Consult Related URLs below.

Use this Persistent URL to link to this item: https://resolver.caltech.edu/CaltechAUTHORS:20221215-789772000.17

Abstract

To help bridge the gap between internet vision-style problems and the goal of vision for embodied perception we instantiate a large-scale navigation task -- Embodied Question Answering [1] in photo-realistic environments (Matterport 3D). We thoroughly study navigation policies that utilize 3D point clouds, RGB images, or their combination. Our analysis of these models reveals several key findings. We find that two seemingly naive navigation baselines, forward-only and random, are strong navigators and challenging to outperform, due to the specific choice of the evaluation setting presented by [1]. We find a novel loss-weighting scheme we call Inflection Weighting to be important when training recurrent models for navigation with behavior cloning and are able to out perform the baselines with this technique. We find that point clouds provide a richer signal than RGB images for learning obstacle avoidance, motivating the use (and continued study) of 3D deep learning models for embodied navigation.


Item Type:Book Section
Related URLs:
URLURL TypeDescription
https://doi.org/10.1109/CVPR.2019.00682DOIArticle
https://resolver.caltech.edu/CaltechAUTHORS:20221219-204816196Related ItemDiscussion Paper
ORCID:
AuthorORCID
Wijmans, Erik0000-0003-4254-3751
Maksymets, Oleksandr0000-0003-3515-8839
Lee, Stefan0000-0001-5953-1963
Additional Information:This work was supported in part by NSF (Grant # 1427300), AFRL, DARPA, Siemens, Samsung, Google, Amazon, ONR YIPs and ONR Grants N00014-16-1-{2713,2793}. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of the U.S. Government, or any sponsor.
Funders:
Funding AgencyGrant Number
NSFDUE-1427300
Air Force Research Laboratory (AFRL)UNSPECIFIED
Defense Advanced Research Projects Agency (DARPA)UNSPECIFIED
SiemensUNSPECIFIED
SamsungUNSPECIFIED
GoogleUNSPECIFIED
AmazonUNSPECIFIED
Office of Naval Research (ONR)N00014-16-1-2713
Office of Naval Research (ONR)N00014-16-1-2793
DOI:10.1109/cvpr.2019.00682
Record Number:CaltechAUTHORS:20221215-789772000.17
Persistent URL:https://resolver.caltech.edu/CaltechAUTHORS:20221215-789772000.17
Usage Policy:No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:118377
Collection:CaltechAUTHORS
Deposited By: George Porter
Deposited On:20 Dec 2022 23:55
Last Modified:20 Dec 2022 23:55

Repository Staff Only: item control page