A Caltech Library Service

ACID: Action-Conditional Implicit Visual Dynamics for Deformable Object Manipulation

Shen, Bokui and Jiang, Zhenyu and Choy, Christopher and Guibas, Leonidas J. and Savarese, Silvio and Anandkumar, Anima and Zhu, Yuke (2022) ACID: Action-Conditional Implicit Visual Dynamics for Deformable Object Manipulation. . (Unpublished)

[img] PDF - Submitted Version
See Usage Policy.


Use this Persistent URL to link to this item:


Manipulating volumetric deformable objects in the real world, like plush toys and pizza dough, bring substantial challenges due to infinite shape variations, non-rigid motions, and partial observability. We introduce ACID, an action-conditional visual dynamics model for volumetric deformable objects based on structured implicit neural representations. ACID integrates two new techniques: implicit representations for action-conditional dynamics and geodesics-based contrastive learning. To represent deformable dynamics from partial RGB-D observations, we learn implicit representations of occupancy and flow-based forward dynamics. To accurately identify state change under large non-rigid deformations, we learn a correspondence embedding field through a novel geodesics-based contrastive loss. To evaluate our approach, we develop a simulation framework for manipulating complex deformable shapes in realistic scenes and a benchmark containing over 17,000 action trajectories with six types of plush toys and 78 variants. Our model achieves the best performance in geometry, correspondence, and dynamics predictions over existing approaches. The ACID dynamics models are successfully employed to goal-conditioned deformable manipulation tasks, resulting in a 30% increase in task success rate over the strongest baseline. For more results and information, please visit

Item Type:Report or Paper (Discussion Paper)
Related URLs:
URLURL TypeDescription Paper ItemProject website
Anandkumar, Anima0000-0002-6974-6797
Zhu, Yuke0000-0002-9198-2227
Additional Information:Work done during an internship at NVIDIA Research.
Record Number:CaltechAUTHORS:20220714-224611031
Persistent URL:
Usage Policy:No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:115595
Deposited By: George Porter
Deposited On:15 Jul 2022 23:22
Last Modified:15 Jul 2022 23:22

Repository Staff Only: item control page