Monte Carlo tree search with spectral expansion for planning with dynamical systems
Abstract
The ability of a robot to plan complex behaviors with real-time computation, rather than adhering to predesigned or offline-learned routines, alleviates the need for specialized algorithms or training for each problem instance. Monte Carlo tree search is a powerful planning algorithm that strategically explores simulated future possibilities, but it requires a discrete problem representation that is irreconcilable with the continuous dynamics of the physical world. We present Spectral Expansion Tree Search (SETS), a real-time, tree-based planner that uses the spectrum of the locally linearized system to construct a low-complexity and approximately equivalent discrete representation of the continuous world. We prove that SETS converges to a bound of the globally optimal solution for continuous, deterministic, and differentiable Markov decision processes, a broad class of problems that includes underactuated nonlinear dynamics, nonconvex reward functions, and unstructured environments. We experimentally validated SETS on drone, spacecraft, and ground vehicle robots and one numerical experiment, each of which is not directly solvable with existing methods. We successfully show that SETS automatically discovers a diverse set of optimal behaviors and motion trajectories in real time.
Acknowledgement (English)
We thank the DARPA LINC team at Caltech and JPL, who contributed to the full autonomy stack of the results shown in Fig. 3. We thank J. Preiss (planning), S. Lupu (control), and M. Anderson (ROS/software) for support and collaboration on the tracked vehicle experiment. Other LINC team members include J. Burdick, Y. Yue, A. Rahmani, L. Gan (localization), F. Xie (control), J. Becker (segmentation and mapping), T. Touma, and J. Alindogan (experiment). We thank J. Cho for support and collaboration on the spacecraft experiment, H. Tsukamoto for discussions on discrete contraction, F. Hadaegh for discussions on space autonomy, and the Sandia National Labs team (T. Blada, E. Lu, and D. Wood) for LINC testing support (Fig. 3). The drone experiment was conducted at Caltech’s Center for Autonomous Systems and Technologies.
Copyright and License (English)
Contributions (English)
The algorithm conceptualization, theory, and software implementation were equally developed by B.R. and J.L. under the guidance and critical reviews of S.-J.C. The quadrotor and spacecraft experiments were equally designed by B.R. and J.L. Both B.R. and J.L. contributed to the tracked vehicle experiment and the glider experiment, with J.L. leading the tracked vehicle experiment and B.R. leading the glider experiment. The manuscript and figures were equally developed by B.R. and J.L. with feedback and multiple iterations with S.-J.C. All authors reviewed the manuscript. S.-J.C. supervised the research.
Conflict of Interest (English)
California Institute of Technology filed a US nonprovisional patent application on this work on 5 August 2024.
Funding (English)
We acknowledge the funding support of DARPA LINC (J. F. Mergen), the Aerospace Corporation (J. Brader and B. Bycroft), and Supernal (R. Stefanescu and H. Park). This material is based on work supported by the NSF Graduate Research Fellowship Program under grant no. 2139433. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the NSF.
Data Availability (English)
All data and code used to produce the plots presented here and in the Supplementary Materials are available on the online repository Dryad: doi:10.5061/dryad.s7h44j1h5. Our code is also available at https://github.com/aerorobotics/sets.
Supplemental Material
The PDF file includes:
Other Supplementary Material for this manuscript includes the following:
Files
Additional details
- National Science Foundation
- Graduate Research Fellowship Program 2139433
- The Aerospace Corporation
- Defense Advanced Research Projects Agency
- Accepted
-
2024-11-06
- Available
-
2024-12-04Published online
- Journal name
- Science Robotics
- Publication Status
- Published