Welcome to the new version of CaltechAUTHORS. Login is currently restricted to library staff. If you notice any issues, please email coda@library.caltech.edu
Published August 2021 | Submitted
Journal Article Open

Simple, low-cost and accurate data-driven geophysical forecasting with learned kernels


Modelling geophysical processes as low-dimensional dynamical systems and regressing their vector field from data is a promising approach for learning emulators of such systems. We show that when the kernel of these emulators is also learned from data (using kernel flows, a variant of cross-validation), then the resulting data-driven models are not only faster than equation-based models but are easier to train than neural networks such as the long short-term memory neural network. In addition, they are also more accurate and predictive than the latter. When trained on geophysical observational data, for example the weekly averaged global sea-surface temperature, considerable gains are also observed by the proposed technique in comparison with classical partial differential equation-based models in terms of forecast computational cost and accuracy. When trained on publicly available re-analysis data for the daily temperature of the North American continent, we see significant improvements over classical baselines such as climatology and persistence-based forecast techniques. Although our experiments concern specific examples, the proposed approach is general, and our results support the viability of kernel methods (with learned kernels) for interpretable and computationally efficient geophysical forecasting for a large diversity of processes.

Additional Information

© 2021 The Author(s). Published by the Royal Society. Manuscript received 28/04/2021; Manuscript accepted 21/07/2021; Published online 18/08/2021; Published in print 25/08/2021. This material is based upon work supported by the US Department of Energy (DOE), Office of Science, Office of Advanced Scientific Computing Research, under contract DE-AC02-06CH11357. This research was funded in part by and used resources of the Argonne Leadership Computing Facility, which is a DOE Office of Science User Facility supported under contract DE-AC02-06CH11357. R.M. acknowledges support from the Margaret Butler Fellowship at the Argonne Leadership Computing Facility. B.H. thanks the European Commission for funding through the Marie Curie fellowship STALDYS-792919 (Statistical Learning for Dynamical Systems). H.O. gratefully acknowledges support by the Air Force Office of Scientific Research under award nos. FA9550-18-1-0271 (Games for Computation and Learning) and MURI (FA9550-20-1-0358). Data accessibility: The data that support the findings of this study are openly available in Github at https://github.com/Romit-Maulik/POD_RKHS. Authors' contributions: B.H. prepared the code and analysis and wrote portions of the paper. R.M. designed the investigation, prepared code and documentation, performed analyses, generated visualizations and wrote the paper. H.O. prepared code and analysis and wrote portions of the paper. We declare we have no competing interests. This paper describes objective technical results and analysis. Any subjective views or opinions that might be expressed in the paper do not necessarily represent the views of the US DOE or the US government.

Attached Files

Submitted - 2103.10935.pdf


Files (2.8 MB)
Name Size Download all
2.8 MB Preview Download

Additional details

August 20, 2023
October 23, 2023