Machine Learning Approaches for Developing Potential Surfaces: Applications to OH⁻(H₂O)ₙ (n = 1–3) Complexes
Abstract
An approach for obtaining high-level ab initio potential surfaces is described. The approach takes advantage of machine learning strategies in a two-step process. In the first, the molecular-orbital based machine learning (MOB-ML) model uses Gaussian process regression to learn the correlation energy at the CCSD(T) level using the molecular orbitals obtained from Hartree-Fock calculations. In this work, the MOB-ML approach is expanded to use orbitals obtained using a smaller basis set, aug-cc-pVDZ, as features for learning the correlation energies at the complete basis set (CBS) limit. This approach is combined with the development of a neural-network potential, where the sampled geometries and energies that provide the training data for the potential are obtained using a diffusion Monte Carlo (DMC) calculation, which was run using the MOB-ML model. Protocols are developed to make full use of the structures that are obtained from the DMC calculation in the training process. These approaches are used to develop potentials for OH-(H2O) and H3O+(H2O), which are used for subsequent DMC calculations. The results of these calculations are compared to those performed using previously reported potentials. Overall, the results of the two sets of DMC calculations are in good agreement for these very floppy molecules. Potentials are also developed for OH-(H2O)2 and OH-(H2O)3, for which there are not available potential surfaces. The results of DMC calculations for these ions are compared to those for the corresponding H3O+(H2O)2 and H3O+(H2O)3 ions. It is found that the level of delocalization of the shared proton is similar for a hydroxide or hydronium ion bound to the same number of water molecules. This finding is consistent with the experimental observation that these sets of ions have similar spectra.
Copyright and License
© 2025 American Chemical Society
Acknowledgement
The authors gratefully acknowledge the chemistry division of the NSF (CHE-2154126) for support of this work. This work was also facilitated through the use of advanced computational, storage, and networking infrastructure provided by the Hyak supercomputer system and funded by the STF at the University of Washington. J.S. thanks for the support from Hongyan Scholarship, and V.C.B. acknowledges the support from National Science Foundation Graduate Research Fellowship Program (GRFP; grant DGE-1745301).
Supplemental Material
The Supporting Information is available free of charge at https://pubs.acs.org/doi/10.1021/acs.jpca.4c08826.
-
Details of MOB-ML methods, including electronic structure calculations and complete basis set extrapolation; numerical details for MOB-ML model training and characterization, including training and test structure collection, ML training protocol, feature analysis, accuracy of the MOB-ML models, and minimum energy structures; numerical details for the DMC calculations; numerical details for NN+(MOB-ML) model training, including collection of training data, analysis of simulation parameters, learning trajectories, PES evaluation times, NN architecture details, and analysis of the NN+(MOB-ML) model errors; analysis of the NN+(MOB-ML) models, including DMC simulation details and cuts through the potentials (PDF)
Data Availability
Reference structures, results of electronic structure calculations, prediction values of the test sets for MOB-ML training, and training/test structures and energies used for refitting the NN+(MOB-ML) PESs are available at https://zenodo.org/records/14563580.
Code Availability
The codes are included in a repository: https://github.com/McCoyGroup/ionic_water_mobml-nn_potential.
Additional Information
Published as part of The Journal of Physical Chemistry A special issue “Michael A. Duncan Festschrift”.
Files
Additional details
- National Science Foundation
- CHE-2154126
- National Science Foundation
- Graduate Research Fellowship Program DGE-1745301
- Accepted
-
2025-02-27Accepted
- Available
-
2025-03-19Published online
- Caltech groups
- Division of Chemistry and Chemical Engineering (CCE)
- Publication Status
- Accepted