Modeling Groundwater Levels in California's Central Valley by Hierarchical Gaussian Process and Neural Network Regression
Abstract
Modeling groundwater levels continuously across California's Central Valley (CV) hydrological system is challenging due to low‐quality well data which is sparsely and noisily sampled across time and space. The lack of consistent well data makes it difficult to evaluate the impact of 2017 and 2019 wet years on CV groundwater following a severe drought during 2012–2015. A novel machine learning method is formulated for modeling groundwater levels by learning from a 3D lithological texture model of the CV aquifer. The proposed formulation performs multivariate regression by combining Gaussian processes (GP) and deep neural networks (DNN). The hierarchical modeling approach constitutes training the DNN to learn a lithologically informed latent space where non‐parametric regression with GP is performed. We demonstrate the efficacy of GP‐DNN regression for modeling non‐stationary features in the well data with fast and reliable uncertainty quantification, as validated to be statistically consistent with the empirical data distribution from 90 blind wells across CV. We show how the model predictions may be used to supplement hydrological understanding of aquifer responses in basins with irregular well data. Our results indicate that on average the 2017 and 2019 wet years in California were largely ineffective in replenishing the groundwater loss caused during previous drought years.
Copyright and License
© 2024 The Author(s). Journal of Geophysical Research: Machine Learning and Computation published by Wiley Periodicals LLC on behalf of American Geophysical Union.
This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.
Acknowledgement
We thank Caltech's Resnick Sustainability Institute for funding the work presented and Professor Mark Simons and Dr. Neil Fromer for helpful discussions on groundwater modeling. Professor Venkat Chandrasekaran was supported in part by Air Force Office of Scientific Research (AFOSR) Grants FA9550-23-1-0204 and FA9550-22-1-0225, and by NSF Grant DMS 2113724. Professor Andrew M. Stuart gratefully acknowledges support by the AFOSR under MURI award number FA9550-20-1-0358 (Machine Learning and Physics-Based Modeling and Simulation). We offer thanks to Dr. Kyongsik Yun and Dillon Holder for help with the processing of the data shown. We are also grateful towards Professor Tapan Mukerji whose expertise with geostatistics helped provide valuable insights supporting the presented research.
Contributions
Conceptualization: Anshuman Pradhan, Kyra H. Adams, Venkat Chandrasekaran, Zhen Liu, John T. Reager, Andrew M. Stuart, Michael J. Turmon.
Data curation: Kyra H. Adams, Zhen Liu, John T. Reager, Michael J. Turmon.
Formal analysis: Anshuman Pradhan.
Funding acquisition: Venkat Chandrasekaran, Andrew M. Stuart.
Methodology: Anshuman Pradhan, Venkat Chandrasekaran, Andrew M. Stuart.
Software: Anshuman Pradhan.
Data Availability
The CV lithologic texture data is available via the United States Geological Survey data release at https://doi.org/10.5066/P9IZRO3V (Marcelli et al., 2022), and the CV digital elevation model is available at https://doi.org/10.5067/MEaSUREs/NASADEM/NASADEM_HGT.001 (NASA JPL, 2020). The CV well water level data is attributed to Kim et al. (2021). The well data, processed as described in Section 3.1 along with the GP-DNN modeled water level trends and time series outputs, may be accessed through the Harvard Dataverse repository at https://doi.org/10.7910/DVN/23TNJO with Creative Commons Attribution 4.0 International license (Pradhan et al., 2024). Version v0.1.0 of Python software for GP-DNN regression, written using open source Tensorflow (TensorFlow Developers, 2023), NumPy (Harris et al., 2020) and Scipy (Virtanen et al., 2020) libraries, is preserved at https://doi.org/10.5281/zenodo.13855361, available via Creative Commons Attribution 4.0 International license (Pradhan, 2024). Normal score transformation method was performed using mGstat geostatistical MATLAB toolbox (Hansen, 2022). All data analyses were performed using open source NumPy and Scipy Python libraries, while data visualizations were conducted using open source Matplotlib Python library (Caswell et al., 2021; Hunter, 2007) and its Basemap extension.
Files
Name | Size | Download all |
---|---|---|
md5:98fee0cd07a3bc9404937880b6c78508
|
10.5 MB | Preview Download |
Additional details
- Resnick Sustainability Institute
- United States Air Force Office of Scientific Research
- FA9550‐23‐1‐0204
- United States Air Force Office of Scientific Research
- FA9550‐22‐1‐0225
- National Science Foundation
- DMS-2113724
- United States Air Force Office of Scientific Research
- FA9550‐20‐1‐0358
- Accepted
-
2024-10-08Accepted
- Available
-
2024-10-29Version of Record online
- Publication Status
- Published