CaltechAUTHORS
  A Caltech Library Service

The Parameter Houlihan: a solution to high-throughput identifiability indeterminacy for brutally ill-posed problems

Albers, D. J. and Levine, M. E. and Mamykina, L. and Hripcsak, G. (2019) The Parameter Houlihan: a solution to high-throughput identifiability indeterminacy for brutally ill-posed problems. Mathematical Biosciences, 316 . Art. No. 108242. ISSN 0025-5564. https://resolver.caltech.edu/CaltechAUTHORS:20190826-092413189

[img] PDF - Submitted Version
See Usage Policy.

1785Kb

Use this Persistent URL to link to this item: https://resolver.caltech.edu/CaltechAUTHORS:20190826-092413189

Abstract

One way to interject knowledge into clinically impactful forecasting is to use data assimilation, a nonlinear regression that projects data onto a mechanistic physiologic model, instead of a set of functions, such as neural networks. Such regressions have an advantage of being useful with particularly sparse, non-stationary clinical data. However, physiological models are often nonlinear and can have many parameters, leading to potential problems with parameter identifiability, or the ability to find a unique set of parameters that minimize forecasting error. The identifiability problems can be minimized or eliminated by reducing the number of parameters estimated, but reducing the number of estimated parameters also reduces the flexibility of the model and hence increases forecasting error. We propose a method, the parameter Houlihan, that combines traditional machine learning techniques with data assimilation, to select the right set of model parameters to minimize forecasting error while reducing identifiability problems. The method worked well: the data assimilation-based glucose forecasts and estimates for our cohort using the Houlihan-selected parameter sets generally also minimize forecasting errors compared to other parameter selection methods such as by-hand parameter selection. Nevertheless, the forecast with the lowest forecast error does not always accurately represent physiology, but further advancements of the algorithm provide a path for improving physiologic fidelity as well. Our hope is that this methodology represents a first step toward combining machine learning with data assimilation and provides a lower-threshold entry point for using data assimilation with clinical data by helping select the right parameters to estimate.


Item Type:Article
Related URLs:
URLURL TypeDescription
https://doi.org/10.1016/j.mbs.2019.108242DOIArticle
https://arxiv.org/abs/1902.01978arXivDiscussion Paper
Additional Information:© 2019 Published by Elsevier. Received 5 February 2019, Revised 20 August 2019, Accepted 22 August 2019, Available online 24 August 2019.
Subject Keywords:Data assimilation; Identifiability; Machine learning; Inverse problems; Physiology; Markov Chain Monte Carlo
Record Number:CaltechAUTHORS:20190826-092413189
Persistent URL:https://resolver.caltech.edu/CaltechAUTHORS:20190826-092413189
Official Citation:David J. Albers, Matthew E. Levine, Lena Mamykina, George Hripcsak, The parameter Houlihan: A solution to high-throughput identifiability indeterminacy for brutally ill-posed problems, Mathematical Biosciences, Volume 316, 2019, 108242, ISSN 0025-5564, https://doi.org/10.1016/j.mbs.2019.108242. (http://www.sciencedirect.com/science/article/pii/S0025556419300781)
Usage Policy:No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:98225
Collection:CaltechAUTHORS
Deposited By: George Porter
Deposited On:26 Aug 2019 16:53
Last Modified:03 Oct 2019 21:39

Repository Staff Only: item control page