The Materials Provenance Store
We present a database resulting from high throughput experimentation, primarily on metal oxide solid state materials. The central relational database, the Materials Provenance Store (MPS), manages the metadata and experimental provenance from acquisition of raw materials, through synthesis, to a broad range of materials characterization techniques. Given the primary research goal of materials discovery of solar fuels materials, many of the characterization experiments involve electrochemistry, along with optical, structural, and compositional characterizations. The MPS is populated with all information required for executing common data queries, which typically do not involve direct query of raw data. The result is a database file that can be distributed to users so that they can independently execute queries and subsequently download the data of interest. We propose this strategy as an approach to manage the highly heterogeneous and distributed data that arises from materials science experiments, as demonstrated by the management of over 30 million experiments run on over 12 million samples in the present MPS release.
© Te Author(s) 2023. This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/. This material is primarily based on work performed by the Liquid Sunlight Alliance, which is supported by the U.S. Department of Energy, Office of Science, Office of Basic Energy Sciences, Fuels from Sunlight Hub under Award DE-SC0021266. Development of the database schema was supported by Toyota Research Institute. Much of the underlying data was generated by research in the Joint Center for Artificial Photosynthesis, a DOE Energy Innovation Hub, supported through the Office of Science of the U.S. Department of Energy (Award No. DE-SC0004993). Storage was provided by the Open Storage Network via XSEDE allocation INI210004. Use of the Stanford Synchrotron Radiation Lightsource, SLAC National Accelerator Laboratory, is supported by the U.S. Department of Energy, Office of Science, Office of Basic Energy Sciences, under Contract No. DE-AC02-76SF00515. Contributions. M.J.S., B.A.R., D.G., S.K. and J.M.G. designed the MPS schema and its ingestion of MEAD. M.J.S., B.A.R. and D.G. implemented MPS. T.E.M. facilitated implementation of DOI-based linkages between MPS and CaltechDATA. Quality checks were performed by all authors. M.J.S., B.A.R. and J.M.G. were the primary authors of the manuscript. Code availability. The MPS database was generated using DBgen (v1.0.0a7) (https://github.com/modelyst/dbgen), an open-source framework for building scientific databases and pipelines available at https://github.com/modelyst/dbgen. A python API, a command-line interface (CLI), and a Jupyter notebook with example queries are available in the Materials Provenance Store Client repository (https://github.com/modelyst/mps-client). Competing interests. Modelyst LLC implements custom data management systems in a professional context.
Published - 41597_2023_Article_2107.pdf
Supplemental Material - 41597_2023_2107_MOESM1_ESM.xlsx