The materials experiment knowledge graph
Abstract
Materials knowledge is inherently hierarchical. While high-level descriptors such as composition and structure are valuable for contextualizing materials data, the data must ultimately be considered in the context of its low-level acquisition details. Graph databases offer an opportunity to represent hierarchical relationships among data, organizing semantic relationships into a knowledge graph. Herein, we establish a knowledge graph of materials experiments whose construction encodes the complete provenance of each material sample and its associated experimental data and metadata. Additional relationships among materials and experiments further encode knowledge and facilitate data exploration. We illustrate the Materials Experiment Knowledge Graph (MekG) using several use cases, demonstrating the value of modern graph databases for the enterprise of data-driven materials science.
Copyright and License
This article is licensed under a Creative Commons Attribution 3.0 Unported Licence.
Acknowledgement
This material is primarily based on work performed by the Liquid Sunlight Alliance, which is supported by the U.S. Department of Energy, Office of Science, Office of Basic Energy Sciences, Fuels from Sunlight Hub under Award DE-SC0021266. Development of the graph database schema was supported by Toyota Research Institute. Much of the underlying data was generated by research in the Joint Center for Artificial Photosynthesis, a DOE Energy Innovation Hub, supported through the Office of Science of the U.S. Department of Energy (Award No. DE-SC0004993). Storage is provided by the Open Storage Network via XSEDE allocation INI210004.
Contributions
M. J. S., B. A. R., D. G., S. K. S., and J. M. G. designed the MekG and the use cases. M. J. S. and B. A. R. implemented MekG with assistance from D. G. and J. M. G. J. B. and D. G. implemented the design of experiments use case.
Data Availability
The MPS SQL database from which MekG is built and the three sub-databases are available at https://data.caltech.edu/records/aeffy-dcr62 (doi: https://doi.org/10.22002/aeffy-dcr62). The MekG neo4j database is available at https://data.caltech.edu/records/h88fq-dk449 (doi: https://doi.org/10.22002/h88fq-dk449).
Code Availability
The code for the query time use cases and MekG migration from MPS is available at https://github.com/modelyst/MekG-migrations. The code for the design of experiments and hypothesis evaluation use cases is available at https://data.caltech.edu/records/m4mpa-4mt17 (doi: https://doi.org/10.22002/m4mpa-4mt17).
Conflict of Interest
Modelyst LLC implements custom data management systems in a professional context.
Files
Name | Size | Download all |
---|---|---|
md5:c892ce579ffadccfb4f24470367eecf5
|
430.1 kB | Preview Download |
md5:42f2f09e8742d701a129a41d784f5ca0
|
569.1 kB | Preview Download |
Additional details
- United States Department of Energy
- DE-SC0021266
- Toyota Motor Corporation (United States)
- Toyota Research Institute
- United States Department of Energy
- DE-SC0004993
- National Science Foundation
- INI210004
- Caltech groups
- Liquid Sunlight Alliance, JCAP