Probabilistic phase labeling and lattice refinement for autonomous materials research
Abstract
X-ray diffraction (XRD) is a powerful method for determining a material's crystal structure in high-throughput experimentation, and is widely being incorporated in artificially intelligent agents for autonomous scientific discovery. However, rapid, automated, and reliable analysis of XRD data at rates that match the pace of experimental measurements at a synchrotron source remains a major challenge. To address these issues, we developed CrystalShift for rapid and efficient probabilistic XRD phase labeling employing symmetry-constrained optimization, best-first tree search, and Bayesian model comparison. The algorithm estimates probabilities for phase combinations without requiring additional phase space information or training. We demonstrate that CrystalShift provides robust probability estimates, outperforming existing methods on synthetic and experimental datasets, and can be readily integrated into high-throughput experimental workflows. In addition to efficient phase labeling, CrystalShift offers quantitative insights into materials' structural parameters, which facilitate both expert evaluation and AI-based modeling of the phase space, ultimately accelerating materials identification and discovery.
Copyright and License
© The Author(s) 2025.
This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
Acknowledgement
This material is based upon work supported by the Air Force Office of Scientific Research under award number FA9550-18-1-0136. This work is based upon research conducted at the Materials Solutions Network at CHESS (MSN-C) which is supported by the Air Force Research Laboratory under award FA8650-19-2-5220. Experimental work was performed in part at the Cornell NanoScale Facility, an NNCI member supported by NSF Grant NNCI-2025233. Additionally, this research was conducted with support from the Cornell University Center for Advanced Computing. Use of the Stanford Synchrotron Radiation Lightsource, SLAC National Accelerator Laboratory, is supported by the U.S. Department of Energy, Office of Science, Office of Basic Energy Sciences under Contract No. DE-AC02-76SF00515.
Data Availability
The datasets used and analyzed during the current study are publicly available at https://doi.org/10.7298/kwe5-xc35. The Ca5(PO4)3F data is from https://subversion.xray.aps.anl.gov/pyGSAS/Tutorials/LabData/data/.
Code Availability
The underlying code for this study is available in CrystalShift.jl and CrystalTree.jl repositories, and can be accessed via: https://github.com/MingChiangChang/CrystalShift.jl and https://github.com/MingChiangChang/CrystalTree.jl. A snapshot of the codebase used in this study is archived in the data repository at https://doi.org/10.7298/kwe5-xc35.
Supplemental Material
Files
Name | Size | Download all |
---|---|---|
md5:01e309ee4cae74818fa205e894b64d38
|
1.2 MB | Preview Download |
md5:df8ec4ddae05a281798016a308cbc0a4
|
2.4 MB | Preview Download |
Additional details
- United States Air Force Office of Scientific Research
- FA9550-18-1-0136
- United States Air Force Office of Scientific Research
- FA8650-19-2-5220
- National Science Foundation
- NNCI-2025233
- United States Department of Energy
- DE-AC02-76SF00515
- Accepted
-
2025-04-13
- Caltech groups
- Division of Engineering and Applied Science (EAS), Liquid Sunlight Alliance
- Publication Status
- Published