A Caltech Library Service

Statistical Analysis and Interpolation of Compositional Data in Materials Science

Pesenson, Misha Z. and Suram, Santosh K. and Gregoire, John M. (2015) Statistical Analysis and Interpolation of Compositional Data in Materials Science. ACS Combinatorial Science, 17 (2). pp. 130-136. ISSN 2156-8952. doi:10.1021/co5001458.

Full text is not posted in this repository. Consult Related URLs below.

Use this Persistent URL to link to this item:


Compositional data are ubiquitous in chemistry and materials science: analysis of elements in multicomponent systems, combinatorial problems, etc., lead to data that are non-negative and sum to a constant (for example, atomic concentrations). The constant sum constraint restricts the sampling space to a simplex instead of the usual Euclidean space. Since statistical measures such as mean and standard deviation are defined for the Euclidean space, traditional correlation studies, multivariate analysis, and hypothesis testing may lead to erroneous dependencies and incorrect inferences when applied to compositional data. Furthermore, composition measurements that are used for data analytics may not include all of the elements contained in the material; that is, the measurements may be subcompositions of a higher-dimensional parent composition. Physically meaningful statistical analysis must yield results that are invariant under the number of composition elements, requiring the application of specialized statistical tools. We present specifics and subtleties of compositional data processing through discussion of illustrative examples. We introduce basic concepts, terminology, and methods required for the analysis of compositional data and utilize them for the spatial interpolation of composition in a sputtered thin film. The results demonstrate the importance of this mathematical framework for compositional data analysis (CDA) in the fields of materials science and chemistry.

Item Type:Article
Related URLs:
URLURL TypeDescription
Suram, Santosh K.0000-0001-8170-2685
Gregoire, John M.0000-0002-2863-5265
Additional Information:© 2014 American Chemical Society. Received: September 17, 2014; revised: November 26, 2014. Publication Date (Web): December 29, 2014. This material is based upon work performed by the Joint Center for Artificial Photosynthesis, a DOE Energy Innovation Hub, supported through the Office of Science of the U.S. Department of Energy under Award Number DE-SC000499.
Funding AgencyGrant Number
Department of Energy (DOE)DE-SC000499
Subject Keywords:high-throughput screening; electrocatalyst; inkjet printing; sputtering; thin-films; interpolation; compositional data; big data; complex data; statistical data analysis
Issue or Number:2
Record Number:CaltechAUTHORS:20150120-090320534
Persistent URL:
Official Citation:Statistical Analysis and Interpolation of Compositional Data in Materials Science Misha Z. Pesenson, Santosh K. Suram, and John M. Gregoire ACS Combinatorial Science 2015 17 (2), 130-136 DOI: 10.1021/co5001458
Usage Policy:No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:53859
Deposited On:20 Jan 2015 22:16
Last Modified:10 Nov 2021 20:08

Repository Staff Only: item control page