A Caltech Library Service

Machine-assisted discovery of relationships in astronomy

Graham, Matthew J. and Djorgovski, S. G. and Mahabal, Ashish A. and Donalek, Ciro and Drake, Andrew J. (2013) Machine-assisted discovery of relationships in astronomy. Monthly Notices of the Royal Astronomical Society, 431 (3). pp. 2371-2384. ISSN 0035-8711. doi:10.1093/mnras/stt329.

PDF - Published Version
See Usage Policy.


Use this Persistent URL to link to this item:


High-volume feature-rich data sets are becoming the bread-and-butter of 21st century astronomy but present significant challenges to scientific discovery. In particular, identifying scientifically significant relationships between sets of parameters is non-trivial. Similar problems in biological and geosciences have led to the development of systems which can explore large parameter spaces and identify potentially interesting sets of associations. In this paper, we describe the application of automated discovery systems of relationships to astronomical data sets, focusing on an evolutionary programming technique and an information-theory technique. We demonstrate their use with classical astronomical relationships – the Hertzsprung–Russell diagram and the Fundamental Plane of elliptical galaxies. We also show how they work with the issue of binary classification which is relevant to the next generation of large synoptic sky surveys, such as the Large Synoptic Survey Telescope (LSST). We find that comparable results to more familiar techniques, such as decision trees, are achievable. Finally, we consider the reality of the relationships discovered and how this can be used for feature selection and extraction.

Item Type:Article
Related URLs:
URLURL TypeDescription
Graham, Matthew J.0000-0002-3168-0139
Djorgovski, S. G.0000-0002-0603-3087
Mahabal, Ashish A.0000-0003-2242-0244
Additional Information:© 2013 The Authors. Published by Oxford University Press on behalf of the Royal Astronomical Society. Accepted 2013 February 20. Received 2013 February 19; in original form 2012 October 4. First published online: March 20, 2013. We thank Hod Lipson and Michael Schmidt for useful discussions and their kind assistance with the EUREQA software. We also thank the anonymous referee for their useful comments which helped improve this paper. This work was supported in part by the NSF grants AST-0909182 and IIS-1118041, by the W. M. Keck Institute for Space Studies, and by the US Virtual Astronomical Observatory, itself supported by the NSF grant AST-0834235. This research has made use of data obtained from or software provided by the US Virtual Astronomical Observatory, which is sponsored by the National Science Foundation and the National Aeronautics and Space Administration. This research has made use of the SIMBAD data base, operated at CDS, Strasbourg, France, and the International Variable Star Index (VSX) data base, operated at AAVSO, Cambridge, Massachusetts, USA. Funding for SDSS-III has been provided by the Alfred P. Sloan Foundation, the Participating Institutions, the National Science Foundation and the U.S. Department of Energy Office of Science. The SDSS-III website is SDSS-III is managed by the Astrophysical Research Consortium for the Participating Institutions of the SDSS-III Collaboration including the University of Arizona, the Brazilian Participation Group, Brookhaven National Laboratory, University of Cambridge, Carnegie Mellon University, University of Florida, the French Participation Group, the German Participation Group, Harvard University, the Instituto de Astrofisica de Canarias, the Michigan State/Notre Dame/JINA Participation Group, Johns Hopkins University, Lawrence Berkeley National Laboratory, Max Planck Institute for Astrophysics, Max Planck Institute for Extraterrestrial Physics, New Mexico State University, New York University, Ohio State University, Pennsylvania State University, University of Portsmouth, Princeton University, the Spanish Participation Group, University of Tokyo, University of Utah, Vanderbilt University,University of Virginia, University of Washington and Yale University.
Group:Keck Institute for Space Studies
Funding AgencyGrant Number
W. M. Keck Institute for Space StudiesUNSPECIFIED
US Virtual Astronomical ObservatoryUNSPECIFIED
Alfred P. Sloan FoundationUNSPECIFIED
Participating InstitutionsUNSPECIFIED
Department of Energy (DOE) Office of ScienceUNSPECIFIED
Subject Keywords: methods: data analysis astronomical data bases: miscellaneous virtual observatory tools
Issue or Number:3
Record Number:CaltechAUTHORS:20130613-102659523
Persistent URL:
Official Citation: Matthew J. Graham, S. G. Djorgovski, Ashish A. Mahabal, Ciro Donalek, and Andrew J. Drake Machine-assisted discovery of relationships in astronomy MNRAS (May 21, 2013) Vol. 431 2371-2384 first published online March 20, 2013 doi:10.1093/mnras/stt329
Usage Policy:No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:38942
Deposited By: Ruth Sustaita
Deposited On:13 Jun 2013 18:09
Last Modified:09 Nov 2021 23:41

Repository Staff Only: item control page