A Caltech Library Service

Data mining a large digital sky survey: from the challenges to the scientific results

Djorgovski, S. G. and de Carvalho, R. R. and Odewahn, S. C. and Gal, R. R. and Roden, J. and Stolorz, P. and Gray, A. (1997) Data mining a large digital sky survey: from the challenges to the scientific results. In: Applications of Digital Image Processing XX. Proceedings of SPIE. No.3164. Society of Photo-optical Instrumentation Engineers (SPIE) , Bellingham, WA, pp. 98-109. ISBN 9780819425867.

[img] PDF - Published Version
See Usage Policy.


Use this Persistent URL to link to this item:


The analysis and an efficient scientific exploration of the digital Palomar observatory sky survey represents a major technical challenge. The input data set consists of 3 Terabytes of pixel information, and contains a few billion sources. We describe some of the specific scientific problems posed by the data, including searches for distant quasars and clusters of galaxies, and the data-mining techniques we are exploring in addressing them Machine- assisted discovery methods may become essential for the analysis of such multi-Terabyte data sets. New and future approaches involve unsupervised classification and clustering analysis in the Giga-object data space, including various Bayesian techniques. In addition to the searches for known types of objects in this database, these techniques may also offer the possibility of discovering previously unknown, rare types of astronomical objects.

Item Type:Book Section
Related URLs:
URLURL TypeDescription
Djorgovski, S. G.0000-0002-0603-3087
de Carvalho, R. R.0000-0002-1283-3363
Additional Information:© 1997 Society of Photo-optical Instrumentation Engineers (SPIE). This work was supported in part by the funds from NASA, the Norris Foundation, and the NSF PYI award AST-9157412. We acknowledge the efforts of the POSS-II team at Palomar, the digitization team at STScI. N. Weir and U. Fayyad made important initial contributions to this project. We also thank J. Kennefick, J. Darling, and V. Desai for their contributions to the quasar search project. The DPOSS work at Caltech is a part of the CRONA international collaboration.
Funding AgencyGrant Number
Kenneth T. and Eileen L. Norris FoundationUNSPECIFIED
Subject Keywords:data mining, sky surveys, clustering analysis, unsupervised classification
Series Name:Proceedings of SPIE
Issue or Number:3164
Record Number:CaltechAUTHORS:20180711-101525969
Persistent URL:
Official Citation:S. George Djorgovski, Reinaldo R. de Carvalho, Steve C. Odewahn, R. R. Gal, Joe Roden, Paul Stolorz, Alex Gray, "Data mining a large digital sky survey: from the challenges to the scientific results", Proc. SPIE 3164, Applications of Digital Image Processing XX, (30 October 1997); doi: 10.1117/12.292750;
Usage Policy:No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:87747
Deposited By: Tony Diaz
Deposited On:11 Jul 2018 17:28
Last Modified:15 Nov 2021 20:51

Repository Staff Only: item control page