A Caltech Library Service

Identification of Single Spectral Lines in Large Spectroscopic Surveys Using UMLAUT: an Unsupervised Machine-learning Algorithm Based on Unbiased Topology

Baronchelli, I. and Scarlata, C. M. and Rodríguez-Muñoz, L. and Bonato, M. and Morselli, L. and Vaccari, M. and Carraro, R. and Barrufet, L. and Henry, A. and Mehta, V. and Rodighiero, G. and Baruffolo, A. and Bagley, M. and Battisti, A. and Colbert, J. and Dai, Y. S. and De Pascale, M. and Dickinson, H. and Malkan, M. and Mancini, C. and Rafelski, M. and Teplitz, H. I. (2021) Identification of Single Spectral Lines in Large Spectroscopic Surveys Using UMLAUT: an Unsupervised Machine-learning Algorithm Based on Unbiased Topology. Astrophysical Journal Supplement Series, 257 (2). Art. No. 67. ISSN 0067-0049. doi:10.3847/1538-4365/ac250c.

[img] PDF - Published Version
See Usage Policy.


Use this Persistent URL to link to this item:


The identification of an emission line is unambiguous when multiple spectral features are clearly visible in the same spectrum. However, in many cases, only one line is detected, making it difficult to correctly determine the redshift. We developed a freely available unsupervised machine-learning algorithm based on unbiased topology (UMLAUT) that can be used in a very wide variety of contexts, including the identification of single emission lines. To this purpose, the algorithm combines different sources of information, such as the apparent magnitude, size and color of the emitting source, and the equivalent width and wavelength of the detected line. In each specific case, the algorithm automatically identifies the most relevant ones (i.e., those able to minimize the dispersion associated with the output parameter). The outputs can be easily integrated into different algorithms, allowing us to combine supervised and unsupervised techniques and increasing the overall accuracy. We tested our software on WISP (WFC3 IR Spectroscopic Parallel) survey data. WISP represents one of the closest existing analogs to the near-IR spectroscopic surveys that are going to be performed by the future Euclid and Roman missions. These missions will investigate the large-scale structure of the universe by surveying a large portion of the extragalactic sky in near-IR slitless spectroscopy, detecting a relevant fraction of single emission lines. In our tests, UMLAUT correctly identifies real lines in 83.2% of the cases. The accuracy is slightly higher (84.4%) when combining our unsupervised approach with a supervised approach we previously developed.

Item Type:Article
Related URLs:
URLURL TypeDescription Paper ItemUMLAUT code ItemErratum
Baronchelli, I.0000-0003-0556-2929
Scarlata, C. M.0000-0002-9136-8876
Rodríguez-Muñoz, L.0000-0002-0192-5131
Bonato, M.0000-0001-9139-2342
Morselli, L.0000-0003-0753-2571
Vaccari, M.0000-0002-6748-0577
Carraro, R.0000-0002-6089-1947
Barrufet, L.0000-0003-1641-6185
Henry, A.0000-0002-6586-4446
Mehta, V.0000-0001-7166-6035
Rodighiero, G.0000-0002-9415-2296
Baruffolo, A.0000-0002-1114-4355
Bagley, M.0000-0002-9921-9218
Battisti, A.0000-0003-4569-2285
Colbert, J.0000-0001-6482-3020
Dai, Y. S.0000-0002-7928-416X
De Pascale, M.0000-0002-7854-7271
Dickinson, H.0000-0003-0475-008X
Malkan, M.0000-0001-6919-1237
Mancini, C.0000-0002-4297-0561
Rafelski, M.0000-0002-9946-4731
Teplitz, H. I.0000-0002-7064-5424
Additional Information:© 2021. The American Astronomical Society. Received 2020 December 30; revised 2021 August 3; accepted 2021 September 6; published 2021 December 10.
Group:Infrared Processing and Analysis Center (IPAC)
Subject Keywords:Spectroscopy; Algorithms; Spectral line identification; Redshift surveys
Issue or Number:2
Classification Code:Unified Astronomy Thesaurus concepts: Spectroscopy (1558); Algorithms (1883); Spectral line identification (2073); Redshift surveys (1378)
Record Number:CaltechAUTHORS:20211210-240656000
Persistent URL:
Official Citation:I. Baronchelli et al 2021 ApJS 257 67
Usage Policy:No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:112360
Deposited By: George Porter
Deposited On:10 Dec 2021 20:20
Last Modified:22 Dec 2022 00:30

Repository Staff Only: item control page