Welcome to the new version of CaltechAUTHORS. Login is currently restricted to library staff. If you notice any issues, please email coda@library.caltech.edu
Published February 2019 | Submitted + Published
Journal Article Open

Optimizing spectroscopic follow-up strategies for supernova photometric classification with active learning


We report a framework for spectroscopic follow-up design for optimizing supernova photometric classification. The strategy accounts for the unavoidable mismatch between spectroscopic and photometric samples, and can be used even in the beginning of a new survey – without any initial training set. The framework falls under the umbrella of active learning (AL), a class of algorithms that aims to minimize labelling costs by identifying a few, carefully chosen, objects that have high potential in improving the classifier predictions. As a proof of concept, we use the simulated data released after the SuperNova Photometric Classification Challenge (SNPCC) and a random forest classifier. Our results show that, using only 12 per cent the number of training objects in the SNPCC spectroscopic sample, this approach is able to double purity results. Moreover, in order to take into account multiple spectroscopic observations in the same night, we propose a semisupervised batch-mode AL algorithm that selects a set of N = 5 most informative objects at each night. In comparison with the initial state using the traditional approach, our method achieves 2.3 times higher purity and comparable figure of merit results after only 180 d of observation, or 800 queries (73 per cent of the SNPCC spectroscopic sample size). Such results were obtained using the same amount of spectroscopic time necessary to observe the original SNPCC spectroscopic sample, showing that this type of strategy is feasible with current available spectroscopic resources. The code used in this work is available in the COINtoolbox.

Additional Information

© 2018 The Author(s) Published by Oxford University Press on behalf of the Royal Astronomical Society. This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model). Accepted 2018 November 5. Received 2018 November 4; in original form 2018 April 19. Published: 06 November 2018. This work was created during the 4th COIN Residence Program (CRP#4), held in Clermont-Ferrand, France on August 2017, with support from Université Clermont-Auvergne and La Région Auvergne-Rhône-Alpes. This project is financially supported by CNRS as part of its MOMENTUM programme over the 2018–2020 period. EEOI thanks Michele Sasdelli for comments on the draft and Isobel Hook for useful discussions. AKM acknowledges the support from the Portuguese Fundação para a Ciência e a Tecnologia (FCT) through grants SFRH/BPD/74697/2010, from the Portuguese Strategic Programme UID/FIS/00099/2013 for CENTRA, the ESA contract AO/1-7836/14/NL/HB and Caltech Division of Physics, Mathematics and Astronomy for hosting a research leave during 2017-2018, when this paper was prepared. RSS thanks the support from NASA under the Astrophysics Theory Program Grant 14-ATP14-0007. RB acknowledges support from the National Science Foundation (NSF) award 1616974 and the NKFI NN 114560 grant of Hungary. BQ acknowledges financial support from CNPq-Brazil under the process number 205459/2014-5. AZV acknowledges financial support from CNPq. AM thanks partial support from NSF through grants AST-0909182, AST-1313422, AST-1413600, and AST-1518308. This work has made use of the computing facilities of the Laboratory of Astroinformatics (IAG/USP, NAT/Unicsul), whose purchase was made possible by the Brazilian agency FAPESP (grant 2009/54006-4) and the INCT-A. This work was partly supported by the Center for Advanced Computing and Data Systems (CACDS) and by the Texas Institute for Measurement, Evaluation, and Statistics (TIMES) at the University of Houston. This project has been supported by a Marie Sklodowska-Curie Innovative Training Network Fellowship of the European Commission's Horizon 2020 Programme under contract number 675440 AMVA4NewPhysics. The Cosmostatistics Initiative (COIN) is a non-profit organization whose aim is to nourish the synergy between astrophysics, cosmology, statistics, and machine learning communities. This work benefited from the following collaborative platforms: Overleaf, Github, and Slack.

Attached Files

Published - sty3015.pdf

Submitted - 1804.03765.pdf


Files (8.2 MB)
Name Size Download all
3.3 MB Preview Download
4.9 MB Preview Download

Additional details

August 19, 2023
October 20, 2023