The ZTF Source Classification Project. I. Methods and Infrastructure

van Roestel, Jan and Duev, Dmitry A. and Mahabal, Ashish A. and Coughlin, Michael W. and Mróz, Przemek and Burdge, Kevin and Drake, Andrew and Graham, Matthew J. and Hillenbrand, Lynne and Bellm, Eric C. and Kupfer, Thomas and Delacroix, Alexandre and Fremling, C. and Golkhou, V. Zach and Hale, David and Laher, Russ R. and Masci, Frank J. and Riddle, Reed and Rosnet, Philippe and Rusholme, Ben and Smith, Roger and Soumagnac, Maayane T. and Walters, Richard and Prince, Thomas A. and Kulkarni, S. R. (2021) The ZTF Source Classification Project. I. Methods and Infrastructure. Astronomical Journal, 161 (6). Art. No. 267. ISSN 1538-3881. doi:10.3847/1538-3881/abe853.

The Zwicky Transient Facility (ZTF) has been observing the entire northern sky since the start of 2018 down to a magnitude of 20.5 (5σ for 30 s exposure) in the g, r, and i filters. Over the course of two years, ZTF has obtained light curves of more than a billion sources, each with 50–1000 epochs per light curve in g and r, and fewer in i. To be able to use the information contained in the light curves of variable sources for new scientific discoveries, an efficient and flexible framework is needed to classify them. In this paper, we introduce the methods and infrastructure that will be used to classify all ZTF light curves. Our approach aims to be flexible and modular and allows the use of a dynamical classification scheme and labels, continuously evolving training sets, and the use of different machine-learning classifier types and architectures. With this setup, we are able to continuously update and improve the classification of ZTF light curves as new data become available, training samples are updated, and new classes need to be incorporated.

Facility: ZTF. - Software: astropy (Astropy Collaboration et al. 2018), keras (Chollet & Others 2015), keras-tuner (O'Malley et al. 2019), kowalski (Duev et al. 2019), matplotlib (Hunter 2007), numpy (van der Walt et al. 2011), pandas (pandas development team 2020), tensorflow (Abadi et al. 2015), xgboost (Chen & Guestrin 2016).
