Tails: Chasing Comets with the Zwicky Transient Facility and Deep Learning

We present Tails, an open-source deep-learning framework for the identification and localization of comets in the image data of the Zwicky Transient Facility (ZTF), a robotic optical time-domain survey currently in operation at the Palomar Observatory in California, USA. Tails employs a custom EfficientDet-based architecture and is capable of finding comets in single images in near real time, rather than requiring multiple epochs as with traditional methods. The system achieves state-of-the-art performance with 99% recall, 0.01% false positive rate, and 1-2 pixel root mean square error in the predicted position. We report the initial results of the Tails efficiency evaluation in a production setting on the data of the ZTF Twilight survey, including the first AI-assisted discovery of a comet (C/2020 T2) and the recovery of a comet (P/2016 J3 = P/2021 A3).


INTRODUCTION
Comets have mesmerized humans for millennia, frequently offering, arguably, some of the most spectacular sights in the night sky. Containing the original materials from when the Solar System first formed, comets provide a unique insight into the distant past of our Solar System. The recent discovery of the first interstellar comet 2I/Borisov by amateur astronomer Gennadiy Borisov predictably sparked much excitement and enthusiasm Corresponding author: Dmitry A. Duev duev@caltech.edu among astronomers and the general public alike (e.g., Bolin et al. 2020;Fitzsimmons et al. 2019;Guzik et al. 2020). Such objects could potentially provide important information on the formation of other stellar systems. It is a very exciting time to look for comets: the large-scale time-domain surveys that are currently in operation, such as the ZTF (Bellm et al. 2019a;Graham et al. 2019), Pan-STARRS (Chambers et al. 2016), or ATLAS , and the upcoming ones such as BlackGEM (Bloemen et al. 2016) and Vera Rubin Observatory / LSST (Ivezić et al. 2008) offer the richest data sets ever available to mine for comets.
Traditional comet detection algorithms rely on multiple observations of cometary objects that are linked arXiv:2102.13352v1 [astro-ph.IM] 26 Feb 2021 together and used to fit an orbital solution. To the best of our knowledge, the previous attempts to take the comet's morphology in the optical image data into consideration in the detection algorithms have not led to reliable and robust results.
In this work, we present Tails -a state-of-the-art deeplearning-based system for the identification and localization of comets in the image data of ZTF. Tails employs an EfficientDet-based architecture  and is thus capable of finding comets in single images in near real time, rather than requiring multiple epochs as with traditional methods.
The Tails' code is open-source and can be found in the "dmitryduev/tails" repository on GitHub. The version of the code aligned with this publication is archived on Zenodo at 10.5281/zenodo.4563226.

The Zwicky Transient Facility
The Zwicky Transient Facility (ZTF) 1 is a state-ofthe-art robotic time-domain sky survey capable of visiting the entire visible sky north of −30 • declination every night. ZTF observes the sky in the g, r, and i bands at different cadences depending on the scientific program and sky region (Bellm et al. 2019a;Graham et al. 2019). The 576 megapixel camera with a 47 deg 2 field of view, installed on the Samuel Oschin 48-inch (1.2-m) Schmidt Telescope, can scan more than 3750 deg 2 per hour, to a 5σ detection limit of 20.7 mag in the r band with a 30second exposure during new moon (Masci et al. 2019a;Dekany et al. 2020).
The ZTF Partnership has been running a specialized survey, the Twilight Survey (ZTF-TS) that operates at Solar elongations down to 35 degrees with an r-band limiting magnitude of 19.5 (Ye et al. 2020;Bellm et al. 2019b). ZTF-TS has so far resulted in the discovery of a number of Atira asteroids (orbits interior to the Earth's) as well as the first inner-Venus object, 2020 AV2 ). Motivated by the success, ZTF-TS will be expanded in Phase II of the project, which commenced in December 2020.
Comets become more easily detectable when close to the Sun as they become brighter and start exhibiting more pronounced coma and tails. Furthermore, it has been shown that the most detectable direction of approach of an interstellar object is from directly behind the Sun because of observational selection effects (Jedicke et al. 2016) and the fact that this direction has a greater cross section for asteroids to bend around and pass into the visibility volume (Engelhardt et al. 2017;Do et al. 2018).
Tails automates the search for comets with detectable morphology. While trained and evaluated on a large corpus of ZTF data, in this work we focus on Tails' performance when applied to the ZTF-TS data.

TAILS: A DEEP LEARNING FRAMEWORK FOR THE IDENTIFICATION AND LOCALIZATION OF COMETS
Deep learning (DL) is a subset of machine learning that employs artificial many-layer neural networks (Mc-Culloch & Pitts 1943). DL systems are able to discover, in a highly automated manner, efficient representations of the data, simplifying the task of finding the meaningful sought-after patterns in them. We refer the reader to a brilliant introduction into DL given in Géron (2019).
DL systems often reach near-optimal performance for a given task and are able to learn even very complicated, highly non-linear mappings between the input and output spaces. The art of building applied DL systems involves two major challenges: finding a suitable network architecture and, more importantly, constructing a large, labeled, representative data set for the network training. In the case of comet detection, the training set must reflect the possible variations across different seeing conditions, filters, sky location, CCDs, and include data artifacts caused by, for example, cross-talk or telescope reflections.

Data set
To build a seed sample for labeling, we first identified all potential observations of known comets conducted with ZTF from March 5, 2018 -March 4, 2020, based on their predicted position and brightness. The code for accomplishing that is based on the Python libraries pypride (Duev et al. 2016) and solarsyslib (Jensen-Clem et al. 2018) and uses the comet ephemerides obtained from the Minor Planet Center (MPC) 2 for a coarse search, followed by a JPL Horizons 3 (Giorgini et al. 1996) query for precision.
To provide more contextual information, epochal image data are supplemented by properly aligned reference images of the corresponding patches of sky and difference (epochal minus reference) images generated with the ZOGY algorithm (Zackay et al. 2016), all produced by the ZTF Science Data System at Caltech's IPAC (Masci et al. 2019a). Finally, we generate image triplet cutouts of size 256 by 256 pixel, which in angular measure translates into 4 .3 by 4 .3 at ZTF's pixel scale of 1 .01/pix.
We selected over 60,000 individual observations with the total comet magnitude ranging from 10 to 23 (as reported by JPL Horizons; see Fig. 1), out of which about 20,000 were sourced for manual annotation. This resulted in an initial sample of 3,000 examples with identifiable morphology.
We also compiled a set of approximately 20,000 negative examples consisting of point-like cometary detections, patches of sky with no identified transient or variable sources, CCD-edge cases, and a wide range of real (point-source) transient and bogus (e.g. artifacts due to bright stars, optical ghosts and "dementors") samples from the Braai data set (Duev et al. 2019).
To expand the data set, we then assembled a stan- This architecture delivers best-in-class object detection efficiency and performance across a wide range of resource constraints. This is achieved by using Efficient-Net -state-of-the art backbone networks for feature extraction, a weighted bi-directional feature pyramid network (BiFPN), which allows easy and fast multi-scale feature fusion, and a compound scaling method that simultaneously and uniformly scales the resolution, depth, and width for all backbone, feature, and location/class prediction networks .
The use of a BiFPN, which effectively represents and processes multi-scale features, makes this architecture particularly well-suited for the problem of morphologybased comet identification and localization.
A batch of triplet image stacks of size (n b , 256, 256, 3), where n b is the number of stacks in the batch, is passed through an EfficientNet B0 backbone . The extracted features from the last five blocks/levels of the network are passed through the BiFPN. The resulting five output tensors denoted in colored circles in Fig. 3 are fed into the head network, which outputs the probability of the image containing a comet p c and its centroid's predicted relative (x, y) position 4 .
We defined the loss function as follows: where L c denotes the binary cross-entropy function for the label c (1 -there is a comet in the image, 0there is no comet) and the predicted probability p c . If p c = 1, L p is computed as an L 1 loss for the relative 4 We note that standard object detection algorithms typically output bounding boxes and corresponding object class probabilities, i.e. sets of (4 + n classes ) numbers. Our approach allowed us to simplify the head network architecture and both simplify and speed up the assembly of the training data set, bypassing the unnecessary complexity and potential inaccuracy of drawing bounding boxes around known comet detections.
position (x, y) and its prediction (x p , y p ) with a small L 2 regularizing term (with = 10 −3 ), and w c and w p denote the weights of the two terms, respectively: We employed the Adam optimizer (Kingma & Ba 2014), a batch size of 32, and a 81%/9%/10% training/validation/test data split. For data augmentation, we applied random horizontal and vertical flips of the input data; no random rotations and translations were added. We note that the test/validation sets did not contain augmented data from the training set. We used standard techniques to maximize training performance: if no improvement in validation loss was observed for 10 epochs, the learning rate was reduced by a factor of 2, and training was stopped early if no improvement was observed for 30 epochs.
The EfficientNet's weights were randomly initialized 5 . We first set w c = 10, w p = 1 to allow for a fast convergence of the feature-extracting part of the network. To fine-tune the performance, we trained Tails on a balanced data set setting w c = 1.1, w p = 1 and monitored the validation loss for early stopping, then bumped w p = 2 and monitored the validation positional RMSE; finally, added the omitted negative examples and again monitored the validation loss for early stopping.
The resulting classifiers were put through the same active-learning-like procedure as was employed in the initial data set assembly, using several months of ZTF Twilight survey data.

TAILS PERFORMANCE
Evaluated on the test set, with a score p c threshold of 0.5, Tails demonstrates false positive and false negative rates (FPR and FNR) of 1.7%, and a ∼ 1 − 2 pixel median RMSE of the predicted comet "centroid" position versus that acquired from JPL Horizon (see Fig. 4).
The ZTF instrument's CCD mosaic has 16 individual 6k × 6k science CCDs. The raw ZTF image data are split into four readout quadrants per CCD and all processing is conducted independently on each CCD readout quadrant. We tessellate each 3k × 3k CCDquadrant image into a 13 × 13 grid of overlapping 256 × 256 pixels tiles and evaluate Tails on those. 6 Tails has been deployed in production since late June 2020. We have implemented a "sentinel" service 7 that processes the incoming data in real time and posts the plausible candidates to Fritz 8 , the ZTF Phase II opensource science data platform (van der Walt et al. 2019;Duev et al. 2019;Kasliwal et al. 2019), for further manual inspection and vetting. The candidates are autoannotated with the detailed information on the detection such as the score, CCD and sky positions, and crossmatches with known Solar System objects. Fig. 5 shows screenshots of the Fritz user interfaces used in the process.
It takes about 5 hours to run inference on a typical set of nightly ZTF Twilight data (∼ 45 30-second exposures) on an e2-highcpu-32 virtual machine instance (32 vCPU, 32 GB memory, SSD disk) on the Google Cloud Platform, including I/O operations.
Consistently with the expected rate of comet observations, a typical run on nightly Twilight data yields a few dozen candidates, which, given the typical number of processed tiles, gives an empirical false positive rate (FPR) value of about 0.01%. The scanning results are accumulated and used to expand the training set and improve Tails' performance.
We have evaluated Tails' performance on a random sample of 200 observations of known comets with identifiable morphology in July-August 2020 and found an empirical recall value of 99%. Fig. 6 shows a number of comet candidates not from the training set identified by Tails, including some of the ZTF observations of the comet 2I/Borisov. Optical artifacts resembling cometary objects are the main source of contamination.

Discovery of comet C/2020 T2
On October 7, 2020, Tails discovered a candidate that was posted to MPC's Possible Comet Confirmation Page (PCCP) 9 as ZTFDD01 (see Fig. 8). It was later confirmed to be a long-period comet and designated C/2020 T2 (Palomar), marking the first DL-assisted comet discovery (Duev 2020). The candidate was found in the Twilight survey data; it was at 19.3 mag in the ZTF r band. The FWHM of the object was approximately 2 .5-3 , compared to nearby background stars that have FWHM of ∼ 2 . The object showed a tail extending up to 5 in the westward direction. Table 1 summarizes the orbital elements of C/2020 T2 provided by the MPC and Fig. 7 shows its orbit as of the discovery date.
To determine if Tails could have discovered C/2020 T2 before 2020 October 7, we searched the ZTF archive for all Twilight Survey data covering the ephemeris position of the comet with the ZChecker software (Kelley et al. 2019). Eleven nights of data were found between 2020 June 11-20 (evening twilight) and October 7-21 (morning twilight). The comet was in conjunction with the Sun between the two sets, and not observable by ZTF. We measured the brightness of the coma in 4pix radius apertures, and aperture corrected the photometry according to the ZTF pipeline documentation. The data are shown in Fig. 9. Typical seeing was 2 in (a) False positive rate (FPR) and false negative rate (FNR) as a function of score pc. FPR and FNR balance out at around 1.7% for a score threshold of 0.5.  (a) Candidate scanning page. The users can inspect the candidates and save vetted objects to one or more groups. Candidates that are not saved to any group within 7 days are removed from Fritz.
(b) Source page. It aggregates and displays in an interactive manner all kinds of information related to an object that exists on Fritz, such as photometry, spectroscopy, auto-annotations, comments, finder charts, follow-up requests, and other data. June, the comet was very faint (r=20.2 mag), near the single-image detection limit (r=20.4-20.9 mag, 5σ point source), and had no morphological features for Tails to pick up. Thus October 7, 2020 was really the first opportunity for Tails to discover the comet.

Recovery of comet P/2016 J3 = P/2021 A3 (STEREO)
A comet candidate was identified by a combination of Tails and the ZTF Moving Object Detection Engine (Masci et al. 2019b) on 2020 January 04 UTC and submitted to the PCCP as ZTF0Ion (see Fig. 10). It was later identified as a recovery of comet P/2016 J3 (STEREO) and given the designation P/2016 J3 = P/2021 A3 (STEREO) (Bolin 2021). P/2021 A3 was identified in the evening Twilight survey data at r=19.3 mag with a clearly-extended appearance scoring 0.9 with a coma ∼10 wide and a tail extending past 20 in the north east direction.

DISCUSSION
This work demonstrates the potential of the state-ofthe-art deep-learning computer-vision architecture designs when applied to the problem of astronomical source detection and localization, with a specific focus on comets.
We experimented with the input data and trained a version of Tails that instead of triplet image stacks uses duplets -epochal/reference images, omitting the ZOGY difference images. Our tests show that this version achieves essentially the same performance as the one trained on triplets without requiring image differencing, expanding the range of potential use cases of Tails.   While Tails is trained only on ZTF data, with transfer learning, it can be adapted to other sky surveys, including the upcoming Vera Rubin Observatory's Legacy Survey of Space and Time (LSST) (Ivezić et al. 2008 This research has made use of data and/or services provided by the International Astronomical Union's Minor Planet Center. Figure 8. Discovery image of C/2020 T2 (Palomar), the first DL-assisted comet discovery by Tails. Taken on October 7, 2020 with the ZTF camera on the 48-inch Schmidt telescope at Palomar. The left pane shows the epochal science exposure (256x256 pix cutout), the middle pane -the reference image, the right pane -the ZOGY difference image. East is to the left, north is down. Figure 9. Photometry of comet C/2020 T2 (Palomar) derived from ZTF-TS images (r-band) versus time from perihelion. A best-fit model lightcurve is also shown: r = 9.85 + 9.54 log 10 (r h ) + 5 log 10 (∆) − Φ(θ), where r h is the heliocentric distance in au, ∆ is the comet-observer distance in au, and Φ(θ)) is the phase angle correction from Schleicher et al. (1998). Tp denotes the time of perihelion passage (July 11, 2021). The left pane shows the epochal science exposure (256x256 pix cutout), the middle pane -the reference image, the right pane -the ZOGY difference image. East is to the left, north is down.