2111.05929.pdf

Draft version December 1, 2021

Typeset using L

twocolumn

style in AASTeX63

COMAP Early Science: III. CO Data Processing

Marie K. Foss

avard T. Ihle

Jowita Borowska,

Kieran A. Cleary

Hans Kristian Eriksen

Stuart E. Harper

Junhan Kim

James W. Lamb

Jonas G. S. Lunde,

Liju Philip

Maren Rasmussen,

Nils-Ole Stutzer

Bade D. Uzgil

Duncan J. Watts

Ingunn K. Wehus

David P. Woody,

J. Richard Bond

Patrick C. Breysse

Morgan Catha,

Sarah E. Church,

Dongwoo T. Chung

7, 10

Clive Dickinson

Delaney A. Dunne

Todd Gaier,

Joshua Ott Gundersen,

Andrew I. Harris

Richard Hobbs,

Charles R. Lawrence,

Norman Murray,

Anthony C. S. Readhead

Hamsa Padmanabhan

Timothy J. Pearson

Thomas J. Rennie

(COMAP Collaboration)

Institute of Theoretical Astrophysics, University of Oslo, P.O. Box 1029 Blindern, N-0315 Oslo, Norway

California Institute of Technology, Pasadena, CA 91125, USA

Jodrell Bank Centre for Astrophysics, Alan Turing Building, Department of Physics and Astronomy, School of Natural Sciences, The

University of Manchester, Oxford Road, Manchester, M13 9PL, U.K.

Owens Valley Radio Observatory, California Institute of Technology, Big Pine, CA 93513, USA

Jet Propulsion Laboratory, California Institute of Technology, 4800 Oak Grove Drive, Pasadena, CA 91109, USA

California Institute of Technology, 1200 E. California Blvd., Pasadena, CA 91125, USA

Canadian Institute for Theoretical Astrophysics, University of Toronto, 60 St. George Street, Toronto, ON M5S 3H8, Canada

Center for Cosmology and Particle Physics, Department of Physics, New York University, 726 Broadway, New York, NY, 10003, USA

Kavli Institute for Particle Astrophysics and Cosmology & Physics Department, Stanford University, Stanford, CA 94305, USA

Dunlap Institute for Astronomy and Astrophysics, University of Toronto, 50 St. George Street, Toronto, ON M5S 3H4, Canada

Department of Physics, University of Miami, 1320 Campo Sano Avenue, Coral Gables, FL 33146, USA

Department of Astronomy, University of Maryland, College Park, MD 20742, USA

Departement de Physique Th ́eorique, Universite de Gen`eve, 24 Quai Ernest-Ansermet, CH-1211 Gen`eve 4, Switzerland

ABSTRACT

We describe the first season COMAP analysis pipeline that converts raw detector readouts to cali-

brated sky maps. This pipeline implements four main steps: gain calibration, filtering, data selection,

and map-making. Absolute gain calibration relies on a combination of instrumental and astrophys-

ical sources, while relative gain calibration exploits real-time total-power variations. High efficiency

filtering is achieved through spectroscopic common-mode rejection within and across receivers, result-

ing in nearly uncorrelated white noise within single-frequency channels. Consequently, near-optimal

but biased maps are produced by binning the filtered time stream into pixelized maps; the corre-

sponding signal bias transfer function is estimated through simulations. Data selection is performed

automatically through a series of goodness-of-fit statistics, including

and multi-scale correlation

tests. Applying this pipeline to the first-season COMAP data, we produce a dataset with very low lev-

els of correlated noise. We find that one of our two scanning strategies (the Lissajous type) is sensitive

to residual instrumental systematics. As a result, we no longer use this type of scan and exclude data

taken this way from our Season 1 power spectrum estimates. We perform a careful analysis of our data

processing and observing efficiencies and take account of planned improvements to estimate our future

performance. Power spectrum results derived from the first-season COMAP maps are presented and

discussed in companion papers.

INTRODUCTION

Corresponding author: Marie K. Foss

m.k.foss@astro.uio.no

Understanding the evolution of galaxies and the in-

tergalactic medium (IGM) over the largest spatial and

temporal scales is one of the principal goals of cosmol-

ogy. Galaxy surveys address this challenge by resolving

and detecting individual galaxies, a technique that nec-

arXiv:2111.05929v3 [astro-ph.IM] 30 Nov 2021

Foss et al.

essarily favors brighter galaxies and smaller cosmic vol-

umes. Spectral line intensity mapping (LIM) (Madau

et al. 1997; Battye et al. 2004; Peterson et al. 2006;

Loeb & Wyithe 2008) is a complementary technique (see

Kovetz et al. 2017 or Kovetz et al. 2019 for a review)

that holds the potential to characterize the global prop-

erties of galaxies and their evolution by surveying the

aggregate emission from all galaxies over large volumes.

This technique uses redshifted line emission (e.g., 21-

cm, Ly

, CO, or C

) as a tracer for the underlying den-

sity field. Large volumes along a given line-of-sight may

be surveyed simultaneously with a single spectrometer

at relatively low spatial resolution, and by scanning this

spectrometer across the sky a full 3D density map may

be derived. Despite multiple different modeling efforts

(Righi et al. 2008; Visbal & Loeb 2010; Lidz et al. 2011;

Pullen et al. 2013; Breysse et al. 2014; Li et al. 2016;

Padmanabhan 2018; Moradinezhad Dizgah & Keating

2019; Sun et al. 2019; Yang et al. 2021; Moradinezhad

Dizgah et al. 2021; Chung et al. 2021a) and significant

progress on the observational front (Keating et al. 2016;

Riechers et al. 2018; Keating et al. 2020; Keenan et al.

2021), the overall level of the CO signal, especially in

the clustering regime, is still unknown.

The CO Mapping Array Project (COMAP; Cleary

et al. 2021) is an intensity mapping experiment that

aims to use emission from carbon monoxide (CO) to

trace the aggregate properties of galaxies over cosmic

time, back to the Epoch of Reionization. A Pathfinder

experiment, consisting of a 19-feed 26–34 GHz receiver,

has been fielded on a 10.4 m single-dish telescope at the

Owens Valley Radio Observatory (OVRO). In this fre-

quency range, the receiver is sensitive to CO(1–0) at

= 2

4–3.4, with a fainter contribution from CO(2–1)

= 6–8. The main goal of the Pathfinder is to detect

the CO(1–0) signal and use it to constrain the properties

of galaxies at the Epoch of Galaxy Assembly. A future

phase will add a second receiver at 12–20 GHz in order to

detect CO(1–0) from around

= 5–9, cross-correlating

with the CO(2–1) signal from the 26–34 GHz receiver

and constraining the properties of galaxies towards the

end of the Epoch of Reionization.

The receiver’s detector chain is based on cryogeni-

cally cooled HEMT low-noise amplifiers (LNA) which

contribute to a typical system temperature of about

44 K across the full frequency range. The predicted

signal from high-redshift CO emission is expected to

be no more than a few microkelvin per COMAP spa-

tial/spectral resolution element (or “voxel”). Thus, the

raw instrumental noise must be reduced by many orders

of magnitude before a statistically significant detection

may be achieved. In practice, this is done by repeatedly

observing the same part of the sky using multiple de-

tectors, and thereby gradually increasing the sensitivity

per voxel. For this to succeed, however, it is necessary

to suppress systematic contributions from atmospheric

temperature variations, sidelobe contamination, ground

pickup, standing waves, Galactic foregrounds, etc. by a

corresponding amount.

The first season COMAP science observations started

in June 2019 and lasted until August 2020. This pa-

per describes the first season COMAP data analysis

pipeline, which aims to produce clean maps from raw

time-ordered COMAP observations. This includes cali-

bration, data selection, filtering, and map-making. The

rest of this paper is organized as follows: First, in or-

der to establish useful notation and conventions, we give

a brief introduction to the COMAP instrument in Sec-

tion 2, while referring the interested reader to Lamb

et al. (2021) for full details. Next, we provide a high-

level overview of the analysis pipeline in Section 3.1,

before specifying each step in Sections 3.3–3.6. Data se-

lection and efficiency is discussed in Sections 4 and 5.

The results are presented in Section 6, and we summa-

rize and conclude in Section 7.

INSTRUMENT AND DATA MODEL

Before describing the COMAP analysis pipeline, we

provide a brief overview of the instrument itself, and

define an explicit data model. A more detailed descrip-

tion of the instrument can be found in a separate paper

(Lamb et al. 2021).

2.1.

Instrument overview

The COMAP Phase I instrument observes in the K

band, at 26–34 GHz and is located at the Owens Valley

Radio Observatory (OVRO) in California, USA. It is

mounted on a 10.4 m telescope that was originally built

for the Millimeter Array at OVRO, then used as a part

of the Combined Array for Research in Millimeter-wave

Astronomy (CARMA) experiment, and has now been

repurposed for COMAP. The telescope’s primary and

secondary reflectors have diameters of 10.4 m and 1.1 m,

respectively, and the beam FWHM is about 4.5 arcmin

at 30 GHz.

The receiver comprises 19 independent detector

chains, called “feeds”. The signal chain of each feed

consists of individual feed horns, polarizers, low noise

amplifiers, two stages of downconversion, frequency sep-

aration and digitization. For the observations described

in this paper, 15 feeds have a two-stage polarizer, two

feeds have a single-stage polarizer, and two feeds have

no polarizer. The digitization happens in two CASPER

“ROACH-2” FPGA-based spectrometers for each sig-

nal chain, giving us four 2 GHz-wide sidebands (SB),

COMAP Early Science: CO Data Processing

Local Sidereal Time [h]

Elevation [deg]

Field 1

Field 2

Field 3

Tau A

Cas A

Cyg A

Figure 1.

Elevation of CO (pink/purple) and calibration

(orange) fields as a function of Local Sidereal Time.

each of which has 1024 frequency channels, resulting in

a native frequency resolution of approximately 2 MHz.

The two sidebands of each band (A and B) are labelled

“lower” (LSB) or “upper” (USB). For more details on

the instrument see Lamb et al. (2021).

To support frequent and accurate gain estimation,

COMAP employs an ambient temperature load that is

directly attached to the environmental shroud housing.

This “calibration vane” is automatically moved in front

of the feed horn array at the beginning and end of each

observation (each lasting for about one hour; see Sec-

tion 2.3), fully filling the field of view of each pixel. The

temperature of the calibration vane is monitored with

sensors, allowing the system temperature to be calcu-

lated and applied to calibrate the gain (see Section 3.4

for more details).

2.2.

Field Selection

COMAP observes several parts of the sky. Table 1

lists all CO science fields and calibrators

. In Figure 1

we plot the elevation of the CO and calibration fields as

a function of Local Sidereal Time, indicating when the

fields are available for observation. Figures 2 and 3 show

the position of the three CO fields observed by COMAP.

These were selected to maximize the observing efficiency,

avoid bright 30 GHz point sources (

&

1 Jy), and overlap

with the coverage of Hobby-Eberly Telescope Dark En-

ergy eXperiment (HETDEX; Hill et al. 2008; Gebhardt

Since COMAP began observing, the boundaries of the HETDEX

Spring field coverage changed, with the result that one COMAP

field no longer overlaps with the main HETDEX survey although

we hope to also fill in this field with additional HETDEX obser-

vations.

et al. 2021; Hill et al. 2021), a galaxy survey target-

ing Ly-alpha emission from galaxies in the same red-

shift. Although COMAP’s observing strategy has been

designed to permit the direct detection of CO fluctua-

tions from galaxies at

= 2

−

4, cross-correlation

with a galaxy survey such as HETDEX can increase the

detection significance by at least a factor of two (Chung

et al. 2019; Silva et al. 2021) as well as provide valida-

tion for the origin of detected signal in galaxies at the

target redshift.

In addition to the main science fields, we are also con-

ducting a survey of the Galactic plane covering longi-

tudes 20

◦

< l <

220

◦

, details of which can be found in

Rennie et al. (2021).

To facilitate calibration with astrophysical sources, we

observe a handful of radio sources, including Jupiter, the

supernova remnants Taurus A (TauA) and Cassiopeia A

(CasA), and the radio galaxy Cygnus A (CygA), all of

which are somewhat extended compared to the beam

except for Jupiter.

2.3.

Observation Strategy

Telescope scans of the science fields follow a harmonic

motion described by

az =

sin(

); el =

sin(

)

(1)

where

A,B

are amplitude parameters that define the

size of the field, the ratio

a/b

determines the shape of

the curve, and

is a phase parameter. Two different

scan types were used: “constant elevation scans (CES)”

(

= 0) and “Lissajous” (varying parameters), alternat-

ing between each on a daily basis. At the start of a

scan, the telescope is positioned at the leading edge of

the field. The telescope then executes the scan while the

field drifts through the pattern. This typically takes 3–

10 minutes, after which the telescope is repointed to the

leading edge of the field again in preparation for the next

scan. An example of the scanning path for about one

hour of continuous observations with a Lissajous scan

and a CES is shown in Figure 4. Testing the relative

performance of the CES and Lissajous scanning strate-

gies in terms of final data quality is an important goal

of the first-season COMAP survey.

2.4.

Data model

As described by Lamb et al. (2021), the COMAP de-

tector readout for a single frequency channel may be

modelled as

out

∆

νT

sys

(2)

where

is the Boltzmann constant,

is the gain, ∆

is the bandwidth, and

sys

is the system temperature of

Foss et al.

Table 1.

COMAP fields and calibrators

Field Name RA (J2000) Dec (J2000) Notes

Field 1

01:41:44.4

+00:00:00.0

CO science field - lies within the HETDEX Fall field

Field 2

11:20:00.0

+52:30:00.0

CO science field - lies within the HETDEX Spring field

Field 3

15:04:00.0

+55:00:00.0

CO science field

TauA

05:34:31.9

+22:00:52.2

Pointing calibrator - supernova remnant (Crab Nebula)

CasA

23:23:24.0

+58:48:54.0

Pointing calibrator - supernova remnant

CygA

19:59:28.4

+40:44:02.1

Pointing calibrator - radio galaxy

Jupiter

Pointing calibrator

Field 1

Field 2

Field 3

1200

CMB

Figure 2.

The three CO fields observed by the telescope overplotted as contours with radii of

∼

◦

, centered at the field centers

(in Galactic coordinates) (lon

lat) = (149

◦

−

◦

)

(150

◦

◦

) and (91

◦

◦

) for Fields 1, 2 and 3 respectively,

on top of the

Planck

LFI 30 GHz full-mission map (downloaded from the

Planck Legacy Archive

Planck Collaboration et al.

2020).

the instrument. The system temperature may be further

modeled as

sys

receiver

atmosphere

ground

CMB

foregrounds

(3)

where

reciever

is the effective noise temperature of the

receiver,

atmosphere

is the noise contribution from the

In this section we are writing all the contributions to

sys

terms of their effective noise contribution, rather than any physi-

cal temperatures. See Section 3.4 for a definition of

sys

in terms

of physical quantities.

atmosphere,

ground

is ground pickup from far sidelobes,

CMB

is the contribution from the CMB,

foregrounds

are

continuum foregrounds (typically from the galaxy), and

is the line emission signal from extragalactic CO,

which is the main scientific target of the COMAP in-

strument.

To understand the challenges involved in measuring

the cosmological CO signal, it is instructive to consider

the order of magnitude and stability of each term in

Equation (3). The largest single contribution is that

of the receiver temperature, which is usually about 10–

COMAP Early Science: CO Data Processing

30°

20°

10°

5°

0°

-5°

Right Ascension (J2000)

Declination (J2000)

HETDEX

Fall Field

Field 1

240°

210°

180°

50°

40°

Right Ascension (J2000)

Declination (J2000)

Field 2

Field 3

HETDEX

Spring Field

125

250

375

500

CMB

Figure 3.

The three CO fields observed. The contours, illustrating the rough coverage of each field, have radii of

∼

◦

. In the

left and right panels respectively we have drawn in the approximate coverage of the HETDEX Fall and Spring fields presented

by Gebhardt et al. (2021). The map in the background is the same

Planck

LFI 30 GHz full-mission map (downloaded from the

PLA, Planck Collaboration et al. 2020) as seen in Figure 2.

Elevation [degrees]

180

185

190

195

200

205

Azimuth [degrees]

Elevation [degrees]

Time [minutes]

Figure 4.

Movement of the telescope boresight in azimuth and elevation for an observation employing Lissajous scans (top)

and an observation employing CES (bottom). Both observations consist of 15 individual scans of Field 1.

30 K. For the COMAP receiver, with HEMT LNA tech-

nology, this is very stable.

The second-largest contribution is from the atmo-

sphere, which typically adds 15–25 K. This term varies

significantly on all time scales longer than a few sec-

onds, and depends on external conditions including ele-

vation, humidity, cloud coverage, ambient temperature

and wind speed. It is also strongly correlated between

detectors and frequencies, since all feeds observe through

essentially the same atmospheric column at any given

time; fortunately, the phase structures of the atmo-

spheric fluctuations are uncorrelated on long time scales.

Next, ground pickup typically accounts for 5–6 K, and

this term can be particularly problematic because it de-

pends sensitively on the instrument pointing: If a side-

lobe happens to straddle a strong signal gradient, such

as the horizon or the Sun, several mK variations may

be measured on very short timescales and with a time-

dependency that appears nearly sky synchronous.

The fourth term represents the CMB temperature

of 2.7 K, which is both isotropic and stationary, while

the fifth term represents astrophysical foregrounds, ex-

pected to contribute at most 1 mK; for instance syn-

chrotron, free-free, and dust emission from the Galaxy.

Although these are sky synchronous, and in principle

could confuse potential CO measurements, they also

have very smooth frequency spectra (Keating et al.

2015), and are therefore relatively easy to distinguish

from the cosmological CO signal, which varies rapidly

with frequency. An important potential exception is line

emission from other molecules redshifted to our band

from galaxies at other epochs. The hydrogen cyanide

(HCN) line is expected to be one of the brightest such

lines. Emission from HCN in galaxies towards our CO

Foss et al.

fields at redshift

= 1

6–2.4 will appear in our frequency

range. However, this contribution is expected to be an

order of magnitude lower than that from CO (Chung

et al. 2017).

Finally, the cosmological CO line emission signal is

expected to account for

K). Whether it is possible

to detect such a weak signal depends directly on the sta-

bility and sensitivity of the instrument. In this respect,

the fundamental quantity of interest is the overall noise

level of the experiment, which is dominated by random

thermal noise.

The magnitude of these random thermal fluctuations

is proportional to

sys

, with a standard deviation that

is given by the so-called radiometer equation,

sys

√

∆

ν τ

(4)

where

is the integration time. Thus, since both the

system temperature and the bandwidth are essentially

fixed experimental parameters, the only way of reduc-

ing the total uncertainty is by increasing the integration

time. As a concrete and relevant example, we note that

an integration time of 45 hours is required to achieve a

standard deviation of 20

K with a system temperature

of 45 K and a bandwidth of 31.25 MHz.

In addition to the thermal and uncorrelated noise de-

scribed by the radiometer equation, there are three main

sources of correlated noise, namely gain fluctuations in

the low-noise amplifiers, atmospheric temperature fluc-

tuations, and time-dependent standing waves. All of

these are expected to have a roughly 1

-type spectrum,

although with different particular properties

. The fact

that these sources of correlated noise are also strongly

correlated between frequencies is very useful in order to

filter out this noise in the analysis.

Equation (2) describes the detector output at any

given time. To connect this to the actual measurements

recorded by the detector, we adopt the following data

model,

(

) =

〈

〉

(1 +

(

))

[

1 +

cel

(∆

cont

+ ∆

)

tel

∆

ground

corr

(

) +

νi

(

)

]

(5)

Here

(

) denotes the raw data recorded at time

for frequency channel

in feed

;

〈

〉

represents the

corresponding time average and basically corresponds

〈

iν

sys

(

)

〉〈

(

)

〉

;

(

) denotes feed dependent gain

fluctuations;

cel

and

tel

are pointing matrices in ce-

lestial and telescope coordinate systems, respectively;

There are several different sources of standing waves, some of the

main ones give rise to 1

-like spectra, but others do not.

∆

cont

denotes the celestial continuum source fluctua-

tions, mainly from the CMB and Galactic foregrounds;

∆

is the CO line emission fluctuation; ∆

ground

is the

ground signal fluctuation picked up by the far sidelobes;

and

corr

(

) are the correlated temperature fluctuations,

mostly consisting of atmosphere fluctuations and stand-

ing waves. Factors with no feed or frequency index are

assumed to be similar (or at least strongly correlated)

at different frequencies and feeds, while factors with a

label indicate parts of the model that are assumed to

have non-smooth frequency dependence. The main pur-

pose of the COMAP analysis pipeline is to characterize

∆

given

(

2.5.

Data overview

Before presenting the analysis pipeline, we provide

a preview of the raw time-ordered data (TOD) gener-

ated by the COMAP instrument, with the goal of build-

ing intuition that will be useful for understanding the

purpose of each component of the analysis pipeline de-

scribed in this paper. Figures 5 and 6 show examples

of such raw time-ordered data (TOD) from the instru-

ment using the CES (left column) and Lissajous (right

column) scanning strategies. Perhaps the most obvious

features in these plots are step-wise changes in power as

the telescope changes elevation during repointings be-

tween scans; see Section 2.3. The Lissajous scans ad-

ditionally show oscillations in power as the telescope

changes elevation during the scan, since the telescope

looks through a thicker slab of atmosphere at lower ele-

vations, and this increases the atmospheric contribution

to the system temperature.

The top panels in Figure 6 show an individual fre-

quency channel for a single scan (i.e., stationary obser-

vation period), while the bottom panel shows the cor-

responding power spectral density (PSD). For the CES

case, the PSD is relatively featureless, with an overall

shape that looks consistent with a typical 1

noise spec-

trum. For the Lissajous case, an additional strong peak

is seen around 0.007 Hz, which matches the scanning

period of 14 sec, and this corresponds to the periodic

atmospheric variations seen in the panels above.

Figure 7 shows the time averaged data for all fre-

quency channels of a single feed for one scan. The spec-

tral shape is mostly determined by the average gain as a

function of frequency, due to the combined effect of the

various components of the receiver chain. This average

gain is a purely instrumental effect, not associated with

the true sky signal, and therefore simply corresponds

to a normalization factor that should be calibrated out

before higher-level analysis. However, some of the spec-

tral shape is also determined by the fact that the sys-

COMAP Early Science: CO Data Processing

0.138

0.140

0.142

Power [du]

Constant elevation scan

feed 2, A:USB

0.136

0.138

0.140

0.142

Power [du]

Lissajous scan

Time [m]

0.1000

0.1025

0.1050

0.1075

0.1100

Power [du]

= 28.488 GHz

= 28.490 GHz

= 28.492 GHz

= 28.494 GHz

Time [min]

0.100

0.105

Power [du]

Figure 5.

Raw data from the COMAP instrument (in arbitrary digital units of power) Here we see data averaged over a single

2 GHz-wide sideband (top) and examples of data from four individual frequency channels in that sideband (bottom). These

data were taken using two different scan patterns: CES (left) and Lissajous (right).

100

150

200

Time [s]

0.100

0.102

0.104

Power [MW Hz

]

Constant elevation scan

Feed 2, A:USB, = 28.488 GHz

100

150

200

Time [s]

0.098

0.100

0.102

Power [MW Hz

]

Lissajous scan

Frequency [Hz]

PSD [MW

]

Feed 2, A:USB, = 28.488 GHz

Frequency [Hz]

PSD [MW

]

Figure 6.

Raw data from an individual frequency channel of the COMAP instrument. Power is shown as a function of time

(top), and the corresponding power spectral density (PSD) is also shown (bottom). We show data from a CES scan (left) and

a Lissajous scan (right).

tem temperature also changes with frequency, and in

some cases exhibits large spikes within specific frequency

ranges (see Lamb et al. (2021) for more details). Sepa-

rating the gain variation as a function of frequency from

the system temperature as a function of frequency is

a main goal of the calibration procedures described in

Section 3.4.

In Figure 8 we plot the correlation,

〈

〉

√

〈

〉〈

〉

(6)

between the power,

recorded by any two feeds,

and

after averaging over all frequencies within each sideband

for each radiometer. Here we first note that the data

Foss et al.

Frequency [GHz]

0.0

0.2

0.4

0.6

0.8

1.0

Power [du]

A:LSB

A:USB

B:LSB

B:USB

Figure 7.

Time-averaged raw data from each frequency channel on a single feed of the COMAP instrument. The colors

represent the four 2 GHz-wide sidebands. Note that a few of the frequency channels at at the edges and middle of sidebands

tend to be unstable and are masked out in the analysis.

Feed

0.0

0.2

0.4

0.6

0.8

1.0

Correlation

Figure 8.

Correlation between the sideband-averaged data

from the 19 feeds of the COMAP instrument for a single

constant elevation scan. For this observation, as for much

of the observing campaign, the LNAs for feeds 4 and 7 were

turned off because those feeds, as a test, did not have a

polarizer and so had large standing waves due to reflections

between the receiver and the secondary reflector.

from different sidebands of the same feed are strongly

correlated. This is because both main sources of cor-

related noise in the COMAP data, namely gain fluc-

tuations and atmospheric fluctuations, are common for

sidebands within a given feed. In contrast, sidebands for

different feeds mostly share the atmospheric fluctuations

(and also some standing waves), but have independent

gain fluctuations, and this results in lower overall cor-

relations, but still typically in the 10–40% range. Ac-

counting for and mitigating such correlations will clearly

be essential in order to extract robust science from these

observations.

The quality of the COMAP data depends strongly on

the observing conditions, as illustrated in Figure 9. The

top panel shows an observation made under normal con-

ditions, while the middle panel shows an observation

made during poor weather, with thick cloud coverage.

The bottom panel shows a data segment with strong

“spikes”, a feature of some data taken in summer, pos-

sibly associated with insects flying in front of the focal

plane. Automatic identification and removal of prob-

lematic data is clearly an important and necessary com-

ponent of the pipeline.

Finally, Figure 10 shows the calibration vane observa-

tions that are made at the beginning and end of each

observation period. Since the ambient temperature is

about one order of magnitude higher than

sys

, the mea-

sured power is also correspondingly about one order of

magnitude higher, and this bright and known signal al-

lows for a precise estimate of

sys

. Note that these data

segments are removed prior to data analysis, as they

would otherwise compromise any filtering that may be

applied to the data.

COMAP ANALYSIS PIPELINE

3.1.

Pipeline Overview

We are now ready to present the COMAP analysis

pipeline, which is designed to process the raw data dis-

cussed in Section 2.5 into calibrated and cleaned CO

maps. The main steps of this pipeline are schematically

illustrated in Figure 11.

The processing starts with “Level 1” files, which con-

tain raw data as recorded by the instrument, together

with pointing information and house-keeping data. Each

COMAP Early Science: CO Data Processing

Time [min]

0.24

0.26

0.28

Power [du]

Feed

Time [min]

0.5

1.0

1.5

Power [du]

Time [min]

0.2

0.3

0.4

Power [du]

Figure 9.

Feed averaged COMAP TOD recorded under

various observing conditions. The top panel shows data ob-

served under normal conditions, and is dominated by instru-

mental noise. The middle panel shows data observed under

poor weather conditions with a thick cloud coverage, result-

ing in large coherent power fluctuations observed by all feeds.

This third panel shows data with strong spikes, which may

for instance happen during rare periods with high insect ac-

tivity.

Time [s]

Power [du]

3610

3615

3620

3625

Raw TOD

hot

Figure 10.

The calibration vane is inserted in front of the

receiver at the beginning and end of one observation of a CO

science field. The time between calibration vane insertions is

typically about an hour, a period set by the preferred data

file size for the CO field observations.

of these files typically contain about one hour of obser-

vation time, including calibration vane observations at

the beginning and end. We denote each (rougly) one

hour of data as one observation, and assign it an indi-

vidual observation ID (abbreviated obsID). Each obser-

vation consists of several scans, where one scan is the

period between two re-pointings of the telescope, during

which the telescope performs the same motions around

a fixed point in azimuth and elevation while the tar-

get field drifts through. The instrumental properties

are consequently assumed to be stationary within each

scan. The module denoted

scan

detect

in Figure 11

indicates a dedicated code that partitions each obser-

vation into individual scans, based on pointing informa-

tion, and records information of each scan in a database.

The main processing takes place in the

l2gen

mod-

ule, which generates calibrated and cleaned TOD and

stores them in so-called “Level 2” files. This is achieved

through the application of a series of filters (see Sec-

tion 3.3) and a time-varying gain normalization (see Sec-

tion 3.4). This stage also evaluates basic goodness-of-

fit statistics and defines a frequency channel mask that

excludes missing or broken data for the current scan,

before reducing the spectral resolution of the data to

a spectral resolution suitable for map-making. In our

main analysis, we reduce the resolution from

∼

2 MHz

∼

31 MHz, resulting in the computational speed-up of

subsequent steps and a memory saving for storing final

maps by a factor of 16.

Next, the

accept

mod

module reads in the statis-

tics (including goodness-of-fit) and basic frequency mask

produced by

l2gen

and produces a list of accepted ob-

servations as defined by user-specified thresholds for

each statistic (see Section 4). Examples of relevant

statistics used for this purpose are

per observation,

correlated noise knee-frequency (

knee

), and Solar elon-

gation. The output from this process is called an

list

, which determines what data to use for mapmaking.

Converting time-ordered data into pixel-ordered data

is done by a map-maker called

tod2comap

(see Sec-

tion 3.6).

As shown in the following sections, the

adopted filters result in very nearly uncorrelated white

noise, and the current implementation of

tod2comap

ac-

cordingly adopts simple binning into voxels. Finally,

from these maps we can estimate the CO power spec-

trum using the module

comap2ps

(see Ihle et al. 2021

for details).

3.2.

Data Segmentation

As described above, we define a

scan

to be the observ-

ing period between re-pointings of the telescope. The

purpose of the

scan

detect

code is to identify all scans