New approach to template banks of gravitational waves with higher harmonics: Reducing matched-filtering cost by over an order of magnitude

New approach to template banks of gravitational waves with higher

harmonics: Reducing matched-filtering cost by over an order of magnitude

Digvijay Wadekar ,

Tejaswi Venumadhav,

2,3

Ajit Kumar Mehta ,

Javier Roulet ,

Seth Olsen ,

Jonathan Mushkin ,

Barak Zackay ,

and Matias Zaldarriaga

School of Natural Sciences,

Institute for Advanced Study

1 Einstein Drive, Princeton, New Jersey 08540, USA

Department of Physics,

University of California at Santa Barbara

, Santa Barbara, California 93106, USA

International Centre for Theoretical Sciences,

Tata Institute of Fundamental Research

Bangalore 560089, India

TAPIR, Walter Burke Institute for Theoretical Physics,

California Institute of Technology

Pasadena, California 91125, USA

Department of Physics,

Princeton University

, Princeton, New Jersey 08540, USA

Department of Particle Physics and Astrophysics,

Weizmann Institute of Science

, Rehovot 76100, Israel

(Received 8 July 2024; accepted 11 September 2024; published 15 October 2024)

Searches for gravitational wave events use models, or templates, for the signals of interest. The templates

used in current searches in the LIGO-Virgo-KAGRA data model the dominant quadrupole mode

l

;

jÞ ¼

;

of the signals and omit subdominant higher-order modes (HMs) such as

l

;

jÞ ¼ ð

;

, (4, 4),

which are predicted by general relativity. This omission reduces search sensitivity to black hole mergers in

interesting parts of parameter space, such as systems with high masses and asymmetric-mass ratios. We

develop a new strategy to include HMs in template banks: instead of making templates containing a

combination of different modes, we separately store normalized templates corresponding to (2, 2), (3, 3),

and (4, 4) modes. To model aligned-spin (3, 3), (4, 4) waveforms corresponding to a given (2, 2) waveform,

we use a combination of post-Newtonian formulas and machine learning tools. In the matched-filtering

stage, one can filter each mode separately with the data and collect the time series of signal-to-noise ratios

(SNRs). This leads to a HM template bank whose matched-filtering cost is just

≈

× that of a quadrupole-

only search (as opposed to

≈

100

× in previously proposed HM search methods). Our method is effectual

and generally applicable for template banks constructed with either stochastic or geometric placement

techniques. New gravitational wave candidate events that we detect using our HM banks and details for

combining the different SNR mode time series are presented in accompanying papers [D. Wadekar

et al.

arXiv:2312.06631

; D. Wadekar

et al.

Phys. Rev. D

110

, 044063 (2024)

]. Additionally, we discuss non-

linear compression of (2, 2)-only geometric placement template banks using machine learning algorithms.

DOI:

10.1103/PhysRevD.110.084035

I. INTRODUCTION

The bulk of the gravitational wave (GW) detections until

now have come from template-based searches, which filter

strain data from detectors with known waveforms

–

18]

These searches rely on having banks of templates with accu-

rate GW waveforms that cover the parameter space within

which one intends to search for compact object mergers.

Gravitational wave signals can be decomposed into basis

harmonics, or modes, each of which has a characteristic

spherical-harmonic angular dependence on the detectors

’

position on the source

’

s sky

[19,20]

. The quadrupole mode

l

;

jÞ ¼ ð

;

is the leading-order term in the post-

Newtonian (PN) multipole expansion, but higher-order

modes (HMs) such as

l

;

jÞ ¼ ð

;

or (4, 4) can be

excited to detectable levels close to the merger for binary

sources with large and asymmetric masses. Almost all

of the currently used template banks include waveforms

with only the dominant quadrupole mode instead of the

full gravitational wave waveform predicted by general

relativity

–

7,13,18,21]

. In addition to ignoring HMs,

current template banks also assume that the constituent

spins are aligned with the orbital angular momentum and

the binary orbit is nearly circular (i.e., they ignore the

effects of spin-orbit precession and orbital eccentricity).

While the effects of HMs and precession are ignored in

GW searches, they are now being included in most studies

estimating source parameters of high-significance triggers

obtained from the searches. One justification for this can be

that the purpose of searches is merely to identify potentially

Contact author: jayw@ias.edu

PHYSICAL REVIEW D

110,

084035 (2024)

2470-0010

2024

110(8)

084035(15)

084035-1

interesting small segments from the full data stream;

one can be more precise in the parameter estimation stage

because additional effects like HMs and precession are

not prohibitively expensive when restricting to a targeted

analysis of small data segments. However, ignoring these

additional effects can change which triggers enter the

final detection catalog, potentially missing interesting

systems when the signal-to-noise ratio (SNR) of these

candidates is not very high. Furthermore, the significance

estimate of triggers corresponding to a particular class of

physical systems (e.g., high-mass systems) can be affected

(i.e., the false alarm probability of such triggers can

increase).

In this paper, we will focus on making banks that include

the effects of HMs in their templates. HMs can contribute

significantly to SNR in the detectors

’

sensitive band for

sources with particular parameters, such as edge-on ori-

entations, unequal mass ratios, and high total mass. Hence,

searches that only model the quadrupole mode can lose

sensitivity to such events

[22

–

32]

(see Figs.

and

below).

One caveat here is that the sensitive volume of asymmetric-

mass systems (i.e., mass ratio

) is much

smaller than the corresponding equal-mass (

) sys-

tems. However, detecting even a few systems with

≲

can help improve our understanding of some of the

compact object formation channels

[33,34]

. This has

motivated previous studies (e.g.,

[22,23,35]

) to construct

template banks that include HMs (it is worth mentioning

that there have also been efforts to include HMs in searches

without using precomputed templates, but instead using

machine learning tools directly

[31,36,37]

One of the biggest hurdles in creating template banks

that include the effects of HMs (or precession or eccen-

tricity) is that the extra degrees of freedom significantly

increase the number of templates. When making aligned-

spin template banks with just the (2, 2) mode, one only

needs to sample over the masses and spins of the binary.

However, constructing HM banks requires sampling over

two additional parameters: inclination and initial phase

(

;

), defined at some reference epoch. These parameters

modulate the amplitude and phase of HM waveforms

relative to the (2, 2). Because of these additional degrees

of freedom, the number of templates increases by

∼

100

compared to the (2, 2)-only case

[22,23,35]

(with the

matched-filtering cost of the search also increasing by the

same factor).

In this paper, we develop a different strategy for

constructing HM template banks. For each set of binary

masses and spins, we store the (2,2), (3,3), (4,4) mode

waveforms separately in our bank. We also matched-filter

the data with these modes separately and store the three

SNR time series as shown in Fig.

. We can then combine

the three SNR time series to marginalize over the angles

;

and get a statistic that can be used to collect and rank

triggers. As the matched-filtering step involves three

computations for every one computation in the analogous

(2, 2)-only case, the cost of the filtering step is increased by

a factor of 3. Note that we have an additional step in our

FIG. 1. In our template banks, for each set of intrinsic parameters (i.e., masses and spins [

]), we generate and store the

normalized (2,2), (3,3), and (4,4) mode waveforms separately. We then filter the data with the individual modes, which results in a cost

increase of just

× compared to the (2,2)-only case. Note that the

× factor is significantly smaller than a factor of

∼

100

× occurring

when the templates are made with combinations of different harmonics (in the combined case, templates need to span the larger 6D

space: [

;

], where

is the inclination and

is the initial reference phase of the binary

[22,23,35]

). We store the output

SNR time series of each mode and later combine them by marginalizing over

and

ref

(details of the marginalization algorithm are

given in our companion paper

[38]

, and we also provide a brief overview in Sec.

DIGVIJAY WADEKAR

et al.

PHYS. REV. D

110,

084035 (2024)

084035-2

case: combining the different SNR time series by maxi-

mization (or marginalization) over

and

. However,

the cost of this additional step is much smaller than the

matched-filtering step. We discuss combination of the SNR

time series in detail in our first companion paper

[38]

where we also introduce new detection statistics to mitigate

the increase in the number of background triggers (which

is a result of the extra degrees of freedom introduced by

HMs in the template bank

[26]

). Our second companion

paper

[1]

details the new binary black hole (BBH) mergers

found using our search methodology in the LIGO-Virgo

third observing run (O3) data.

We discuss the formalism and simulations of waveforms

containing HMs in Sec.

. In Sec.

III

, we describe our

model for making (2,2)-only banks. In Sec.

, we describe

our model for making amplitudes and phases of HM

templates using post-Newtonian expansion and machine

learning techniques. In Sec.

, we briefly describe the

process of matched filtering the data with our template

bank. In Sec.

, we show that our HM template bank is

effectual. In Sec.

VII

, we discuss an application of our

technique to stochastic placement-based template banks,

and we conclude in Sec.

VIII

FIG. 2. Ignoring HMs in the waveform templates can cause a loss of waveform overlap with the true signal, which in turn leads to a

reduction in the detection volume of systems. In this figure, we show the fractional detection volume of systems (calculated by the cube

of the waveform overlap in a simplistic scenario) in particular regions of parameter space. Left: we include only the dominant (2, 2)

mode in the templates and see that the detection volume loss is larger for high-mass and asymmetric-mass ratio systems (see Fig.

for

the reasoning behind the high-mass behavior). Center/right: the behavior when additional

l

¼j

modes are included in addition to

(2, 2) [omitting the

l

≠

modes: (2, 1), (3, 2)]. We find that the fractional loss in volume is

≲

in the given parameter space,

showing that adding only the (3,3) and (4,4) modes to our template banks is roughly sufficient. In all the panels, we have averaged

over inclination accounting for the brightness of the 22 waveform. Note that the range of the color bar is significantly different in the

three panels.

FIG. 3. As seen in Fig.

, the fractional contribution of HMs to

SNR increases significantly for high-mass binaries. This is

because the noise amplitude spectral density (ASD) increases

at low frequencies, which leads to preferential downweighting of

the (2,2) mode SNR, especially for high masses where the (2,2)

signal is dominated by lower frequencies (while the HMs, being

at higher frequencies, are less affected). The reference noise ASD

shown here corresponds to the O3 run and has been used

throughout this paper. The parameters of the system used to

generate the mode curves are shown in the title of the plot.

NEW APPROACH TO TEMPLATE BANKS OF GRAVITATIONAL

...

PHYS. REV. D

110,

084035 (2024)

084035-3

II. GRAVITATIONAL WAVE SIGNALS

INCLUDING HIGHER HARMONICS

A. Generating waveform samples for the bank

To generate aligned-spin waveforms that include HMs, we

use the recently developed IMRPhenomXHM model

[39]

This approximant calibrates analytic PN-based expressions

of different multipoles (which are accurate in the early

inspiral regime of the binary evolution) with numerical

relativity simulations (which more accurately describe the

merger-ringdown regime).

In addition to (2, 2), we use all

the currently available modes in IMRPhenomXHM: (3, 3),

(4, 4), (2, 1), (3, 2). Hereafter, we write

l

mode as just

l

for brevity.

We only generate BBH templates in this study and leave

systems containing neutron stars for a different study. In

order to model our template banks, we simulate waveforms

in the following parameter range:

⊙

400

⊙

<q<

;

where the masses are in the detector frame (we do not

use separate cuts for redshift). The motivation for the mass

ratio (

) cut is that the IMRPhenomXHM approximant was

calibrated to numerical relativity (NR) waveforms up to

, while the cut on the individual spins is due to the

relativistic Kerr limit. To construct our template banks, we

simulate a sample of

∼

waveforms (each corresponding

to the physical parameter set

;

). These

samples are broadly distributed across the parameter

space in Eq.

(1)

. Similar to other templated searches, we

need to assume an astrophysical prior distribution in order

to give weights to the templates in our banks. We pick a

simple prior: uniform prior in

for

<q<

; flat

prior for

eff

bounded within

−

eff

(with

;

); and a power law distribution in the total

redshifted mass:

tot

∝

−

tot

(we do not assume a

separate prior over redshift).

We first quantify the fractional loss in the recovered SNR

if we ignore the HMs in our templates. To calculate the

overlap of two waveforms, we use

i¼

∞

where

is the one-sided power spectral density (PSD).

Throughout this paper, we use a reference PSD that is

obtained by applying Welch

’

s method

[40]

over 50 random

O3 LIGO data files and taking the tenth percentile of the

sample of PSDs in order to downweight the distortions

arising from spectral lines and loud glitches.

An important factor in the calculation of loss of SNR is

the inclination

of the binary. This is because HMs become

more important when

is away from 0 or

(zero corres-

ponds to a face-on binary). However, the observable

volume of systems with inclination closer to edge on is

diminished, since the brightness of the 22 mode starts to

decrease (typically the 22 mode is brighter than HMs so it

drives our choice for the inclination distribution). We

therefore simulate test waveforms using the following

probability distribution, which includes the observable

volume effect (see Fig. 4 of

[41]

Þ¼

cos

sin

Note that similar comparisons have already been done in

previous studies

[22

–

30,42]

for fixed

, whereas here we

show the case when the

dependence has been margin-

alized over. We show the results of the fractional loss in

SNR in the left panel of Fig.

(the color shows the overlap

given by

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi

where

includes all

modes: 22, 33, 44, 21, 32). Note that the detection volume

shown in the figure does not include the effect of loss in

sensitivity from the increase in background triggers due to

the larger number of waveforms

[26]

. To thoroughly study

this issue, we plan to do an injection-based study using the

IAS-HM pipeline and explicitly measuring the effect of

adding HMs on the effective volume-time (

) of the

search.

We also perform another test where we add two of the

HMs (33 and 44) alongside 22 and test the overlaps with

the full waveform. From the right panel of Fig.

, we see

that using these three modes gives fairly accurate overlap

with the full waveform throughout the parameter space.

The model for the gravitational wave strain in the

detector is given by

det

where

are the detector

’

s response functions, which depend on sky

location and polarization angle

of the binary

[20,30]

. The

waveform polarization modes are given by

−

Þ¼

l

≥

−

l

;

l

Þð

where

is the luminosity distance,

is the reference

distance at which the harmonic modes

l

were generated,

and

−

l

corresponds to the spin-weighted spherical

We also attempted generating waveforms with the IMRPhe-

nomXPHM approximant, which includes both HMs and pre-

cession. However, we found a few inconsistencies in the case

when the spins are nearly aligned (

∼

) versus exactly

aligned (

), which could be due to borderline cases

between the separate parameter space regions where the approx-

imant was calibrated. This led to anomalies in our singular value

decomposition basis for the phases (which are calculated later

in Sec.

III B

DIGVIJAY WADEKAR

et al.

PHYS. REV. D

110,

084035 (2024)

084035-4

harmonics. Explicitly expanding the expressions for

−

l

for the case of 22, 33, and 44, we get

[30]

det

Þ¼

cos

−

cos

Þþ

sin

One can see that the detector responses conveniently

factor out as an overall normalization term

[30]

. This is,

however, not the case if

l

≠

modes are included in

this expression, and we discuss this point further in

Appendix

. It is also evident that, if one only considers

the 22 waveform in Eq.

(5)

, there is a degeneracy between

;

, and

(included in the detector responses). In that

case, one does not need to make different templates

to account for these variables. However, when including

HMs, this degeneracy is broken and therefore, in order to

model the full

HM waveform, it is necessary to make

different templates for different values of

and

. This

is indeed the approach taken by previous HM template

banks

[22,23,35]

. We, however, take a different approach

in this paper: we construct templates for the 22, 33, and

44 modes separately, so our templates do not include any

information about

and

. Instead, we include the depen-

dence on

and

at a later stage in the pipeline when we

combine the SNR time series after matched filtering with

the templates (see Ref.

[38]

). We discuss the model for our

templates in the next two sections.

III. OUR QUADRUPOLE BANK MODEL

In our method, there is a template for the (3, 3) mode and

a template for the (4, 4) mode associated with every (2, 2)

template. We first create our (2,2) bank independent of

HMs and discuss our model in this section (we will discuss

modeling HM templates in the next section).

There are two broad classes of (2,2) template banks in

the literature. One is stochastic placement, where the bank

consists of waveforms that are randomly sampled (see,

e.g.,

[6,43]

). We will use the alternative approach of

geometric placement, where a regular lattice is constructed

based on the metric induced by the inner product of the

waveforms, Eq.

(2)

(see e.g.,

[44

–

46]

). It is worth noting

that our methodology of including HMs in the template

banks is equivalently applicable to 22 banks constructed

with stochastic placement (we will come back to this point

later in Sec.

VII A

). To build our 22 banks, we use a method

based on Ref.

[45]

(hereafter R19. R19 shows that

decomposing the frequency-domain waveform into an

amplitude and an unwrapped phase term is advantageous

because the amplitude

and phase

separately

depend on the parameters of the binary in a smooth way

(see Fig.

of R19).

A. Modeling amplitudes and separation

into banks

To divide the binary mass and spin samples into banks, we

first find the normalized amplitudes (i.e.,

i¼

)of

the waveforms from IMRPhenomXHM corresponding to

these samples. Our banks are made by requiring that

waveforms with similar amplitude profiles are grouped

together in a single bank. We pass

waveforms

over the entire parameter space through a KMeans algo-

rithm

[47]

. This algorithm identifies centroids in the space

of waveform amplitudes and each input waveform ampli-

tude is then associated with a particular centroid. Each

centroid corresponds to an amplitude profile which we

assign as the reference amplitude of that bank:

bank

(note that we also normalize these amplitudes such that

bank

i¼

). The choice of the centroids is made

such that the average overlap between the amplitudes

associated with that bank, given by

Þj

bank

Þi

,is

maximized. We choose the number of banks to be 17 based

on the criteria that the overlap is

for more than

99.9% of the waveforms associated with that bank. We

show the profiles of

bank

for our banks in the top panel

of Fig.

. In the bottom panel, we show the parameters

associated with each bank based on their waveform

amplitude profiles. The curved boundary seen in the bottom

left of the

–

tot

plot is because of the condition that the

secondary black hole has mass

⊙

We further choose to divide low-mass banks into sub-

banks based on

chirp

. This is because waveforms with

different

chirp

have different duration. Non-Gaussian

noise is known to affect short-duration waveforms much

more than long-duration waveforms, and hence this divi-

sion helps in isolating long-duration waveforms from the

additional background. We choose the number of subbanks

within each bank such that the duration of waveforms in

each subbank is roughly similar to within a factor of 2.

We show the

chirp

cuts for each subbank in Table

We also show the number of templates in each bank. Note

that the smaller the binary mass, the larger the number of

orbital cycles seen by current detectors, and therefore more

templates are needed to model that region (

templates

∝

−

min

[48]

). If we assume the same SNR threshold across

all the banks to collect the matched-filtering triggers, we

will run into the problem that there will be a vastly larger

number of background triggers in low-mass banks com-

pared to high-mass banks due to a difference in the number

of templates. To get around this issue, we employ different

SNR collection thresholds in different banks (e.g., we use

collection

for BBH-0, whereas for BBH-8 and above,

we use

collection

Note that the banks in R19 were divided using

chirp

values in the entire parameter range (and subbanks were

defined based on the shapes of the normalized amplitudes

of waveforms). For low-mass binaries, the waveform is

NEW APPROACH TO TEMPLATE BANKS OF GRAVITATIONAL

...

PHYS. REV. D

110,

084035 (2024)

084035-5

inspiral-dominated and thus the division by

chirp

appropriate; however, for high-mass binaries, the waveform

is dominated by the merger and the ringdown phase, hence

the waveform behavior depends primarily on

tot

instead

chirp

. Therefore, in our case, our bank division is

only based on the normalized amplitudes in the high-mass

case, and we forego further division by

chirp

. For the low-

mass case, our subbanks are roughly similar to the sub-

banks of R19.

B. Modeling phases

We model the phases of waveforms based on the singular

value decomposition (SVD) method introduced in R19.

For IMRPhenomXHM waveforms associated with each

bank, we first extract their unwrapped phases, subtract the

average phase, and also orthogonalize the phases with

respect to time offsets. We perform an SVD of the phase

profiles to obtain an orthonormal basis for the phase

functions. We find that the phase profile can be well

modeled by a low dimensional space,

model

Þ¼h

subbank

Þþ

few

SVD

;

where

corresponds to the orthonormal basis functions

from the SVD, and

subbank

denotes the average over

physical waveforms in the particular subbank. It is worth

noting that, for computing SVDs of the phases, we give a

weight to each frequency bin proportional on the fractional

SNR in the bin:

∝

bank

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi

, where

corresponds to the frequency bins and

is the bin width

(see R19 for further details). We also choose a different

lower and upper frequency cutoff for different banks. These

cutoffs are chosen such that the resulting loss in SNR

[

−

max

min

bank

ffiffiffiffiffiffiffiffiffiffiffi

] is lower than

∼

. This

gives

min

between 19 and 24 Hz for different banks,

FIG. 4. The template banks in our analysis are split according to

the normalized waveform amplitudes of the (2,2) mode. Top:

normalized amplitude profiles corresponding to our banks, where

we see that the different banks are roughly distinguished by the

cutoff frequencies of the waveforms. Bottom: physical para-

meters corresponding to the different banks.

TABLE I. Details of our 22 template banks. The banks are split

according to 22 amplitudes (see Fig.

) and the subbanks are split

according to the chirp mass (

). The lower and upper bin edges

are given in the right column. From BBH-9 onward, we

only have a single subbank per bank.

Bank

templates

in subbanks

bin edges

of subbanks

BBH-0 10442, 1702, 774, 465, 280 2.6, 5.3, 6.8, 8.4, 10.7, 19.8

BBH-1

1902, 318, 157

5.8, 11.9, 15.8, 27.3

BBH-2

755, 94, 55

8.9, 18.5, 23, 38.6

BBH-3

263, 53, 44

12, 22.4, 27.4, 49.6

BBH-4

1782, 422, 225

5.8, 10.8, 13.5, 19.5

BBH-5

174, 67, 63

14.7, 23.9, 30.3, 64.2

BBH-6

249, 115, 76

10.3, 21, 29.5, 75.9

BBH-7

145, 74

12.9, 30.6, 92.4

BBH-8

73, 31

15.1 39.9, 108.3

BBH-9

18.5, 127.3

BBH-10

21.1, 149.1

BBH-11

24.7, 168.6

BBH-12

28.4, 173.5

BBH-13

32.6, 173.8

BBH-14

37.8, 173.6

BBH-15

43.2, 173.7

BBH-16

51.8, 166

Total

20890

DIGVIJAY WADEKAR

et al.

PHYS. REV. D

110,

084035 (2024)

084035-6

but

max

varies significantly among banks as they have

different cutoff frequencies (see the top panel of Fig.

For low-mass waveforms,

is found to be approx-

imately proportional to

chirp

, while

represents a

combination of

and

eff

(corresponding to the well-

known degeneracy in the 22 waveform). The advantage of

the model in Eq.

(6)

is that, for nearby templates, the

space serves as a Euclidean metric for the mismatch

between templates in a particular subbank [R19],

c

Þj

c

δ

c

Þi

≃

−

We therefore define a grid in the

c

space and each grid point

corresponds to a 22 template in our search. The number of

dimensions to include in our grid is chosen based on

calculating the match between the model in Eq.

(6)

with the

phases of IMRPhenomXHM waveforms in a particular

subbank given by

bank

model

The histogram of the matches corresponding to the

BBH-0,0 subbank (which has the largest number of

templates among all subbanks) is shown in Fig.

.We

can see that two

dimensions are not sufficient, but three

are enough to have the match

for over 99% of the

waveforms.

It is, however, important to note that SVD is a linear

dimensional reduction tool that we used to model the

hypersurface of the physical 22 waveform phases. Using

SVD is optimal if the physical hypersurface is linear, but

we find this is not true in our case, as is evident from the

top panel of Fig.

. This can lead to overpredicting the

number of dimensions in our waveform model, which

would introduce inefficiencies in our search. Note that

linear dimensionality reduction tools like SVD or principal

component analysis are used extensively in numerous areas

of astrophysics, e.g.,

[49

–

51]

; it is important to be aware

that the linearity assumption used in these techniques can

cause them to be suboptimal.

A few possible solutions to overcome our problem are to

use the kernel trick or use nonlinear deep neural network

models like autoencoders. However, we want to preserve

the Euclidean

metric for waveform mismatches, so we

use the same SVD basis as earlier but add a machine

learning tool called random forest regressor (RF) to reduce

the redundancies in the

space. For RF, we use the

publicly available package

SCIKIT

LEARN

and train the

model to predict the higher-order SVD coefficients as a

function of the two lowest-order ones,

;

...

The result for the case of BBH-0,0 is shown as the black

line in the bottom panel of Fig.

. This helps in cutting

down the dimensionality of the

space to two dimensions

for all banks and subbanks in our case (in contrast to

R19, where a few of the banks had 3

–

4 dimensions).

Furthermore, comparing our template bank size with and

without including the RF, we find that RF helps in reducing

the number of templates by a factor of

∼

40%

while keeping

a similar level of effectualness (after including the effect of

waveform optimization as described in Sec. III H of

[52]

We leave exploration of alternative dimensionality reduc-

tion techniques like autoencoders to a future study. Finally,

along the

and

dimensions, not all the points of the

rectangular grid may describe physically viable waveforms.

We therefore use the method adopted in R19 of rejecting

FIG. 5. We model the phases of our 22 templates using a low-

dimensional basis constructed using SVD [see Eq.

(6)

]. The

dimensionality of our banks is set by the number of SVD

coefficients

needed to get effectualness above a particular

threshold. Top: the top three SVD coefficients corresponding to

physical waveforms follow a curved hypersurface (showing that

are not independent, but there is a redundancy that leads

to a spurious increase in number of dimensions of our subbanks).

Bottom: histograms of waveform matches for different case of

dimensions. Upon modeling

as a function of

using a

machine learning tool called RF, we find that all our quadrupole

subbanks can be compressed to two dimensions. Overall, using

the RF enables us to search with

∼

40%

times fewer templates

while keeping a similar level of effectualness.

Random forest:

https://scikit-learn.org/stable/modules/

generated/sklearn.ensemble.RandomForestRegressor.html

NEW APPROACH TO TEMPLATE BANKS OF GRAVITATIONAL

...

PHYS. REV. D

110,

084035 (2024)

084035-7

the grid points that do not have a physical waveform in their

vicinity (for reference, see Fig.

of R19.

It is worth mentioning that the quadrupole bank creation

method outlined here has been used recently to also create

template banks for an inspiral-only search targeting exotic

compact objects with large tidal deformabilities

[15]

and

also for a search targeting intermediate-mass-ratio BBH

mergers

[53]

IV. MODELING 33 AND 44 WAVEFORMS

USING PN EXPANSION AND

MACHINE LEARNING

In this section, we derive effective models for higher

harmonics. The Fourier domain waveforms for different

harmonics can be calculated under the stationary phase

approximation (SPA) as [see Eq. (11) of

[54]

]

l

Þ¼

ffiffiffiffiffi

−

SPA

Þþ

l

where

≡

Mf=k

is the PN parameter in Fourier

space (corresponding to the velocity of the binary) and

≡

is the symmetric mass ratio.

A. Model for amplitudes

The various PN-based amplitude terms (

l

) are given

in Appendix E of

[39]

(which is an updated version of

Sec. III of Ref.

[54]

). The lowest-order terms for the three

multipoles we are interested in are

≡

−

323

224

−

451

168

;

≡

−

ffiffiffi

−

;

≡

−

ffiffiffiffiffi

−

Combining Eqs.

(9)

and

(10)

with Eq.

(5)

, we get effective

equations for the amplitude of higher harmonics as

≃

ffiffiffi

−

tot

sin

≃

ffiffiffi

−

tot

sin

We show the performance of these equations in Fig.

for

a particular set of binary parameters and

. We see

that, although these amplitude formulas are constructed

from low-order PN expressions, they are a fairly good

approximation to the amplitude even in the quasilinear

regime (a similar behavior was also noted by Ref.

[55]

while comparing PN expressions to time-domain numerical

relativity waveforms). It is clear that maximizing over the

amplitude of the HM waveform leads to maximization over

inclination and masses, and the frequency dependence is

independent of these waveforms.

In our case, each template in a particular bank has

the same amplitude profile. Following Eq.

(11)

, one

could use

bank

∝

bank

and

bank

∝

bank

. We, however, use the following instead,

as we find that this slightly improves the model accuracy in

the nonlinear regime:

bank

Þ¼h

bank

;

bank

Þ¼h

bank

;

where the averaging is done over the normalized waveform

samples in the bank, i.e., each sample has

i¼

After averaging, we normalize the bank amplitudes such

that

bank

l

bank

l

i¼

. It is worth mentioning that, for

modeling HMs, we choose different frequency cutoffs than

the ones used for 22,

min

ll

Þ¼

l

min

, and a similar

relation for

max

ll

FIG. 6. Top: amplitude profiles of different harmonics for an

arbitrary waveform in bank BBH-8,0 with parameters

;

¼½

;

−

;

−

. Approximations to

HM amplitudes derived from low-order PN expansion [see

Eq.

(11)

] are shown with dashed lines. Interestingly, we see that

the approximate formulas are fairly accurate even in the quasi-

linear regime close to the merger. We show the ratios of the

formulas output to the actual waveforms in the bottom panels.

Our waveform model for the HM amplitudes is motivated by

these formulas and given in Eq.

(12)

DIGVIJAY WADEKAR

et al.

PHYS. REV. D

110,

084035 (2024)

084035-8

B. Model for phases

In our template banks, we model

l

l

l

, where

l

is real. Let us start by writing the phase of the

waveform in Eq.

(9)

arg

l

Þ¼

SPA

arg

l

Ignoring the constant offset for the moment, one can see

that a good approximation in the early inspiral regime is

l

≈

. However, it is important to mention

that

l

can have imaginary terms in the higher-order PN

expansion which make arg

l

≠

(see Appendix E

[39]

). Furthermore, as we go into the nonlinear regime,

NR simulations show that there is an additional offset

(compared to PN) to the above approximation (see Fig. 5

[39]

). We therefore attempt to model all these frequency-

dependent offsets by calculating

ΔΨ

l

Þ¼

l

−

directly from IMRPhenomXHM waveforms for each of the

parameter samples used to create the template banks. In

Fig.

we show an example of a high-mass-ratio case. In the

plot, we set

ΔΨ

to be 0 at 100 Hz to make it easier to

visually compare the frequency-dependent offset. We see

that, in some cases, the deviation can go up to a few radians

and thus the above approximation is not sufficient for

modeling

l

To remedy this, we try to model

ΔΨ

l

from

IMRPhenomXHM as a function of intrinsic parameters.

First, we decompose

ΔΨ

Þ¼h

ΔΨ

subbank

Þþ

SVD

;

where

ΔΨ

ið

denotes the average over physical wave-

forms in the particular subbank. Next, we predict the

coefficient

Þð

by training a random forest regressor. Note that this

approach is similar to the one used earlier in Sec.

III B

to model the phases of the 22 waveforms. The RF takes the

coefficients of a 22 template as input and outputs the first

three

coefficients (we checked empirically that includ-

ing more than three coefficients had negligible effect on the

results). Our final model for the phases becomes

model

Þ¼

þh

ΔΨ

subbank

SVD

We show the results from our model with the dashed line in

Fig.

, where we indeed see an improvement compared to

the naive approximation (see also Fig.

). In the future, if

needed, the accuracy of our phase model can be further

improved by measuring

instead of predicting it with

RF. This would entail an additional dimension correspond-

ing to the 33 phase, on top of the two existing dimensions

corresponding to the 22 phase in our template banks. We

leave further discussion of this point to Appendix

Finally, it is worth mentioning that we discussed the

formalism for the 33 case in this subsection, but an

analogous method also applies for the 44 case.

C. Model for amplitude ratios

Now that we have modeled the phase profiles

model

and the normalized amplitude profiles

bank

of the 22,

33, and 44 modes, the last ingredient in our template bank

is the prior on the expected SNR in the HMs relative to 22.

FIG. 7. Similar to Fig.

, but showing the accuracy of our model

of phases of 33 and 44 waveforms for the same binary system.

With the solid lines, we show a simplistic model motivated by

low-order PN expansion. The phase residuals are not sufficient

for our analysis, therefore we explicitly model the phase residuals

from the simple relation using machine learning (ML). To set up

our ML model, we simulate similar phase residuals for arbitrary

waveforms from IMRPhenomXHM, compress the residuals to a

few SVD components, and then model the SVD coefficients

using a random forest regressor [see Eqs.

(14)

–

(16)

]. Applying

our model to this particular binary system, we obtain the

comparatively more accurate results shown with dashed lines.

NEW APPROACH TO TEMPLATE BANKS OF GRAVITATIONAL

...

PHYS. REV. D

110,

084035 (2024)

084035-9