GW150914: First results from the search for binary black hole
coalescence with Advanced LIGO
A full list of authors and affiliations appears at the end of the article.
Abstract
On September 14, 2015 at 09:50:45 UTC the two detectors of the Laser Interferometer
Gravitational-wave Observatory (LIGO) simultaneously observed the binary black hole merger
GW150914. We report the results of a matched-filter search using relativistic models of compact-
object binaries that recovered GW150914 as the most significant event during the coincident
observations between the two LIGO detectors from September 12 to October 20, 2015.
GW150914 was observed with a matched filter signal-to-noise ratio of 24 and a false alarm rate
estimated to be less than 1 event per 203000 years, equivalent to a significance greater than 5.1
σ
.
I. INTRODUCTION
On September 14, 2015 at 09:50:45 UTC the LIGO Hanford, WA, and Livingston, LA,
observatories detected a signal from the binary black hole merger GW150914 [
1
]. The initial
detection of the event was made by low-latency searches for generic gravitational-wave
transients [
2
]. We report the results of a matched-filter search using relativistic models of
compact binary coalescence waveforms that recovered GW150914 as the most significant
event during the coincident observations between the two LIGO detectors from September
12 to October 20, 2015. This is a subset of the data from Advanced LIGO’s first
observational period that ended on January 12, 2016.
The binary coalescence search targets gravitational-wave emission from compact-object
binaries with individual masses from 1M
⊙
to 99M
⊙
, total mass less than 100M
⊙
and
dimensionless spins up to 0.99. The search was performed using two independently
implemented analyses, referred to as PyCBC [
3
–
5
] and GstLAL [
6
–
8
]. These analyses use a
common set of template waveforms [
9
,
10
,
86
], but differ in their implementations of
matched filtering [
12
,
13
], their use of detector data-quality information [
14
], the techniques
used to mitigate the effect of non-Gaussian noise transients in the detector [
6
,
15
], and the
methods for estimating the noise background of the search [
4
,
16
].
GW150914 was observed in both LIGO detectors [
17
] within the 10 ms inter-site
propagation time, with a combined matched-filter signal to noise ratio (SNR) of 24. The
search reported a false alarm rate estimated to be less than 1 event per 203000 years,
equivalent to a significance greater than 5.1
σ
. The basic features of the GW150914 signal
point to it being produced by the coalescence of two black holes [
1
]. The best-fit template
†
Deceased, May 2015.
‡
Deceased, March 2015.
NASA Public Access
Author manuscript
Phys Rev D
. Author manuscript; available in PMC 2020 August 17.
Published in final edited form as:
Phys Rev D
. 2016 June 15; 93(12): . doi:10.1103/PhysRevD.93.122003.
NASA Author Manuscript
NASA Author Manuscript
NASA Author Manuscript
parameters from the search are consistent with detailed parameter estimation that identifies
GW150914 as a near-equal mass black hole binary system with source-frame masses
36
−4
+5
M
⊙
and
29
−4
+4
M
⊙
at the 90% credible level [
18
].
The second most significant candidate event in the observation period (referred to as
LVT151012) was reported on October 12, 2015 at 09:54:43 UTC with a combined matched-
filter SNR of 9.6. The search reported a false alarm rate of 1 per 2.3 years and a
corresponding false alarm probability of 0.02 for this candidate event. Detector
characterization studies have not identified an instrumental or environmental artifact as
causing this candidate event [
14
]. However, its false alarm probability is not sufficiently low
to confidently claim this candidate event as a signal. Detailed waveform analysis of this
candidate event indicates that it is also a binary black hole merger with source frame masses
23
−5
+18
M
⊙
and
13
−5
+4
M
⊙
, if it is of astrophysical origin.
This paper is organized as follows: Sec. II gives an overview of the compact binary
coalescence search and the methods used. Sec. III and Sec. IV describe the construction and
tuning of the two independently implemented analyses used in the search. Sec. V presents
the results of the search, and follow-up of the two most significant candidate events,
GW150914 and LVT151012.
II. SEARCH DESCRIPTION
The binary coalescence search [
19
–
26
] reported here targets gravitational waves from binary
neutron stars, binary black holes, and neutron star–black hole binaries, using matched
filtering [
27
] with waveforms predicted by general relativity. Both the PyCBC and GstLAL
analyses correlate the detector data with template waveforms that model the expected signal.
The analyses identify candidate events that are detected at both observatories consistent with
the 10 ms inter-site propagation time. Events are assigned a detection-statistic value that
ranks their likelihood of being a gravitational-wave signal. This detection statistic is
compared to the estimated detector noise background to determine the probability that a
candidate event is due to detector noise.
We report on a search using coincident observations between the two Advanced LIGO
detectors [
28
] in Hanford, WA (H1) and in Livingston, LA (L1) from September 12 to
October 20, 2015. During these 38.6 days, the detectors were in coincident operation for a
total of 18.4 days. Unstable instrumental operation and hardware failures affected 20.7 hours
of these coincident observations. These data are discarded and the remaining 17.5 days are
used as input to the analyses [
14
]. The analyses reduce this time further by imposing a
minimum length over which the detectors must be operating stably; this is different between
the two analysis, as described in Sec. III and Sec. IV. After applying this cut, the PyCBC
analysis searched 16 days of coincident data and the GstLAL analysis searched 17 days of
coincident data. To prevent bias in the results, the configuration and tuning of the analyses
were determined using data taken prior to September 12, 2015.
A gravitational-wave signal incident on an interferometer alters its arm lengths by
δ
L
x
and
δ
L
y
, such that their measured difference is Δ
L
(
t
) =
δ
L
x
−
δ
L
y
=
h
(
t
)
L
, where
h
(
t
) is the
Abbott et al.
Page 2
Phys Rev D
. Author manuscript; available in PMC 2020 August 17.
NASA Author Manuscript
NASA Author Manuscript
NASA Author Manuscript
gravitational-wave metric perturbation projected onto the detector, and
L
is the unperturbed
arm length [
29
]. The strain is calibrated by measuring the detector’s response to test mass
motion induced by photon pressure from a modulated calibration laser beam [
30
]. Changes
in the detector’s thermal and alignment state cause small, time-dependent systematic errors
in the calibration [
30
]. The calibration used for this search does not include these time-
dependent factors. Appendix A demonstrates that neglecting the time-dependent calibration
factors does not affect the result of this search.
The gravitational waveform
h
(
t
) depends on the chirp mass of the binary,
ℳ =
m
1
m
2
3/5
/
m
1
+
m
2
1/5
[
31
,
32
], the symmetric mass ratio
η
= (
m
1
m
2
)/(
m
1
+
m
2
)
2
[
33
],
and the angular momentum of the compact objects
χ
1, 2
=
cS
1, 2
/
Gm
1, 2
2
[
34
,
35
] (the compact
object’s dimensionless spin), where
S
1,2
is the angular momentum of the compact objects.
The effect of spin on the waveform depends also on the ratio between the component
objects’ masses. Parameters which affect the overall amplitude and phase of the signal as
observed in the detector are maximized over in the matched-filter search, but can be
recovered through full parameter estimation analysis [
18
]. The search parameter space is
therefore defined by the limits placed on the compact objects’ masses and spins. The
minimum component masses of the search are determined by the lowest expected neutron
star mass, which we assume to be 1M
⊙
[
36
]. There is no known maximum black hole mass
[
37
], however we limit this search to binaries with a total mass less than
M
=
m
1
+
m
2
≤
100M
⊙
. The LIGO detectors are sensitive to higher mass binaries, however; the results of
searches for binaries that lie outside this search space will be reported in future publications.
For binary component objects with masses less than 2M
⊙
, we limit the magnitude of the
component object’s spin to 0.05, the spin of the fastest known pulsar in a double neutron star
system [
38
]. At current detector sensitivity, this is sufficient to detect gravitational-wave
signals from mergers of binaries with neutron star components having spins up to 0.4, the
spin of the fastest-spinning millisecond pulsar [
39
]. Observations of X-ray binaries indicate
that astrophysical black holes may have near extremal spins [
40
]. For binary components
with masses larger than 2M
⊙
, we limit the spin magnitude to less than 0.9895. This is set by
our ability to generate valid template waveforms at higher spins [
9
]. Figure 1 shows the
boundaries of the search parameter space in the component-mass plane.
Since the parameters of signals are not known in advance, each detector’s output is filtered
against a discrete bank of templates that span the search target space [
20
,
41
–
44
]. The
placement of templates depends on the shape of the power spectrum of the detector noise.
Both analyses use a low-frequency cutoff of 30 Hz for the search. The average noise power
spectral density of the LIGO detectors was measured over the period September 12 to
September 26, 2015. The harmonic mean of these noise spectra from the two detectors was
used to place a single template bank that was used for the duration of the search [
4
,
45
]. The
templates are placed using a combination of geometric and stochastic methods [
7
,
46
,
47
,
86
] such that the loss in matched-filter SNR caused by its discrete nature is
≲
3%.
Approximately 250,000 template waveforms are used to cover this parameter space, as
shown in Fig. 1. The performance of the template bank is tested numerically by simulating
binary black hole waveforms and determining the fraction of the total possible matched-
Abbott et al.
Page 3
Phys Rev D
. Author manuscript; available in PMC 2020 August 17.
NASA Author Manuscript
NASA Author Manuscript
NASA Author Manuscript
filter SNR recovered for each simulated signal (the fitting factor) [
48
]. Figure 2 shows the
resulting distribution of fitting factors obtained over the observation period. The loss in
matched-filter SNR is less than 3% for more than 99% of the 10
5
simulated signals.
The template bank assumes that the spins of the two compact objects are aligned with the
orbital angular momentum. The resulting templates can nonetheless effectively recover
systems with misaligned spins in the parameter-space region of GW150914. Figure 3 shows
the effective fitting factor for simulated signals from a population of simulated precessing
binary black holes that are uniform in co-moving volume [
49
,
50
]. The effective fitting
factor weights the fraction of the matched-filter SNR recovered by the amplitude of the
signal [
51
]. A signal that has a low fitting factor may also have a poor orientation. When its
strain is projected onto the detector, the amplitude of the signal may be too small to detect
even if there was no mismatch between the signal and the template; the weighting in the
effective fitting accounts for this. The effective fitting factor is lowest at high mass ratios and
low total mass, where the effects of precession are more pronounced. In the region close to
the parameters of GW150914 the aligned-spin template bank is sensitive to a large fraction
of precessing signals [
50
].
In addition to possible gravitational-wave signals, the detector strain contains a stationary
noise background that primarily arises from photon shot noise at high frequencies and
seismic noise at low frequencies. In the mid-frequency range, detector commissioning has
not yet reached the point where test mass thermal noise dominates, and the noise at mid
frequencies is poorly understood [
14
,
17
,
52
]. The detector strain data also exhibits non-
stationarity and non-Gaussian noise transients that arise from a variety of instrumental or
environmental mechanisms. The measured strain
s
(
t
) is the sum of possible gravitational-
wave signals
h
(
t
) and the different types of detector noise
n
(
t
).
To monitor environmental disturbances and their influence on the detectors, each
observatory site is equipped with an array of sensors [
53
]. Auxiliary instrumental channels
also record the interferometer’s operating point and the state of the detector’s control
systems. Many noise transients have distinct signatures, visible in environmental or auxiliary
data channels that are not sensitive to gravitational waves. When a noise source with known
physical coupling between these channels and the detector strain data is active, a data-
quality veto is created that is used to exclude these data from the search [
14
]. In the GstLAL
analysis, time intervals flagged by data quality vetoes are removed prior to the filtering. In
the PyCBC analysis, these data quality vetoes are applied after filtering. A total of 2 hours is
removed from the analysis by data quality vetoes. Despite these detector characterization
investigations, the data still contains non-stationary and non-Gaussian noise which can affect
the astrophysical sensitivity of the search. Both analyses implement methods to identify
loud, short-duration noise transients and remove them from the strain data before filtering.
The PyCBC and GstLAL analyses calculate the matched-filter SNR for each template and
each detector’s data [
12
,
54
]. In the PyCBC analysis, sources with total mass less than 4M
⊙
are modeled by computing the inspiral waveform accurate to third-and-a-half post-
Newtonian order [
33
,
55
,
56
]. To model systems with total mass larger than 4M
⊙
, we use
templates based on the effective-one-body (EOB) formalism [
57
], which combines results
Abbott et al.
Page 4
Phys Rev D
. Author manuscript; available in PMC 2020 August 17.
NASA Author Manuscript
NASA Author Manuscript
NASA Author Manuscript
from the Post-Newtonian approach [
33
,
56
] with results from black hole perturbation theory
and numerical relativity [
9
,
58
] to model the complete inspiral, merger and ringdown
waveform. The waveform models used assume that the spins of the merging objects are
aligned with the orbital angular momentum. The GstLAL analysis uses the same waveform
families, but the boundary between Post-Newtonian and EOB models is set at
ℳ = 1.74M
⊙
.
Both analyses identify maxima of the matched-filter SNR (triggers) over the signal time of
arrival.
To suppress large SNR values caused by non-Gaussian detector noise, the two analyses
calculate additional tests to quantify the agreement between the data and the template. The
PyCBC analysis calculates a chi-squared statistic to test whether the data in several different
frequency bands are consistent with the matching template [
15
]. The value of the chi-
squared statistic is used to compute a re-weighted SNR for each maxima. The GstLAL
analysis computes a goodness-of-fit between the measured and expected SNR time series for
each trigger. The matched-filter SNR and goodness-of-fit values for each trigger are used as
parameters in the GstLAL ranking statistic.
Both analyses enforce coincidence between detectors by selecting trigger pairs that occur
within a 15ms window and come from the same template. The 15ms window is determined
by the 10ms inter-site propagation time plus 5ms for uncertainty in arrival time of weak
signals. The PyCBC analyses discards any triggers that occur during the time of data-quality
vetoes prior to computing coincidence. The remaining coincident events are ranked based on
the quadrature sum of the re-weighted SNR from both detectors [
4
]. The GstLAL analysis
ranks coincident events using a likelihood ratio that quantifies the probability that a
particular set of concident trigger parameters is due to a signal versus the probability of
obtaining the same set of parameters from noise [
6
].
The significance of a candidate event is determined by the search background. This is the
rate at which detector noise produces events with a detection-statistic value equal to or
higher than the candidate event (the false alarm rate). Estimating this background is
challenging for two reasons: the detector noise is non-stationary and non-Gaussian, so its
properties must be empirically determined; and it is not possible to shield the detector from
gravitational waves to directly measure a signal-free background. The specific procedure
used to estimate the background is different for the two analyses.
To measure the significance of candidate events, the PyCBC analysis artificially shifts the
timestamps of one detector’s triggers by an offset that is large compared to the inter-site
propagation time, and a new set of coincident events is produced based on this time-shifted
data set. For instrumental noise that is uncorrelated between detectors this is an effective
way to estimate the background. To account for the search background noise varying across
the target signal space, candidate and background events are divided into three search classes
based on template length. To account for having searched multiple classes, the measured
significance is decreased by a trials factor equal to the number of classes [
59
].
The GstLAL analysis measures the noise background using the distribution of triggers that
are not coincident in time. To account for the search background noise varying across the
Abbott et al.
Page 5
Phys Rev D
. Author manuscript; available in PMC 2020 August 17.
NASA Author Manuscript
NASA Author Manuscript
NASA Author Manuscript
target signal space, the analysis divides the template bank into 248 bins. Signals are assumed
to be equally likely across all bins and it is assumed that noise triggers are equally likely to
produce a given SNR and goodness-of-fit value in any of the templates within a single bin.
The estimated probability density function for the likelihood statistic is marginalized over
the template bins and used to compute the probability of obtaining a noise event with a
likelihood value larger than that of a candidate event.
The result of the independent analyses are two separate lists of candidate events, with each
candidate event assigned a false alarm probability and false alarm rate. These quantities are
used to determine if a gravitational-wave signal is present in the search. Simulated signals
are added to the input strain data to validate the analyses, as described in Appendix B.
III. PYCBC ANALYSIS
The PyCBC analysis [
3
–
5
] uses fundamentally the same methods [
12
,
15
,
60
–
67
] as those
used to search for gravitational waves from compact binaries in the initial LIGO and Virgo
detector era [
68
–
79
], with the improvements described in Refs. [
3
,
4
]. In this Section, we
describe the configuration and tuning of the PyCBC analysis used in this search. To prevent
bias in the search result, the configuration of the analysis was determined using data taken
prior to the observation period searched. When GW150914 was discovered by the low-
latency transient searches [
1
], all tuning of the PyCBC analysis was frozen to ensure that the
reported false alarm probabilities are unbiased. No information from the low-latency
transient search is used in this analysis.
Of the 17.5 days of data that are used as input to the analysis, the PyCBC analysis discards
times for which either of the LIGO detectors is in their observation state for less than 2064 s;
shorter intervals are considered to be unstable detector operation by this analysis and are
removed from the observation time. After discarding time removed by data-quality vetoes
and periods when detector operation is considered unstable the observation time remaining
is 16 days.
For each template
h
(
t
) and for the strain data from a single detector
s
(
t
), the analysis
calculates the square of the matched-filter SNR defined by [
12
]
ρ
2
t
≡
1
ℎ ∣ ℎ
s
∣ ℎ
t
2
,
(1)
where the correlation is defined by
s
ℎ
t
= 4
∫
0
∞
s
f
ℎ
*
f
S
n
f
e
2
πift
d
f
,
(2)
where
s
f
is the Fourier transform of the time domain quantity
s
(
t
) given by
s
f
=
∫
−∞
∞
s
t
e
−2
πift
d
t
.
(3)
Abbott et al.
Page 6
Phys Rev D
. Author manuscript; available in PMC 2020 August 17.
NASA Author Manuscript
NASA Author Manuscript
NASA Author Manuscript
The quantity
S
n
(|
f
|) is the one-sided average power spectral density of the detector noise,
which is re-calculated every 2048 s (in contrast to the fixed spectrum used in template bank
construction). Calculation of the matched-filter SNR in the frequency domain allows the use
of the computationally efficient Fast Fourier Transform [
80
,
81
]. The square of the matched-
filter SNR in Eq. (1) is normalized by
ℎ ∣ ℎ
= 4
∫
0
∞
ℎ
f
ℎ
*
f
S
n
f
d
f
,
(4)
so that its mean value is 2, if
s
(
t
) contains only stationary noise [
82
].
Non-Gaussian noise transients in the detector can produce extended periods of elevated
matched-filter SNR that increase the search background [
4
]. To mitigate this, a time-
frequency excess power (burst) search [
83
] is used to identify high-amplitude, short-duration
transients that are not flagged by data-quality vetoes. If the burst search generates a trigger
with a burst SNR exceeding 300, the PyCBC analysis vetoes these data by zeroing out 0.5s
of
s
(
t
) centered on the time of the trigger. The data is smoothly rolled off using a Tukey
window during the 0.25 s before and after the vetoed data. The threshold of 300 is chosen to
be significantly higher than the burst SNR obtained from plausible binary signals. For
comparison, the burst SNR of GW150914 in the excess power search is ~ 10. A total of 450
burst-transient vetoes are produced in the two detectors, resulting in 225 s of data removed
from the search. A time-frequency spectrogram of the data at the time of each burst-transient
veto was inspected to ensure that none of these windows contained the signature of an
extremely loud binary coalescence.
The analysis places a threshold of 5.5 on the single-detector matched-filter SNR and
identifies maxima of
ρ
(
t
) with respect to the time of arrival of the signal. For each maximum
we calculate a chi-squared statistic to determine whether the data in several different
frequency bands are consistent with the matching template [
15
]. Given a specific number of
frequency bands
p
, the value of the reduced
χ
r
2
is given by
χ
r
2
=
p
2
p
− 2
1
ℎ ∣ ℎ
∑
i
= 1
p
s
∣ ℎ
i
−
s
∣ ℎ
p
2
,
(5)
where
h
i
is the sub-template corresponding to the
i
-th frequency band. Values of
χ
r
2
near
unity indicate that the signal is consistent with a coalescence. To suppress triggers from
noise transients with large matched-filter SNR,
ρ
(
t
) is re-weighted by [
62
,
77
]
ρ
=
ρ
/
1 +
χ
r
2
3
/2
1
6
, if
χ
r
2
> 1,
ρ
,
if
χ
r
2
≤ 1.
(6)
Triggers that have a re-weighted SNR
ρ
< 5
or that occur during times subject to data-quality
vetoes are discarded.
Abbott et al.
Page 7
Phys Rev D
. Author manuscript; available in PMC 2020 August 17.
NASA Author Manuscript
NASA Author Manuscript
NASA Author Manuscript
The template waveforms span a wide region of time-frequency parameter space and the
susceptibility of the analysis to a particular type of noise transient can vary across the search
space. This is demonstrated in Fig. 4 which shows the cumulative number of noise triggers
as a function of re-weighted SNR for Advanced LIGO engineering run data taken between
September 2 and September 9, 2015. The response of the template bank to noise transients is
well characterized by the gravitational-wave frequency at the template’s peak amplitude,
f
peak
. Waveforms with a lower peak frequency have less cycles in the detector’s most
sensitive frequency band from 30–2000 Hz [
17
,
52
], and so are less easily distinguished
from noise transients by the re-weighted SNR.
The number of bins in the
χ
2
test is a tunable parameter in the analysis [
4
]. Previous
searches used a fixed number of bins [
84
] with the most recent Initial LIGO and Virgo
searches using
p
= 16 bins for all templates [
77
,
78
]. Investigations on data from LIGO’s
sixth science run [
78
,
85
] showed that better noise rejection is achieved with a template-
dependent number of bins. The left two panels of Fig. 4 show the cumulative number of
noise triggers with
p
= 16 bins used in the
χ
2
test. Empirically, we find that choosing the
number of bins according to
p
=
0.4
f
peak
/Hz
2/3
(7)
gives better suppression of noise transients in Advanced LIGO data, as shown in the right
panels of Fig. 4.
The PyCBC analysis enforces signal coincidence between detectors by selecting trigger
pairs that occur within a 15ms window and come from the same template. We rank
coincident events based on the quadrature sum
ρ
c
of the
ρ
from both detectors [
4
]. The final
step of the analysis is to cluster the coincident events, by selecting those with the largest
value of
ρ
c
in each time window of 10 s. Any other events in the same time window are
discarded. This ensures that a loud signal or transient noise artifact gives rise to at most one
candidate event [
4
].
The significance of a candidate event is determined by the rate at which detector noise
produces events with a detection-statistic value equal to or higher than that of the candidate
event. To measure this, the analysis creates a “background data set” by artificially shifting
the timestamps of one detector’s triggers by many multiples of 0.1 s and computing a new
set of coincident events. Since the time offset used is always larger than the time-
coincidence window, coincident signals do not contribute to this background. Under the
assumption that noise is not correlated between the detectors [
14
], this method provides an
unbiased estimate of the noise background of the analysis.
To account for the noise background varying across the target signal space, candidate and
background events are divided into different search classes based on template length. Based
on empirical tuning using Advanced LIGO engineering run data taken between September 2
and September 9, 2015, we divide the template space into three classes according to: (i)
ℳ < 1.74M
⊙
; (ii)
ℳ ≥ 1.74M
⊙
and
f
peak
≥ 220Hz; (iii)
ℳ ≥ 1.74M
⊙
and
f
peak
< 220Hz. The
significance of candidate events is measured against the background from the same class.
Abbott et al.
Page 8
Phys Rev D
. Author manuscript; available in PMC 2020 August 17.
NASA Author Manuscript
NASA Author Manuscript
NASA Author Manuscript
For each candidate event, we compute the false alarm probability
ℱ
. This is the probability
of finding one or more noise background events in the observation time with a detection-
statistic value above that of the candidate event, given by [
4
,
86
]
ℱ
ρ
c
≡
P
≥ 1 noise event above
ρ
c
∣
T
,
T
b
=
1 − exp
−
T
1 +
n
b
ρ
c
T
b
,
(8)
where
T
is the observation time of the search,
T
b
is the background time, and
n
b
ρ
c
is the
number of noise background triggers above the candidate event’s re-weighted SNR
ρ
c
.
Eq. (8) is derived assuming Poisson statistics for the counts of time-shifted background
events, and for the count of coincident noise events in the search [
4
,
86
]. This assumption
requires that different time-shifted analyses (i.e. with different relative shifts between
detectors) give independent realizations of a counting experiment for noise background
events. We expect different time shifts to yield independent event counts since the 0.1 s
offset time is greater than the 10 ms gravitational-wave travel time between the sites plus the
~ 1 ms autocorrelation length of the templates. To test the independence of event counts over
different time shifts over this observation period, we compute the differences in the number
of background events having
ρ
c
> 9
between consecutive time shifts. Figure 5 shows that the
measured differences on these data follow the expected distribution for the difference
between two independent Poisson random variables [
87
], confirming the independence of
time shifted event counts.
If a candidate event’s detection-statistic value is larger than that of any noise background
event, as is the case for GW150914, then the PyCBC analysis places an upper bound on the
candidate’s false alarm probability. After discarding time removed by data-quality vetoes
and periods when the detector is in stable operation for less than 2064 seconds, the total
observation time remaining is
T
= 16 days. Repeating the time-shift procedure ~ 10
7
times
on these data produces a noise background analysis time equivalent to
T
b
= 608000 years.
Thus, the smallest false alarm probability that can be estimated in this analysis is
approximately
ℱ = 7 × 10
−8
. Since we treat the search parameter space as 3 independent
classes, each of which may generate a false positive result, this value should be multiplied
by a trials factor or look-elsewhere effect [
59
] of 3, resulting in a minimum measurable false
alarm probability of
ℱ = 2 × 10
−7
. The results of the PyCBC analysis are described in Sec.
V.
IV. GSTLAL ANALYSIS
The GstLAL [
88
] analysis implements a time-domain matched filter search [
6
] using
techinques that were developed to perform the near real-time compact-object binary searches
[
7
,
8
]. To accomplish this, the data
s
(
t
) and templates
h
(
t
) are each whitened in the frequency
domain by dividing them by an estimate of the power spectral density of the detector noise.
An estimate of the stationary noise amplitude spectrum is obtained with a combined
median–geometric-mean modification of Welch’s method [
8
]. This procedure is applied
piece-wise on overlapping Hann-windowed time-domain blocks that are subsequently
Abbott et al.
Page 9
Phys Rev D
. Author manuscript; available in PMC 2020 August 17.
NASA Author Manuscript
NASA Author Manuscript
NASA Author Manuscript
summed together to yield a continuous whitened time series
s
w
(
t
). The time-domain
whitened template
h
w
(
t
) is then convolved with the whitened data
s
w
(
t
) to obtain the
matched-filter SNR time series
ρ
(
t
) for each template. By the convolution theorem,
ρ
(
t
)
obtained in this manner is the same as the
ρ
(
t
) obtained by frequency domain filtering in Eq.
(1).
Of the 17.5 days of data that are used as input to the analysis, the GstLAL analysis discards
times for which either of the LIGO detectors is in their observation state for less than 512 s
in duration. Shorter intervals are considered to be unstable detector operation by this
analysis and are removed from the observation time. After discarding time removed by data-
quality vetoes and periods when the detector operation is considered unstable the
observation time remaining is 17 days. To remove loud, short-duration noise transients, any
excursions in the whitened data that are greater than 50
σ
are removed with 0.25 s padding.
The intervals of
s
w
(
t
) vetoed in this way are replaced with zeros. The cleaned whitened data
is the input to the matched filtering stage.
Adjacent waveforms in the template bank are highly correlated. The GstLAL analysis takes
advantage of this to reduce the computational cost of the time-domain correlation. The
templates are grouped by chirp mass and spin into 248 bins of ~ 1000 templates each.
Within each bin, a reduced set of orthonormal basis functions
ℎ
t
is obtained via a singular
value decomposition of the whitened templates. We find that the ratio of the number of
orthonormal basis functions to the number of input waveforms is ~0.01 – 0.10, indicating a
significant redundancy in each bin. The set of
ℎ
t
in each bin is convolved with the
whitened data; linear combinations of the resulting time series are then used to reconstruct
the matched-filter SNR time series for each template. This decomposition allows for
computationally-efficient time-domain filtering and reproduces the frequency-domain
matched filter
ρ
(
t
) to within 0.1% [
6
,
54
,
89
].
Peaks in the matched-filter SNR for each detector and each template are identified over 1 s
windows. If the peak is above a matched-filter SNR of 4, it is recorded as a trigger. For each
trigger, the matched-filter SNR time series around the trigger is checked for consistency with
a signal by comparing the template’s autocorrelation function
R
(
t
) to the matched-filter SNR
time series
ρ
(
t
). The residual found after subtracting the autocorrelation function forms a
goodness-of-fit test,
ξ
2
=
1
μ
∫
t
p
−
δt
t
p
+
δt
d
t
ρ
t
p
R
t
−
ρ
t
2
,
(9)
where
t
p
is the time at the peak matched-filter SNR
ρ
(
t
p
), and
δ
t
is a tunable parameter. A
suitable value for
δ
t
was found to be 85.45 ms (175 samples at a 2048Hz sampling rate). The
quantity
μ
normalizes
ξ
2
such that a well-fit signal has a mean value of 1 in Gaussian noise
[
8
]. The
ξ
2
value is recorded with the trigger.
Each trigger is checked for time coincidence with triggers from the same template in the
other detector. If two triggers occur from the same template within 15 ms in both detectors, a
coincident event is recorded. Coincident events are ranked according to a multidimensional
Abbott et al.
Page 10
Phys Rev D
. Author manuscript; available in PMC 2020 August 17.
NASA Author Manuscript
NASA Author Manuscript
NASA Author Manuscript
likelihood ratio
ℒ
[
16
,
90
], then clustered in a ±4s time window. The likelihood ratio ranks
candidate events by the ratio of the probability of observing matched-filter SNR and
ξ
2
from
signals (h) versus obtaining the same parameters from noise (n). Since the orthonormal filter
decomposition already groups templates into regions with high overlap, we expect templates
in each group to respond similarly to noise. We use the template group
θ
i
as an additional
parameter in the likelihood ratio to account for how different regions of the compact binary
parameter space are affected differently by noise processes. The likelihood ratio is thus:
ℒ =
p
x
H
,
x
L
,
D
H
,
D
L
∣
θ
i
, h
p
x
H
∣
θ
i
, n
p
x
L
∣
θ
i
, n
,
(10)
where
x
d
=
ρ
d
,
ξ
d
2
are the matched-filter SNR and
ξ
2
in each detector, and
D
is a parameter
that measures the distance sensitivity of the given detector during the time of a trigger.
The numerator of the likelihood ratio is generated using an astrophysical model of signals
distributed isotropically in the nearby Universe to calculate the joint SNR distribution in the
two detectors [
16
]. The
ξ
2
distribution for the signal hypothesis assumes that the signal
agrees to within ~ 90% of the template waveform and that the nearby noise is Gaussian. We
assume all
θ
i
are equally likely for signals.
The noise is assumed to be uncorrelated between detectors. The denominator of the
likelihood ratio therefore factors into the product of the distribution of noise triggers in each
detector,
p
(
x
d
|
θ
i
,n). We estimate these using a two-dimensional kernel density estimation
constructed from all of the single-detector triggers not found in coincidence in a single bin.
The likelihood ratio
ℒ
provides a ranking of events such that larger values of
ℒ
are
associated with a higher probability of the data containing a signal. The likelihood ratio
itself is not the probability that an event is a signal, nor does it give the probability that an
event was caused by noise. Computing the probability that an event is a signal requires
additional prior assumptions. Instead, for each candidate event, we compute the false alarm
probability
ℱ
. This is the probability of finding one or more noise background events with a
likelihood-ratio value greater than or equal to that of the candidate event. Assuming Poisson
statistics for the background, this is given by:
ℱ
ℒ
≡
P
ℒ ∣
T
, n
= 1 − exp
−
λ
ℒ ∣
T
, n
.
(11)
Instead of using time shifts, the GstLAL anlysis estimates the Poisson rate of background
events
λ
ℒ ∣
T
, n
as:
λ
ℒ ∣
T
, n
=
M
T
P
ℒ ∣ n
,
(12)
where
M
(
T
) is the number of coincident events found above threshold in the analysis time
T
,
and
P
ℒ ∣ n
is the probability of obtaining one or more events from noise with a likelihood
ratio
≥ ℒ
(the survival function). We find this by estimating the survival function in each
template bin, then marginalize over the bins; i.e.,
P
ℒ ∣ n
= ∑
i
P
ℒ ∣
θ
i
, n
P
θ
i
∣ n
. In a
single bin, the survival function is
Abbott et al.
Page 11
Phys Rev D
. Author manuscript; available in PMC 2020 August 17.
NASA Author Manuscript
NASA Author Manuscript
NASA Author Manuscript
P
ℒ ∣
θ
i
, n
= 1 −
∫
S
ℒ
p
′
x
H
∣
θ
i
, n
p
′
x
L
∣
θ
i
, n
d
x
H
d
x
L
.
(13)
Here,
p
′
(
x
d
|
θ
i
,n) are estimates of the distribution of triggers in each detector including all of
the single-detector triggers, whereas the estimate of
p
(
x
d
|
θ
i
,n) includes only those triggers
which were not coincident. This is consistent with the assumption that the false alarm
probability is computed assuming all events are noise.
The integration region
S
ℒ
is the volume of the four-dimensional space of
x
d
for which the
likelihood ratios are less than
ℒ
. We find this by Monte Carlo integration of our estimates of
the single-detector noise distributions
p
′
(
x
d
|
θ
i
,n). This is approximately equal to the number
of coincidences that can be formed from the single-detector triggers with likelihood ratios
≥ ℒ
divided by the total number of possible coincidences. We therefore reach a minimum
possible estimate of the survival function, without extrapolation, at the
ℒ
for which
p
′
(
x
H
|
θ
i
,n)
p
′
(
x
L
|
θ
i
,n) ~ 1/
N
H
(
θ
i
)
N
L
(
θ
i
), where
N
d
(
θ
i
) are the total number of triggers in each
detector in the
i
th bin.
GW150914 was more significant than any other combination of triggers. For that reason, we
are interested in knowing the minimum false alarm probability that can be computed by the
GstLAL analysis. All of the triggers in a template bin, regardless of the template from which
they came, are used to construct the single-detector probability density distributions
p
′
within that bin. The false alarm probability estimated by the GstLAL analysis is the
probability that noise triggers occur within a ±15ms time window
and
occur in the same
template. Under the assumption that triggers are uniformly distributed over the bins, the
minimum possible false alarm probability that can be computed is
MN
bins
/(
N
H
N
L
), where
N
bins
is the number of bins used,
N
H
is the total number of triggers in H, and
N
L
is the total
number of triggers in L. For the present analysis,
M
~ 1 × 10
9
,
N
H
~
N
L
~ 1 × 10
11
, and
N
bins
is 248, yielding a minimum value of the false alarm probability of ~ 10
−11
.
We cannot rule out the possibility that noise produced by the detectors violates the
assumption that it is uniformly distributed among the templates within a bin. If we consider
a more conservative noise hypothesis that does not assume that triggers are uniformly
distributed within a bin and instead considers each template as a separate
θ
i
bin, we can
evaluate the minimum upper bound on the false alarm probability of GW150914. This
assumption would produce a larger minimum false alarm probability value by approximately
the ratio of the number of templates to the present number of bins. Under this noise
hypothesis, the minimum value of the false alarm probability would be ~ 10
−8
, which is
consistent with the minimum false alarm probability bound of the PyCBC analysis.
Figure 6 shows
p
(
x
H
|n) and
p
(
x
L
|n) in the warm colormap. The cool colormap includes
triggers that are also found in coincidence, i.e.,
p
′
(
x
H
|n) and
p
′
(
x
L
|n), which is the
probability density function used to estimate
P
ℒ ∣ n
. It has been masked to only show
regions which are not consistent with
p
(
x
H
|n) and
p
(
x
L
|n). In both cases
θ
i
has been
marginalized over in order to show all the data on a single figure. The positions of the two
Abbott et al.
Page 12
Phys Rev D
. Author manuscript; available in PMC 2020 August 17.
NASA Author Manuscript
NASA Author Manuscript
NASA Author Manuscript
loudest events, described in the next section, are shown. Figure 6 shows that GW150914
falls in a region without any non-coincident triggers from any bin.
V. SEARCH RESULTS
GW150914 was observed on September 14, 2015 at 09:50:45 UTC as the most significant
event by both analyses. The individual detector triggers from GW150914 occurred within
the 10 ms inter-site propagation time with a combined matched-filter SNR of 24. Both
pipelines report the same matched-filter SNR for the individual detector triggers in the
Hanford detector (
ρ
H1
= 20) and the Livingston detector (
ρ
L1
= 13). GW150914 was found
with the same template in both analyses with component masses 47.9M
⊙
and 36.6M
⊙
. The
effective spin of the best-matching template is
χ
eff
=
c
/
G
S
1
/
m
1
+
S
2
/
m
2
⋅
L
/
M
= 0.2
,
where
S
1
,
2
are the spins of the compact objects and
L
is the direction of the binary’s orbital
angular momentum. Due to the discrete nature of the template bank, follow-up parameter
estimation is required to accurately determine the best fit masses and spins of the binary’s
components [
18
,
91
].
The frequency at peak amplitude of the best-matching template is
f
peak
= 144Hz, placing it
in noise-background class (iii) of the PyCBC analysis. Figure 7 (left) shows the result of the
PyCBC analysis for this search class. In the time-shift analysis used to create the noise
background estimate for the PyCBC analysis, a signal may contribute events to the
background through random coincidences of the signal in one detector with noise in the
other detector [
86
]. This can be seen in the background histogram shown by the black line.
The tail is due to coincidence between the single-detector triggers from GW150914 and
noise in the other detector. If a loud signal is in fact present, these random time-shifted
coincidences contribute to an overestimate of the noise background and a more conservative
assessment of the significance of an event. Figure 7 (left) shows that GW150914 has a re-
weighted SNR
ρ
c
= 23.6
, greater than all background events in its class. This value is also
greater than all background in the other two classes. As a result, we can only place an upper
bound on the false alarm rate, as described in Sec. III. This bound is equal to the number of
classes divided by the background time
T
b
. With 3 classes and
T
b
= 608000 years, we find
the false alarm rate of GW150914 to be less than 5 × 10
−6
yr
−1
. With an observing time of
384hr, the false alarm probability is
ℱ < 2 × 10
−7
. Converting this false alarm probability to
single-sided Gaussian standard deviations according to
−
2 erf
−1
1 − 2
1 − ℱ
, where erf
−1
is the inverse error function, the PyCBC analysis measures the significance of GW150914 as
greater than 5.1
σ
.
The GstLAL analysis reported a detection-statistic value for GW150914 of
ℒ = 78
, as
shown in the right panel of Fig. 7. The GstLAL analysis estimates the false alarm probability
assuming that noise triggers are equally likely to occur in any of the templates within a
background bin. However, as stated in Sec. IV, if the distribution of noise triggers is not
uniform across templates, particularly in the part of the bank where GW150914 is observed,
the minimum false alarm probability would be higher. For this reason we quote the more
conservative PyCBC bound on the false alarm probability of GW150914 here and in Ref.
[
1
]. However, proceeding under the assumption that the noise triggers are equally likely to
Abbott et al.
Page 13
Phys Rev D
. Author manuscript; available in PMC 2020 August 17.
NASA Author Manuscript
NASA Author Manuscript
NASA Author Manuscript