() - 1408.6265v1.pdf

arXiv:1408.6265v1 [astro-ph.IM] 26 Aug 2014

Mon. Not. R. Astron. Soc.

000

, 1–24 (0000)

Printed 28 August 2014

(MN L

X style file v2.2)

A method for the estimation of the significance of cross-corr

elations

in unevenly sampled red-noise time series

W. Max-Moerbeck

1

2

⋆

, J. L. Richards

3

, T. Hovatta

1

4

, V. Pavlidou

1

5

6

, T. J. Pearson

1

A. C. S. Readhead

1

Cahill Center for Astronomy and Astrophysics, California I

nstitute of Technology, Pasadena, CA 91125, USA

National Radio Astronomy Observatory (NRAO), P.O. Box 0, So

corro, NM 87801, USA

Department of Physics, Purdue University, West Lafayette,

IN 47907, USA

Aalto University Mets ̈ahovi Radio Observatory, Mets ̈ahov

intie 114, 02540 Kylm ̈al ̈a, Finland

Max-Planck-Institut f ̈ur Radioastronomie, Auf dem H ̈ugel

69, 53121 Bonn, Germany

Department of Physics, University of Crete / Foundation for

Research and Technology - Hellas, Heraklion 71003, Greece

Accepted 2014 August 19. Received 2014 August 18; in origina

l form 2014 June 20

ABSTRACT

We present a practical implementation of a Monte Carlo metho

d to estimate the significance

of cross-correlations in unevenly sampled time series of da

ta, whose statistical properties

are modeled with a simple power-law power spectral density.

This implementation builds

on published methods, we introduce a number of improvements

in the normalization of the

cross-correlation function estimate and a bootstrap metho

d for estimating the significance

of the cross-correlations. A closely related matter is the e

stimation of a model for the light

curves, which is critical for the significance estimates. We

present a graphical and quan-

titative demonstration that uses simulations to show how co

mmon it is to get high cross-

correlations for unrelated light curves with steep power sp

ectral densities. This demonstra-

tion highlights the dangers of interpreting them as signs of

a physical connection. We show

that by using interpolation and the Hanning sampling window

function we are able to reduce

the effects of red-noise leakage and to recover steep simple

power-law power spectral den-

sities. We also introduce the use of a Neyman construction fo

r the estimation of the errors in

the power-law index of the power spectral density. This meth

od provides a consistent way to

estimate the significance of cross-correlations in unevenl

y sampled time series of data.

Key words:

methods: data analysis — methods: statistical — techniques

: miscellaneous

1 INTRODUCTION

Studies of the variability in astronomical sources can reve

al as-

pects that are not accessible to imaging, which is limited by

the

angular resolution of current instruments. For example, va

riabil-

ity can be used to set limits on the sizes of the emitting re-

gions through causality arguments (e.g., Abdo et al. 2011),

to de-

termine the size of the broad line region in active galactic n

clei (e.g., Peterson & Cota 1988), or to detect extrasolar pl

anets

(e.g., Charbonneau et al. 2000) among many other applicatio

ns. In

this paper we describe the practical implementation of a cro

ss-

correlation technique to determine the location of the gamm

a-ray

emission site in blazars, by studying the relation between t

he vari-

ability in the radio and gamma-ray bands. For this purpose we

are

carrying out a blazar monitoring program with the Owens Val-

ley Radio Observatory (OVRO) 40 meter telescope (Richards e

t al.

2011) and the Large Area Telescope (LAT) on board of the

Fermi

⋆

E-mail: wmax@nrao.edu

Gamma-ray Space Telescope

(

Fermi

, Atwood et al. 2009). Our ap-

proach is to search for correlated variability between thes

e two en-

ergy bands, which would enable us to determine the location o

the gamma-ray emission regions relative to the radio emissi

on re-

gions. The study of cross-correlations between two energy b

ands

presents a number of challenges from the data analysis and st

atis-

tical point of view: among these are uneven sampling, non-eq

ual

error bars, and short time duration of the light curves. The t

ech-

niques we develop here should be useful for other applicatio

ns.

Related methods have been presented in the literature, for e

ample the study of cross-correlations with unevenly sample

d light

curves has an extensive literature about its application to

rever-

beration mapping (e.g., Peterson 1993). These methods pres

ent a

detailed treatment of the estimation of cross-correlation

s and time

lags, but not of the estimation of significance of the observe

d cor-

relations, a critical aspect for the interpretation of cros

s-correlation

results. The literature abounds with claims of statistical

ly signifi-

cant correlations that are not backed up by rigorous statist

ical anal-

yses.

0000 RAS

W. Max-Moerbeck et al.

This paper presents a detailed discussion of the meth-

ods used for our investigation of time-correlation between

ra-

dio and gamma-ray activity in blazars, which is discussed in

Max-Moerbeck et al. (2014). Here we present a description of

the Monte Carlo method used to estimate the significance of

cross-correlations between unevenly sampled time series u

sing

the method of Edelson & Krolik (1988). In order to estimate

the distribution of cross-correlations in two uncorrelate

d data

streams we need a model for the light curves. A commonly used

model for time variability in blazars and other AGNs is a sim-

ple power-law power spectral density (PSD

∝

/ν

), as has

been measured for a small number of sources at various wave-

lengths (e.g., Hufnagel & Bregman 1992; Edelson et al. 1995;

Uttley et al. 2003; Ar ́evalo et al. 2008; Chatterjee et al. 20

08;

Abdo et al. 2010). The results presented in Abdo et al. (2010)

are

of particular interest for the OVRO blazar monitoring progr

am.

In their paper, they find a value of

= 1

for bright

BL Lacs and

= 1

for bright FSRQs in the gamma-

ray band. In the radio band a number of publications have mea-

sured

radio

. It has been found that

radio

= 2

for 3C279

at 14.5 GHz (Chatterjee et al. 2008) using a fit to the PSD for

an 11 year light curve. Additional indirect estimates for th

e PSD

power-law index are obtained by Hufnagel & Bregman (1992) us

ing structure function fits. For five sources, they obtain val

ues

= 0

, where

is the exponent on

the structure function

SF(

)

∝

. The same method is used

for 51 sources by Hughes et al. (1992) who found that most val-

ues of

lie between 0.6 and 1.8, while a couple are closer to

0. However, the often assumed relation between the exponent

s of

the PSD and the structure function (

+ 1

) is only valid

under special conditions, not necessarily found in real dat

a sets

(Paltani 1999; Emmanoulopoulos et al. 2010). The structure

func-

tion has been widely used in blazar variability studies but i

ts inter-

pretation is not straightforward, as has been recently disc

ussed by

Emmanoulopoulos et al. (2010). These authors used simulati

ons to

demonstrate that many of the features in the structure funct

ion are

associated with the length and sampling patterns of the ligh

t curves

rather than anything of statistical significance. For these

reasons,

values obtained from the structure function can only be take

n as a

rough measure of the properties of the time series, and there

fore

we do not use them here. Instead we fit the PSDs directly.

We start by giving a brief description of the data sets used

(Section 2), and then provide detailed descriptions of the m

ethods

in Sections 3 and 4. In Section 3 we describe our approach to th

critical problem of estimating a model for the light curves t

o use

with the Monte Carlo significance estimate. Here we describe

implementation of the method of Uttley et al. (2002) that con

tains

some important modifications. In Section 4 we provide a descr

ip-

tion of a number of modifications we propose to common methods

used to estimate the significance of cross-correlations. We

give a

justification for the use of the local normalization (Welsh 1

999)

in the Edelson & Krolik (1988) method, demonstrate the stron

dependence of the significance estimate on the model light cu

rves

and introduce a bootstrap method to estimate the error in the

cross-

correlation significance estimates. We close the paper with

a sum-

mary of our main findings and recommendations for the use of th

and related techniques (Section 5).

An important aspect of this work is the use of a statistically

well-defined data set, where long light curves are used indep

endent

of the flaring state of the object. A fatal trap that many autho

rs fall

into is that of ”cherry picking” the data by selecting small i

ntervals

of data. This approach can produce spurious levels of signifi

cance

for the cross-correlations, and hence cannot be used to draw

con-

clusions about blazar populations, or the long term behavio

ur of

individual sources.

2 THE PARAMETERS OF THE OBSERVATIONS

The methods we discuss in this paper can be adapted to use with

any data set, but since we are describing a particular implem

en-

tation, our simulated data sets are generated making some ch

oices

related to the intended application. These choices are moti

vated by

the data sets associated with our blazar monitoring program

in the

radio and gamma-ray bands. These data sets are described in d

etail

in Max-Moerbeck et al. (2014) and here we only summarize thei

main properties.

A radio observation for each one of the monitored blazars is

attempted twice per week, but because of the effects of weath

and other technical problems we obtain unevenly sampled lig

curves (Richards et al. 2011).

Fermi

observes the whole sky once

every three hours (Atwood et al. 2009), but because of the hig

hly

varying nature of blazars in gamma-rays, sources may someti

mes

fall below that detection threshold, resulting in upper lim

its for

a given integration period. We consider gamma-ray light cur

ves

with a time binning of one week, which allows us to detect abou

100 sources most of the time. We have upper limits for about

30% of the data. At this level we find that treating the upper

limits as non-observations does not have important effects

in the

measured time lags or significance of cross-correlations fo

r the

cases with interesting values of the cross-correlation sig

nificance

(Max-Moerbeck et al. 2014). This behaviour could be a result

the particular properties of the light curves we considered

in that

study, such as long time scales for the variability compared

to the

gaps created by ignoring the upper limits or the power spectr

densities, and it might not hold in other situations. We thus

obtain

unevenly sampled gamma-ray light curves as the ones we discu

here.

3 POWER SPECTRAL DENSITY ESTIMATION FOR

UNEVENLY SAMPLED TIME SERIES OF SHORT

DURATION

We begin with a brief summary of the standard methods used

for the estimation of the PSD and then move to the uneven sam-

pling and short time series cases. This discussion is based o

n the

method presented in Uttley et al. (2002) which is modified to s

uit

our dataset and the range of PSDs we fit. Additional justificat

ion of

the need for binning and interpolation of the light curves is

given

in Section 3.2.2. We also present an example of the applicati

of our method to a simulated light curve, and a number of tests

using real data sampling for simulated light curves that dem

on-

strate the accuracy of the fitting procedure under different

condi-

tions. The real data sampling is based on the data set present

ed in

Max-Moerbeck et al. (2014), which have 4 year 15 GHz radio lig

curves from the OVRO 40 meter telescope blazar monitoring pr

gram, and 3 year gamma-ray light curves from the LAT on board

Fermi

. A study of the effect of increasing the number of simu-

lations in the fitting procedure is performed to guide our cho

ice of

parameters for the data analysis. A summary of the method, wi

emphasis on the improvements we add to the original formulat

ion

is given in Section 5.

0000 RAS, MNRAS

000

, 1–24

Significance of cross-correlations for uneven sampling

3.1 The basics of power spectral density estimation

We define a time series as a time ordered sequence of triplets

(

, f

, e

)

, where

is the observation time,

is the measured

value of the quantity of interest (e.g., flux density, photon

flux,

etc.), and

is an estimate of the observational error associated

with the measurement. We assume that the time series is sorte

d in

time and

= 1

, ..., N

An estimation of the PSD can be obtained through the peri-

odogram, which is conventionally defined as the squared modu

lus

of the discrete Fourier transform:

(

) =

cos(2

πν

)

sin(2

πν

)

(1)

where the periodogram is evaluated at the discrete set of fre

quencies

k/T

for

= 1

, ..., N/

for

even, or

, ...,

(

−

for

odd,

Nyq

is the Nyquist fre-

quency and

(

−

)

(

−

, see footnote

Estimating the PSD in this way requires sampling a continu-

ous time series at discrete times for a finite amount of time. T

sampling operation is equivalent to multiplication of the t

ime se-

ries by a Dirac comb, while sampling for a finite time correspo

nds

to a multiplication by a rectangular observing window. Thes

e two

multiplications appear as convolutions in frequency space

: the

original spectrum is convolved with the Fourier transform o

f the

Dirac comb and of the rectangular window. As a final step we onl

look at a discrete set of frequencies which is equivalent to m

ulti-

plication by a Dirac comb in frequency space.

Ignoring the effect of sampling with a Dirac comb in the fre-

quency domain, and omitting normalization factors, we find t

hat

the periodogram is given by

(

) =

(

)

∗

III

∆

(

)

∗

(

)

(2)

where

(

)

is the Fourier transform of the time series

(

, f

)

III

∆

(

)

is the Fourier transform of the Dirac comb with sampling

interval

∆

, and

(

)

is the Fourier transform of the sampling

window function, which is by default a rectangular window, a

∗

denotes convolution.

As a result of the convolution with the Dirac comb, we do not

have access to the original spectrum but a modified version th

at re-

peats periodically. Another distortion comes from the conv

olution

with the sampling window function, which modifies the shape o

the original spectrum, and finally we only look at discrete se

t of

frequencies. All these factors have to be taken into account

when

analysing data and interpreting the results. The periodic r

epetition

of the spectrum gives rise to aliasing, in which high frequen

components are mistaken as low frequency components. Convo

lution with a window function can be a serious problem when th

sidelobes of the frequency window function lie on regions of

the

spectrum where the power is much higher than at the frequency

interest – this is the origin of the red-noise leakage proble

m. Hav-

ing the spectrum sampled at a number of discrete frequencies

can

be problematic if we are searching for narrow spectral compo

nents

which can be smeared or missed.

In what follows we use

for the frequency and

for time series data,

e.g., flux density, photon flux, etc.

This choice of

is consistent with the definition of the discrete Fourier

transform (Brigham 1988) and allows us to make use of the Fast

Fourier

Transform algorithm to increase the speed of the computatio

ns.

A graphical representation of these operations can help the

reader un-

derstand their effect. See Figure 6.1 in Brigham (1988) or el

sewhere.

For the case of evenly sampled time series, PSD estimation

amounts to using the discrete Fourier transform (DFT) along

with

periodogram or frequency averaging to decrease the noise wh

ich

is distributed as a

for a single frequency component. Each of

these averaging processes can reduce the variance at the pri

ce of

reduced spectral resolution. For example, in the case of fre

quency

or periodogram averaging of

components the resulting distribu-

tion is

, which reduces the variance by a factor of

with

respect to the non-averaging case.

The application of these methods is straightforward in the

case of long time series, where a good estimate of the PSD can b

obtained at the expense of reduced frequency resolution. No

nethe-

less, problems of aliasing and red-noise leakage can still c

ompli-

cate the analysis of broadband signals like the simple power

-law

PSDs we fit to our data (

(

)

∝

/ν

), for the reasons out-

lined below. For relatively flat spectra (

from 0 to 2) aliasing

can be a problem as high frequency power above the Nyquist fre

quency contaminates low frequencies. This problem is less s

erious

for steep spectra (

), that have relatively small amounts of

power at high frequencies. But in this case red-noise leakag

e can

flatten the high frequency part of the spectrum: power from lo

frequencies contaminates the low amplitude high frequency

parts

of the spectrum through sidelobes on the sampling window fun

tions. To reduce the effects of these problems a combination

filters and sampling window functions can be used (e.g., Brig

ham

1988; Press et al. 1992; Shumway and Stoffer 2011).

3.2 Power spectral density estimation for unevenly sampled

data and short time series

When working with time series data, problems often arise be-

cause the time series is unevenly sampled and relatively sho

rt.

Uneven sampling requires the use of a different estimate of t

periodogram: the best known alternatives are the Deeming pe

riodogram (Deeming 1975) and the Lomb-Scargle periodogram

(Scargle 1982). The Lomb-Scargle periodogram is well suite

d to

the detection of periodic signals in white noise, because it

s statis-

tical properties are well understood. For the analysis of br

oadband

signals, the Deeming periodogram is often used for reasons t

hat are

mainly historical as it does not present any real advantages

. These

two methods allow us to obtain an estimate of the periodogram

for

unevenly sampled time series directly, but do not provide a w

to correct for distortions produced by the sampling window f

unc-

tions, which can modify the shape of the periodogram signific

antly

as explained below.

3.2.1 Description of the method

The method we present here was originally developed and de-

scribed in detail in Uttley et al. (2002). We describe the mai

n steps

here to highlight the differences between theirs and our imp

lemen-

tation.

(i) Obtain the periodogram for the light curve and bin it in fr

quency to reduce scatter. The periodogram is given by a frequ

ency

binned version of the following expression

(

) =





cos(2

πν

)

sin(2

πν

)





(3)

where the frequencies are

k/T

for

= 1

, ..., N/

for

0000 RAS, MNRAS

000

, 1–24

W. Max-Moerbeck et al.

even, or

= 1

, ...,

(

−

for

uneven. The minimum fre-

quency is

min

= 1

, the maximum frequency is the Nyquist

frequency

Nyq

= (

2)(1

)

, and

(

−

)

(

−

The multiplicative factor is a normalization, such that the

integral

from

is equal to the variance contributed to the light curve

in this frequency range. The evenly sampled time series

(

, f

)

is obtained from the original one by interpolation onto a reg

ular

grid. This interpolated time series is first multiplied by an

appro-

priate sampling window in order to reduce red-noise leakage

. A

justification of these steps is given in Sections 3.2.2 and 3.

2.3.

(ii) Choose a PSD model to test against the data. In this case

we are fitting power-laws of the form

(

)

∝

/ν

but this can

be generalized to any functional dependence. For the given m

odel

simulate

time series, where

is a large number that allows us

to represent a variety of possible realizations of this PSD m

odel.

These and all the simulated light curves used in this work are

gen-

erated using the method described in Timmer & Koenig (1995),

which randomizes both the amplitude and phase of the Fourier

transform coefficients consistent with the statistical pro

perties of

the periodogram.

(iii) For each simulated light curve apply the same sampling

add observational noise, and interpolate into the same even

grid.

Calculate the periodogram for each one. From these

peri-

odograms determine the mean periodogram and its associated

er-

ror as the scatter at each frequency bin.

(iv) Using the mean periodogram and errors obtained in step

(iii) construct a

-like test, defined by

obs

max

min

[

sim

(

)

−

obs

(

)]

∆

sim

(

)

(4)

where

obs

(

)

is the periodogram of the observed light curve,

sim

(

)

and

∆

sim

(

)

are the mean and scatter in the peri-

odogram obtained from the simulated light curves, and

obs

the

of the observed light curve when compared to the simu-

lations for a given PSD model. This

obs

is then compared to the

simulated distribution of

for which we can obtain

samples,

sim

, i

, by replacing the

obs

(

)

term by the periodogram of the

simulated light curves,

sim

, i

(

)

, in Equation 4. The fraction of

the distribution for which

sim

, i

> χ

obs

is the significance level

at which the tested PSD model can be rejected, also known as th

-value. Thus a high value of this percentage represents a goo

d fit,

while a low one corresponds to a poor fit.

The process described above can be repeated for a number of

models with different parameters. The final step consists of

select-

ing the best model fit as the one with the highest value of

. As with

any statistical procedure, a measurement of the uncertaint

y in the

parameters of the model needs to be given. In this point we dep

art

from the original formulation and provide uncertainties ba

sed on

Monte Carlo simulations of the model fitting process (as desc

ribed

in Section 3.2.6).

The most significant differences with the original implemen

tation are the use of sampling window functions to reduce red

noise leakage and the Monte Carlo estimation of fitting uncer

tain-

ties. Another difference is that we simulate the effects of a

liasing

by simulating light curves with high frequency components w

ith a

sampling period of 1 day, instead of adding a constant noise t

erm

to the PSD of the simulated light curves as in the original for

mu-

lation. The high frequency cut at 1 day

−

is justified in our imple-

mentation by the small amount of power seen at higher frequen

cies specially in the radio band. Intraday variable sources

show

variability in short time scales (Wagner & Witzel 1995), but

even

in these cases the amplitude of the variability is only a few p

ercent

for most sources (Quirrenbach et al. 1992). At gamma-rays th

is is

not necessarily true as fast variability has been observed,

but given

that gamma-ray photon fluxes correspond to mean values of lon

integrations of at least a week for most blazars, the effects

of fast

variability are less important as they are averaged out. Oth

er ap-

plications where fast variability is expected might requir

e a higher

frequency cut, making this approach impractical. Another i

mpor-

tant difference, although less important conceptually, is

the use of

the Fast Fourier Transform to perform the computations, whi

substantially decreases computing time. Further discussi

ons of the

most important elements of the method are given below.

3.2.2 The necessity for rebinning and interpolation of the l

ight

curves

This step is very important when estimating steep PSDs. It is

easy

to get mislead by intuition developed from the behavior of wi

dow functions for evenly sampled time series, but it turns ou

t that

window functions for unevenly sampled data do not behave in t

same way. An example is presented in Figure 1, where we show th

frequency response of a uneven sampling pattern with rectan

gular

and Hanning windows, for the periodogram of power-law PSDs

with different values of

from 0 to 5. These window functions are

rect

(

) =

(

otherwise

(5)

Hanning

(

) =

(

cos(

(

−

)

otherwise

(6)

From Figure 1 it can be seen that even though we can calcu-

late the periodogram directly for an unevenly sampled time s

eries

the results we obtain are very noisy and do not vary much among

different values of

. The main problem is that all the PSDs with

look very similar, showing almost the same slope when

fitted with a linear function after a log-log transformation

. This is

problematic as the fitting procedure relies on the differenc

es be-

tween different PSD power-law indices to choose the best mod

el.

Doing the same exercise for a time series with the same time

length and number of data points but with even sampling we obt

ain

the results shown in Figure 2. In this case the results are muc

h less

noisy and the estimated PSDs look different from each other e

ven

for very steep PSDs. This allows for better discrimination a

nd is

required to find an upper limit to the source power-law expone

of the PSD.

The problems associated with the window functions become

evident when trying to apply the fitting method using unevenl

sampled data, and show up as an inability to find an upper limit

the power-law exponent

due to the lack of difference between the

estimated PSDs for the simulated data. This problem can be so

lved

by the use of interpolation and an appropriate window functi

on, a

subject that is discussed in Section 3.2.3.

Figures 1 and 2 illustrate the limited use we can make of di-

rect PSD fitting, even for the case of long time series. In this

case,

red-noise leakage makes it impossible to recover the right p

ower-

law index for steep PSDs.

The subject of windowing of unevenly sampled data is briefly

discussed in Scargle (1982). In particular figure 3 in Scargl

e (1982)

shows a few example window functions for the cases of even and

uneven sampling using the classic periodogram. That figure i

llus-

trate the very different sidelobe structure that is obtaine

d for the

0000 RAS, MNRAS

000

, 1–24

Significance of cross-correlations for uneven sampling

Figure 1.

Effect of the use of window functions for

uneven sampling

cases using the rectangular (blue) and Hanning window (gree

n). Each figure shows the

result of simulating 1000 light curves with a given simple po

wer-law PSD

∝

/ν

, with

given in each figure title. The data points are the mean PSD and

the error is the standard deviation in the simulation, while

the units of power (vertical axis) and frequency (horizonta

l axis) are arbitrary. Also included are

direct fits of the slopes of the mean PSDs for the simulated dat

a for each window (individual panels legend). Notice how the

linear fits can hardly discriminate

between different slopes and how all the estimated PSDs look

very similar.

Figure 2.

Effect of the use of window functions for

even sampling

cases using the rectangular (blue) and Hanning window (gree

n). Each figure shows the

result of simulating 1000 light curves with a given simple po

wer-law PSD

∝

/ν

, with

given in each figure title. The data points are the mean PSD and

the error bar is the standard deviation in the simulation, wh

ile the units of power (vertical axis) and frequency (horizo

ntal axis) are arbitrary. Also included

are direct fits of the slopes of the mean PSDs for the simulated

data for each window (individual panels legend). In this cas

e, the shape of the PSDs is less

noisy and the estimated PSDs for steep cases look different f

rom each other. Even in this case, direct linear fitting of the

PSD produces biased results.

0000 RAS, MNRAS

000

, 1–24

W. Max-Moerbeck et al.

uneven sampling case, which is at the root of the problem de-

scribed here.

To clarify this point, we also include the window functions f

our test data along with the results of applying the Hanning w

in-

dow. An examination of Figure 3 helps us understand the resul

described below. In conventional Fourier analysis, window

func-

tions change the frequency response of the sampling, changi

ng the

sidelobe structure and thus helping mitigate the effects of

red-noise

leakage and aliasing. This behavior can be seen when using ev

enly

sampled data sets, where the sidelobe structure is regular a

nd de-

cays as frequency increases. The case for uneven sampling is

very

different: the shapes of the window functions explains the s

trong

red-noise leakage seen in the simulations and the increased

noise.

In the case of even sampling we recover the results of conven-

tional Fourier analysis, with all the known properties of wi

ndow

functions.

For the reasons described above we use linear interpolation

and rebinning to interpolate the unevenly sampled light cur

ves to

a regular grid, thus allowing for the PSD fitting.

3.2.3 Spectral window function

One fundamental difference between the implementation of t

method of Uttley et al. (2002) and ours is that we use window

functions to reduce the effects of red-noise leakage. We fou

nd that

this is necessary when dealing with steep power spectral den

sities,

like those found in blazar studies. In our first attempts to fit

the

PSDs we found that with a rectangular window we were not able

to set an upper limit to the value of

and were only able to set a

lower limit. The upper limit on

is necessary to constrain the sig-

nificance of cross-correlations, as will be described in Sec

tion 4. In

this section we explain the origin of that problem and the sol

ution

we implemented.

For broadband time series a big problem is the leakage of

power through far sidelobes of the spectral window response

. This

problem is evident when dealing with high dynamic range PSDs

such as steep power-laws. For these power-law PSDs, it is see

as a flattening of the high frequency part of the periodogram d

to power leaking from low frequency part which has much highe

power. In practical terms, it means that after some critical

value of

the power-law index all the periodograms have a flat slope whi

does not depend strongly on the PSD (Figures 1 and 2). Most of

this high frequency power is actually coming from low freque

cies through sidelobes of the window function. One way to dea

with this problem is by using window functions with low level

sidelobes; some details about their application to our data

set are

presented below.

Spectral window functions for our data sets

There is a great

variety of window functions, which differ mainly in the widt

h of

their main lobe, the maximum level, and the fall-off rate of t

sidelobes. The ideal window function will depend on the appl

cation and some experimentation might be necessary. Proper

ties

of various window functions can be found elsewhere (e.g., Ha

rris

1978)

We tried a number of them and compared their performance

in recovering steep PSDs. We found that among the ones tested

the most suitable one was the Hanning window, which is able to

recover a steep spectrum in a range that allows us to fit our lig

curves. Among the special characteristics of this window ar

e its

low sidelobe level, more than

dB below main lobe, and the fast

Table 1.

Properties of selected window functions

Window

Sidelobe Level

Sidelobe Fall-Off

3-dB BW

(dB)

(dB/oct)

(bins)

Rectangular

−

Triangle or Bartlett

−

cos

(

)

or Hanning

−

fall-off at

−

dB/decade. As a downside the Hanning window

has a broader main lobe at 3 dB (

) when compared to

the rectangular window (

), where

is the length of the

time series.

The effect of different windows is illustrated in Figure 4,

which shows the periodogram for a series of steep PSDs. From

the figure it is also clear why other window functions fail to d

is-

tinguish between steep PSDs, and thus are not suitable to use

with

this method.

The results of Figure 4 can be understood by comparing the

properties of the window functions shown in Table 1 (Harris 1

978).

The reduction of the red-noise leakage when using the Hannin

window allows us to discriminate between different steep po

wer-

law indices of the PSD, and is due to the low level and fast fall

-off

its sidelobes.

Windowing is good for fitting a featureless PSD, but it can

be a source of problems if the goal is to find narrow spectral co

ponents. This is because of the well-known trade-off betwee

n res-

olution and sidelobe level: tapering the window function in

order

to decrease the sidelobe level must reduce the resolution. T

his has

to be considered when searching for periodic components, a c

ase

which is outside of the scope of the current analysis.

3.2.4 Filtering

The windowing technique is able to solve the problem with red

noise leakage, but another method that can be used is filterin

g in

the time domain followed by a correction to the frequency dom

ain

result. The goal of filtering is to eliminate the low-frequen

cy com-

ponents that produce the red-noise leakage before computin

g the

periodogram. Since this changes the spectrum of the time ser

ies, it

has to be compensated in the final periodogram by the applicat

ion

of a frequency filter.

One of these techniques is called pre-whitening and post-

darkening by first differencing. In this case, the original t

ime

series

(

, f

)

with even sampling is transformed to

(

, g

≡

−

)

. In the frequency domain this is equivalent to filter-

ing with

(

)

= 2[1

−

cos(2

πν

)]

. Higher order filtering is

possible, for example by the application of first order diffe

rencing

multiple times (Shumway and Stoffer 2011).

Figure 5 shows the result of applying this procedure to simu-

lated data with even sampling and a range of power-law slopes

the PSD. It can be seen that this method has problems recoveri

flat PSDs with

and very steep PSDs with

. We also

tested it with the sampling of the OVRO data set and found that

in a large number of cases it was not able to provide good upper

limits for

and was outperformed by windowing with the Han-

ning window. We therefore use Hanning windowing for the data

analysis.

0000 RAS, MNRAS

000

, 1–24