of 30
Dynamic neural reconfiguration for strategy switching during competitive
social interactions
Supplementary Materials
Ruihan Yang, et.al.
Contents
1 Method S1. Time-varying Granger causality with signal-dependent noise
1
2 Method S2. Numerical Simulations of time-varying systems
6
2.1 Time series with a Gaussian white noise
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6
2.2 Time-varying model with a signal-dependent noise
. . . . . . . . . . . . . . . . . . . . . . . .
7
2.3 Simulation results for the time-varying Granger causality with Gaussian white noise
. . . . .
7
2.4 Simulation results for the time-varying Granger causality with signal-dependent noise
. . . .
8
3 Method S3. Expectation-Maximum algorithm to identify the behavioral window
9
4 Method S4. Estimate the hemodynamic response function with the GLM model
10
5 Method S5. Accounting for the possible confounding effect of the hemodynamic response
function on the findings
11
5.1 Compare the HRF delay between two brain regions
. . . . . . . . . . . . . . . . . . . . . . . .
11
5.2 Numerical simulations of the HRF delay
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
11
List of Tables
S1 Characteristics of the behavioral windows identified by the hidden Markov model.
. . . . . .
13
S2 Comparison of the demographics of the buyers associated with different types of behavioral
windows.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
14
S3 The identification of strategies when different intervals were chosen to calculate the observations.
15
Elsevier
List of Figures
S1 Detection of information flow in a system with time-varying coefficient and Gaussian white
noise.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
16
S2 Detection of information flow in a system with time-varying coefficient and signal-dependent
noise.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
17
S3 Comparison of behavioral classification between time-invariant and time-varying approaches.
18
S4 Main findings remained when the ambiguous behavior window of Subject 29 was retained.
. .
19
S5 Comparison of the information flow estimated by the classic GC and GCSDN method.
. . . .
20
S6 Behavioral correlations between the information revelation (IR) and the information flows.
. .
21
S7 Behavioral correlations between the
R
2
and the information flows.
. . . . . . . . . . . . . . .
22
S8 Results when different intervals were chosen to calculate the observations.
. . . . . . . . . . .
23
S9 Results of different thresholds of the minimum length of a stable behavioral window.
. . . . .
24
S10 Comparison of the HRFs among three ROIs.
. . . . . . . . . . . . . . . . . . . . . . . . . . . .
25
S11 Model performance.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
26
1.
Method S1. Time-varying Granger causality with signal-dependent noise
Assuming a constant effective connectivity between brain regions, the classic uses the time-series models
(both GC and GCSDN) with constant model coefficients. However, this assumption may be an oversimpli-
fication of the information processing in the brain, especially during some intensive cognitive computation
(e.g., the two-party bargaining game). To model the dynamic behavior, a more complicated model which can
better describe the dynamic characteristics is needed. Here, we first briefly discussed the consquence when
classic time-invariant model was applied to a time-varying system, and then proposed a new approach of
time-varying Granger causality with signal-dependent noise (time-varying GCSDN) to measure the dynamic
causality.
Consider the following time series model
x
t
=
ax
t
1
+
b
(
t
)
y
t
1
+
ε
t
,
where
ε
t
is a Gaussian white noise,
a
is a constant coefficient, and
b
(
t
)
is a time-varying coefficient. The
classic Granger causality can be defined as
F
y
x
= log
var (
b
(
t
)
y
t
1
+
ε
t
)
var (
ε
t
)
,
and estimated by
b
F
y
x
= log
P
T
1
s
=1
ˆ
b
2
(
s
)
y
2
s
+
P
T
1
s
=1
ˆ
ε
2
s
P
T
1
s
=1
ˆ
ε
2
s
,
where
ˆ
b
(
t
)
is the local classic estimation at each time step t and
ˆ
ε
s
is the residual process. If we ignore the
time-varying property of the model and do the conventional least square estimation for the parameters, we
can have an estimation of
̄
b
and the corresponding model residual
̄
ε
s
, and the causality can be established as
F
y
x
= log
P
T
1
s
=1
̄
b
2
y
2
s
+
P
T
1
s
=1
̄
ε
2
s
P
T
1
s
=1
̄
ε
2
s
.
For the AR model, we showed that the causality established by assuming the constant coefficient in the
model is upper bounded by the mean causality among time windows, and the condition on which the equality
holds was discussed as follows. Define
o
t
=
0
@
x
t
y
t
1
A
,
β
=
0
@
̄
a
̄
b
1
A
,
β
t
=
0
@
ˆ
a
ˆ
b
(
t
)
1
A
,
O
=
0
B
B
B
B
B
B
@
x
1
y
1
x
2
y
2
.
.
.
.
.
.
x
T
1
x
T
1
1
C
C
C
C
C
C
A
,
D
=
0
B
B
B
B
B
B
@
x
2
x
3
.
.
.
x
T
1
C
C
C
C
C
C
A
.
1
For the whole time series we have
O
D
=
O
,
and for each time step
t
, the following equation holds,
o
t
x
t
+1
=
o
t
o
t
β
.
Since
P
T
1
t
=1
o
t
x
t
+1
=
O
D
, we get
T
1
X
t
=1
o
t
o
t
β
t
=
O
.
Therefore,
β
is a weighted average of
β
t
as
β
= (
O
O
)
1
T
1
X
t
=1
o
t
o
t
β
t
.
Since what we consider here is the effect of the time-varying parameter
b
(
t
)
, we further suppose that the
constant parameter
a
= 0
, and the above formula can be simplified as
̄
b
= (
T
1
X
s
=1
y
2
s
)
1
T
1
X
t
=1
y
2
t
ˆ
b
(
t
)
.
Since
T
1
X
s
=1
̄
b
2
y
2
s
=
T
1
X
s
=1
y
2
s
!
2
T
1
X
t
=1
y
2
t
ˆ
b
(
t
)
!
2
T
1
X
s
=1
y
2
s
!
=
T
1
X
s
=1
y
2
s
!
1
T
1
X
t
=1
y
2
t
ˆ
b
(
t
)
!
2
T
1
X
s
=1
y
2
s
!
1
T
1
X
t
=1
ˆ
b
2
(
t
)
y
2
t
!
T
1
X
t
=1
y
2
t
!
=
T
1
X
t
=1
ˆ
b
2
(
t
)
y
2
t
,
we have that
F
y
x
b
F
y
x
,
and the equality holds only when
P
T
1
t
=1
ˆ
b
2
(
t
)
y
2
t
=
c
P
T
1
t
=1
y
2
t
with a constant c. Assuming that the local
estimation gives the exact value of the parameter, we can see that the causality established by ignoring the
time-varying property of the parameters is upper bounded by the averaged causality among local causalities
(i.e. the causality detected by each sliding window). Therefore, caution must be made when we detect the
causality for a whole time series, since the causality may exist in some time windows. Therefore, if the whole
time series can be divided into N time windows according to the task paradigm of the fMRI experiment, we
want to estimate the effective connectivity at each time window instead of the whole time series.
2
In our case, to deal with the signal-dependent noise (SDN) observed in the BOLD signal[
1
], the proposed
time-varying method is based on the Granger causality with signal-dependent noise (GCSDN) model, which
has been proved to be useful in detecting the effective connectivity previously ([
2
] and [
3
]). Suppose we
have two time series,
x
t
and
y
t
. Let
p
and
q
be the model orders,
{
A
i
(
i
= 1
,
···
, p
)
,
B
xy,j
,
B
yx,j
(
j
=
1
,
···
, q
)
,
C
xy
,
C
yx
}
be the model coefficient matrices, and
u
xy,t
,
v
yx,t
be the Gaussian white noise. The
model can be defined as follows:
0
@
x
t
y
t
1
A
=
p
X
i
=1
0
@
A
xx,i
A
xy,i
A
yx,i
A
yy,i
1
A
0
@
x
t
i
y
t
i
1
A
+
0
@
r
xy,t
r
yx,t
1
A
,
(S1)
where
r
xy,t
=
H
1
/
2
xy,t
u
xy,t
,
H
xy,t
=
C
xy
C
xy
+
q
X
j
=1
B
xy,j
0
@
x
t
j
y
t
j
1
A

x
t
j
y
t
j

B
xy,j
,
H
yx,t
=
C
yx
C
yx
+
q
X
j
=1
B
yx,j
0
@
x
t
j
y
t
j
1
A

x
t
j
y
t
j

B
yx,j
,
r
yx,t
=
H
1
/
2
yx,t
v
yx,t
.
(S2)
For the time-varying causality, we first divide the whole time series into N time windows. At each window,
the model (Eq.
S1
-
S2
) can be fit based on the direct observations and the indirect observations. Assuming
the system evolved smoothly from one time window to another, the observed time series in the current
window are considered as the direct observations of the model for this window, while the observed time-series
data in the other windows are the indirection observations. We need the indirect observations here, since
the number of data points in a given window is often too small to make reliable estimation of the model.
This is especially true for the self-paced task-fMRI experiments. In a self-paced paradigm, the number of
scans collected in one round can be as small as one or two. Here, we propose to make use of the indirect
observations by calculating the likelihood function at the
i
th
0
time window as a weighted average among all
time windows. The weight for the observations in the
i
th
window is defined as follows:
w
i,i
0
=
K
i
i
0
h

P
N
j
=1
K
j
i
0
h

.
(S3)
In the current paper, we used the Gaussian kernel, among many other choices [
4
]. A larger
h
is recommended,
if the time window given by the task paradigm is short, and a smaller h is preferred, otherwise. Here, we set
h
= 0
.
17
for the numerical simulation and
h
= 4
for the fMRI data analysis. Define
θ
i
0
=

A
i
0
k
, k
= 1
,
···
, p,
B
i
0
xy,j
,
B
i
0
yx,j
, j
= 1
,
···
, q,
C
i
0
xy
,
C
i
0
yx
,
(S4)
the objective function (i.e.,the log-likelihood function) can be rewritten for the time window as follows
LLF
i
0
(
θ
) =
T
X
t
=
p
q
N
X
i
=1
χ
i
th
window
(
t
)
w
i,i
0
l
θ
(
t
)
,
(S5)
3
where
l
θ
(
t
) =
1
2
ln
|
H
t
|−
1
2
r
t
H
1
t
r
t
, t
=
p
q,
···
, T.
(S6)
Consider the stationary conditions of the model (Eq.
S1
)[
1
], the model parameters at the
i
th
0
window can be
estimated by solving the following constrained optimization problem.
b
θ
i
0
= arg max
θ
LLF
i
0
(
θ
)
,
s.t. stability conditions hold.
(S7)
Taking the causality
y
x
at the
i
th
0
time window as example, we consider the following two models for
x
t
,
x
t
=
p
X
k
=1
A
i
0
x,k
x
t
i
+
H
i
0
xx,t

1
/
2
u
xx,t
,
H
i
0
xx,t
=
C
i
0
xx
C
i
0
xx
+
q
X
j
=1
B
i
0
xx,j
x
t
j
x
t
j
B
i
0
xx,j
,
(S8)
and
x
t
=
p
X
k
=1
A
i
0
xy,k
x
t
k
+
p
X
k
=1
A
i
0
yx,k
y
t
k
+
H
i
0
xy,t

1
/
2
u
xy,t
,
H
i
0
xy,t
=
C
i
0
xy
C
i
0
xy
+
q
X
j
=1
B
i
0
xy,j
0
@
x
t
j
y
t
j
1
A

x
t
j
y
t
j

B
i
0
xy,j
.
(S9)
At each time window, the model coefficients are assumed to be constants. Define
θ
i
0
restricted
=
n
A
i
0
x,k
, k
= 1
,
···
, p,
B
i
0
xx,j
, j
= 1
,
···
, q,
C
i
0
xx
o
,
θ
i
0
full
=
n
A
i
0
xy,k
,
A
i
0
yx,k
, k
= 1
,
···
, p,
B
i
0
xy,j
, j
= 1
,
···
, q,
C
i
0
xy
o
,
We can obtained the estimated
b
θ
i
0
restricted
and
b
θ
i
0
full
by solving the constrained optimization problem defined
in (
S7
). Now, the causal influence from
y
to
x
can be measured by the likelihood ratio given by the above
two prediction models for
x
t
,
IF
i
0
y
x
=
L

b
θ
i
0
restricted
|{
x
t
}
T
t
=1

L

b
θ
i
0
full
|{
x
t
}
T
t
=1
,
{
y
t
}
T
t
=1

,
where the
L

b
θ
i
0
restricted
|{
x
t
}
T
t
=1

is the likelihood function for the restricted model (Eq.
S8
), and the
L

b
θ
i
0
full
|{
x
t
}
T
t
=1
,
{
y
t
}
T
t
=1

is the likelihood function for the full model (Eq.
S9
). Therefore, the likelihood
ratio test can be used for causal inference,
lr
y
x
=
2
h
log
L

b
θ
restricted
|{
x
t
}
T
t
=1

log
L

b
θ
full
|{
x
t
}
T
t
=1
,
{
y
t
}
T
t
=1
i
.
(S10)
The test statistic
lr
y
x
is approximately chi-squared distributed with the degrees of freedom
df
full
df
restricted
,
where
df
full
and
df
restricted
are the numbers of free parameters of the full model (Eq.
S9
) and the re-
stricted model (Eq.
S8
), respectively. Given the TR was 2 s, we used
p
=
q
= 1
in this analysis
4
following the literature [
2
], [
5
], and [
6
]. A Matlab toolbox of this algorithm is also available at
https:
//github.com/qluo2018/GCSDN
.
5
2.
Method S2. Numerical Simulations of time-varying systems
To illustrate the performance of the time-varying GCSDN model, we conducted two simulation studies.
Suppose we had two brain regions, X and Y. The effective connectivity between them could be described by
the time series model, either with the assumption of the constant noise level or the signal-dependent noise
level. The observed time-series data of the activities for these two regions could be simulated by the models,
and the information flow (
IF
) could be detected by the classic GC, GCSDN or time-varying GCSDN model.
2.1.
Time series with a Gaussian white noise
The time-series data with 1000 time steps were generated by the following model, where
x
t
,
y
t
were the
time series of one-dimensional, representing the activities of brain region X and Y, respectively.
u
xy,t
,
v
xy,t
were the Gaussian white noise with variance 0.5,
0
@
x
t
y
t
1
A
=
0
@
A
11
(
t
)
A
12
(
t
)
A
21
(
t
)
A
22
(
t
)
1
A
0
@
x
t
1
y
t
1
1
A
+
0
@
u
xy,t
u
yx,t
1
A
.
(S1)
The time-varying causality was modeled by the corresponding coefficients
A
11
(
t
) = 0
.
1
, A
12
(
t
) = 0
.
4

t
t
1
1000
t
1

,
A
21
(
t
) = 0
.
4

1
t
t
2

, A
12
(
t
) = 0
.
1
2
.
Setting
t
1
= 500
in the model (Eq.
S1
), we had a strong effective connectivity from
Y
to
X
through the
coefficient
A
12
at the beginning, and this excitatory influence weakened to be undetectable as the posi-
tive coefficient
A
12
decreased linearly to zero. As the coefficient
A
12
became negative, an inhibitory effect
strengthened as the absolute value of the coefficient
A
12
increased. Similarly, the effective connectivity from
X
to
Y
evolved in an opposite pattern from inhibitory to excitatory by setting
t
2
= 500
.
6
2.2.
Time-varying model with a signal-dependent noise
The time-series data with 1000 time steps were generated by the following model for two time series,
representing the activities of brain region X and Y, respectively.
0
@
x
t
y
t
1
A
=
0
@
0
.
1
0
0 0
.
1
2
1
A
0
@
x
t
1
y
t
1
1
A
+
0
@
r
xy,t
r
yx,t
1
A
,
0
@
r
xy,t
r
yx,t
1
A
=
0
@
H
1
2
xy,t
0
0
H
1
2
yx,t
1
A
0
@
u
xy,t
u
yx,t
1
A
,
H
xy,t
= 1 +

B
xx
(
t
)
B
xy
(
t
)

0
@
x
t
1
y
t
1
1
A

x
t
1
y
t
1

0
@
B
xx
(
t
)
B
xy
(
t
)
1
A
,
H
yx,t
= 1 +

B
yx
(
t
)
B
yy
(
t
)

0
@
x
t
1
y
t
1
1
A

x
t
1
y
t
1

0
@
B
yx
(
t
)
B
yy
(
t
)
1
A
.
(S2)
The time-varying causality was modeled by the corresponding coefficients
B
xx
(
t
) =
0
.
5
, B
xy,t
(
t
) =
8
<
:
0
.
6

1
t
t
1

t
t
1
0
t > t
1
,
B
yx,t
(
t
) =
8
<
:
0
t < t
2
0
.
6

t
t
2
1000
t
2

t
t
2
, B
yy
(
t
) =
0
.
5
.
A significant non-zero value of the coefficient
B
xy
lead to an effective connectivity from
Y
to
X
, while the
coefficient
B
yx
indicated an effective connectivity from
X
to
Y
. Setting
t
1
= 600
and
t
2
= 400
in model
(Eq.
S2
), the strong influence from
Y
to
X
at the beginning of the simulation weakened linearly to zero at
600s. At 400s, an influence from
X
to
Y
began to increase linearly till the end of the simulation. With the
proposed time-varying algorithm, we expected to detect an effective connectivity from
Y
to
X
, but not from
X
to
Y
at the beginning, while a strong effective connectivity from
X
to
Y
, but not from
Y
to
X
at the end
of the simulation.
2.3.
Simulation results for the time-varying Granger causality with Gaussian white noise
Assuming the effective connectivity between two brain regions
X
and
Y
by the simulation model (Eq.
S1
; Fig.
S1
A), we employed the classic GC model to infer the information flow between
X
and
Y
by using
the whole time series data. In 1000 repeats, only 0.5% and 0.8% significant information flow was detected for
X
Y
and
Y
X
, respectively (Fig.
S1
B). We also applied the GCSDN model to detect the information
flow on the whole time series data, and found few significant information flow on both direction, only 13.2%
significant information flow was detected for
X
Y
and 16.1% for
Y
X
(Fig.
S1
B). When we divided
7
the simulated time series data into five time windows, both methods identified significant information flows
as specified by the simulation model (Eq.
S1
). At the first and the last two time windows, both the classic
GC model and the GCSDN model could detect more than 94% of the significant information flows in the
1000 repeated simulations. In the 3rd time window, where no significant interaction was specified by the
model (Eq.
S1
), the false positive detection was less than 1% (Fig.
S1
B). Applying the proposed the time-
varying GCSDN method to these five time windows, similar performance was achieved (Fig.
S1
D). With the
time-varying GCSDN method, we found strong information flow between
X
and
Y
in both directions at the
beginning, and this information flow weakened as the causal coefficients in the model (Eq.
S1
) decreased. At
the 3rd time window, these two time series stopped to exchange any information. As the absolute value of
the causal coefficients in the model (Eq.
S1
) increased from the third time window, stronger information flow
was detected again (Fig.
S1
D). We also estimated the coefficients of the AR-BEKK model for each window,
and found that the estimated coefficients were equal to the average of the time-varying parameters at each
window (Fig.
S1
C). These results suggested that the time invariant models were no longer applicable when
time-varying property was significant.
2.4.
Simulation results for the time-varying Granger causality with signal-dependent noise
We repeated the simulation of the model (Eq.
S2
) for 1000 times (Fig.
S2
A), and found that the classic
GC method failed to detect any significant information flow in the presence of the signal-dependent noise,
no matter we used the whole time series data or divided it into five time windows (Fig.
S2
B). Applying the
GCSDN method to the whole time series, we detected the significant information flows between
X
and
Y
in both directions (Fig.
S2
B). In five time windows, the time-varying GCSDN method accurately identified
the significant information flows from
Y
to
X
at the first two windows and from
X
to
Y
at the last two
windows, but no significant information flow at the 3rd time window (Fig.
S2
D). Compared with the GCSDN
method, the proposed time-varying GCSDN revealed more details of the time-varying system, and estimated
the time-varying coefficient accurately at each time window (Fig.
S2
C), which may be particularly useful
when we want to investigate the evolving pattern of interaction between the key brain regions the underlying
time-varying behavior.
8
3.
Method S3. Expectation-Maximum algorithm to identify the behavioral window
To learn the parameters for the HMM, we applied the Expectation-Maximum algorithm. With the same
notations used as described in the main text, the iterative estimation of parameters is as follows:
E-steps: for all time-state pairs
(
t, k
)
,
1)
Recursively compute the forward probabilities
α
t,k
(
i
) =
P
(
O
1:
t,k
, q
t
=
q
i
|
λ
t
)
,
and the backward probabilities
β
t,k
(
i
) =
P
(
O
T
:
1:
t
+1
,k
|
q
t
=
q
i
,
λ
t
)
.
2)
Compute the probabilities of the state occupation
P
(
q
t
=
q
i
,
O
1:
T,k
|
λ
t
) =
α
t,k
(
i
)
β
t,k
(
i
)
,
P
(
q
t
=
q
i
, q
t
+1
=
q
j
,
O
1:
T,k
|
λ
t
) =
α
t,k
(
i
)
β
t,k
(
i
)
a
t
ij
P
(
O
t
+1
,k
|
q
t
+1
=
q
j
,
λ
t
)
.
M-steps:
Based on the estimated probabilities of the state occupation, we can re-estimate
the HMM parameters
a
ij,t
+1
=
P
N
k
=1
P
T
1
t
=1
P
(
q
t
=
q
i
, q
t
+1
=
q
j
,
O
1:
T,k
|
λ
t
)
P
N
k
=1
P
T
1
t
=1
P
(
q
t
=
q
i
,
O
1:
T,k
|
λ
t
)
,
π
i,t
+1
=
P
N
k
=1
P
(
q
1
=
q
i
,
O
1:
T,k
|
λ
t
)
P
N
k
=1
P
(
O
1:
T,k
|
λ
t
)
,
μ
i,t
+1
=
P
N
k
=1
P
T
t
=1
P
(
q
t
=
q
i
,
O
1:
T,k
|
λ
t
)
·
o
t,k
P
N
k
=1
P
T
t
=1
P
(
q
t
=
q
i
,
O
1:
T,k
|
λ
t
)
,
Σ
i,t
+1
=
P
N
k
=1
P
T
t
=1
P
(
q
t
=
q
i
,
O
1:
T,k
|
λ
t
)
·
(
o
t,k
μ
t
+1
i
)(
o
t,k
μ
t
+1
i
)
P
N
k
=1
P
T
t
=1
P
(
q
t
=
q
i
,
O
1:
T,k
|
λ
t
)
.
where
N
is the number of subjects and
T
is the length of the time-series observation. Here, we had 76
subjects with 59 observations for each.
We calculated the information revelation (
IR
) and
R
2
at the identified behavioral windows and plotted
the clustering results in the two-dimensional feature space (the x-axis represents information revelation (
IR
)
and the y-axis represents
R
2
). However, the
R
2
calculated at the incremental window
[25
,
35]
of subject
29 was too small, making the incremental window class and the conservative window class overlapped (Fig.
S4
D). Therefore, we did not consider this behavioral window as an incremental window in the main analysis,
and the main findings remained to be the same when retaining this behavioral window (Fig.
S4
A, B, C).
9
4.
Method S4. Estimate the hemodynamic response function with the GLM model
As the BOLD signal could be driven by both the task design and the information exchange between
regions, it was necessary to regress out the activation of the whole brain from the BOLD signal before
calculating the effective connectivity. Therefore, we estimated the hemodynamic response function of the
brain regions of interest by the means of the GLM model[
7
]. Given the matrix formed by the stacking of k
basis elements
B
= [
b
1
,
···
,
b
k
]
(here we used the 3-HRF basis, i.e., hrf (with time and dispersion derivatives)
in the SPM8), we first convolved the basis elements with the event-train of each condition (i.e., trail onset,
thinking, choice making), then down-sampled the signal to the same sampling rate as the BOLD signal. Next,
the designed matrix
X
B
was formed by the above regressors and a matrix of nuisance parameters, such as
the trends and the motion parameters (i.e., three translations and three rotations). The estimated vector
was then given by
b
β
B
=
X
B
y
. We used the estimated vector
b
β
B
to calculate the hrf for each condition. To
ensure that the residuals were white noise, the BOLD signal and the design matrix were pre-whitened before
the estimation[
8
].
10
5.
Method S5. Accounting for the possible confounding effect of the hemodynamic response
function on the findings
Instead of the classic Granger causality, the model applied to the BOLD signal was the Granger causality
with signal-dependent noise (GCSDN) in our case. Therefore, it was necessary to examine whether the
information flow detected by the GCSDN model could still reflect the correct underlying effective connectivity
with the possible confounding effect of the hemodynamic response function. We examined the effect of the
hemodynamic response function in the following aspects.
5.1.
Compare the HRF delay between two brain regions
The delay between the hemodynamic response and the neuronal activity was estimated by the time
required for the HRF to reach its peak value from the onset. Comparing such delays among three ROIs with
the pairwise t-test, we found no significant difference in the condition of thinking (Fig.
S10
A-C). Therefore,
the regional variation of HRF would not be a significant problem in the current study.
5.2.
Numerical simulations of the HRF delay
Since the sensitivity of the analyses for effective connectivity depends on the neuronal transmission delay
between the source and the effect regions and their relative HRF delay (i.e., the time required for the HRF
to reach its peak value from the onset.)[
9
], we assessed the performance of the tvGCSDN model at different
levels of HRF delay and neuronal transmission delay.
Consider two brain regions X and Y, the HRF delay of X was longer than that of Y and the relative HRF
delay varied in 0-1s. The underlying effective connectivity between them could be described by the time
series model. Suppose X was the source region and Y was the effect region, the model was defined as follows,
8
<
:
x
t
=
A
xx
x
t
l
+
H
1
2
xy,t
u
xy,t
y
t
=
A
yx
x
t
l
+
A
yy
y
t
l
+
H
1
2
yx,t
u
yx,t
,
H
xy,t
= 0
.
1 + 0
.
01
x
2
t
l
,
H
yx,t
= 0
.
1 + (
B
yx
x
t
l
+ 0
.
01
y
t
l
)
2
,
(S1)
where
l
represented the neuronal transmission delay. Single-cell recordings in monkeys had shown that the
median latencies increased by approximately 20 ms from one brain region in the visual hierarchy to the next
[
10
]. Considering that the size of the human brain increased compared with that of the monkeys, the neuronal
delay could be longer. Therefore, the neuronal delay
l
was set to vary from 40ms to 140ms. Following the
settings of simulation in the literature[
9
], the autocorrelation parameter
A
xx
and
A
yy
was set to 0.9, and
11
the influence from
X
to
Y
was set to 0.5 for
A
xy
in the autoregressive model and
0
.
005
for
B
yx
in the
signal-dependent noise respectively.
We first generated two time series
x
t
and
y
t
by the Granger causality model with signal-dependent
noise (Eq.
S1
), both signals were simulated for 300,000 time steps of 1 ms (300 s). Next, the signal was
convolved with the hemodynamic response functions of the corresponding region, which was generated by the
SPM8. Gaussian white noise was then added to the signal, representing the physiological noise in the BOLD
response. Subsequently, the signal was down-sampled to the same rate as the BOLD signal (i.e., 2 HZ), and
the Gaussian noise was again added to represent the acquisition noise. The signals were normalized to zero
mean and unit-variance after each step above and the total amount of added noise was 20%.
We repeated the above experiment 100 times. The GCSDN model was applied to calculate the informa-
tion flow (
IF
) of the simulated time series in both directions. We detected the dominant direction of the
information flow by the difference in the information flow
r
diff
=
1
2
IF
X
Y
IF
Y
X

, where
IF
X
Y
and
IF
Y
X
were two chi-squared distribution statistics with the same degrees of freedom. Therefore, the
distribution function of
r
diff
is
T
m
(
x
) =
1
2
m
π
Γ(
m
+
1
2
)
x
m
K
m
(
x
)
,
where
K
m
(
x
)
is a modified Bessel function,
Γ(
·
)
is a Gamma function, and
m
=
d
x
d
y
1
2
[
11
]. A table
for the two-sided one and five percent quantiles of this distribution can be found in [
11
]. In the current
paper, we took
d
x
=
d
y
= 1
. Therefore, 4.61 represents a significant level of 0.01, if
r
diff
<
4
.
61
,
we detected a effective connectivity from
X
to
Y
, if
r
diff
>
4
.
61
, we detected a effective connectivity
from
Y
to
X
. Otherwise, no significant effective connectivity could be detected. We counted the times of
r
diff
<
4
.
61
,
IF
Y
X
<
IF
X
Y

and
r
diff
>
4
.
61
,
IF
X
Y
<
IF
Y
X

, then calculated the proportion of
inverted, correct and non-significant results.
It could be found that when the underlying effective connectivity was from X to Y, where the source
region had longer HRF delay than the target region, though the number of detected non-significant results
increased, there were few inverted results(Fig.
S11
A). The proportion of inverted results decreased when
neuronal delay increased. Assuming no difference in HRF delay, the proportion of detected inverted results
was less than 1% in all conditions of neuronal delays. Together with the findings of no significant regional
variation in the HRF delays of 3 ROIs (Method S4), we believe that the dynamic information flow estimated
by the tvGCSDN was a reliable measurement of the strength of the effective connectivity.
We also simulated the time series with Y-to-X underlying effective connectivity 100 times, the effect region
had a longer HRF delay than the source region in this case. GCSDN was applied to the time series to detect
the information flow, but no inverted result was found (Fig.
S11
B).
12