of 17
Parallel inverse-problem solver for time-domain optical tomography
with perfect parallel scaling
E. L. Gaggioli
a,b,
, and O. P. Bruno
c
a
Instituto de Astronom ́ıa y F ́ısica del Espacio (IAFE, CONICET–UBA), Casilla de Correo 67 - Suc. 28 (C1428ZAA), Buenos Aires,
Argentina.
b
Departamento de F ́ısica, FCEyN, Universidad de Buenos Aires.
c
Computing and Mathematical Sciences, Caltech, Pasadena, CA 91125, USA.
Abstract
This paper presents an efficient
parallel
radiative transfer-based inverse-problem solver for time-domain optical tomogra-
phy. The radiative transfer equation provides a physically accurate model for the transport of photons in biological tissue,
but the high computational cost associated with its solution has hindered its use in time-domain optical-tomography
and other areas. In this paper this problem is tackled by means of a number of computational and modeling innovations,
including 1) A spatial parallel-decomposition strategy with
perfect parallel scaling
for the forward and inverse problems
of optical tomography on parallel computer systems; and, 2) A Multiple Staggered Source method (MSS) that solves
the inverse transport problem at a computational cost that is
independent of the number of sources employed
, and which
significantly accelerates the reconstruction of the optical parameters: a six-fold MSS acceleration factor is demonstrated
in this paper. Finally, this contribution presents 3) An intuitive derivation of the adjoint-based formulation for eval-
uation of functional gradients, including the highly-relevant general Fresnel boundary conditions—thus, in particular,
generalizing results previously available for vacuum boundary conditions. Solutions of large and realistic 2D inverse
problems are presented in this paper, which were produced on a 256-core computer system. The combined parallel/MSS
acceleration approach reduced the required computing times by several orders of magnitude, from months to a few hours.
1. Introduction
This paper presents an efficient parallel inverse-problem
solver, based on the radiative transfer equation (RTE), for
time-domain optical tomography. As is well known, the
RTE provides a physically accurate model for the trans-
port of photons in biological tissue [1, 2], but its appli-
cability in the context of inverse problems has been hin-
dered by the high computational cost required for its so-
lution. In this paper this difficulty is addressed by means
of a combination of three main strategies, namely, (i) Use
of the spectral FC-DOM approach [3] for the solution of
the RTE equations; (ii) An effective parallel implementa-
tion of the FC-DOM method based on a spatial domain-
decomposition strategy; and (iii) A Multiple Staggered
Source (MSS) setup, which utilizes certain combinations of
sources operating simultaneously instead of the sequences
of single sources used in previous approaches. When imple-
mented on a 256-core computer cluster the overall paral-
lel/MSS approach results in computing-time acceleration
by several orders of magnitude—thus making it possible
to solve large 2D inverse problems, such as the problem of
imaging within a neck model considered in Section 7.
The proposed direct parallel solvers provide a num-
ber of advantages. In contrast to other algorithms, for
Email address:
egaggioli@iafe.uba.ar
(E. L. Gaggioli)
example, the approach presented in this contribution is
highly efficient independently of the number of sources
employed (cf. [4]). Alternative parallel strategies based
on GPU architectures [5] might be appropriate for optical
tomography based on the diffusion approximation. But,
unfortunately, the diffusion approximation is not accurate
for many applications, and, in view of the high storage
required by the RTE-based time-domain inverse problem,
a GPU-based RTE solver may not be viable. In the re-
view [6] of parallelization strategies for this problem, for
strong scaling tests, all parallelization strategies show an
efficiency scaling significantly below the ideal. Ref. [7]
presents the only parallel strategy known to the authors
which reports a perfect parallel efficiency, but the applica-
bility of the method is restricted by a fundamental limita-
tion to non-absorbing and non-scattering media. In con-
trast, the parallel algorithm presented in this paper enjoys
perfect parallel efficiency for general problems, including
arbitrarily prescribed scattering and absorption, and finds
no restrictions regarding the transport regime, the number
of sources, or the number of discrete ordinates employed.
A parallel efficiency of 136.7% with up to 256 cores for
the benchmark presented in this work has been obtained
by means of the proposed parallelization for the FC-DOM
algorithm (see sec. 4.3 Figure 4 and its caption). For com-
parison, reference [8, p. 153] reports a computing time
of 44.3 hours for a single 2D forward time solution of the
Preprint submitted to Elsevier
February 22, 2022
arXiv:2202.09421v1 [physics.med-ph] 16 Feb 2022
RTE, which, running on 64 cores, the algorithms presented
in this paper obtain in less than thirty minutes.
As mentioned above, an additional significant reduc-
tion in the computing time required for the solution of
the inverse problem is achieved by exploiting the proposed
MSS method—which, combining multiple sources in each
RTE solution, reduces the number of time-domain direct
and adjoint RTE solutions required. MSS acceleration by
a factor of six is demonstrated in this paper, without any
degradation in the accuracy of the inverse problem recon-
struction, relative to the time required by the Transport
Sweep method [9, 10] (TS) ubiquitously encountered in
optical tomography.
The rest of this paper is organized as follows. Sec-
tion 2 presents the classical optical-tomography forward
problem. Section 3 then presents a derivation of the ad-
joint gradient formalism under the highly relevant general
Fresnel boundary conditions (thus generalizing results pre-
viously available for vacuum boundary conditions) which,
in particular, incorporates a class of
generalized sources
inherent in the MSS method in addition to the classical
source types inherent in the TS method. A direct compar-
ison, presented in Section 3.6, between gradients for the
objective function obtained by finite differences and the
adjoint method demonstrate the validity and accuracy of
the proposed adjoint approach. Section 4, in turn, presents
the numerical methods employed for the numerical solu-
tion of the RTE, as well as the spatial domain decom-
position strategy utilized for their efficient parallelization.
Section 5 demonstrates the high-order convergence of the
overall RTE parallel solver for smooth solutions, and it
presents a demonstration for a photon density of a type
used in laboratory settings. Section 7 presents results of a
number of reconstructions obtained by the proposed meth-
ods, including large 2D inverse problems, such as the prob-
lem of imaging within a head model and an MRI-based
neck model.
2. Preliminaries
Defining the transport operator
T
[
u
] =
1
c
∂u
(
x
,θ,t
)
∂t
+
ˆ
θ
·∇
u
(
x
,θ,t
) +
a
(
x
)
u
(
x
,θ,t
)
+
b
(
x
)
[
u
(
x
,θ,t
)
S
1
η
(
ˆ
θ
·
ˆ
θ
)
u
(
x
,t
)
]
,
(1)
we consider the initial and boundary-value radiative trans-
fer problem
T
[
u
]
= 0
,
(
x
)
×
[0
,
2
π
)
u
(
x
,θ,t
= 0) = 0
,
(
x
)
×
[0
,
2
π
)
u
(
x
,θ,t
) =
f
(
ˆ
θ
·
ˆ
ν
)
u
(
x
r
,t
) +
q
(
x
,θ,t
)
,
(
x
)
Γ
(2)
for the amount of energy
u
=
u
(
x
,θ,t
) irradiated at a
point
x
, propagating with direction
ˆ
θ
S
1
, at a time
t
,
over the two-dimensional spatial domain Ω and for 0
t
T
(
T >
0), where
ˆ
θ
= (cos(
θ
)
,
sin(
θ
)) (0
θ
2
π
),
ˆ
θ
= (cos(
θ
)
,
sin(
θ
)) (0
θ
2
π
), and where
S
1
,
c
,
a
(
x
) and
b
(
x
) denote the unit circle, the average speed of
light in the medium, and the absorption and scattering
coefficients, respectively. Further, ˆ
ν
denotes the outward
unit vector normal to the boundary
Ω of the domain Ω,
Γ
±
=
{
(
x
)
×
[0
,
2
π
)
,
±
ˆ
ν
·
ˆ
θ >
0
}
—with subindex
(resp. +) denoting the set of incoming (resp. outgo-
ing) directions—and
q
(
x
,θ,t
) denotes a given source func-
tion, which can be used to model illuminating lasers that
inject radiation over some portion of Γ
. Additionally,
the function
f
(
ˆ
θ
·
ˆ
ν
) =
f
(cos(
θ
θ
ν
)) denotes the Fres-
nel coefficient, according to which radiation is reflected
into Ω and transmitted away from Ω in the directions
ˆ
θ
r
= (cos(
θ
r
)
,
sin(
θ
r
)) and
ˆ
θ
tr
= ˆ
α
(
ˆ
θ
) respectively, where,
in accordance with Snells’ law for the refractive indexes
n
and
n
0
within and outside Ω,
θ
r
=
R
(
θ
) = 2
θ
ν
θ
+
π
(mod 2
π
) and ˆ
α
(
ˆ
θ
) denote a reflection function and the
transmission angle, respectively. And, finally
η
(
ˆ
θ
·
ˆ
θ
) =
1
2
π
1
g
2
(1 +
g
2
2
g
ˆ
θ
·
ˆ
θ
)
3
/
2
(3)
denotes the widely used Henyey–Greeinstein phase func-
tion [11] which models the angle dependent redistribu-
tion probability after a photon collision event. The in-
dex
g
[
1
,
1] characterizes the scattering anisotropy:
pure forward scattering (resp. pure backward scattering),
wherein all photons emerge in the forward (resp. back-
ward) direction after a collision event, corresponds to a
value of
g
= 1 (resp.
g
=
1). Isotropic scattering, in
turn, where a photon emerges with equal probability in
any given direction after collision corresponds to the value
g
= 0. The total density of photons at a particular point
x
at a time
t
, which is considered in particular in the context
of the photon diffusion approximation, is quantified by the
scalar flux
Φ(
x
,t
) =
S
1
u
(
x
,θ,t
)
dθ.
(4)
To conclude this section we mention the window func-
tion [12]
w
(
v
) =
1
for
v
= 0
,
exp
(
2
e
1
/
|
v
|
|
v
|−
1
)
for 0
<
|
v
|
<
1
,
0
for
|
v
|≥
1
.
(5)
of the real variable
v
, which vanishes for
|
v
| ≥
1 and
smoothly transitions to one in the interval
1
< v <
1.
This function is used in multiple roles in what follows—
including modeling of both the temporal profile and the
collimated irradiation of the laser beam, as well as the
spatial dependence of the sensitivity of the detectors.
3. Inverse problem and adjoint gradient formalism
3.1. Objective functions
This section introduces the general form of the
objec-
tive functions
whose minima yield the solutions of the TS
2
and MSS inverse transport problems we consider, as well
as the
adjoint formalism
we use for the efficient evalua-
tion of the corresponding objective-function gradients un-
der general Fresnel boundary conditions—that is, using
a possibly non-vanishing Fresnel coefficient
f
in the last
line in equation (2). For
f
(
ˆ
θ
·
ˆ
ν
)
0 the gradient ex-
pression we obtain coincides with the well-known result
for homogeneous boundary conditions that are relevant in
neutron transport [13] (vacuum boundary conditions) and
that are also often assumed in the context of optical to-
mography [9, 10, 14].
The TS and MSS objective functions are based on use
of sources of two different types, both of which can ex-
pressed in the form
q
=
q
i
(
x
,θ,t
) =
N
s
k
=1
s
k,i
(
x
,θ,t
)
, i
= 1
,
2
,...,N
q
(6)
for certain integer values of
N
q
and
N
s
. Here, using the
window function (5), we define
s
k,i
(
x
,θ,t
) = exp(
−|
x
x
k,i
|
2
/
2
σ
2
x
)
w
(
δ
k,i
)
T
(
t
τ
k,i
)
,
(7)
for given laser-beam positions
x
k,i
Ω, beam diameter
σ
x
, time delays
τ
k,i
0, and angular departures
δ
k,i
=
|
θ
θ
k,i
|
θ
, where
θ
k,i
(0
θ
k,i
<
2
π
) and
σ
θ
model
the center and the angular spread of the nearly collimated
irradiation from the (
k,i
)-th laser beam. Note that, for
both the TS and MSS configurations a total of
N
b
=
N
q
N
s
(8)
laser beams are utilized.
For the TS-type sources we set
N
s
= 1 and, gener-
ally,
N
q
>
1 (that is to say, a sequence of
N
q
>
1 single-
laser sources is used, requiring the solution of
N
q
pairs of
forward/adjoint problems per gradient-descent iteration),
while in the proposed MSS-type sources we let
N
s
>
1 and
N
q
= 1 (so that
N
s
>
1 laser beams are combined in a sin-
gle “generalized” source, thus requiring the solution of only
one forward/adjoint problem pair per gradient-descent it-
eration). In each “sweep” of the TS method each one of
the
N
q
>
1 sources is applied independently of all oth-
ers, with delays
τ
1
,i
= 0 (1
i
N
q
), and readings are
recorded at all of the detectors used [9, 10]. In the pro-
posed MSS method, instead, the single generalized source
(
N
q
= 1) is used which incorporates time-staggered con-
tributions from
N
s
>
1 laser beams along
Ω, with time
delays
τ
k,
1
0. In view of the time delays it utilizes, the
single forward/adjoint solve used in the MSS method re-
quires longer computing time than each one of the
N
q
>
1
forward/adjoint solves required by the TS method. In all,
as illustrated in Section 7, the combined-source strategy
inherent in the MSS approach leads to significant gains in
the overall inversion process without detriment in recon-
struction accuracy.
Both the TS and MSS approaches rely on use of a
number
N
d
1 of detectors, where the
j
-th detector
(1
j
N
d
), which is placed at the point
x
j
Ω,
is characterized by a measurement operator
G
j
=
G
j
[
u
](
t
)
defined by
G
j
[
u
] =
ˆ
θ
·
ˆ
ν>
0
[1
f
(
ˆ
θ
·
ˆ
ν
)]
ˆ
θ
·
ˆ
ν
×
w
(
|
x
x
j
|
σ
d
)
u
(
x
,θ,t
)
dθdS
(9)
for any given function
u
=
u
(
x
,θ,t
) defined for (
x
,θ,t
)
×
[0
,
2
π
)
×
[0
,T
]. Here, using equation (5) and letting
σ
d
>
0 denote the effective area of the detectors, the factor
w
(
|
x
x
j
|
d
) characterizes the spatial sensitivity of the
j
-th detector and
dS
denotes the element of area on
Ω.
Clearly, the operator
G
j
quantifies the flux of transmitted
photons over the surface of the detector. For each gener-
alized source
q
i
we have a set of
N
d
time resolved detector
readings. The position and number of detectors remain
fixed throughout the inversion process.
3.2. Inverse problem
The reconstruction of the absorption properties in tis-
sue enables the identification of tumors [15, 16, 17], func-
tional imaging of the brain [18, 19, 20], and characteriza-
tion of different tissue constituents in medical imaging.
In this work we focus in the reconstruction of the ab-
sorption coefficient only, although the proposed approach
can be easily extended to other reconstruction problems,
such as, e.g., the problem of determining the RTE sources,
with application in the related discipline of fluorescence
optical tomography and bioluminescence tomography [1,
14, 21]. Prior information on the scattering coefficient
b
(
x
), which can be obtained from high resolution imag-
ing modalities, is generally assumed for cancer diagnosis
and treatment monitoring [22, 23]. Such prior knowledge
additionally provides (limited) information on the parame-
ter
a
(
x
), namely, the known absorption coefficient of, say,
bone and air, on one hand, as well as upper and lower
bounds on the absorption coefficient of soft tissue, which
can be used to constrain the space of functions where the
minimizer is sought. In view of these considerations, in
what follows we make explicit the dependence of the trans-
port operator
T
and the solution
u
in eq. (1) on the ab-
sorption coefficient
a
=
a
(
x
) by denoting
T
[
u
] =
T
[
u,a
] =
T
[
u,a
](
x
,θ,t
)
(10)
and
u
=
u
[
a
] =
u
[
a
](
x
,θ,t
)
,
(11)
respectively.
We express our inversion problem for the optical pa-
rameter
a
(
x
) in terms of the problem of minimization of
the objective function
Λ[
a
] =
N
q
i
=1
g
i
[
u
i
]
,
(12)
3
where, for a given absorption coefficient
a
,
u
i
=
u
i
[
a
] =
u
i
[
a
](
x
,θ,t
)
(13)
denotes the solution
u
=
u
i
of equation (2) with
q
=
q
i
(increasingly added detail is included in eq. (12) from left
to right concerning the dependence of
u
i
on
a
and the
spatial, angular and temporal variables), and where, for
a given number
N
q
×
N
d
of detector measurements
̃
G
j,i
(
N
d
detector readings
̃
G
j,i
for each of the
N
q
generalized
sources
q
i
) and using eq. (9)
g
i
, denotes the functional
g
i
[
u
] =
1
2
N
d
j
=1
T
0
(
G
j
[
u
]
̃
G
j,i
)
2
dt.
(14)
3.3. Functional derivatives
To minimize the objective function eq. (12) we rely on
a gradient descent algorithm based on use of the func-
tional derivative
d
Λ
da
[
a
;
δa
] with respect to the absorption
coefficient function
a
=
a
(
x
) in the direction
δa
. Here
d
da
denotes Gateaux differentiation [24]: for a given func-
tion
a
=
a
(
x
) and a given perturbation
δa
=
δa
(
x
), the
Gateaux derivative of a given functional
h
=
h
[
a
] in the
direction
δa
is defined by
dh
da
[
a
;
δa
] = lim
ε
0
h
[
a
+
εδa
]
h
[
a
]
ε
.
(15)
A similar definition can be given for partial Gateaux deriva-
tives for an operator
w
=
w
[
a
](
x
,θ,t
) (such as, e.g., the
operator (1), the solution
u
=
u
[
a
] =
u
[
a
](
x
,θ,t
) of equa-
tion (2), etc.):
∂w
∂a
[
a
;
δa
](
x
,θ,t
) = lim
ε
0
w
[
a
+
εδa
](
x
,θ,t
)
w
[
a
](
x
,θ,t
)
ε
.
(16)
In what follows we utilize Gateaux derivatives of com-
position of functionals and operators, for which the chain
rule is satisfied. For example, for the composition
h
w
[
a
] =
h
[
w
[
a
]
]
we have the chain-rule identity
d
(
h
w
)
da
[
a
;
δa
]
=
dh
dw
[
w
[
a
];
∂w
∂a
[
a
;
δa
]
]
.
(17)
In our context we may illustrate this relationship as fol-
lows. As a variation equal to a real number
ε
times a func-
tion
δa
=
δa
(
x
) is added to the function
a
, a perturbed
function (
a
+
εδa
) is obtained and, thus, a perturbed oper-
ator value
w
[
a
+
εδa
]. (In our case, the perturbed operator
value could be e.g. the solution
u
[
a
+
εδa
] of equation (2)
with absorption coefficient (
a
+
εδa
); cf. eq. (11).) In view
of the Gateaux-derivative definition (16) we obtain
w
[
a
+
εδa
] =
w
[
a
] +
ε
∂w
∂a
[
a
;
δa
] +
o
(
ε
)
where
o
(
ε
)
ε
0 as
ε
0. In other words, the error in
the approximation
w
[
a
+
εδa
]
w
[
a
] +
ε
∂w
∂a
[
a
;
δa
] is much
smaller than
ε
. We may thus utilize the approximation
h
[
w
[
a
+
εδa
]
]
h
[
w
[
a
] +
ε
∂w
∂a
[
a
;
δa
]
]
in the quotient of increments, of the form (16), for the
derivative of the composite function
h
[
w
[
a
]
]
, which yields
lim
ε
0
h
[
w
[
a
+
εδa
]
]
h
[
w
[
a
]
]
ε
=
lim
ε
0
h
[
w
[
a
] +
ε
∂w
∂a
[
a
;
δa
]
]
h
[
w
[
a
]
]
ε
,
and, thus, clearly, the right-hand side of (17), as desired.
The needed functional derivative of the objective func-
tion (12) is given by
d
Λ
da
=
N
q
i
=1
d
(
g
i
u
i
)
da
[
a
;
δa
]
.
(18)
In order to obtain the derivatives in the sum on the right-
hand side of this equation we apply the chain rule iden-
tity (17), which yields
d
(
g
i
u
i
)
da
[
a
;
δa
] =
dg
i
du
[
u
i
[
a
];
∂u
i
∂a
[
a
;
δa
]
]
,
(19)
or, using (9) and (14),
d
(
g
i
u
i
)
da
[
a
;
δa
] =
G
[
a
;
δa
] where
G
[
a
;
δa
]
:
=
T
0
ˆ
θ
·
ˆ
ν>
0
N
d
j
=1
(
G
j
[
u
i
[
a
]
]
̃
G
j,i
)
[1
f
(
ˆ
θ
·
ˆ
ν
)]
×
ˆ
θ
·
ˆ
νw
(
|
x
x
j
|
σ
d
)
∂u
i
∂a
[
a
;
δa
](
x
,θ,t
)
dθdSdt.
(20)
Clearly, in view of eq. (20), the gradients (18) required
by the gradient descent strategy in a fully discrete context
could be produced by evaluating and substituting in this
equation the derivative
∂u
i
∂a
[
a
;
δa
], for each (discretized)
absorption coefficient
a
in the gradient descent process,
and for all (discretized) directions
δa
. But, the evalua-
tion of these partial derivatives, say, by means of a simple
finite difference scheme, requires evaluation of one fully
spatio-temporal solution of the transport eq. (2) for each
direction
δa
, which clearly entails an extremely high, crip-
pling, computational burden. To avoid this computational
expense we rely on the adjoint-method strategy, which is
described in what follows.
3.4. Fast gradient evaluation via the adjoint method
To evaluate the derivative displayed in eq. (20) we seek
to eliminate the quantity
∂u
i
∂a
[
a
;
δa
] from the right-hand
side of this equation. As indicated in what follows, this
can be achieved by considering the initial and boundary
problem that is obtained by differentiation, for the given
a
and in the direction
δa
, of each one of the three equations
in the initial and boundary problem (2). From the first
line in (2), in particular, we obtain
0 =
d
T
da
[
u
i
[
a
]
,a
;
δa
]
=
T
∂u
[
u
i
[
a
]
,a
;
∂u
i
∂a
[
a
;
δa
]
]
+
T
∂a
[
u
i
[
a
]
,a
;
δa
]
.
(21)
4
But, by linearity of
T
we have
T
∂u
[
u
i
[
a
]
,a
;
∂u
i
∂a
[
a
;
δa
]
]
=
T
[
∂u
i
∂a
[
a
;
δa
]
,a
]
,
(22)
and, thus, in view of (2), the relation
T
∂a
[
u
i
[
a
]
,a
;
δa
]
+
T
[
∂u
i
∂a
[
a
;
δa
]
,a
]
= 0
(23)
results. This relation provides, for each relevant triple
(
x
,θ,t
), one linear equation for the two unknowns
u
i
[
a
]
and
∂u
i
∂a
[
a
;
δa
]
.
In order to eliminate
∂u
i
∂a
[
a
;
δa
] from the right-hand side
of (20) we subtract from from both sides of this identity
a “linear combination with suitable coefficients”
λ
of the
relation (23)—or, more precisely, an integral of the prod-
uct of this relation times a suitable function
λ
(
x
,θ,t
) over
(
x
,θ,t
)
×
[0
,
2
π
)
×
[0
,T
]. (Below we incorporate addi-
tional equations related to the initial and boundary con-
ditions in (2) as well.) For notational compactness we
express such integrals in terms of the scalar product nota-
tion
v,w
=
T
0
S
1
v
(
x
,θ,t
)
w
(
x
,θ,t
)
dθd
x
dt
(24)
for any two functions
v
and
w
of the variables (
x
,θ,t
).
Thus, for a given function
λ
i
=
λ
i
[
a
](
x
,θ,t
) we obtain
from (23) the equation
λ
i
,
T
∂a
[
u
i
,a
;
δa
]
+
λ
i
,
T
[
∂u
i
∂a
[
a
;
δa
]
,a
]〉
= 0
,
(25)
which, for a suitably selected function
λ
i
we intend to sub-
tract from (20) to achieve the cancellation of the challeng-
ing derivative term.
To select the function
λ
i
that attains such cancellation
we rely on an integration-by-parts calculation to express
the second summand in (25) as an integral of a product
of two functions, one of which is precisely
∂u
i
∂a
. Integra-
tion by parts of that second summand leads to a sum
of a “volumetric” integral
A
(namely, an integral over
×
[0
,
2
π
)
×
[0
,T
]) plus a sum
B
+
C
of integrals over
various portions of the boundary of this domain:
λ
i
,
T
[
∂u
i
∂a
[
a
;
δa
]
,a
]〉
=
A
+
B
+
C
(26)
where
A
[
a
;
δa
]
:
=
T
0
S
1
∂u
i
∂a
[
a
;
δa
]
[
1
c
∂λ
i
∂t
(27)
ˆ
θ
·∇
λ
i
+ (
a
+
b
)
λ
i
b
S
1
η
(
ˆ
θ
·
ˆ
θ
)
λ
i
]
dθd
x
dt,
B
[
a
;
δa
]
:
=
S
1
[
∂u
i
∂a
[
a
;
δa
]
λ
i
]
T
0
dθd
x
,
(28)
and
C
[
a
;
δa
]
:
=
T
0
S
1
ˆ
θ
·
ˆ
νλ
i
∂u
i
∂a
[
a
;
δa
]
dθdSdt.
(29)
Subtracting the linear combination (25) from (20) and us-
ing (26)-(29) we obtain
d
(
g
i
u
i
)
da
[
a
;
δa
] =
G−A−B−C
λ
i
,
T
∂a
[
u
i
,a
;
δa
]
.
(30)
Clearly, the quantity
∂u
i
∂a
in (30) will be eliminated, as
desired, if and only if
A
+
B
+
C
=
G
,
(31)
since the last term on the right-hand side of (30) does no
contain
∂u
i
∂a
. Once we select
λ
i
such that (31) is satisfied,
and using the Gateaux derivative relation
T
∂a
[
u
i
,a
;
δa
] =
δa
(
x
)
u
i
[
a
](
x
,θ,t
)
,
(32)
the expression
d
(
g
i
u
i
)
da
[
a
;
δa
] =
−〈
λ
i
[
a
]
,δau
i
[
a
]
(33)
for the functional derivative, which does not contain the
challenging term
∂u
i
∂a
, results from (30).
To obtain the solution
λ
i
=
λ
i
[
a
](
x
,θ,t
) of eq. (31) we
note that, in view of the spatial integration domains in
eqs. (20) and (27)-(29), eq. (31) is satisfied if and only if
(i)
A
= 0, (ii)
B
= 0 and (iii)
C −G
= 0. Equation (i) is
satisfied provided the term in brackets in the integrand
of (27), which will be denoted by
T
[
λ
i
[
a
]
,a
]
in what
follows, equals zero. Relation (ii) is satisfied by impos-
ing the appropriate “final”condition
λ
i
(
x
,θ,t
=
T
) = 0,
since, in view of (2), we have
∂u
i
∂a
= 0 for
t
= 0. In
order to fulfill point (iii), finally, we decompose the inte-
gral (29) into two integrals,
C
+
and
C
, where integra-
tion ranges in the
θ
variable are restricted to angular do-
mains
ˆ
θ
·
ˆ
ν >
0 and
ˆ
θ
·
ˆ
ν <
0, respectively. The inte-
grand in the difference
C
+
−G
, which is only integrated
over the angular domain
ˆ
θ
·
ˆ
ν >
0, equals the product
of the common factor
F
=
∂u
i
∂a
ˆ
θ
·
ˆ
ν
and the difference
P
=
λ
i
N
d
j
=1
(
G
j
[
u
i
]
̃
G
j,i
)
×
[1
f
(
ˆ
θ
·
ˆ
ν
)]
w
(
|
x
x
j
|
σ
d
)
. To
incorporate the summand
C
under the same
ˆ
θ
·
ˆ
ν >
0 inte-
gration range, in turn, we first utilize the Fresnel boundary
condition
∂u
i
∂a
(
x
,θ,t
) =
f
(
ˆ
θ
·
ˆ
ν
)
∂u
i
∂a
(
x
r
,t
) ((
x
)
Γ
)
that results from differentiation of the boundary condition
in eq. (2), and we thus obtain
C
[
a
;
δa
]
:
=
T
0
ˆ
θ
·
ˆ
ν<
0
ˆ
θ
·
ˆ
νλ
i
(
x
,θ,t
)
f
(
ˆ
θ
·
ˆ
ν
)
×
∂u
i
∂a
[
a
;
δa
](
x
r
,t
)
dθdSdt.
5
Then, incorporating the change of variables
θ
= 2
θ
ν
θ
r
+
π
, so that
ˆ
θ
·
ˆ
ν
= cos(
θ
θ
ν
) =
cos(
θ
r
θ
ν
) =
ˆ
θ
r
·
ˆ
ν
we
obtain
C
[
a
;
δa
]
:
=
T
0
ˆ
θ
r
·
ˆ
ν>
0
ˆ
θ
r
·
ˆ
νλ
i
(
x
,
2
θ
ν
θ
r
+
π,t
)
×
f
(
ˆ
θ
r
·
ˆ
ν
)
∂u
i
∂a
[
a
;
δa
](
x
r
,t
)
r
dSdt.
Substituting the dummy variable
θ
r
by
θ
in this equation,
the angular argument in
λ
i
becomes 2
θ
ν
θ
+
π
which
coincides with
θ
r
, and, thus, calling
Q
(
x
r
,t
) =
f
(
ˆ
θ
·
ˆ
ν
)
λ
i
(
x
r
,t
) we obtain
C
[
a
;
δa
]
:
=
T
0
ˆ
θ
·
ˆ
ν>
0
ˆ
θ
·
ˆ
νQ
(
x
r
,t
)
×
∂u
i
∂a
[
a
;
δa
](
x
,θ,t
)
dθdSdt.
Combining this result with the term
C
+
−G
we obtain
(
C−G
)[
a
;
δa
]
:
=
T
0
ˆ
θ
·
ˆ
ν>
0
F
×
[
P
(
x
,θ,t
)
Q
(
x
r
,t
)]
dθdSdt,
and, thus, (iii) is satisfied provided
P
Q
= 0.
In summary, denoting
T
[
λ
i
[
a
]
,a
]
=
1
c
∂λ
i
∂t
ˆ
θ
·∇
λ
i
+
(
a
+
b
)
λ
i
b
S
1
η
(
ˆ
θ
·
ˆ
θ
)
λ
i
, we have shown that the con-
ditions (i), (ii) and (iii) are satisfied provided the corre-
sponding “adjoint” problem
T
[
λ
i
[
a
]
,a
]
= 0
,
(
x
)
×
[0
,
2
π
)
λ
i
(
x
,θ,t
=
T
) = 0
,
(
x
)
×
[0
,
2
π
)
,
and
λ
i
(
x
,θ,t
) =
f
(
ˆ
θ
·
ˆ
ν
)
λ
i
(
x
r
,t
) +
N
d
j
=1
(
G
j
[
u
i
]
̃
G
j,i
)
×
[1
f
(
ˆ
θ
·
ˆ
ν
)]
w
(
|
x
x
j
|
σ
d
)
,
(
x
)
Γ
+
(34)
hold. Thus, the function
λ
i
(
x
,θ,t
) needed in eq. (33) can
be obtained by solving the
adjoint back transport
prob-
lem (34) in the time interval
T
t
0 with homoge-
neous final data at time
t
=
T >
0. Once the function
λ
i
has been obtained the component of the functional gradi-
ent (33) in the direction
δa
can inexpensively be obtained
by integration, which, in view of (24), may be expressed
in the form
d
(
g
u
)
da
[
a
;
δa
] =
T
0
S
1
λuδadθd
x
dt.
(35)
The correctness and accuracy of the proposed approach
for gradient evaluation are demonstrated in the following
section via comparisons with direct finite-difference gradi-
ent computations.
3.5. Verification and accuracy assessment of the functional-
derivative expression
(33)
In this section we present numerical verifications and
accuracy assessments for the functional derivative expres-
sion (35), with
λ
i
given as the solution of the adjoint
problem (34). To do this we consider a problem of the
type (2) with Fresnel boundary conditions, described in
what follows—so as to illustrate, in particular, the abil-
ity of the functional-derivative expression to produce cor-
rect gradients in this case, for which corresponding adjoint
treatments were not previously available. As a basis for
comparison we obtain numerical gradients produced by di-
rect use of the finite-difference approximation
d
(
g
u
)
da
[
a
;
δa
]
FD
g
[
u
[
a
+
εδa
]
]
g
[
u
[
a
]
]
ε
(36)
for a given direction
δa
(
x
) and a suitable small value of
ε
.
The index of refraction
n
= 1
.
4 is assumed in the spa-
tial domain Ω = [
x
min
,x
max
]
×
[
y
min
,y
max
] = [0
,
3]
×
[0
,
3]
and with
n
0
= 1 outside Ω. For simplicity we use
δa
= 1,
with given spatially constant values
a
(
x
) =
a
and
b
(
x
) =
b
of the absorption and scattering coefficients, and, without
loss of generality, we consider a single generalized source
q
1
=
q
for both a TS source case (a single laser incident
beam located at
x
s
= (1
.
5
,
0
.
0)) and an MSS source case
(assumed to consist of the combination of four laser inci-
dent beams, one located at the center of each one of the
sides of the square domain Ω). (Further details on the
modeling of sources can be found at the end of sec. 5,
and eq. (48).) For these tests we employ a single detector
placed at
x
d
= (0
.
0
,
0
.
75). The time delays required by the
MSS method are selected for these examples by enforcing
a 50ps time-shift between successive laser start times. A
total duration of 60ps was used for each single pulse, and
the system was evolved for both TS and MSS cases up to a
final time of 600ps. A mesh with
N
x
=
N
y
= 200,
M
= 32
discrete directions and
T
= 60000 time steps was used for
the solution of the RTE and its adjoint. The relative error
e
=
d
(
g
u
)
da
[
a
;
δa
]
Adj
d
(
g
u
)
da
[
a
;
δa
]
FD
|
d
(
g
u
)
da
[
a
;
δa
]
Adj
|
,
(37)
was used to quantify the quality provided by the pro-
posed adjoint gradient expression, where
d
(
g
u
)
da
[
a
;
δa
]
Adj
and
d
(
g
u
)
da
[
a
;
δa
]
FD
denote the adjoint and finite difference
derivatives, respectively; the value
ε
= 0
.
0001 was used to
produce the finite difference approximation (36).
Table (1) demonstrates the agreement observed be-
tween the values of the functional derivative produced by
the finite-difference and adjoint methods under various
transport regimes, including several values of the scatter-
ing coefficient
b
and anisotropy coefficient
g
, and under
both the TS and the MSS configurations; errors of sim-
ilar magnitudes were obtained for a wide range of val-
ues of the parameters
a
,
b
and
g
. The excellent agree-
ment observed in all cases suggests that the very large
6
Table 1: Functional derivative differences
a
[1
/cm
]
b
[1
/cm
]
g
e
TS
e
MSS
0
.
35
80
0
.
9
0
.
00008
0
.
00009
0
.
35
20
0
.
0
0
.
00008
0
.
00097
0
.
35
8
.
0
0
.
0
0
.
00027
0
.
00016
0
.
35
0
.
1
0
.
0
0
.
00009
0
.
00026
improvements in computational speed provided by the ad-
joint method, which would amount to a factor of the order
of (
N
x
+ 1)
×
(
N
y
+ 1)
'
40
,
000 for the evaluation of the
full gradient in the present example, do not impact upon
the accuracy in the gradient determination.
3.6. Numerical functional gradient calculation
All of the numerical gradients utilized in this paper
were obtained by solving forward and the adjoint prob-
lems, followed by use of a discrete version of eq. (35) for a
number of perturbation functions
δa
—each one of which
is selected to provide a variation of the absorption coeffi-
cient at and around one of the discretized spatial coordi-
nate points
x
`
1
,`
2
Ω in the discretization
x
`
1
,`
2
= (
x
min
+
[
`
1
1]∆
x
x
+ (
y
min
+ [
`
2
1]∆
y
y
,
`
1
= 1
,...,N
x
+ 1,
`
2
= 1
,...,N
y
+ 1 of the domain Ω. The perturbation
δa
is selected as a pyramid-shaped function which equals one
at a point
x
`
1
,`
2
Ω, and becomes zero at and beyond
the first neighbors in the discrete grid. At the discrete
level, each pyramid-shaped function is approximated by a
product of Kronecker delta functions
δa
`
1
,`
2
=
δ
r,`
1
×
δ
s,`
2
.
Thus, denoting by
a
g
(
x
`
1
,`
2
) the value of the functional
gradient in the direction
δa
`
1
,`
2
, the discrete version of
eq. (35) is given by
a
g
(
x
`
1
,`
2
)
∼−
m,j
λ
`
1
,`
2
,m,j
u
`
1
,`
2
,m,j
θ
x
y
t,
(38)
where
λ
`
1
,`
2
,m,j
λ
(
x
`
1
,`
2
m
,t
j
) and where
u
`
1
,`
2
,m,j
u
(
x
`
1
,`
2
m
,t
j
). Note that eq. (38) represents the func-
tional derivative for a single direction
δa
`
1
,`
2
, correspond-
ing to the (
`
1
,`
2
) component of the functional gradient. As
a result of the adjoint method, the evaluation of eq. (38)
for all (
`
1
,`
2
) requires only one forward and one adjoint
transport simulation for each generalized source
q
=
q
i
—a
calculation which, if performed by direct use of eq. (36)
requires a much larger number (
N
x
+ 1)
×
(
N
y
+ 1) of for-
ward transport simulations and associated overwhelming
computational cost. It must be noted, however, that the
adjoint method requires storage in memory of full forward
and adjoint transport solutions. The solver algorithm pro-
posed in Section 4 is well suited for parallel distributed sys-
tems, as it simultaneously provides computational speed
and distributed memory availability.
As an illustration, Figure 1 displays the full spatial
gradient
a
g
(
x
`
1
,`
2
) for 1
`
1
(
N
x
+ 1) and 1
`
2
(
N
y
+ 1), for certain assumed values
̃
G
j,i
, with a single
source and a single detector placed at
x
s
= (1
.
5
,
0) and
x
d
= (1
.
0
,
0) respectively.
Figure 1: (Color online). Full spatial gradient eq. (38) (in arbitrary
units) for a single source-detector pair, located at
x
s
= (1
.
5
,
0) and
x
d
= (0
,
1
.
0) respectively, for selected data values
̃
G
1
,
1
(one source
and one detector).
4. Parallel FC–DOM numerical implementation for
the Radiative Transfer Equation
The numerical treatment of the time dependent RTE
requires a discretization of all variables in phase space,
namely, the spatial, directional and temporal variables. In
this section we present a parallel algorithm for the numer-
ical solution of the RTE on the basis of such a discrete
grid in phase space. As evidenced by the approach used,
and demonstrated by means of numerical experiments in
Sections 4.4 and 5, the proposed algorithm enjoys high or-
der accuracy for smooth solutions as well as high parallel
efficiency.
4.1. Velocity-domain discretization
We discretize the RTE with respect to the velocity
variable
v
=
c
ˆ
θ
by means of the discrete ordinates me-
thod. To do this a set of
M
discrete directions
ˆ
θ
(
θ
m
) =
ˆ
θ
m
= (
ξ
m
m
) (1
m
M
) is utilized where the di-
rection cosines are given by
ξ
m
= ˆ
x
·
ˆ
θ
m
= cos(
θ
m
) and
η
m
= ˆ
y
·
ˆ
θ
m
= sin(
θ
m
) in terms of the Cartesian unit
vectors ˆ
x
and ˆ
y
, where
θ
m
=
2
π
(
m
1)
M
, m
= 1
,
2
,...,M.
(39)
The necessary angular integrations are produced by means
of the trapezoidal rule using the associated quadrature
weights
w
m
= 2
π/M
. Given that the specific intensity
u
is 2
π
-periodic in the angular variable
θ
, and provided
the transport solution is sufficiently smooth, the use of
the trapezoidal rule gives spectral accuracy for integration
with respect to this variable, as illustrated in Figure 5.
Letting
u
m
=
u
(
x
m
,t
), the semidiscrete version of
the differential equation in (2) translates into the system
of equations
1
c
∂u
m
∂t
+
ˆ
θ
m
·∇
u
m
+ (
a
+
b
)
u
m
b
M
m
=1
w
m
p
m,m
u
m
= 0
, m
= 1
,
2
,...,M.
7