Matrix product states with large sites
Henrik R. Larsson,
1,
a)
Huanchen Zhai,
1
Klaas Gunst,
1, 2, 3,
b)
and Garnet Kin-Lic Chan
1,
c)
1)
Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, CA 91125,
USA
2)
Center for Molecular Modeling, Ghent University, Technologiepark 46, B-9052 Zwijnaarde,
Belgium
3)
Department of Physics and Astronomy, Ghent University, Krijgslaan 281, S9, B-9000 Ghent,
Belgium
We explore various ways to group orbitals into clusters in a matrix product state (MPS). We explain how a generic cluster
MPS can often lead to an increase in computational cost and instead propose a special cluster structure, involving only
the first and last orbitals/sites, with a wider scope for computational advantage. This structure is a natural formalism
to describe correlated multireference (MR) theories. We demonstrate the flexibility and usefulness of this approach by
implementing various uncontracted MR configuration interaction, perturbation and linearized coupled cluster theories
using an MPS with large cluster sites. Applications to the nitrogen dimer, the chromium dimer, and benzene, including
up to triple excitations in the external space, demonstrate the utility of an MPS with up to two large sites. We use our
results to analyze the quality of different multireference approximations.
I. INTRODUCTION
The density matrix renormalization group (DMRG) and
its associated ansatz of matrix product states (MPSs)
1–6
are
established as useful electronic structure approximations in
problems where there are a large number of correlated open
shells.
7–13
The formalism requires first mapping the orbitals
to a one-dimensional lattice of sites. One direction discussed
already in the first quantum chemistry DMRG paper
14
is the
possibility of grouping clusters of related orbitals into large
“sites” (Fig. 1), whose Hilbert space is then approximated out-
side of the truncation procedure of DMRG. Such a cluster MPS
seems attractive for interpretation, as one can group orbitals
corresponding to chemical identity, and it has been efficiently
implemented in a number of works.
15–20
However, because the
Hilbert space of the cluster site grows exponentially quickly
with the number of cluster orbitals, the computational advan-
tage is less clear. In addition, the drawbacks of clustering,
which gives rise to more complicated interactions and entan-
glement between clusters, has been understood since the ear-
liest formulations of the quantum renormalization group.
21,22
In the first part of this work, we analyze whether clustering
orbitals is a good idea in chemical problems from the view of
computational cost and accuracy.
In the second part, we discuss a specific setting where
grouping sites into large clusters has a clear theoretical compu-
tational advantage. This occurs when the clusters are at either
end of the DMRG lattice (Fig. 2). Because the clusters do not
share a common boundary, the cluster Hilbert space dimension
appears together with the MPS bond dimension in a computa-
tionally more favourable way than in a general cluster MPS. A
natural application for this type of MPS is to represent dynam-
ical correlations by clustering inactive and external orbitals.
As we demonstrate, this leads to a substantial cost reduction
a)
Electronic mail: larsson [a t] caltech . e
훿
u
b)
Present address: Quantum Simulation Technologies, Inc., Cambridge, MA
02139
c)
Electronic mail: garnetc [a t] caltech . e
훿
u
O
r
b
i
t
a
l
s
Many-body con
fi
gurations
FIG. 1. A cluster matrix product state obtained by combining multi-
ple orbitals into large sites. The Hilbert space of each of the large
sites consists of
푃
many-body configurations that may be further
approximated.
15
in MPS treatments of dynamic correlation.
The paper proceeds as follows. We start by analysing the
cost of cluster MPS (Section II A) and explain why compu-
tational gains are not expected in general settings. We then
analyse the conditions leading to the favourable cost of the
single and double cluster-site model (Section II B) and dis-
cuss its application to uncontracted multireference dynamic
correlation theories (Section II C). We next discuss the de-
tailed implementation of DMRG with large cluster sites in Sec-
tion III, taking advantage of the large-scale parallel DMRG
implementation in Ref. [23]. Finally in Section IV, we demon-
strate the cluster MPS implementation of uncontracted multi-
reference configuration interaction (MRCI), multi-reference
perturbation theory (MRPT), and multi-reference linearized
coupled cluster theories (MRLCC) in applications to the nitro-
gen dimer, chromium dimer, and the benzene molecule, using
complete active spaces (CASs) with up to 30 electrons and 30
orbitals, with up to triples in the external space, and with up
to 280 external orbitals. We conclude in Section V.
Note: while this manuscript was under review, a related
arXiv:2109.11036v2 [physics.chem-ph] 21 Jan 2022
2
FIG. 2. Panel (a), (b): Diagrammatic representation of a cluster
MPS with two large sites at either end of the lattice. This ansatz
demonstrates favourable scaling with respect to the large site Hilbert
space dimension because there is no shared boundary between the
large sites. Panel (c): Application to an uncontracted multireference
dynamic correlation wavefunction in the singles and doubles space,
where the Hilbert space of the green site covers the inactive orbitals
and the Hilbert space of the blue site covers the external orbitals.
preprint appeared that implements uncontracted MRCI by us-
ing an MPS with a single cluster site.
24
II. THEORY
A. Analysis of clustering sites in matrix product states
The benefit of clustering orbitals depends on the entangle-
ment structure of the problem. If the entanglement is such
that groups of sites are strongly entangled internally, but only
weakly entangled between the groups, then it may make sense
computationally to cluster into large sites. The critical ques-
tion is how large this difference in intra- versus inter-cluster
entanglement needs to be for a computational benefit.
To start, we recall the computational cost of the standard
DMRG algorithm, then examine the cost for multiple clusters
of sites, and then finally for large sites at the ends of the DMRG
lattice, the latter being the main focus in this work.
1. Standard matrix product states
In the standard MPS/DMRG formulation, the wavefunction
for
퐾
orbitals is mapped to a lattice of
퐾
sites and written as
a matrix product state (MPS) of bond dimension
퐷
,
1
Ψ
⟩
=
...
{
푛
푖
}
퐀
푛
1
퐀
푛
2
...
퐀
푛
퐾
푛
1
푛
2
...
푛
퐾
⟩
,
(1)
where the matrices
퐀
푛
푖
at “site”
푖
are of dimensions
퐷
푖
×
퐷
푖
+1
,
save those associated with the first and last sites, which are
vectors of size
퐷
푖
. The bond dimension of the MPS is then
defined as
퐷
=
max
푖
퐷
푖
. The local Hilbert space
{
푛
⟩
}
is that
of a spatial orbital
휙
and the dimension is
푃
= 4
(
{
푛
푖
⟩
} =
{
vac
⟩
,
휙
훼
푖
⟩
,
휙
훽
푖
⟩
,
휙
훼
푖
휙
훽
푖
⟩
}
).
The main cost of the DMRG algorithm when using elec-
tronic structure Hamiltonians stems from two steps, performed
at each site: (1) the construction and diagonalization of an
effective Hamiltonian in the product space of one or more
sites and the renormalized Hilbert space of their environment,
(2) the transformation of operators into the new renormalized
space.
14,25
Both steps contribute to the leading computational
scaling, which is usually given (per site) as
[(
퐾
2
<
+
퐾
)
퐷
3
푃
2
+
퐾
2
<
퐾
>
퐷
2
]
, assuming
퐷
푖
∼
퐷
.
퐾
<
(
퐾
>
) is the smaller (larger)
of the numbers of orbitals to the left and right of the biparti-
tion at site
푖
. In the following we assume that
퐷 ≫ 퐾
in order
to drop the second term, which stems from the transformation
of operators. In addition, as
푃
= 4
is a small constant, we
drop the
푃
dependence for a standard MPS. For the total cost,
퐾
<
∕
>
∼
퐾
and the cost per site is then multiplied by
퐾
to
obtain the leading cost for the DMRG algorithm of
[
퐾
3
퐷
3
]
.
For reference, the precise scalings for a normal MPS and dif-
ferent variations of MPSs discussed below are gathered in Ta-
ble I.
2. Cluster matrix product states
The above analysis can be repeated for a cluster MPS with
clusters of orbitals as sites. We assume that each cluster has
퐾
푐
orbitals, with a cluster Hilbert space of
푃
configurations. Then
the number of sites in the cluster MPS is reduced to
퐾
∕
퐾
푐
=
퐶
. To simplify the analysis, we assume that
퐾
푐
is similar for all
clusters. Note that to obtain the same accuracy as the standard
MPS the bond dimension used between two clusters must be
the same as the bond dimension in the standard MPS between
the sites at the boundary of the two clusters. Now consider
increasing the number of orbitals in each cluster
퐾
푐
. Assuming
the full Hilbert space of each cluster is used, then
푃
∼ 4
퐾
푐
.
Because
푃
is now potentially large, we consider it as important
as
퐷
and
퐾
in the analysis of the leading scaling.
The cost of the cluster MPS is then given by
{
퐶퐾
2
[
퐷
3
푃
+
퐷
2
poly
(
푃
)] +
퐶퐾
3
퐷
2
}
, analogous to the standard MPS cost.
We again assume that
퐷 ≫ 퐾
and drop the last term. In con-
trast to the cost given above for the standard MPS, here we have
written the first term without a
푃
2
dependence because we as-
sume the use of a tri-partition to perform the diagonalization
and renormalization. As explained below in Section III C, this
changes the cost of the first term per site from
(
퐾
2
<
퐷
3
푃
2
)
to
(
퐾
<
퐾
>
퐷
3
푃
)
. The term containing poly
(
푃
)
stems from ap-
plying operators in the cluster space onto the site and is at most
푃
2
. However, in many common situations we can use a local
basis (such as a determinantal basis) in which the Hamiltonian
is sparse. Then poly
(
푃
) ∼
푃
. Hence, the leading cost of the
cluster MPS simplifies to
(
퐶퐾
2
퐷
3
푃
)
.
While
푃
only appears linearly in the scaling, it grows
3
TABLE I. Comparison of the scaling for the computational cost of the various MPS-based DMRG formulations.
퐷
defines the bond dimension,
퐾
the number of orbitals,
푃
the number of configurations on the large site, and
퐶
the number of clusters (large sites) in a cluster MPS. For
the MPS with large sites at either ends, we here assume only one large site with
퐾
ext
orbitals, for simplicity. The number of standard (orbital-
based) sites is then
퐾
act
and
퐾
ext
+
퐾
act
=
퐾
. The first column shows the costs for carrying out the Davidson diagonalization and operator
renormalization steps, while the second column is an additional cost for the complementary operator renormalization step.
method
diagonalization & renorm. operators compl. renorm. op.
normal MPS
퐾
3
퐷
3
퐾
4
퐷
2
cluster MPS
퐶퐾
2
[
퐷
3
푃
+
퐷
2
poly
(
푃
)]
퐶퐾
3
퐷
2
single large site MPS
퐾
act
(
퐾
2
act
+
퐾
)
퐷
3
act
퐾
2
act
퐾퐷
2
+(
퐾
2
act
+
퐾
)[
퐷
2
ext
푃
+
퐷
ext
poly
(
푃
)]
MPS-MRCISD (large site MPS) as single large site MPS with
푃
=
퐾
2
ext
MPS-MRCISD
퐾
act
(
퐾
2
act
+
퐾
)
퐷
3
act
퐾
2
act
퐾퐷
2
(conventional MPS)
+
∑
퐾
푖
=
퐾
act
(
퐾
2
<,푖
+
퐾
)
퐷
3
푖
+
∑
퐾
푖
=
퐾
act
퐾
2
<,푖
퐾
>,푖
퐷
2
푖
much faster than the
1∕
퐾
푐
reduction in the number of sites
in the cluster MPS. Although there are ways to truncate the
cluster Hilbert space, e.g., via filtering determinants (“se-
lected configuration interaction”),
15
general linear subspace
projection (“Tucker decomposition”),
26
or using an additional
factorized ansatz for the MPS cluster matrix (“comb tensor
networks”),
16,27
any choice must achieve an effective exponen-
tial reduction in
푃
complexity, to compete with the standard
MPS cost. This means that in systems whose entanglement
is well described by a standard MPS with constant bond di-
mension across the lattice, a cluster MPS is unlikely to reduce
computational cost for the same accuracy.
A different limiting case is in problems described by an MPS
with highly non-uniform bond dimensions, large within a clus-
ter of sites and very small between clusters. The extreme case
is no entanglement (
퐷
= 1
) between clusters, i.e. the state
is a product state of cluster wavefunctions (such as a general-
ized valence bond wavefunction, or for infinitely separated or
noninteracting systems).
18,20,28,29
Since
퐷
= 1
, it does not ap-
pear in the scaling and we need to consider terms non-leading
in
퐷
in the analysis. Assuming determinant-like sparsity in
the local basis, the DMRG cost is then
[
퐶
(
퐾
2
푃
+
퐾
3
)]
.
Conversely, when treating the problem using a standard MPS,
퐷 >
1
when cutting across a cluster. In the worst case,
퐷
∼
(
푃
1∕2
)
, yielding
(
퐶퐾
2
푃
3∕2
)
cost (the leading
퐷
term). This is larger than the cluster MPS result because we
assumed no compressibility within the cluster, and the renor-
malized MPS basis cannot use sparsity. Thus the cluster MPS
is advantageous in this limit.
In general, chemical problems fall between these two limit-
ing cases. Sufficiently weakly interacting units are close to the
second limiting case, and thus there are computational benefits
to using the cluster MPS there. But the exponential overhead
of clustering, together with the presence of long-range inter-
actions (which introduce long-range entanglement) means that
many problems are in fact close to the first scenario. To illus-
trate this, in Appendix A we carry out numerical simulations
using both cluster MPS as well as the ordinary MPS for hydro-
gen chains at several geometries. The results show that there
can be little gain from clustering even in a regime where the
chemical identity of individual atoms or molecular units is ev-
ident.
B. Matrix product states with large sites at the ends
We now turn to the case of main interest in this work, when
there are large sites at one or both ends of the MPS lattice.
In anticipation of the multireference use-cases discussed later,
the orbitals treated in the usual MPS fashion will be denoted
active orbitals, orbitals in the left cluster will be denoted in-
active, orbitals in the right cluster will be denoted external.
The number of orbitals in each class is then
퐾
act
,
퐾
inact
,
퐾
ext
respectively. The ansatz thus takes the form (see also Fig. 2)
Ψ
⟩
=
...
{
푛
}
퐀
퐧
inact
퐀
푛
퐾
inact
+1
퐀
푛
퐾
inact
+2
...
퐀
퐧
ext
퐧
inact
푛
퐾
inact
+1
푛
퐾
inact
+2
...
퐧
ext
⟩
,
(2)
where
퐧
inact
and
퐧
ext
label the Hilbert space of the inactive and
external sites, of dimension
푃
inact
,
푃
ext
respectively.
To simplify the scaling discussion, we will ignore the left
large site, i.e.,
퐾
inact
= 0
. The main finding is easily gen-
eralized to the case of
퐾
inact
≠
0
. Following the discussion
above, the DMRG cost at sites within the active space is the
same as in standard DMRG, i.e.
[(
퐾
2
act
+
퐾
)
퐷
3
푃
2
]
(
푃
= 4
),
where the only difference is that
퐾
act
≠
퐾
. The new consid-
eration is for the site at the boundary between the active sites
and the large external site. The contraction at the boundary
has cost
[(
퐾
2
act
+
퐾
)[
퐷
2
ext
푃
ext
+
퐷
ext
poly
(
푃
ext
)]
where
퐷
ext
is
the bond dimension at the boundary. As for the cluster MPS,
we assume poly
(
푃
ext
) ∼
푃
ext
and drop the last term. Unlike
in the general cluster MPS, however, the cluster Hilbert space
dimension appears with
퐷
2
ext
, not
퐷
3
ext
. Thus, for
푃
ext
not too
large (see below) it is possible to obtain a speedup. For the
case of two large sites we must also consider the boundary be-
tween the inactive large site and the active space, but this takes
the same form where the inactive cluster Hilbert space dimen-
sion is multiplied by
퐷
2
inact
. In the limiting case of
퐾
act
= 0
,
i.e. the MPS consists of only two large sites, the cost scales
as
[(
퐾
2
Int
+
퐾
ext
)
퐷
2
ext
푃
ext
]
. This corner case may be advan-
tageous when
푃
ext
is small enough, but will not be considered
further here.
One concrete application is to use the ansatz Eq. (2) to repre-
sent orbital partitioned quantum chemistry models, such as the
restricted active space (RAS) model and other uncontracted
4
multi-reference dynamic correlation models. For example,
if we assume a singles and doubles theory where the exter-
nal space contains at most two electrons, then
푃
ext
∼
퐾
2
ext
.
Using a standard MPS to represent such a state, the exter-
nal space restriction limits the bond-dimension of the MPS to
퐷
ext
at the boundary. The cost of the standard DMRG con-
traction at the boundary site is
(
퐾
2
ext
퐷
3
ext
푃
2
)
(with
푃
= 4
),
and assuming
퐷
ext
decreases linearly across the external or-
bitals, the leading cost becomes
(
퐾
3
ext
퐷
3
ext
)
(the precise scal-
ing is detailed in Table I). However, using a large external
site, and the expression for the single boundary contraction,
then for
퐾
ext
∼
퐾 > 퐾
2
act
we obtain a speedup of
퐷
ext
rela-
tive to the standard DMRG implementation. If
퐾
ext
< 퐾
2
act
,
the speedup will be larger than
∼
퐷
ext
∕
퐾
ext
. For an external
space with only single excitations out of the active space, the
speedup is even greater, namely up to
퐾퐷
ext
. For a more gen-
eral external space, e.g., constructed by selected configuration
interaction,
30–33
a similar analysis can be applied.
C. Matrix product state formulation of uncontracted
dynamical correlation methods
We next describe how to approximate various uncontracted
multireference dynamical correlation methods using MPS. In
all these cases, the large site MPS ansatz (2) can be used, and
when the excitation degree is small (e.g., up to singles and dou-
bles, in some cases up to triples) we can expect speedup rela-
tive to the standard MPS formulation. This will be assessed in
the benchmark in Section IV.
1. Multireference configuration interaction theory
The uncontracted multi-reference CI (uc-MRCI) ansatz
takes the form
Ψ
uc-MRCI
⟩
=
푛
ref
...
퐶
푐
퐶
퐶
⟩
+
...
퐸
푐
퐸
퐸
⟩
(3)
where
퐶
⟩
denotes a configuration from the reference space
(no particles in the external space, no holes in the inactive
space) and
퐸
⟩
are configurations outside of the reference
space, classified as singles (one particle in the external space),
doubles (two particles in the external space) and so on. The
uc-MRCI coefficients
푐
퐶
and
푐
퐸
are determined by minimiz-
ing the variational energy.
MRCI does not give an extensive energy, e.g., the energy of
independent subsystems is not the sum of the energy of the sys-
tems. Defining the correlation energy as
Δ
퐸
=
퐸
MRCI
−
퐸
0
(
퐸
0
being the energy of the reference wavefunction
Ψ
0
⟩
=
∑
퐶
푐
0
퐶
퐶
⟩
, with
푐
0
퐶
determined variationally), the following
approximate size-extensivity corrections have been defined
(among others
34
):
퐸
D
= (1 −
푐
2
0
)Δ
퐸,
(4)
퐸
RD
= (1 −
푐
2
0
)∕
푐
2
0
Δ
퐸,
(5)
퐸
P
=
t
푁
2
el
+ 2
푁
el
tan(2
휃
) −
푁
el
2[sec(2
휃
) − 1]
Δ
퐸
(6)
≈ (1 − 2∕
푁
el
)
퐸
RD
, 휃
= arccos(
푐
0
)
,
퐸
M
=
푔
M
퐸
RD
, 푔
M
=
(
푁
el
− 2)(
푁
el
− 3)
푁
el
(
푁
el
− 1)
,
(7)
where
퐸
D
(
퐸
RD
) is the (renormalized) Davidson
correction,
34,35
퐸
P
is the Pople correction,
36
and
퐸
M
is
the Meissner correction.
37
푁
el
is the number of correlated
electrons and
푐
2
0
is either defined as
38
푐
2
0
=
푛
ref
...
퐶
푐
2
퐶
,
(8)
or as
39
푐
2
0
=
⟨
Ψ
0
Ψ
MRCISD
⟩
2
.
(9)
Here, we use Eq. (8), which has been found to be slightly
more accurate in many situations.
34,40
The size-consistency-
corrected MRCI methods will be referred to as MRCI+Q
푋
,
where
푋
stands for the particular correction used.
One can also define an energy functional to variationally
minimize that includes the size-extensivity correction in its
definition. This permits a simple implementation of the gra-
dients and properties. Many such functionals can be obtained
by shifting the diagonal of the MRCI Hamiltonian according
to
̂
퐻
→
̂
퐻
+ Δ
̂
푃,
Δ = (1 −
푔
)Δ
퐸,
(10)
where
̂
푃
=
...
퐸
퐸
⟩⟨
퐸
,
(11)
and the parameter
푔
defines the type of correction.
34
Here, we
will use only two variants
푔
= 1−
푔
푀
(MR averaged quadratic
coupled-cluster, MR-AQCC, method,
41
) and
푔
= 2∕
푁
el
(MR
averaged coupled pair functional, MR-ACPF).
42
MR-AQCC is
related to
퐸
푀
and MR-ACPF is related to
퐸
푃
, and MR-ACPF
is extensive for identical subsystems.
The MPS versions of the above theories are easily defined,
by constraining MPS to preserve constraints in the inactive and
external Hilbert spaces. We will refer to the MPS versions
of the above theories by prepending MPS to the name of the
method, e.g., MPS-MRCI, MPS-ACPF, MPS-AQCC, etc. For
brevity, we avoid the additional “uc” prefix and assume that
“MPS-MR
푋
” implies an uncontracted multireference formu-
lation of method
푋
.
5
2. Multireference perturbation theory and multireference
coupled cluster theory
It is straightforward to formulate uncontracted multi-
reference perturbation theory in terms of MPS. This was dis-
cussed in Ref. [43] with subsequent extensions in Refs. [44–
47]. Given a zeroth-order Hamiltonian,
̂
퐻
0
, the first order per-
turbed wavefunction,
Ψ
1
⟩
, can be obtained by minimizing the
Hylleraas functional
48
퐻
[
Ψ
1
⟩
] =
⟨
Ψ
1
̂
퐻
0
−
퐸
0
Ψ
1
⟩
+ 2
⟨
Ψ
1
̂
푄
(
̂
퐻
−
̂
퐻
0
)
Ψ
0
⟩
,
(12)
where
̂
푄
= 1 −
Ψ
0
⟩⟨
Ψ
0
. The MPS formulation corresponds
to representing both
Ψ
0
⟩
and
Ψ
1
⟩
as MPS, approximating the
uncontracted perturbation solution. The energy for third-order
MRPT can be obtained from
퐸
PT3
=
퐸
PT2
+
⟨
Ψ
1
̂
퐻
−
̂
퐻
0
Ψ
1
⟩
(13)
In Ref. [43] the above theory was implemented for the Dyall
Hamiltonian
49
to approximate the uncontracted
푛
-electron
valence-state perturbation theory (NEVPT2),
50–52
and the re-
sulting formulation was termed MPS-PT2. The Dyall Hamil-
tonian
̂
퐻
0
,퐷
is defined as
49,52
̂
퐻
0
,퐷
=
...
푖푗
∈
inactive
퐹
푖푗
̂푎
†
푖
̂푎
푗
+
...
푟푠
∈
ext
퐹
푟푠
̂푎
†
푟
̂푎
푠
+
...
푎푏
∈
act
ℎ
eff
푎푏
̂푎
†
푎
̂푎
푏
+
1
2
...
푎푏푐푑
∈
act
⟨
푎푏
푐푑
⟩
̂푎
†
푎
̂푎
†
푏
̂푎
푑
̂푎
푐
+
퐸
퐷
,
(14)
ℎ
eff
푎푏
=
ℎ
푎푏
+
...
푖
∈
inactive
[2
⟨
푎푖
푏푖
⟩
−
⟨
푎푖
푖푏
⟩
]
,
(15)
퐸
퐷
=2
...
푖
∈
inactive
(
ℎ
푖푖
−
퐹
푖푖
) +
...
푖푗
∈
inactive
[2
⟨
푖푗
푖푗
⟩
−
⟨
푖푗
푗푖
⟩
]
,
(16)
where
퐅
is the generalized Fock matrix.
ℎ
푖푗
=
⟨
푖
ℎ
푗
⟩
are
the one-particle Hamiltonian matrix elements and
⟨
푎푏
푐푑
⟩
the
electron repulsion matrix elements.
In Ref. [44], Fink’s restraining the excitation degree (RE)
Hamiltonian,
53,54
̂
퐻
0
,퐹
=
퐸
0
+
...
푝푞
;Δ
푛
ex
=0
ℎ
푝푞
̂푎
†
푝
̂푎
푞
+
1
2
...
푝푞푟푠
;Δ
푛
ex
=0
⟨
푝푞
푟푠
⟩
̂푎
†
푝
̂푎
†
푞
̂푎
푠
̂푎
푟
,
(17)
was used, leading to an MPS-based version of RE perturbation
theory (REPT).
Δ
푛
ex
= 0
indicates that excitations between
the inactive, active, and external spaces are omitted, compared
to the full Hilbert space. If there is no active space, then
Ψ
0
⟩
is a single Slater determinant, and the result from REPT2 is
identical to the linearized-coupled cluster (LCC) approxima-
tion, thus this approximation was termed MPS-LCC in Ref. [
44]. However, given a multi-reference
Ψ
0
⟩
, this equivalence
no longer holds. We will consider another linearized multi-
reference coupled cluster approximation below, thus we will
refer to this choice of
̂
퐻
0
,퐹
as MPS-MRREPT2.
Refs. [55,56] defined the first linearized multi-reference cou-
pled cluster approximation (MR-LCCM). This corresponds to
the choice
퐻
0
,퐿퐶퐶푀
=
̂
푃
̂
퐻
̂
푃
+
Ψ
0
⟩⟨
Ψ
0
퐸
0
Ψ
0
⟩⟨
Ψ
0
(18)
where
̂
푃
is defined in Eq. (11) and
Ψ
1
⟩
is solved for only in
the excited space, i.e.
Ψ
1
⟩
=
̂
푃
Ψ
1
⟩
. It differs from REPT2
in that (a)
Ψ
1
⟩
has no contributions in the reference space and
(b) the excitation spaces of degree
푛
ex
>
0
are coupled in
̂
퐻
0
.
We refer to the MPS implementation of this theory as MPS-
MRLCCM.
III. IMPLEMENTATION
We have implemented the modified MPS algorithms de-
scribed above in several ways. We have implemented un-
contracted dynamical correlation methods within a standard
MPS formulation by restricting the occupancy of different
spaces, as described in Section III A within B
LOCK
2.
23
The
large site implementation of the dynamical correlation meth-
ods is implemented in B
LOCK
2 as well, as described in Sec-
tion III B. Finally, the general cluster MPS (used in the com-
putations in Appendix A) is implemented within the DMRG
program S
CHWARZBROT
18
as described in Section III C. For
general references on DMRG implementation, we refer to
the literature.
4,23,25,57–60
The above MPS algorithms are inter-
faced with the P
Y
SCF program.
61,62
A. Restricting configurations in matrix product states
To implement the unrestricted multireference dynamical
correlation theories in Section II C in a standard MPS, we en-
force constraints on the MPS matrices. Elementary symme-
tries such as particle number or spin symmetry are usually
taken into account by introducing irreducible blocks in the
MPS site matrices
퐀
푛
푖
,
1,6,25,63
where each block corresponds
to a different symmetry (e.g., number of particles) to the left
and right of the given site. If the MPS sites are ordered accord-
ing to the orbital spaces, inactive (
퐾
inact
), active (
퐾
act
), and
external (
퐾
ext
), the same technique can be used to constrain
the MPS ansatz to a wavefunction of the form (3) with a given
excitation level. For example, restricting the particle numbers
on site
푖
≤
퐾
inact
to be
{
푖,푖
−1
,푖
−2}
and the particle numbers
on site
푖 > 퐾
Int
to be
{
푁
el
,푁
el
− 1
,푁
el
− 2}
, we approx-
imate the ansatz (3) with singles and doubles excitations.
64
Particle number restriction is sufficient to implement the un-
contracted multi-reference dynamical correlation approaches
in this work, but extensions to other symmetry sectors (e.g.,
푆
푧
and
푆
2
symmetry) is possible, and can, for example be used
to describe wavefunctions restricted by the seniority quantum
number.
63
In passing, we note that this approach is very differ-
ent from the “multilevel” DMRG,
65
where different maximal
bond dimensions are used in the three subspaces, without any
restrictions on the particle number blocks.
6
B. Matrix product states with large sites in the ends
Introducing large sites in an MPS requires significant
changes in the implementation of a DMRG code. Since for
conventional MPSs in electronic structure theory, the physical
dimension of a site is
푃
= 4
, or, with spin-orbital sites, even
2
, one typically does not optimize the DMRG implementation
around the size of the physical dimension. However, for large
sites,
푃
can become arbitrarily large. Hence care must to be
taken to avoid unfavorable costs and scaling with respect to
푃
.
In standard CI methods, the matrix representation of opera-
tors is seldom explicitly constructed, and instead matrix vector
products, such as
̂
퐻
Ψ
⟩
are evaluated on the fly. Here, how-
ever, we store all required operators that act in the large site
Hilbert space and represent them as sparse matrices (in the
determinantal basis of the large site) of size
푃
×
푃
. This is
because (a) operators in DMRG need to be accessed more of-
ten than in standard CI methods and (b) the size of the large
site basis is small (
∼ 10
6
), compared to standard CI methods.
Note that the determinantal configurations range over different
numbers of electrons (e.g., between
0
-
2
for the external site
in an implementation of the multireference singles and dou-
bles theories) and this yields extremely sparse operator matri-
ces depending on the operator, e.g., with
(
푃
)
or even
(1)
nonzeros. In our implementation, for up to two electrons in
the external space, most of the memory and runtime (includ-
ing the initialization) is spent on optimizing the regular sites
in the active space and not the large inactive or external sites.
A standard DMRG implementation uses a decomposed
form of the Hamiltonian
̂
퐻
=
∑
훼
̂
푂
훼
퐿
̂
푂
훼
푅
to carry out the
optimization of the MPS matrix at a given site, where
̂
푂
훼
퐿
,
̂
푂
훼
푅
define operators that act to the left/right(inclusive) of the
given site. There are multiple such decompositions,
23
e.g.,
we can group the Hamiltonian integrals with either the left or
right operator, resulting in a normal (no integrals) or comple-
mentary (with integrals) operator. When approaching the site
in the middle of an MPS during a sweep, it is advantageous
in the standard algorithm to swap the assignment of normal
and complementary between the left and right operators in or-
der to reduce the number of terms in the
훼
sum.
23,66,67
In the
large site implementation of multireference dynamic correla-
tion, this is not required, because we can usually assume that
퐾
ext
≫ 퐾
inact
+
퐾
act
, and thus the sweep is always over the first
"half" of the sites. In practice, this means that only the normal
operators are constructed for the inactive and active sites, and
only the complementary operators are constructed for the last
external site.
The standard DMRG algorithm extracts a renormalized ba-
sis at each site by constructing and diagonalizing a density
matrix.
2,25
Small perturbations are also added to this density
matrix to improve convergence during the optimization.
68
For
the large site, however, the density matrix would have size
(
푃
2
)
. To avoid constructing this large object, we use the sin-
gular value decomposition (SVD) of the large site,
퐀
퐧
ext/inact
,
which reduces the scaling of the memory to
(
퐷푃
)
.
69,70
Fi-
nally, standard DMRG simulations often use a two-site algo-
rithm where two adjacent sites are optimized simultaneously,
in order to improve convergence and to optimize the distri-
bution of symmetry blocks in each MPS matrix. We use a
two-site algorithm on all sites except the large sites, which are
treated using the one-site algorithm. To ensure that symmetry
sectors in each matrix are not lost during the DMRG sweep,
we always retain at least one state in each symmetry sector
(particle number, point group, and
푆
푧
) in the SVD. Our large
site implementation does not currently use
푆
2
symmetry.
To evaluate the scalar size-extensivity energy corrections,
the weight of the reference space,
푐
2
0
, is required. An eval-
uation as Eq. (9) via
⟨
Ψ
0
Ψ
MRCISD
⟩
2
is done by straight-
forward contraction of two MPSs.
1
An evaluation as Eq. (8)
via
∑
퐶
푐
2
퐶
requires first setting the inactive and external con-
figurations in the MPS ansatz (2),
퐧
inact
= {2
1
...2
퐾
inact
}
,
퐧
ext
= {0
1
...0
퐾
ext
}
, and then computing the norm by con-
tracting the resulting MPS with itself. The energy function-
als associated with the AQCC and ACPF methods are imple-
mented by modifying the diagonal of the Hamiltonian during
the optimization, as shown in Eq. (10). We shift the diago-
nal of
̂
퐻
, excluding the reference space, by constructing a ma-
trix product operator (MPO)
1,66
representation of
̂
푃
defined in
Eq. (11), giving
̂
퐻
→
̂
퐻
+Δ
̂
푃
. For
Δ
퐸
, we evaluate the cor-
relation energy using the lowest energy so far observed during
the DMRG sweep.
The perturbation-based methods MPS-MRLCCM and
MPS-MRREPT2 are implemented in the DMRG sweep algo-
rithm by solving linear equations at each site instead of eigen-
value problems.
43
For MPS-MRLCCM, Eq. (18) can be con-
structed by removing the reference space in
Ψ
1
⟩
. The zeroth-
order Hamiltonian in MPS-MRREPT can be constructed by
including only particle-number-conserving operators on the
large sites.
C. Cluster matrix product states
While the implementation of the cluster MPS follows that
of an MPS with many large sites, more care has to be taken to
avoid a
푃
2
-type of computational cost. In addition to the mod-
ifications described in Section III B, all DMRG optimization
sweeps are performed in one-site mode and explicitly blocked
operators (which act on the Hilbert space of a block enlarged
by a site) are not explicitly constructed. (For example, this
means that we always use a tripartition of the Hamiltonian
̂
퐻
=
∑
훼
̂
푂
훼
퐿
̂
푂
훼
푆
̂
푂
훼
푅
where the
푆
index denotes the site being
optimized in the sweep). This changes the
(
퐾
2
<
퐷
3
푃
2
)
term
in the scaling of operator multiplication at a site in the DMRG
sweep to
(
퐾
<
퐾
>
퐷
3
푃
)
, c.f. Section II A 2.
IV. APPLICATIONS
A. Nitrogen dimer
Here, we compare relative timings of (a) a standard DMRG
computation (approximating full configuration interaction,
FCI), (b) an MPS-MRCISD computation based on a stan-
7
dard MPS with restricted quantum numbers and (c) an MPS-
MRCISD computation based on an MPS with large sites.
Specifically, we compare timings for N
2
with MPS-MRCISD
based on a valence CAS(10e,8o), double and triple
휁
bases and
different maximal bond dimensions.
The computations are performed with shared-memory
parallelism
23
on one node with 28 Intel(R) Xeon(R) E5-2680
v4 CPUs. For each experiment, we start with a random ini-
tial state and perform two sweeps with the one-site DMRG
algorithm and with perturbative noise.
68,69
For all computa-
tions, we measure and compare the total runtimes (including
initialization steps such as the setup of the Hamiltonian matrix
product operator).
The absolute and relative timings are shown in Fig. 3.
For the same bond dimension, the MPS-MRCI simulations
(full and dashed lines) are typically faster than the FCI-based
DMRG simulations (dotted lines). This is expected as the
Hilbert space size (and thus the matrices in the MPS) are
restricted in the MPS-MRCI computation. Likewise, MPS-
MRCI simulations converge faster than conventional MPS
(FCI) simulations with respect to bond dimension (see below).
For all basis sizes the large site MPS (full lines) performs sig-
nificantly faster than the MPS based on restricted quantum
numbers (dashed lines) (right panel in Fig. 3). The speedup
is always significantly larger than 1, decreases with increasing
bond dimension, and increases with basis size. For example,
in a triple
휁
basis with 60 orbitals, a speedup of more than 50
compared to the standard MPS can be obtained. Notably, the
large site MPS computation with a triple
휁
basis (60 orbitals;
full black line) is faster than the MPS with restricted quantum
numbers with a double
휁
basis (28 orbitals; dashed green line).
The convergence of the energy versus bond dimension is
shown for N
2
in Fig. 4. We compute the energy at the equi-
librium distance (
푅
= 1
.
1208
Å) using the cc-pVDZ basis.
MRCISD is based on a full CAS(14e,10o), employing a com-
bination of natural orbitals obtained from CASSCF (for the
CAS space) and from Møller-Plesset second order PT (MP2)
48
(by diagonalizing the one-particle density matrix in the space
of the external orbitals). We use the same natural-occupation-
based ordering for the MPS-MRCISD and the standard MPS
wavefunctions. While for this example the normal MPS ap-
proaches the uc-MRCISD energy more rapidly, namely around
퐷
∼ 300
, the overall convergence with respect to
퐷
is
slower. Due to the restrictions on the wavefunction, the MPS-
MRCISD method requires a much smaller bond dimension of
less than
400
to approach an error of less than
1mE
H
. In con-
trast, a normal MPS requires a bond dimension of
∼1000
for
a similar convergence tolerance.
71
B. Chromium dimer
The chromium dimer is a prototypical correlated system
with complex bonding, which requires both a multireference
treatment and a large amount of dynamic correlation.
51,72–80
Several studies have used both internally contracted and un-
contracted MRCISD and related methods to compute the Cr
2
binding curve. Here, we will use the MPS-based formalism
to obtain results for several variants of MPS-MRCI methods.
Our purpose here is to illustrate the flexibility of the MPS for-
malism and the utility of the large site implementation which
allows us to obtain results in large basis sets and beyond dou-
bles excitations in the MRCI ansatz.
We use a CAS self-consistent field (CASSCF) reference
with a valence CAS consisting of 12 electrons and 12 orbitals
(3d and 4s shells, 28784 configuration state functions, CSFs)
and employ the spin-free exact-2-component Hamiltonian
81,82
with the cc-pV
{
D,T,Q,5
}
Z-DK basis sets (up to quintuple
휁
), which include up to
푖
-type functions.
83
To decrease the
required bond dimension, we use CASSCF natural orbitals
and Fiedler ordering in the active space.
84,85
We use stan-
dard canonical external and inactive orbitals since an MPS-
MRCISD wavefunction is invariant with respect to orbital ro-
tations in these spaces. We do not employ any BSSE cor-
rection. The uncontracted MRCI wavefunction keeps the 1s,
2s, 2p shells frozen, and includes the 3s and 3p orbitals in the
inactive space, and we correct the energies using the Pople,
and other, size-extensivity corrections. (For comparison, pre-
vious uncontracted MRACPF and MRAQCC simulations by
Dachsel et al.
86
and Müller
74
used a generalized valence bond
reference function consisting of 3088 or 1516 CSFs, using
bases with up to
ℎ
-type functions). The large site represent-
ing the external space in the MPS has up to
153
⋅
10
3
config-
urations. The PECs are generated from the binding energies
as obtained by subtracting the energy from the dimer at large
distance (to account for size consistency errors). Energy data
is given in the Supporting Information.
The MPS-MRCI+Q
P
PECs are presented in Fig. 5, to-
gether with the earlier uc-MRAQCC results from Müller,
74
and experimental curves (with a zero-point energy correc-
tion of
0
.
03eV
80
). The size consistency error at the 5
휁
level
is
0
.
185eV
, which is similar to the uc-MRCI+Q results ob-
tained by Müller.
74
As usual, we computed the size consis-
tency error by taking the difference between the dimer energy
at large distance and twice the energy of the
Cr
atom (based
on restricted open-shell Hartree-Fock orbitals). We find that
the MPS-MRCI simulations require a very large bond dimen-
sion, typically in the middle of the MPS, which is larger than
that required for the reference wavefunction.
87
The maximum
bond dimension required in the MPS representation of the
CAS(12,12) wavefunction is
∼1
,
500
. In contrast, in the MPS-
MRCI wavefunction, the additional external space leads to a
dramatic increase of the required bond dimension and with
퐷
= 15
,
000
(without spin adaptation) the Q
휁
PEC is con-
verged to the eye with accuracy of
≾
1mE
H
. As mentioned
in Section IV A, the required bond dimension still is much
smaller than that needed to represent the FCI wavefunction.
However, we could not similarly converge the simulation with
the 5
휁
basis using a maximum bond dimension of
16
,
000
. In
particular, the relative energies for the bond lengths between
2
.
1
and
2
.
5
Å and the absolute energies are not converged at
that bond dimension, thus we also show an approximate ex-
trapolation to infinite bond dimension
85
for the 5
휁
results.
Compared to the experimental curve, the MPS-MRCI+Q
P
PECs gives too narrow of a well and the
휎
-bonding around
2
.
2
Å is underestimated. The MPS-MRCI+Q results differ sig-