Externally corrected CCSD with renormalized perturbative triples (R-ecCCSD(T)) and density matrix renormalization group and selected configuration interaction external sources

We investigate the renormalized perturbative triples correction together with the externally corrected coupled-cluster singles and doubles (ecCCSD) method. We take the density matrix renormalization group (DMRG) and heatbath CI (HCI) as external sources for the ecCCSD equations. The accuracy is assessed for the potential energy surfaces of H2O, N2, and F2. We find that the triples correction significantly improves on ecCCSD and we do not see any instability of the renormalized triples with respect to dissociation. We explore how to balance the cost of computing the external source amplitudes with respect to the accuracy of the subsequent CC calculation. In this context, we find that very approximate wavefunctions (and their large amplitudes) serve as an efficient and accurate external source. Finally, we characterize the domain of correlation treatable using the externally corrected method and renormalized triples combination studied in this work via a well-known wavefunction diagnostic.


Introduction
Electronic structure methods that can efficiently handle both static and dynamic correlation remain an important area of investigation. Because there is a wide spectrum of strongly correlated problems, ranging from mildly "quasi"-degenerate scenarios (e.g. in the electronic structure of diradicals 1,2 ) to extensively near-degenerate problems (e.g. in the electronic structure of multi-centre transition metal clusters [3][4][5] ) a variety of theoretical strategies have been proposed.
For highly-degenerate problems, it is common to combine dynamic correlation methods with an explicit treatment of a multi-reference state. The representation of the multi-reference state can range from an exact complete active space (CAS) representation [6][7][8] (for small numbers of orbitals), to density matrix renormalization group (DMRG) 9-23 (e.g. when almost all orbitals are degenerate), to selected configuration interaction  and Monte Carlo approximations [45][46][47][48][49] for intermediate cases. On top of these, various flavours of perturbation theory, 50-62 configuration interaction, [63][64][65][66] and exponential approximations 58,67-69 have been explored. However, the combination of dynamic correlation with multi-reference representations is not straightforward and usually leads to added conceptual, implementation, and computational complexity.
For quasi-degenerate problems, an alternative strategy can be used, which incorporates a limited amount of static correlation on top of an existing SR method. This has been particularly popular in conjunction with SR coupled cluster methods. [70][71][72][73][74][75] Some examples include variants of tailored coupled cluster [76][77][78][79][80][81] and externally corrected coupled cluster methods. [82][83][84][85] The computational cost of such SR static correlation methods is often lower than that of true MR dynamic correlation methods and thus large active spaces become affordable. However, the approximations are limited to problems with only a modest amount of degeneracy.
In this work, we will focus on quasi-degenerate problems, and in particular, we will investigate the externally corrected coupled cluster method. [82][83][84][85] This extracts static correlation from a MR method by using the MR wavefunction as an "external" source of higher order coupled cluster amplitudes. For example, in the ecCCSD approximation, the T 3 and T 4 amplitudes are extracted from the external source and a new set of T 1 and T 2 amplitudes are computed in their presence.
Should the T 3 and T 4 amplitudes be exact, then the T 1 and T 2 amplitudes and the energy will be exact. A different, spiritually related, approximation is tailored CCSD, [76][77][78] which has been of renewed interest of late. [79][80][81] Here, instead of higher order cluster amplitudes, the large (active space) T 1 and T 2 amplitudes are fixed from an external source.
The ecCCSD method has a long history and external sources, ranging from unrestricted Hartree-Fock [86][87][88][89] to CASSCF and CASCI [90][91][92][93][94] and, most recently, full configuration interaction quantum Monte Carlo, 95 have been used. One of the more successful applications of ecCCSD is the reduced multireference (RMR) CCSD method, 96,97 which uses a MRCISD wave function as the external source. RMR CCSD(T), which incorporates some of the residual dynamic correlation through perturbative triples, has also been studied. 98 Despite the promising performance of RMRCCSD (T) in several studies, [99][100][101][102][103][104][105][106][107] it suffers from two main limitations. First, conventional MRCISD can only be applied for modest sizes of active spaces (typically, up to about 16 orbitals as limited by the exact CAS treatment). Second, the (T) correction, although not divergent like its single-reference counterpart, still overcorrects the dynamical correlation in the bond-stretched region. 108 In this work, we make two modifications to ecCCSD to overcome and ameliorate the above limitations. First, we utilize variational DMRG and HCI wave functions as external sources for ecCCSD. This allows for the use of larger quasidegenerate active spaces, of the size typically treated by DMRG and HCI. Second, we explore the renormalized perturbative triples correction.
This has been shown in the single reference setting to ameliorate the overcorrection of standard perturbative triples, 109 without affecting the computational scaling.
We describe the use of DMRG and HCI as external sources for ecCCSD in Section 2.1. It is possible to create a near-exact method by using a near-exact external source. Since near-exact external sources can be computed by DMRG and HCI for the small systems we employ as test cases in this paper, the critical question is not simply the accuracy of the method, but the balance between cost and accuracy. To this end, we explore a variety of approximate treatments of the external DMRG and HCI sources as discussed in Sec. 2.2. The various triples approximations for ecCCSD are discussed in Section 2.3. Computational details are provided in Section 3 and the accuracy of the renormalized perturbative triples correction is assessed for three potential energy surfaces (PESs) in Sections 4.1-4.4. We characterize the range of quasi-degenerate correlations captured in this work in Section 4.6. Finally we discuss the limitations of this method in Section 4.7 and summarize our findings in Section 5.

Theory
In this work, we use a reference configuration |Φ with the same occupancy as the Hartree-Fock (HF) determinant. It is useful to define a projection operator onto the space of k-tuply excited configurations relative to the reference; we denote this Q k . The external source is used to provide an important subset of the triply and quadruply excited configurations; the projector onto this subset is denoted Q ec k , and its complement is Q c k . Thus

Externally corrected CCSD with DMRG and HCI wave functions
In ecCCSD, the coupled cluster operator T is given by Here, T n , n = 1, 2, 3, 4, are the n-fold cluster operators. In this work, we extract Q ec k T k , k = 3, 4 from the DMRG or the HCI variational wave function. For the DMRG wave function, the triply and quadruply excited configurations D p that define Q ec k , k = 3, 4 are chosen to be those where the magnitude of the CI coefficient c p is above a threshold, i.e.
where s is an arbitrary scaling factor and ω is the largest discarded weight of the density matrix at the maximum bond dimension in two-dot DMRG sweeps (carried out without noise). An efficient algorithm to convert a matrix-product state to CI coefficients above a given threshold is described in Appendix A. For the HCI variational wave function, D p is included in the projectors Q ec k , k = 3, 4 using the heatbath algorithm with a threshold , 42 i.e., it is included if for at least one determinant D q which is already in the variational space. The extracted CI coefficients are then converted into cluster amplitudes.
The T 1 and T 2 amplitudes are obtained by solving the ecCCSD equations using fixed Q ec 3 T 3 and Q ec 4 T 4 , where H N is the Hamiltonian in normal-ordered form, and the subscript C denotes the connected part of the corresponding operator expression. With the relaxed T 1 and T 2 , the ecCCSD correlation energy of the ground state is obtained as

Approximations in the external source
While DMRG and HCI can, in small molecules, be a source of nearly exact T 3 and T 4 amplitudes even in the full orbital space, this provides no computational advantage as such calculations are more expensive than the subsequent coupled cluster calculation. Consequently, it is important to balance the cost of the external source calculation and that of the subsequent coupled cluster calculation by making approximations in the external source. This introduces the additional complication that one must ensure that errors introduced into the external source do not lead to unacceptable errors in the final coupled cluster calculation.
In this work, we consider six different types of approximate external sources with different sizes of active spaces and different values of parameters summarized in Table 1. Type I uses CASSCF-like external sources in minimal active spaces. Since the minimal active space of F 2 , i.e., two electrons in two orbitals (2e,2o), does not contain T 3 and T 4 , we perform minimal active space ecCC calculations for only H 2 O and N 2 (with (4e,4o) and (6e,6o), respectively). These provide amplitudes that are very close to the exact CASSCF amplitudes. However, these amplitudes lack the relaxation that comes from allowing excitations within a larger space of orbitals, and of course the amplitudes outside the minimal active space are completely absent. Type III uses CASCI-like external sources (the orbitals are not optimized to save computer time) in larger active spaces ((8e,18o), (10e,16o), and (14e,16o) for H 2 O, N 2 , and F 2 , respectively).
In either case, one can potentially introduce bad external amplitudes if the effect of relaxation on the amplitude upon going to a larger space is large relative to the size of the amplitude (e.g. changes its sign). Thus, we study also Type II and Type IV external sources which are similar to Type I and Type II external sources respectively except that they employ an additional threshold to screen out all except the largest T 3 and T 4 amplitudes. The absolute values of the T 3 and T 4 elements at the most stretched geometry of each molecule are sorted in a single large vector. The norm of the vector is computed. Only the largest elements of the vector are retained such that the resulting norm is more than 80% of the norm of the full vector. Along PESs of each molecule, we used the same set of elements of T 3 and T 4 (but with the appropriate values for each geometry) as the external sources, to maintain the smoothness of the PESs.
As discussed in Section 4.4, the type-III and type-IV sources improve upon the PESs obtained from the type-I and type-II sources, but the DMRG calculations to obtain the sources incur a higher computational cost than the subsequent CC calculations. To reduce the cost, we have also tried type-V sources, which employ loosely converged DMRG wave functions with small bond dimensions of M = 25, 50, and 100. Finally, type-VI sources employ large thresholds = 0.01, and 0.003 to obtain loosely converged HCI wave functions in the full orbital spaces.
This combination has the advantage that it can be considered a black-box method wherein a single parameter controls the tradeoff between accuracy and cost.

Perturbative triples corrections
Expressions for the standard, renormalized, and completely renormalized perturbative triples corrections can be written down in analogy with their single reference definitions. 109 We first define the completely renormalized (CR)-ecCCSD(T) correction. We use the state |Ψ ecCCSD(T) defined as where V N is the two-body part of the Hamiltonian in normal-ordered form and R 0 denotes the three-body component of the reduced resolvent operator in many-body perturbation theory, given by differences of orbital energies in the denominator. 75 The resulting formula for the CR-ecCCSD(T) energy correction is The energy corrections for renormalized (R)-ecCCSD(T) and ecCCSD(T) can be obtained by taking the lowest-order estimates of the correction and by assuming the denominator to be one, Unlike perturbative triples without external correction (as in CCSD(T)), the approximate perturbative expression T [2] 3 is only evaluated for determinants omitted in the external source. Thus it is not expected to diverge as long as the external source includes all degeneracies. Nonetheless, it can still overestimate the triples correlation. The role of the denominator in the "renormalized" triples approximations is to rescale this correction, which can be expected to reduce the overestimation.

Computational Details
All CC calculations were performed using cc-pVTZ basis sets. 110 The ecCC calculations were performed with the six different types of external sources summarized in Table 1

PESs with type-II external sources
The difference between the type-I and type-II external sources is that the former use all the amplitudes of T 3 and T 4 while the latter use only a small number of the largest amplitudes, which contribute about 80% of the total weight of T 3 and T 4 (see Table 1 and Section 2. One concern with partial use of the amplitudes in the minimal active space is the possibility of divergence in (T). Assuming a HF occupation of the natural orbitals, the recomputed orbital energies (diagonal parts of the Fock matrix) of the holes and those of the particles are such that the system retains a sizable gap. N 2 at the 10.0a 0 bond length, for example, has a gap of around

PESs with type-III and type-IV external sources
In addition, we investigated PESs of R-ecCCSD(T) using larger active spaces with all amplitudes (type-III) and 80% of the amplitudes (type-IV). We used active spaces of (8e,18o), (10e,16o), and (14e,16o) for H 2 O, N 2 , and F 2 , respectively, and obtained near exact wave functions in the spaces.
The PESs of ecCC using the type-III and type-IV external sources are shown as colored solid and dotted lines, respectively, in Figure 3. The MAE and NPE of the PESs are given in Table 2.

PESs with type-V external sources
In the previous section, we showed that the use of larger active spaces significantly improves the ecCC PESs. However, obtaining tightly converged DMRG wave functions in large active spaces requires more CPU time than the subsequent CC calculation. For example, optimizing an external wave function with 16 orbitals requires around a few minutes of CPU time for one DMRG sweep with M = 2000. Although a few minutes is not prohibitively large in many applications, it is large compared to the subsequent ecCC calculation which only takes tens of seconds, at least for the small molecules considered here. In addition, the fact that in some cases only a partial use of the amplitudes led to better results in the last section suggests that it is not a good use of computational time to tightly converge the external source. Figure 3: PESs of H 2 O, N 2 , and F 2 in the top, middle, and bottom panels, respectively, for the ecCC methods using the larger than minimal active spaces and the near exact external sources. The solid and dotted lines correspond to PESs obtained using all external amplitudes of T 3 and T 4 (type-III) and only the largest amplitudes of T 4 (type-IV), respectively. The descriptions are otherwise the same as those in Fig.2  We thus now consider loosely converged DMRG sources (type-V) in the larger active spaces.
We re-computed PESs of R-ecCCSD(T) shown in Figure 3 using a DMRG source with bond dimensions 25, 50, and 100. We used the truncation to 80% amplitude weight for H 2 O and N 2 , while we used all the external amplitudes for F 2 . Table 3 shows the corresponding MAE and NPE in the same range of geometries in Table 2.
For all cases, when we reduced the bond dimension to M = 100, the MAE increased by

PESs with type-VI external sources
Type VI sources use the full orbital space and employ the single HCI parameter, , to select the large T 3 and T 4 amplitudes. They do not require a choice for the CAS space, and we did not further threshold the T 3 and T 4 amplitudes. However, for systems where the molecule has a low-spin ground state, but the dissociated fragments have high-spin ground states, the HCI wave functions at stretched geometries are strongly spin-contaminated when is large, even when time-reversal symmetry is employed to reduce the spin contamination. (This type of spin-contamination is avoided in the small bond dimension DMRG calculations via the full use of spin symmetry).
Although this has little effect on the HCI energy, it makes the amplitudes unsuitable for ecCC calculations.
In the second part of Table 3 the MAE and NPE from type-VI sources using = 0.01 and 0.003 are shown. These are evaluated over a smaller range of geometries than those used in the first part of Table 3 for the reason mentioned above. Since N 2 is a singlet that dissociates into atoms that are quartets, it has particularly large errors. The errors improve upon going from = 0.01 to = 0.003, in particular the PEC has an unphysical dip at = 0.01 which disappears at = 0.003, but the errors are still substantial. Similarly to how using all the amplitudes could lead to larger errors than only using some of the amplitudes in the previous sections, the errors incurred from the type VI sources here emphasize that the quality of the external amplitudes cannot be judged solely from the energy obtained by the external method.

Error analysis
In this section, we present the errors of R-ecCCSD(T) for the systems in this work and analyze them using the well known CC error diagnostic D 2 , defined by the matrix 2-norm of the T 2 amplitudes. 119 The magnitude of D 2 can be used to distinguish between the SR and MR character of the different geometries on the PESs. Organic molecules are sometimes considered to have MR character when D 2 is larger than 0.18. 119 Figure 4 shows the absolute errors on a log scale versus the D 2 diagnostic. Each symbol represents a geometry on the PESs of one of the molecules, H 2 O,

Limitations
Although the work above shows that it is possible to obtain quantitative accuracy across the full potential energy surface, at a reasonable computational cost, using the combination of external sources and the renormalized (T) correction, SR ecCC approaches have a fundamental limitation when the CI coefficient of the reference configuration (the HF configuration in this work) in the external source is exactly or numerically zero. This leads to overflow due to the assumption of intermediate normalization. The smallest reference coefficient value we encountered in this work was 0.2 at the stretched 10a 0 geometry of N 2 . Although this coefficient is not numerically zero, it is difficult to converge the ecCCSD energy. We extracted initial amplitudes of T 1 and T 2 from the external source and then iteratively converged the ecCCSD energy using a damping parameter of 0.05 to update the amplitudes. (We did not use the direct inversion in the iterative subspace (DIIS) algorithm). At this geometry, the energy could be converged monotonically up to a threshold of 10 −5 Hartrees, although the norm of the amplitudes could not be converged, and we simply used amplitudes from the last iteration in the set of monotonically decreasing energies.

Conclusion
In this work, we explored externally corrected coupled cluster with a renormalized triples correction (R-ecCCSD(T)) using DMRG and HCI and external sources. The critical question is how to best balance the accuracy and cost of computing the external source with the cost of the overall method.
To this end, we considered multiple types of external sources: "exact" external sources, where the DMRG and HCI wavefunctions were tightly converged, and "approximate" external sources where they were loosely converged. We also considered both "full" usage of the T 3 and T 4 amplitudes, and "partial" usage, wherein we retained only the largest elements.
For all systems considered here, we found that R-ecCCSD(T) can significantly improve on the description of the potential energy surface given either by the external source alone or CC alone. problems, we can ask whether other non-iterative corrections such as the "completely renormalized" triples and quadruples corrections, 109 corresponding to CR-ecCCSD(T) and CR-ecCCSD(T,Q), would further improve on the present R-ecCCSD(T) method. Finally, the current approximation, with its modest computational requirements on the external source, is applicable to the same scale of systems that can be handled by single reference coupled cluster methods. Thus understanding the performance of this method in larger correlated systems is of interest.
Note: As this work was being prepared for submission, we were made aware of a related recent submission to the arxiv, 120 that also discusses selected configuration interaction as an external source and perturbative triples corrections in the context of externally corrected coupled cluster methods. only generate symmetry unique partial coefficients (e.g. if S z = 0, then the values of c α p+1 come in time-reversal pairs and only one needs to be considered). Thus using the above algorithm we can completely avoid generating any determinants with coefficients below the threshold.