6. GGA Hessian

在这一节,我们会非常简略地提及 GGA 泛函的 Hessian 求取。GGA 的 Hessian 求取过程与 RHF 的过程近乎于完全一致。但我们也注意到,GGA 相比于 RHF 而言,多了泛函核的计算。对于 GGA 的 Hessian 而言,我们唯一需要补充的,会是 GGA 能量贡献项的二阶 Skeleton 导数。

后文的 GGA 是以 B3LYP 为例。

6.1. 准备工作

[1]:
%matplotlib notebook

from pyscf import gto, scf, dft, lib, grad, hessian
import numpy as np
from functools import partial
import warnings
from matplotlib import pyplot as plt
from pyxdh.Utilities import NucCoordDerivGenerator, DipoleDerivGenerator, NumericDiff, GridHelper, KernelHelper
from pyxdh.DerivOnce import GradSCF
from pyxdh.DerivTwice import HessSCF

np.einsum = partial(np.einsum, optimize=["greedy", 1024 ** 3 * 2 / 8])
np.allclose = partial(np.allclose, atol=1e-6, rtol=1e-4)
np.set_printoptions(5, linewidth=150, suppress=True)
warnings.filterwarnings("ignore")
[2]:
mol = gto.Mole()
mol.atom = """
O  0.0  0.0  0.0
O  0.0  0.0  1.5
H  1.0  0.0  0.0
H  0.0  0.7  1.0
"""
mol.basis = "6-31G"
mol.verbose = 0
mol.build()
[2]:
<pyscf.gto.mole.Mole at 0x7fd3d41ae1c0>
[3]:
def mol_to_grids(mol, atom_grid=(75, 302)):
    grids = dft.Grids(mol)
    grids.atom_grid = atom_grid
    grids.becke_scheme = dft.gen_grid.stratmann
    grids.prune = None
    grids.build()
    return grids
grids = mol_to_grids(mol)
[4]:
def mol_to_scf(mol):
    scf_eng = dft.RKS(mol)
    scf_eng.grids = mol_to_grids(mol)
    scf_eng.xc = "B3LYPg"
    scf_eng.conv_tol = 1e-10
    return scf_eng.run()
[5]:
gradh = GradSCF({"scf_eng": mol_to_scf(mol), "cphf_tol": 1e-12})
hessh = HessSCF({"deriv_A": gradh, "deriv_B": gradh})
[6]:
nmo, nao, natm, nocc, nvir = gradh.nao, gradh.nao, gradh.natm, gradh.nocc, gradh.nvir
so, sv, sa = gradh.so, gradh.sv, gradh.sa
mol_slice = gradh.mol_slice
C, Co, Cv, e, eo, ev, D = gradh.C, gradh.Co, gradh.Cv, gradh.e, gradh.eo, gradh.ev, gradh.D
H_0_ao, S_0_ao, eri0_ao, F_0_ao = gradh.H_0_ao, gradh.S_0_ao, gradh.eri0_ao, gradh.F_0_ao
H_0_mo, S_0_mo, eri0_mo, F_0_mo = gradh.H_0_mo, gradh.S_0_mo, gradh.eri0_mo, gradh.F_0_mo
H_1_ao, S_1_ao, eri1_ao, F_1_ao = gradh.H_1_ao, gradh.S_1_ao, gradh.eri1_ao, gradh.F_1_ao
H_1_mo, S_1_mo, eri1_mo, F_1_mo = gradh.H_1_mo, gradh.S_1_mo, gradh.eri1_mo, gradh.F_1_mo
Ax0_Core, B_1, U_1, U_1_vo = gradh.Ax0_Core, gradh.B_1, gradh.U_1, gradh.U_1_vo
H_2_ao, S_2_ao, eri2_ao = hessh.H_2_ao, hessh.S_2_ao, hessh.eri2_ao
H_2_mo, S_2_mo, eri2_mo = hessh.H_2_mo, hessh.S_2_mo, hessh.eri2_mo
[7]:
grdh = GridHelper(mol, grids, D)
kerh = KernelHelper(grdh, "B3LYPg")
cx = gradh.cx

下面的 A_rho_1 \(\rho^{A_t}\)A_gamma_1 \(\gamma^{A_t}\) 的维度将会重设置为 \((A_t, g)\),这与 pyxdh 的默认设定不太一样,但更改后的维度比较容易代入到 Hessian 的计算中:

[8]:
ngrid = grdh.ngrid
ao_0, ao_1, ao_2, ao_3 = grdh.ao_0, grdh.ao_1, grdh.ao_2, grdh.ao_3
rho_1, rho_2 = grdh.rho_1, grdh.rho_2
A_rho_1 = grdh.A_rho_1.reshape(natm * 3, ngrid)
A_gamma_1 = grdh.A_gamma_1.reshape(natm * 3, ngrid)
fr, fg, frr, frg, fgg = kerh.fr, kerh.fg, kerh.frr, kerh.frg, kerh.fgg
[9]:
def grad_generator(mol):
    scf_eng = mol_to_scf(mol)
    config = {"scf_eng": scf_eng, "cphf_tol": 1e-12}
    return GradSCF(config)
gradn = NucCoordDerivGenerator(mol, grad_generator)

6.2. GGA Hessian 求取

我们在 RHF 部分已经给出了 Hessian 公式了。对于 GGA 而言,公式的表达差距不大,但在 Skeleton 导数部分则有所差别:

\[\begin{split}\begin{align} \frac{\partial^2 E_\mathrm{tot}}{\partial B_s \partial A_t} &= h_{\mu \nu}^{A_t B_s} D_{\mu \nu} + \frac{1}{2} (\mu \nu | \kappa \lambda)^{A_t B_s} D_{\mu \nu} D_{\kappa \lambda} - \frac{c_\mathrm{x}}{4} (\mu \kappa | \nu \lambda)^{A_t B_s} D_{\mu \nu} D_{\kappa \lambda} + E_\mathrm{GGA, Skeleton}^{A_t B_s} \\ &\quad - 2 S_{ij}^{A_t B_s} F_{ij} - 2 F_{ij}^{B_s} S_{ij}^{A_t} - 2 F_{ij}^{A_t} S_{ij}^{B_s} \\ &\quad + 2 (\varepsilon_i + \varepsilon_j) S_{ij}^{A_t} S_{ij}^{B_s} + 4 B_{ai}^{B_s} U_{ai}^{A_t} + S_{ij}^{A_t} A_{ij, kl} S_{kl}^{B_s} \\ &\quad + \partial_{A_t} \partial_{B_s} E_\mathrm{nuc} \end{align}\end{split}\]

我们这里就不对推导过程作更多说明,仅仅指出,所有与 GGA 能量有关的 U 导数都被打包到后续的项中 (譬如 \(F_{ij}^{A_t} S_{ij}^{B_s}\))。

我们回顾到,一阶导数中,

\[\partial_{A_t} E_\mathrm{elec} = h_{\mu \nu}^{A_t} D_{\mu \nu} + \frac{1}{2} D_{\mu \nu} (\mu \nu | \kappa \lambda)^{A_t} D_{\kappa \lambda} - \frac{c_\mathrm{x}}{4} D_{\mu \nu} (\mu \kappa | \nu \lambda)^{A_t} D_{\kappa \lambda} + f_\rho \rho^{A_t} + f_\gamma \gamma^{A_t} - 2 F_{ij} S_{ij}^{A_t}\]

那么,其中与 GGA 有关的二阶 Skeleton 导数可以通过上式给出:

\[E_\mathrm{GGA, Skeleton}^{A_t B_s} = f_{\rho \rho} \rho^{A_t} \rho^{B_s} + f_{\rho \gamma} \rho^{A_t} \gamma^{B_s} + f_{\rho \gamma} \rho^{B_s} \gamma^{A_t} + f_{\gamma \gamma} \gamma^{A_t} \gamma^{B_s} + f_\rho \rho^{A_t B_s} + f_\gamma \gamma^{A_t B_s}\]

我们以前没有求取过的两项是 \(\rho^{A_t B_s}\)\(\gamma^{A_t B_s}\)。作为 Skeleton 导数,它们的求导不会产生 U 矩阵。我们给出下述 AB_rho_2 \(\rho^{A_t B_s}\) 的导出式:

\[\rho^{A_t B_s} = 2 D_{\mu \nu} \phi_{ts \mu_{AB}} \phi_\nu + 2 D_{\mu \nu} \phi_{t \mu_A} \phi_{s \nu_B}\]
[10]:
AB_rho_2 = np.zeros((natm, 3, natm, 3, ngrid))
for A in range(natm):
    sA = mol_slice(A)
    AB_rho_2[A, :, A, :] += 2 * np.einsum("uv, tsgu, gv -> tsg", D[sA, :], ao_2[:, :, :, sA], ao_0)
    for B in range(natm):
        sB = mol_slice(B)
        AB_rho_2[A, :, B, :] += 2 * np.einsum("uv, tgu, sgv -> tsg", D[sA, sB], ao_1[:, :, sA], ao_1[:, :, sB])
AB_rho_2.shape = (natm * 3, natm * 3, ngrid)
[11]:
np.allclose(AB_rho_2, grdh.AB_rho_2.swapaxes(1, 2).reshape(natm * 3, natm * 3, ngrid))  # pyxdh approach
[11]:
True

AB_gamma_2 \(\gamma^{A_t B_s}\) 的表达式则为:

\[\gamma^{A_t B_s} = 2 \rho_r^{A_t B_s} \rho_r + 2 \rho_r^{A_t} \rho_r^{B_s}\]

在此之前,我们还需要求取下述 A_rho_2 \(\rho_r^{A_t}\) (维度 \(A_t, r, g\)) 与 AB_rho_3 \(\rho_r^{A_t B_s}\) (维度 \(A_t, B_s, r, g\)):

\[\rho_r^{A_t} = - 2 D_{\mu \nu} \phi_{tr \mu_A} \phi_\nu - 2 D_{\mu \nu} \phi_{t \mu_A} \phi_{r \nu}\]
[12]:
A_rho_2 = np.zeros((natm, 3, 3, ngrid))
for A in range(natm):
    sA = mol_slice(A)
    A_rho_2[A] -= 2 * np.einsum("uv, trgu, gv -> trg", D[sA], ao_2[:, :, :, sA], ao_0)
    A_rho_2[A] -= 2 * np.einsum("uv, tgu, rgv -> trg", D[sA], ao_1[:, :, sA], ao_1)
A_rho_2.shape = (natm * 3, 3, ngrid)
[13]:
np.allclose(A_rho_2, grdh.A_rho_2.reshape(natm * 3, 3, ngrid))  # pyxdh approach
[13]:
True
\[\rho_r^{A_t B_s} = 2 D_{\mu \nu} \phi_{tsr \mu_{AB}} \phi_\nu + 2 D_{\mu \nu} \phi_{tr \mu_A} \phi_{s \nu_B} + 2 D_{\mu \nu} \phi_{ts \mu_{AB}} \phi_{r \nu} + 2 D_{\mu \nu} \phi_{t \mu_A} \phi_{sr \nu_B}\]
[14]:
AB_rho_3 = np.zeros((natm, 3, natm, 3, 3, ngrid))
for A in range(natm):
    sA = mol_slice(A)
    AB_rho_3[A, :, A] += 2 * np.einsum("uv, tsrgu, gv -> tsrg", D[sA], ao_3[:, :, :, :, sA], ao_0)
    AB_rho_3[A, :, A] += 2 * np.einsum("uv, tsgu, rgv -> tsrg", D[sA], ao_2[:, :, :, sA], ao_1)
    for B in range(natm):
        sB = mol_slice(B)
        AB_rho_3[A, :, B] += 2 * np.einsum("uv, trgu, sgv -> tsrg", D[sA, sB], ao_2[:, :, :, sA], ao_1[:, :, sB])
        AB_rho_3[A, :, B] += 2 * np.einsum("uv, tgu, srgv -> tsrg", D[sA, sB], ao_1[:, :, sA], ao_2[:, :, :, sB])
AB_rho_3.shape = (natm * 3, natm * 3, 3, ngrid)
[15]:
np.allclose(AB_rho_3, grdh.AB_rho_3.swapaxes(1, 2).reshape(natm * 3, natm * 3, 3, ngrid))  # pyxdh approach
[15]:
True

因此,AB_gamma_2 \(\gamma^{A_t B_s}\) (维度 \(A_t, B_s, g\)):

\[\gamma^{A_t B_s} = 2 \rho_r^{A_t B_s} \rho_r + 2 \rho_r^{A_t} \rho_r^{B_s}\]
[16]:
AB_gamma_2 = (
    + 2 * np.einsum("ABrg, rg -> ABg", AB_rho_3, rho_1)
    + 2 * np.einsum("Arg, Brg -> ABg", A_rho_2, A_rho_2)
)
AB_gamma_2.shape
[16]:
(12, 12, 90600)
[17]:
np.allclose(AB_gamma_2, grdh.AB_gamma_2.swapaxes(1, 2).reshape(natm * 3, natm * 3, ngrid))
[17]:
True

在作了这些准备后,我们就可以立即求得 GGA 所对 Skeleton 导数的贡献 E_2_Skeleton_GGA

\[E_\mathrm{GGA, Skeleton}^{A_t B_s} = f_{\rho \rho} \rho^{A_t} \rho^{B_s} + f_{\rho \gamma} \rho^{A_t} \gamma^{B_s} + f_{\rho \gamma} \rho^{B_s} \gamma^{A_t} + f_{\gamma \gamma} \gamma^{A_t} \gamma^{B_s} + f_\rho \rho^{A_t B_s} + f_\gamma \gamma^{A_t B_s}\]
[18]:
E_2_Skeleton_GGA = (
    + np.einsum("g, Ag, Bg -> AB", frr, A_rho_1, A_rho_1)
    + np.einsum("g, Ag, Bg -> AB", frg, A_rho_1, A_gamma_1)
    + np.einsum("g, Bg, Ag -> AB", frg, A_rho_1, A_gamma_1)
    + np.einsum("g, Ag, Bg -> AB", fgg, A_gamma_1, A_gamma_1)
    + np.einsum("g, ABg -> AB", fr, AB_rho_2)
    + np.einsum("g, ABg -> AB", fg, AB_gamma_2)
)

我们将会把 GGA Hessian 中的 Skeleton 二阶导数贡献写为 E_2_Skeleton

\[\frac{\partial^2 E_\mathrm{tot}}{\partial B_s \partial A_t} \xleftarrow{\text{Skeleton derivative}} h_{\mu \nu}^{A_t B_s} D_{\mu \nu} + \frac{1}{2} (\mu \nu | \kappa \lambda)^{A_t B_s} D_{\mu \nu} D_{\kappa \lambda} - \frac{c_\mathrm{x}}{4} (\mu \kappa | \nu \lambda)^{A_t B_s} D_{\mu \nu} D_{\kappa \lambda} + E_\mathrm{GGA, Skeleton}^{A_t B_s}\]
[19]:
E_2_Skeleton = (
    + np.einsum("ABuv, uv -> AB", H_2_ao, D)
    + 0.5 * np.einsum("ABuvkl, uv, kl -> AB", eri2_ao, D, D)
    - cx * 0.25 * np.einsum("ABukvl, uv, kl -> AB", eri2_ao, D, D)
    + E_2_Skeleton_GGA
)
E_2_Skeleton.shape
[19]:
(12, 12)
[20]:
np.allclose(E_2_Skeleton, hessh._get_E_2_Skeleton())  # pyxdh approach
[20]:
True

那么,GGA 的总梯度贡献则可以表示为

\[\begin{split}\begin{align} \frac{\partial^2 E_\mathrm{tot}}{\partial B_s \partial A_t} &= h_{\mu \nu}^{A_t B_s} D_{\mu \nu} + \frac{1}{2} (\mu \nu | \kappa \lambda)^{A_t B_s} D_{\mu \nu} D_{\kappa \lambda} - \frac{c_\mathrm{x}}{4} (\mu \kappa | \nu \lambda)^{A_t B_s} D_{\mu \nu} D_{\kappa \lambda} + E_\mathrm{GGA, Skeleton}^{A_t B_s} \\ &\quad - 2 S_{ij}^{A_t B_s} F_{ij} - 2 F_{ij}^{B_s} S_{ij}^{A_t} - 2 F_{ij}^{A_t} S_{ij}^{B_s} \\ &\quad + 2 (\varepsilon_i + \varepsilon_j) S_{ij}^{A_t} S_{ij}^{B_s} + 4 B_{ai}^{B_s} U_{ai}^{A_t} + S_{ij}^{A_t} A_{ij, kl} S_{kl}^{B_s} \\ &\quad + \partial_{A_t} \partial_{B_s} E_\mathrm{nuc} \end{align}\end{split}\]
[21]:
np.allclose(
    + np.einsum("ABuv, uv -> AB", H_2_ao, D)
    + 0.5 * np.einsum("ABuvkl, uv, kl -> AB", eri2_ao, D, D)
    - cx * 0.25 * np.einsum("ABukvl, uv, kl -> AB", eri2_ao, D, D)
    + E_2_Skeleton_GGA
    - 2 * np.einsum("ABij, ij -> AB", S_2_mo[:, :, so, so], F_0_mo[so, so])
    - 2 * np.einsum("Bij, Aij -> AB", F_1_mo[:, so, so], S_1_mo[:, so, so])
    - 2 * np.einsum("Aij, Bij -> AB", F_1_mo[:, so, so], S_1_mo[:, so, so])
    + 2 * np.einsum("ij, Aij, Bij -> AB", eo[:, None] + eo[None, :], S_1_mo[:, so, so], S_1_mo[:, so, so])
    + 4 * np.einsum("Bai, Aai -> AB", B_1[:, sv, so], U_1_vo)
    + np.einsum("Bij, Aij -> AB", Ax0_Core(so, so, so, so)(S_1_mo[:, so, so]), S_1_mo[:, so, so])
    + hessian.RHF(gradh.scf_eng).hess_nuc().swapaxes(1, 2).reshape((12, 12)),
    hessh.E_2
)
[21]:
True

最后,我们不妨验证一下,能量一阶梯度的再一阶数值导数能否求得与上述解析导数一致的结果:

[22]:
nd_E_1 = NumericDiff(gradn, lambda gradh: gradh.E_1.flatten()).derivative
np.allclose(hessh.E_2, nd_E_1, atol=1e-4)
[22]:
True

需要指出,上面的计算过程的误差相当大。

[23]:
np.abs(hessh.E_2 - nd_E_1).sum() / hessh.E_2.size
[23]:
8.730667769423729e-06

如果我们将格点大小增大,则误差可以进一步减小。

[24]:
def mol_to_scf_99590(mol):
    scf_eng = dft.RKS(mol)
    scf_eng.grids = mol_to_grids(mol, atom_grid=(99, 590))
    scf_eng.xc = "B3LYPg"
    scf_eng.conv_tol = 1e-10
    return scf_eng.run()
[25]:
gradh_99590 = GradSCF({"scf_eng": mol_to_scf_99590(mol), "cphf_tol": 1e-12})
hessh_99590 = HessSCF({"deriv_A": gradh_99590, "deriv_B": gradh_99590})
[26]:
def grad_generator_99590(mol):
    scf_eng = mol_to_scf_99590(mol)
    config = {"scf_eng": scf_eng, "cphf_tol": 1e-12}
    return GradSCF(config)
gradn_99590 = NucCoordDerivGenerator(mol, grad_generator_99590)
[27]:
nd_E_1_99590 = NumericDiff(gradn_99590, lambda gradh: gradh.E_1.flatten()).derivative
np.allclose(hessh_99590.E_2, nd_E_1_99590, atol=3e-5)
[27]:
True
[28]:
np.abs(hessh_99590.E_2 - nd_E_1_99590).sum() / hessh.E_2.size
[28]:
3.8146258984312884e-06

事实上,GGA 计算过程中,还存在格点的偏移导致的梯度变化。这部分梯度变化没有纳入我们的计算过程中。但应当认为,若格点越密集,那么因格点偏移导致的变化会因为积分的精度提高而被掩盖;因此,格点精度越高,越有可能得到更为接近数值梯度的解析梯度。