2. 背景资料

2.1. 参考文献

从这一节开始，我们就开始讨论如何求取分子能量的解析梯度，也是真正理解 pyxdh 项目的核心内容了。在此之前，我们先需要了解下述的几篇文献。在使用这些文献时，会以首作者名来代表这篇文章或书籍。

Yamaguchi

这篇书籍会是一篇计算梯度量的导引书籍。所有 RHF 方法的梯度计算都可以从这篇书籍找到。我们的大部分程序都基于这本书在 RHF 梯度的讨论上。

Yamaguchi, Y.; Goddard, J. D.; Osamura, Y. & Schaefer, H.

A New Dimension to Quantum Chemistry: Analytic Derivative Methods in Ab Initio Molecular Electronic Structure Theory (International Series of Monographs on Chemistry)

Oxford University Press, 1994
Szabo

这篇书籍是基础的量化程序书籍。这本书的内容通常不在讨论范围之内，因为它几乎不讨论能量梯度性质；但作为基础书，我们以后有可能会引用其公式。

Szabo, A. & Ostlund, N. S.

Modern Quantum Chemistry: Introduction to Advanced Electronic Structure Theory (Dover Books on Chemistry)

Dover Publications, 1996
Aikens

这篇文档讨论 MP2 的一阶梯度实现。

Aikens, C. M.; Webb, S. P.; Bell, R. L.; Fletcher, G. D.; Schmidt, M. W. & Gordon, M. S.

A derivation of the frozen-orbital unrestricted open-shell and restricted closed-shell second-order perturbation theory analytic gradient expressions

Theor. Chem. Acc. 2003, 110, 233-253

doi: 10.1007/s00214-003-0453-3
Su

这篇文档讨论 XYG3 型泛函的一阶梯度实现。

Su, N. Q.; Zhang, I. Y. & Xu, X.

Analytic derivatives for the XYG3 type of doubly hybrid density functionals: Theory, implementation, and assessment

J. Comput. Chem. 2013, 34, 1759-1774

doi: 10.1002/jcc.23312
Cammi

这篇文档讨论 MP2 的二阶梯度实现。

Cammi, R.; Mennucci, B.; Pomelli, C.; Cappelli, C.; Corni, S.; Frediani, L.; Trucks, G. W. & Frisch, M. J.

Second-order Møller–Plesset second derivatives for the polarizable continuum model: theoretical bases and application to solvent effects in electrophilic bromination of ethylene

Theor. Chem. Acc. 2004, 111, 66-77

doi: 10.1007/s00214-003-0521-8
Handy

这篇文档讨论 MP2 二阶梯度中涉及到的 U 矩阵奇点项的处理。关于奇点项，我们以后会花精力讨论这个问题。

Handy, N.; Amos, R.; Gaw, J.; Rice, J. & Simandiras, E.

The elimination of singularities in derivative calculations

Chem. Phys. Lett. 1985, 120, 151-158

doi: 10.1016/0009-2614(85)87031-7

2.2. 矩阵乘法的全导数：链式法则

在讨论量化问题之前，我们先了解矩阵乘法的全导数是如何给出的。这一段的符号是纯数学的，即不包含任何物理意义。

[1]:

import numpy as np

譬如，我们现在有矩阵 \(A_{ij}\) 与 \(B_{jk}\) 相乘：

\[C_{ik} = A_{ij} B_{jk}\]

或者用粗体表示矩阵：

\[\mathbf{C} = \mathbf{A} \mathbf{B}\]

同时，矩阵 \(A_{ij}\) 与 \(B_{jk}\) 被参数 \(t\) 决定；譬如说，我们定义

\[\begin{split}\begin{align} A_{ij} &= (2i + j) \exp(2 t) \\ B_{jk} &= \sin(j t - 2 k) \end{align}\end{split}\]

[2]:

A = lambda t: (np.arange(2)[:, None] + np.arange(3)) * np.exp(2 * t)
A(1)

[2]:

array([[ 0.       ,  7.3890561, 14.7781122],
       [ 7.3890561, 14.7781122, 22.1671683]])

[3]:

B = lambda t: np.sin(np.arange(3)[:, None] * t - 2 * np.arange(4))
B(1)

[3]:

array([[ 0.        , -0.90929743,  0.7568025 ,  0.2794155 ],
       [ 0.84147098, -0.84147098, -0.14112001,  0.95892427],
       [ 0.90929743,  0.        , -0.90929743,  0.7568025 ]])

那么矩阵对参数 \(t\) 的导数就可以表示为

\[\begin{split}\begin{align} A_{ij}^t = \frac{\partial A_{ij}}{\partial t} = 2 (2i + j) \exp(t) \\ B_{jk}^t = \frac{\partial B_{jk}}{\partial t} = j \cos (jt - 2k) \end{align}\end{split}\]

[4]:

A_p = lambda t: 2 * (np.arange(2)[:, None] + np.arange(3)) * np.exp(2 * t)
A_p(1)

[4]:

array([[ 0.        , 14.7781122 , 29.5562244 ],
       [14.7781122 , 29.5562244 , 44.33433659]])

[5]:

B_p = lambda t: np.arange(3)[:, None] * np.cos(np.arange(3)[:, None] * t - 2 * np.arange(4))
B_p(1)

[5]:

array([[ 0.        , -0.        , -0.        ,  0.        ],
       [ 0.54030231,  0.54030231, -0.9899925 ,  0.28366219],
       [-0.83229367,  2.        , -0.83229367, -1.30728724]])

我们可以用三点差分法来判断，上述矩阵的导数是否计算正确：

[6]:

(A(1 + 1e-4) - A(1 - 1e-4)) / 2e-4

[6]:

array([[ 0.        , 14.7781123 , 29.55622459],
       [14.7781123 , 29.55622459, 44.33433689]])

[7]:

(B(1 + 1e-4) - B(1 - 1e-4)) / 2e-4

[7]:

array([[ 0.        ,  0.        ,  0.        ,  0.        ],
       [ 0.5403023 ,  0.5403023 , -0.98999249,  0.28366218],
       [-0.83229367,  1.99999999, -0.83229367, -1.30728723]])

矩阵 \(C_{ik} = A_{ij} B_{jk}\) 可以表示为

[8]:

C = lambda t: A(t) @ B(t)
C(1)

[8]:

array([[ 19.65537571,  -6.21767631, -14.48044305,  18.26965745],
       [ 32.59190172, -19.15420232, -16.64998031,  33.01187559]])

现在我们用数值导数的方法全导数 \(C_{ik}^t = \frac{\partial C_{ik}}{\partial t}\)：

[9]:

(C(1 + 1e-4) - C(1 - 1e-4)) / 2e-4

[9]:

array([[ 31.00334575,  21.11319627, -48.57572542,  19.31607267],
       [ 54.71885701,  14.01058065, -66.37977464,  41.23688581]])

该全导数的求取方式是链式法则：

\[C_{ik}^t = \frac{\partial C_{ik}}{\partial t} = \frac{\partial A_{ij}}{\partial t} B_{jk} + A_{ij} \frac{\partial B_{jk}}{\partial t} = A_{ij}^t B_{jk} + A_{ij} B_{jk}^t\]

[10]:

A_p(1) @ B(1) + A(1) @ B_p(1)

[10]:

array([[ 31.00334618,  21.11319582, -48.57572548,  19.31607316],
       [ 54.71885761,  14.01058005, -66.37977474,  41.23688649]])

我们后期很少接触非常复杂的矩阵导数，譬如形如 \(\exp(\mathrm{A})\) 的导数。我们以后的工作会非常繁杂冗长；链式法则近乎于就是全部的数学基础了。尽管它很简单，但非常关键，并且真正活用链式法则其实还是有一定困难的。