论文标题
分子偶极力矩通过旋转等效的高斯过程回归,其基于分子 - 轨道的机器学习中的衍生物回归
Molecular Dipole Moment Learning via Rotationally Equivariant Gaussian Process Regression with Derivatives in Molecular-orbital-based Machine Learning
论文作者
论文摘要
这项研究扩展了基于分子 - ML的准确且可转移的分子 - ML)方法,以建模电子相关性对偶极矩的贡献,以HARTREE-FOCK计算为代价。应用了偶极矩的相关部分的基于分子 - 轨道的成对分解,并且这些对偶极矩可以作为分子轨道(MOS)的通用功能进一步回归。偶极子暴民的特征由能量暴民特征及其对电场的响应组成。引入了一种具有衍生物算法的可解释和旋转的高斯过程回归(GPR),以更有效地学习偶极矩。提出的问题设置,特征设计和ML算法被证明可以为偶极矩和水和14个小分子提供高度精确的模型。为了证明MOB-ML充当有机分子的分子偶极矩和能量的广义密度 - 矩阵功能的能力,我们进一步应用了所提出的MOB-ML方法来训练和测试QM9数据集的分子。将局部可扩展的GPR与高斯混合模型无监督聚类(GMM/GPR)的应用将MOB-ML缩放到大数据表格中,同时保留了预测准确性。此外,与文献结果相比,在110000 QM9分子进行训练时,MOB-ML分别为偶极矩和能量模型提供了4.21 Mdebye和0.045 kcal/mol的最佳测试MAE。所得QM9模型的出色可传递性也通过对四个不同系列肽的精确预测进行了说明。
This study extends the accurate and transferable molecular-orbital-based machine learning (MOB-ML) approach to modeling the contribution of electron correlation to dipole moments at the cost of Hartree-Fock computations. A molecular-orbital-based (MOB) pairwise decomposition of the correlation part of the dipole moment is applied, and these pair dipole moments could be further regressed as a universal function of molecular orbitals (MOs). The dipole MOB features consist of the energy MOB features and their responses to electric fields. An interpretable and rotationally equivariant Gaussian process regression (GPR) with derivatives algorithm is introduced to learn the dipole moment more efficiently. The proposed problem setup, feature design, and ML algorithm are shown to provide highly-accurate models for both dipole moment and energies on water and fourteen small molecules. To demonstrate the ability of MOB-ML to function as generalized density-matrix functionals for molecular dipole moments and energies of organic molecules, we further apply the proposed MOB-ML approach to train and test the molecules from the QM9 dataset. The application of local scalable GPR with Gaussian mixture model unsupervised clustering (GMM/GPR) scales up MOB-ML to a large-data regime while retaining the prediction accuracy. In addition, compared with literature results, MOB-ML provides the best test MAEs of 4.21 mDebye and 0.045 kcal/mol for dipole moment and energy models, respectively, when training on 110000 QM9 molecules. The excellent transferability of the resulting QM9 models is also illustrated by the accurate predictions for four different series of peptides.