论文标题
用于机器学习间原子电位的原子环境表示的灵敏度和维度
Sensitivity and Dimensionality of Atomic Environment Representations used for Machine Learning Interatomic Potentials
论文作者
论文摘要
忠实地代表化学环境对于用机器学习方法描述材料和分子至关重要。在这里,我们提出了这些表示形式的系统分类,然后研究:(i)对扰动的敏感性以及(ii)各种原子环境表示的有效维度,以及一系列材料数据集。研究的表示包括原子中心对称函数,Chebyshev多项式对称函数(CHSF),原子位置的平滑重叠,多体张量表示和原子群集扩展。在区域(i)中,我们表明,在切向扰动下,没有一个原子环境表示形式是线性稳定的,并且对于CHSF而言,有不稳定性可用于特定的扰动选择,我们表明可以通过对表示的略有重新定义来将其删除。在区域(ii)区域中,我们发现大多数表示形式可以显着压缩而不会损失精度,进一步选择表示表示方法的最佳子集提高了为给定数据集构建的回归模型的准确性。
Faithfully representing chemical environments is essential for describing materials and molecules with machine learning approaches. Here, we present a systematic classification of these representations and then investigate: (i) the sensitivity to perturbations and (ii) the effective dimensionality of a variety of atomic environment representations, and over a range of material datasets. Representations investigated include Atom Centred Symmetry Functions, Chebyshev Polynomial Symmetry Functions (CHSF), Smooth Overlap of Atomic Positions, Many-body Tensor Representation and Atomic Cluster Expansion. In area (i), we show that none of the atomic environment representations are linearly stable under tangential perturbations, and that for CHSF there are instabilities for particular choices of perturbation, which we show can be removed with a slight redefinition of the representation. In area (ii), we find that most representations can be compressed significantly without loss of precision, and further that selecting optimal subsets of a representation method improves the accuracy of regression models built for a given dataset.