用于对机器学习的原子质电位进行概括分析的框架

论文标题

用于对机器学习的原子质电位进行概括分析的框架

A framework for a generalisation analysis of machine-learned interatomic potentials

论文作者

Ortner, Christoph, Wang, Yangshuai

论文摘要

机器学习的原子间电位（MLIP）和力场（即原子和分子的相互作用定律）通常是在有限的数据集上训练的，这些数据集仅涵盖可能输入结构的整个空间的一小部分。但是，MLIP能够对涉及（看似）更复杂结构的模拟中的力和能量进行准确的预测。在本文中，我们提出了一个框架，在该框架中，可以严格理解这种概括。作为一个原型示例，我们将框架应用于结晶固体中模拟点缺陷的情况。在这里，我们演示了模拟的准确性如何明确取决于训练结构的大小，即对模型已拟合的观测值（例如能量，力，力常数，病毒）的类型（例如，能量，力常数，病毒）以及拟合精度。我们在MLIP文献中部分获得了当前最佳实践的新理论见解，并提出了一种新的方法来收集培训数据和损失功能的设计。

Machine-learned interatomic potentials (MLIPs) and force fields (i.e. interaction laws for atoms and molecules) are typically trained on limited data-sets that cover only a very small section of the full space of possible input structures. MLIPs are nevertheless capable of making accurate predictions of forces and energies in simulations involving (seemingly) much more complex structures. In this article we propose a framework within which this kind of generalisation can be rigorously understood. As a prototypical example, we apply the framework to the case of simulating point defects in a crystalline solid. Here, we demonstrate how the accuracy of the simulation depends explicitly on the size of the training structures, on the kind of observations (e.g., energies, forces, force constants, virials) to which the model has been fitted, and on the fit accuracy. The new theoretical insights we gain partially justify current best practices in the MLIP literature and in addition suggest a new approach to the collection of training data and the design of loss functions.

下载PDF全文

下载文献需遵守相关版权规定

论文标题