论文标题

聚合物性质预测的分子集合的图表表示

A graph representation of molecular ensembles for polymer property prediction

论文作者

Aldeghi, Matteo, Coley, Connor W.

论文摘要

合成聚合物是用途广泛的,并且使用了广泛使用的材料。与小的有机分子类似,假设可以访问此类材料的大型化学空间。计算属性预测和虚拟筛选可以通过优先考虑预期具有有利属性的候选人来加速聚合物设计。但是,与有机分子相反,聚合物通常不是定义明确的单结构,而是类似分子的合奏,这对传统的化学表现和机器学习方法构成了独特的挑战。在这里,我们介绍了分子集合的图表和相关的图形神经网络结构,该图是针对聚合物属性预测量身定制的。我们证明,这种方法捕获了聚合物材料的关键特征,例如链结构,单体化学计量和聚合程度,并且具有优于现成的化学形式方法的精度。在这样做的同时,我们为具有不同单体组成,化学计量和链架构的> 40K聚合物的模拟电子亲和力和电离潜在值的数据集进行了数据集,该数据集可用于开发其他量身定制的机器学习方法。这项工作中介绍的数据集和机器学习模型铺平了通往聚合物信息学新算法类别的道路,更广泛地引入了一个分子集合建模的框架。

Synthetic polymers are versatile and widely used materials. Similar to small organic molecules, a large chemical space of such materials is hypothetically accessible. Computational property prediction and virtual screening can accelerate polymer design by prioritizing candidates expected to have favorable properties. However, in contrast to organic molecules, polymers are often not well-defined single structures but an ensemble of similar molecules, which poses unique challenges to traditional chemical representations and machine learning approaches. Here, we introduce a graph representation of molecular ensembles and an associated graph neural network architecture that is tailored to polymer property prediction. We demonstrate that this approach captures critical features of polymeric materials, like chain architecture, monomer stoichiometry, and degree of polymerization, and achieves superior accuracy to off-the-shelf cheminformatics methodologies. While doing so, we built a dataset of simulated electron affinity and ionization potential values for >40k polymers with varying monomer composition, stoichiometry, and chain architecture, which may be used in the development of other tailored machine learning approaches. The dataset and machine learning models presented in this work pave the path toward new classes of algorithms for polymer informatics and, more broadly, introduce a framework for the modeling of molecular ensembles.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源