论文标题

深度学习何时失败以及如何解决?对聚合物序列特质替代模型的批判性分析

When does deep learning fail and how to tackle it? A critical analysis on polymer sequence-property surrogate models

论文作者

Himanshu, Patra, Tarak K

论文摘要

深度学习模型在预测聚合物特性方面已获得流行和效力。这些模型可以使用预先存在的数据构建,对于聚合物特性的快速预测很有用。但是,深度学习模型的性能与其拓扑结构和培训数据的数量无关。没有可用于选择深度学习体系结构的便利协议,并且缺乏大量聚合物的均匀序列序列数据。这两个因素是有效发展深度学习模型的主要瓶颈。在这里,我们评估了这些因素的严重性,并提出了新算法来解决这些因素。我们表明,神经网络的线性分层扩展可以帮助确定给定问题的最佳神经网络拓扑。此外,我们使用机器学习管道将聚合物的离散序列空间映射到连续的一维潜在空间,以识别建立通用深度学习模型的最小数据点。我们针对三个代表性的构建序列特制替代模型的代表性案例,即共聚物回旋的单分子半径,共聚物的粘合剂自由能以及共聚物相兼容器,证明了所提出策略的一般性。这项工作建立了使用最小数据和超参数来预测聚合物的序列定义特性的有效方法。

Deep learning models are gaining popularity and potency in predicting polymer properties. These models can be built using pre-existing data and are useful for the rapid prediction of polymer properties. However, the performance of a deep learning model is intricately connected to its topology and the volume of training data. There is no facile protocol available to select a deep learning architecture, and there is a lack of a large volume of homogeneous sequence-property data of polymers. These two factors are the primary bottleneck for the efficient development of deep learning models. Here we assess the severity of these factors and propose new algorithms to address them. We show that a linear layer-by-layer expansion of a neural network can help in identifying the best neural network topology for a given problem. Moreover, we map the discrete sequence space of a polymer to a continuous one-dimensional latent space using a machine learning pipeline to identify minimal data points for building a universal deep learning model. We implement these approaches for three representative cases of building sequence-property surrogate models, viz., the single-molecule radius of gyration of a copolymer, adhesive free energy of a copolymer, and copolymer compatibilizer, demonstrating the generality of the proposed strategies. This work establishes efficient methods for building universal deep learning models with minimal data and hyperparameters for predicting sequence-defined properties of polymers.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源