论文标题

波利伯特:一种化学语言模型,可实现完全机器驱动的超快聚合物信息学

polyBERT: A chemical language model to enable fully machine-driven ultrafast polymer informatics

论文作者

Kuenneth, Christopher, Ramprasad, Rampi

论文摘要

聚合物是日常生活的重要组成部分。他们的化学宇宙是如此之大,以至于它带来了前所未有的机会,以及确定适当的应用特定候选者的重大挑战。我们提出了完整的端到端机器驱动的聚合物信息学管道,该管道可以以前所未有的速度和准确性搜索此空间以寻找合适的候选者。该管道包括一种称为Polybert的聚合物化学指纹识别能力(受自然语言处理概念的启发),以及一种将Polybert指纹映射到许多属性的多任务学习方法。波利伯特是一种化学语言学家,将聚合物的化学结构视为一种化学语言。目前的方法超过了目前最适合的聚合物属性概念,用于基于手工制作的指纹方案的速度,同时保持准确性的同时,使其成为在包括云基础架构在内的可扩展体系结构中部署的有力候选者。

Polymers are a vital part of everyday life. Their chemical universe is so large that it presents unprecedented opportunities as well as significant challenges to identify suitable application-specific candidates. We present a complete end-to-end machine-driven polymer informatics pipeline that can search this space for suitable candidates at unprecedented speed and accuracy. This pipeline includes a polymer chemical fingerprinting capability called polyBERT (inspired by Natural Language Processing concepts), and a multitask learning approach that maps the polyBERT fingerprints to a host of properties. polyBERT is a chemical linguist that treats the chemical structure of polymers as a chemical language. The present approach outstrips the best presently available concepts for polymer property prediction based on handcrafted fingerprint schemes in speed by two orders of magnitude while preserving accuracy, thus making it a strong candidate for deployment in scalable architectures including cloud infrastructures.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源