论文标题

利用低保真数据通过转移学习来改善稀疏高保真热导率数据的机器学习

Leveraging Low-Fidelity Data to Improve Machine Learning of Sparse High-Fidelity Thermal Conductivity Data via Transfer Learning

论文作者

Liu, Zeyu, Jiang, Meng, Luo, Tengfei

论文摘要

半导体的晶格导热率(TC)对于各种应用至关重要,范围从微电子到热电话。数据驱动的方法可以潜在地建立具有理想TC快速筛选候选人所需的关键组成性关系,但是少数可用数据仍然是主要的挑战。可以使用经验模型有效地计算TC,但与更重要的首页计算相比,它们的精度较低。在这里,我们证明了转移学习(TL)的使用来改善从实验和第一原理计算的小而高的TC数据进行培训的机器学习模型,该模型通过利用由经验TC模型产生的大型但低的获取性数据,该模型对高和低的TC数据进行了培训,将其视为不同但相关的任务。 TL在R2中提高了模型准确性多达23%,并将平均因子差异降低了30%。使用TL模型,筛选了一个大的半导体数据库,并确定了室温TC> 350 W/MK的几个候选物,并使用第一原理模拟进行了进一步验证。这项研究表明,TL可以利用大型低保真数据作为代理任务,以改善具有高保真性但小数据的目标任务模型。 TL的这种能力可能对材料信息学具有重要意义。

Lattice thermal conductivity (TC) of semiconductors is crucial for various applications, ranging from microelectronics to thermoelectrics. Data-driven approach can potentially establish the critical composition-property relationship needed for fast screening of candidates with desirable TC, but the small number of available data remains the main challenge. TC can be efficiently calculated using empirical models, but they have inferior accuracy compared to the more resource-demanding first-principles calculations. Here, we demonstrate the use of transfer learning (TL) to improve the machine learning models trained on small but high-fidelity TC data from experiments and first-principles calculations, by leveraging a large but low-fidelity data generated from empirical TC models, where the trainings on high- and low-fidelity TC data are treated as different but related tasks. TL improves the model accuracy by as much as 23% in R2 and reduces the average factor difference by as much as 30%. Using the TL model, a large semiconductor database is screened, and several candidates with room temperature TC > 350 W/mK are identified and further verified using first-principles simulations. This study demonstrates that TL can leverage big low-fidelity data as a proxy task to improve models for the target task with high-fidelity but small data. Such a capability of TL may have important implications to materials informatics in general.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源