论文标题
从静态到动态结构:通过基于图的深度学习改善结合亲和力预测
From Static to Dynamic Structures: Improving Binding Affinity Prediction with Graph-Based Deep Learning
论文作者
论文摘要
准确预测蛋白质 - 配体结合亲和力是基于结构的药物设计的必要挑战。尽管数据驱动的亲和力预测方法的最新进展,但它们的准确性仍然有限,部分是因为它们仅利用静态晶体结构,而实际的结合亲和力通常由蛋白质和配体之间的热力学结合来确定。近似这种热力学合奏的一种有效方法是使用分子动力学(MD)模拟。在这里,策划了包含3,218个不同蛋白质配合物复合物的MD数据集,而DynaFormer则进一步开发了基于图的深度学习模型,以通过学习从MD轨迹中的蛋白质 - 配体相互作用的几何特征来预测结合亲和力。在计算机实验中,该模型在CASF-2016基准数据集中表现出最先进的评分和排名能力,这表现优于迄今报道的方法。此外,在使用DynaFormer对热休克蛋白90(HSP90)进行虚拟筛选中,确定了20个候选物,并在实验验证中进一步验证其结合亲和力。 Dynaformer在虚拟药物筛查中显示出令人鼓舞的结果,揭示了12种命中化合物(2种在亚微摩尔范围内),其中包括几个新型的支架。总体而言,这些结果表明,该方法为加速早期药物发现过程提供了有希望的途径。
Accurate prediction of protein-ligand binding affinities is an essential challenge in structure-based drug design. Despite recent advances in data-driven methods for affinity prediction, their accuracy is still limited, partially because they only take advantage of static crystal structures while the actual binding affinities are generally determined by the thermodynamic ensembles between proteins and ligands. One effective way to approximate such a thermodynamic ensemble is to use molecular dynamics (MD) simulation. Here, an MD dataset containing 3,218 different protein-ligand complexes is curated, and Dynaformer, a graph-based deep learning model is further developed to predict the binding affinities by learning the geometric characteristics of the protein-ligand interactions from the MD trajectories. In silico experiments demonstrated that the model exhibits state-of-the-art scoring and ranking power on the CASF-2016 benchmark dataset, outperforming the methods hitherto reported. Moreover, in a virtual screening on heat shock protein 90 (HSP90) using Dynaformer, 20 candidates are identified and their binding affinities are further experimentally validated. Dynaformer displayed promising results in virtual drug screening, revealing 12 hit compounds (two are in the submicromolar range), including several novel scaffolds. Overall, these results demonstrated that the approach offer a promising avenue for accelerating the early drug discovery process.