论文标题

跨模式蛋白嵌入化合物蛋白亲和力和接触预测

Cross-Modality Protein Embedding for Compound-Protein Affinity and Contact Prediction

论文作者

You, Yuning, Shen, Yang

论文摘要

化合物蛋白对主导了FDA批准的药物定位对,并预测了复合蛋白亲和力和接触(CPAC)可以帮助加速药物发现。在这项研究中,我们将蛋白质视为多模式数据,包括1D氨基酸序列和(序列预测的)2D残基接触图。我们从经验上评估了这两种单一模式的嵌入在CPAC预测的准确性和普遍性(即无结构可解释的复合蛋白亲和力预测)中的嵌入。我们在嵌入单个方式和学习可推广的嵌入标签关系的挑战中合理化了他们的表现。我们进一步提出了两个涉及交叉模式蛋白嵌入的模型,并确定一种具有交叉相互作用的模型(从而捕获了模态之间的相关性)优于SOTAS和我们的单一模态模型在亲和力,接触,接触和结合位置预测蛋白质中从未见过的蛋白质。

Compound-protein pairs dominate FDA-approved drug-target pairs and the prediction of compound-protein affinity and contact (CPAC) could help accelerate drug discovery. In this study we consider proteins as multi-modal data including 1D amino-acid sequences and (sequence-predicted) 2D residue-pair contact maps. We empirically evaluate the embeddings of the two single modalities in their accuracy and generalizability of CPAC prediction (i.e. structure-free interpretable compound-protein affinity prediction). And we rationalize their performances in both challenges of embedding individual modalities and learning generalizable embedding-label relationship. We further propose two models involving cross-modality protein embedding and establish that the one with cross interaction (thus capturing correlations among modalities) outperforms SOTAs and our single modality models in affinity, contact, and binding-site predictions for proteins never seen in the training set.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源