论文标题
Blant:网络拓扑的基本本地对齐,第2部分:仅拓扑扩展Graphlet种子
BLANT: Basic Local Alignment of Network Topology, Part 2: Topology-only Extension Beyond Graphlet Seeds
论文作者
论文摘要
BLAST是生物信息学的标准工具,用于使用“种子和扩展”方法创建局部序列比对。在这里,我们介绍了一种类似的种子和扩展算法,该算法产生本地网络一致性:Blant(网络拓扑的基本局部对齐)。在第1部分中,我们介绍了Blant-seed,该种子仅使用拓扑信息生成基于Graphlet的种子。在这里,在第2部分中,我们描述了易变的扩展,它仅使用拓扑信息将种子“生长”到较大的局部比分。我们允许用户在几个度量上指定界限必须满足对齐的满足,包括边缘密度,边缘通用性(即对齐边缘)和节点对相似性,如果使用了这种措施;后者允许如果需要的话,也允许基于序列的相似性以及局部拓扑结构。 Blant-Extend能够列举所有可能在指定的CPU时间或生成的比对数中从每个种子中种植的边界的所有可能对齐。虽然以拓扑为驱动的本地网络对齐方式在生物信息学以外具有各种潜在的应用,但在这里我们重点关注蛋白质 - 蛋白质相互作用(PPI)网络的一致性。我们表明,当已知网络具有很高的拓扑相似性时,Blant能够找到大型高质量的局部比对 - 例如,在最近的集成交互数据库(IID)的网络之间恢复数百个直系同源物。但是,可以预见的是,当缺乏真正的拓扑相似性时,它的性能不佳,就像当前嘈杂且边缘密度差异很大的大多数实验性PPI网络一样。
BLAST is a standard tool in bioinformatics for creating local sequence alignments using a "seed-and-extend" approach. Here we introduce an analogous seed-and-extend algorithm that produces local network alignments: BLANT (Basic Local Alignment of Network Topology). In Part 1, we introduced BLANT-seed, which generates graphlet-based seeds using only topological information. Here, in Part 2, we describe BLANT-extend, which "grows" seeds to larger local alignments using only topological information. We allow the user to specify bounds on several measures an alignment must satisfy, including the edge density, edge commonality (i.e., aligned edges), and node-pair similarity if such a measure is used; the latter allows the inclusion of sequence-based similarity, if desired, as well as local topological constraints. BLANT-extend is able to enumerate all possible alignments satisfying the bounds that can be grown from each seed, within a specified CPU time or number of generated alignments. While topology-driven local network alignment has a wide variety of potential applications outside bioinformatics, here we focus on the alignment of Protein-Protein Interaction (PPI) networks. We show that BLANT is capable of finding large, high-quality local alignments when the networks are known to have high topological similarity -- for example recovering hundreds of orthologs between networks of the recent Integrated Interaction Database (IID). Predictably, however, it performs less well when true topological similarity is absent, as is the case in most current experimental PPI networks that are noisy and have wide disparity in edge density which results in low common coverage.