论文标题

基因组重排模型的新代数方法

A new algebraic approach to genome rearrangement models

论文作者

Terauds, Venta, Sumner, Jeremy

论文摘要

我们提出了在基因组代数中建模基因组及其重排的统一框架,作为同时结合所有物理对称性的元素。在使用对称组的组代数的先前工作的基础上,我们明确构建了具有二面对称性的无签名圆形基因组的基因组代数,并表明基因组重排距离的最大似然估计值(MLE)可以有效,并且可以在此设置中更有效地执行。然后,我们为更一般的情况构建基因组代数,即可能由任意组和对称组元素表示的基因组,并表明可以在此框架内完全执行MLE计算。在此框架中没有规定的模型;也就是说,它允许选择保留一组区域的重排和任意权重。此外,由于可能性函数是由路径概率(路径计数的概括)构建的,因此可以将框架用于基于路径概率的任何距离度量。

We present a unified framework for modelling genomes and their rearrangements in a genome algebra, as elements that simultaneously incorporate all physical symmetries. Building on previous work utilising the group algebra of the symmetric group, we explicitly construct the genome algebra for the case of unsigned circular genomes with dihedral symmetry and show that the maximum likelihood estimate (MLE) of genome rearrangement distance can be validly and more efficiently performed in this setting. We then construct the genome algebra for a more general case, that is, for genomes that may be represented by elements of an arbitrary group and symmetry group, and show that the MLE computations can be performed entirely within this framework. There is no prescribed model in this framework; that is, it allows any choice of rearrangements that preserve the set of regions, along with arbitrary weights. Further, since the likelihood function is built from path probabilities -- a generalisation of path counts -- the framework may be utilised for any distance measure that is based on path probabilities.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源