论文标题
DIMA:病毒的序列多样性动力学分析仪
DiMA: Sequence Diversity Dynamics Analyser for Viruses
论文作者
论文摘要
序列多样性是针对病毒的诊断,预防性和治疗性干预措施设计的主要挑战之一。 Dima是一种新型工具,已经准备好数据,旨在促进病毒序列多样性动力学的解剖。 DiMa通过提供各种独特的功能来脱颖而出。 DIMA通过使用Shannon的熵校正了大小偏差的序列(核苷酸/蛋白质)多样性的定量概述,该熵通过用户定义的K-MER滑动窗口应用于输入对齐文件,并且每个K-MER位置均被解剖为各种多样性基础。基序是基于给定的K-MER位置下不同序列的概率定义的,在给定的K-MER位置上,索引是主要序列,而其他所有序列是(总数)(总)变体。总变体被亚分类为主要(最常见的)变体,次要变体(发生多次发生,频率低于专业)和唯一的(单例)变体。 DIMA允许用户定义的序列元数据富集来分析图案。在严重的急性急性呼吸综合征2(SARS-COV-2)和相对高度多样化的POL蛋白(3,874)的人类免疫缺陷病毒病毒1(HIV-1)(HIV-1)的相对保守的峰值蛋白(2,106,985个序列)中,证明了DIMA的应用。该工具可公开作为Web服务器(https://dima.bezmialem.edu.tr),作为Python库(通过PYPI)和命令行客户端(通过GitHub)。
Sequence diversity is one of the major challenges in the design of diagnostic, prophylactic and therapeutic interventions against viruses. DiMA is a novel tool that is big data-ready and designed to facilitate the dissection of sequence diversity dynamics for viruses. DiMA stands out from other diversity analysis tools by offering various unique features. DiMA provides a quantitative overview of sequence (nucleotide/protein) diversity by use of Shannon's entropy corrected for size bias, applied via a user-defined k-mer sliding window to an input alignment file, and each k-mer position is dissected to various diversity motifs. The motifs are defined based on the probability of distinct sequences at a given k-mer position, whereby an index is the predominant sequence, while all the others are (total) variants to the index. The total variants are sub-classified into the major (most common) variant, minor variants (occurring more than once and of frequency lower than the major), and the unique (singleton) variants. DiMA allows user-defined, sequence metadata enrichment for analyses of the motifs. The application of DiMA was demonstrated for the alignment data of the relatively conserved Spike protein (2,106,985 sequences) of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) and the relatively highly diverse Pol protein (3,874) of human immunodeficiency virus-1 (HIV-1). The tool is publicly available as a web server (https://dima.bezmialem.edu.tr), as a Python library (via PyPi) and as a command line client (Via GitHub).