论文标题
带有潜在高斯工艺的贝叶斯二项式回归模型,用于建模DNA甲基化
A Bayesian binomial regression model with latent Gaussian processes for modelling DNA methylation
论文作者
论文摘要
表观遗传观察结果由给定的细胞池和甲基化读数的读数总数表示,从而合理地通过二项式分布对这些数据进行建模是合理的。有许多因素可以影响特定地区成功的可能性。此外,这些概率具有很强的空间(与基因组)相关性。我们通过具有潜在高斯磁场和logit链路函数的二项式回归模型,将甲基化概率与观测库进行观察的甲基化概率和甲基化概率的空间依赖性结合在一起。我们采用贝叶斯方法,包括有关模型配置的先前规格。我们在不同选择的协变量选择中运行模式跳跃马尔可夫链蒙特卡洛算法(MJMCMC),以获得参数和模型的关节后分布。这还允许找到最佳的协变量集,以建模在感兴趣的基因组区域内的甲基化概率和协变量的个体边缘包含概率。
Epigenetic observations are represented by the total number of reads from a given pool of cells and the number of methylated reads, making it reasonable to model this data by a binomial distribution. There are numerous factors that can influence the probability of success in a particular region. Moreover, there is a strong spatial (alongside the genome) dependence of these probabilities. We incorporate dependence on the covariates and the spatial dependence of the methylation probability for observations from a pool of cells by means of a binomial regression model with a latent Gaussian field and a logit link function. We apply a Bayesian approach including prior specifications on model configurations. We run a mode jumping Markov chain Monte Carlo algorithm (MJMCMC) across different choices of covariates in order to obtain the joint posterior distribution of parameters and models. This also allows finding the best set of covariates to model methylation probability within the genomic region of interest and individual marginal inclusion probabilities of the covariates.