论文标题
具有对称层次差异过程的母体高血压疾病的模型选择
Model Selection for Maternal Hypertensive Disorders with Symmetric Hierarchical Dirichlet Processes
论文作者
论文摘要
世界各地约10%的孕妇发生了高血压疾病。尽管有证据表明高血压会影响孕产妇的心脏功能,但只能部分理解高血压与心脏功能障碍之间的关系。对这种关系的研究可以作为多个人群的共同推断问题构架,每个人群都与不同的高血压疾病诊断相对应,该研究结合了通过心脏功能指数集合提供的多元信息。贝叶斯非参数方法似乎特别适合这种设置,我们在由脑胸膜超声心动图摄影结果组成的数据集中证明了这一点。我们能够进行模型选择,提供心脏功能指数的密度估计以及患者的潜在聚类:这些易于解释的推论产量允许与健康受试者相比,可以挑出高血压患者的改良心脏功能,并随着疾病的严重程度而逐渐增加改变。该分析基于贝叶斯非参数模型,该模型依赖于一种新型的层次结构,称为对称层次级别的Dirichlet过程。这是适当设计的,以便识别并将平均参数用于跨种群的模型选择,对多样性进行惩罚,并通过受试者的潜在聚类来研究未观察到的相关因素。后推断依赖于合适的马尔可夫链蒙特卡洛算法,并且在模拟数据上还展示了模型行为。
Hypertensive disorders of pregnancy occur in about 10% of pregnant women around the world. Though there is evidence that hypertension impacts maternal cardiac functions, the relation between hypertension and cardiac dysfunctions is only partially understood. The study of this relationship can be framed as a joint inferential problem on multiple populations, each corresponding to a different hypertensive disorder diagnosis, that combines multivariate information provided by a collection of cardiac function indexes. A Bayesian nonparametric approach seems particularly suited for this setup and we demonstrate it on a dataset consisting of transthoracic echocardiography results of a cohort of Indian pregnant women. We are able to perform model selection, provide density estimates of cardiac function indexes and a latent clustering of patients: these readily interpretable inferential outputs allow to single out modified cardiac functions in hypertensive patients compared to healthy subjects and progressively increased alterations with the severity of the disorder. The analysis is based on a Bayesian nonparametric model that relies on a novel hierarchical structure, called symmetric hierarchical Dirichlet process. This is suitably designed so that the mean parameters are identified and used for model selection across populations, a penalization for multiplicity is enforced, and the presence of unobserved relevant factors is investigated through a latent clustering of subjects. Posterior inference relies on a suitable Markov Chain Monte Carlo algorithm and the model behaviour is also showcased on simulated data.