论文标题

Dirichlet过程混合模型与高斯内核的方差矩阵先验

Variance matrix priors for Dirichlet process mixture models with Gaussian kernels

论文作者

Jing, Wei, Papathomas, Michail, Liverani, Silvia

论文摘要

Dirichlet工艺混合模型(DPMM)是一种贝叶斯非参数方法,用于密度估计和聚类。在此手稿中,我们研究了采用高斯内核时的差异或精度矩阵的先验选择。通常,在相关文献中,通过仅在几个维度的空间中考虑观察结果来评估混合模型。取而代之的是,我们关心更高维度的更现实的问题,在多达20个维度的空间中。我们观察到,随着问题的维度的增加,先验的选择越来越重要。在确定了较高维度问题中标准先验的某些不良特性之后,我们审查并实施了可能的替代先验。确定了最有希望的先验,以及影响MCMC采样器收敛性的其他因素。我们的结果表明,先验的选择对于得出可靠的后验推论至关重要。该手稿对可能的先验进行了详尽的概述和比较调查,并提供了有关实施的详细指南。尽管我们的工作着重于将DPMM在聚类中的使用,但它也适用于密度估计。

The Dirichlet Process Mixture Model (DPMM) is a Bayesian non-parametric approach widely used for density estimation and clustering. In this manuscript, we study the choice of prior for the variance or precision matrix when Gaussian kernels are adopted. Typically, in the relevant literature, the assessment of mixture models is done by considering observations in a space of only a handful of dimensions. Instead, we are concerned with more realistic problems of higher dimensionality, in a space of up to 20 dimensions. We observe that the choice of prior is increasingly important as the dimensionality of the problem increases. After identifying certain undesirable properties of standard priors in problems of higher dimensionality, we review and implement possible alternative priors. The most promising priors are identified, as well as other factors that affect the convergence of MCMC samplers. Our results show that the choice of prior is critical for deriving reliable posterior inferences. This manuscript offers a thorough overview and comparative investigation into possible priors, with detailed guidelines for their implementation. Although our work focuses on the use of the DPMM in clustering, it is also applicable to density estimation.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源