论文标题

改进的持久图建模

Improved Modeling of Persistence Diagram

论文作者

Agami, Sarit

论文摘要

高维还原方法是描述大数据中主要模式的强大工具。这些方法之一是拓扑数据分析(TDA),它以拓扑特性为模型。该方法将原始数据专门转化为二维系统,该系统通过“持久图”以图形方式表示。该图上的异常值指向数据模式,而其他点则表现为随机噪声。为了确定哪些点是重要的异常值,需要对原始数据集的复制。一旦仅一个原始数据可用,就可以通过拟合持久图上的点的模型,然后使用MCMC方法来创建复制。这样的模型之一是第一个(复制统计拓扑)。在本文中,我们建议对RST模型进行修改。使用仿真研究,我们表明修改后的RST从拟合良好方面提高了RST的性能。我们使用MCMC Metropolis-Hastings算法根据拟合模型进行采样。

High-dimensional reduction methods are powerful tools for describing the main patterns in big data. One of these methods is the topological data analysis (TDA), which modeling the shape of the data in terms of topological properties. This method specifically translates the original data into two-dimensional system, which is graphically represented via the 'persistence diagram'. The outliers points on this diagram present the data pattern, whereas the other points behave as a random noise. In order to determine which points are significant outliers, replications of the original data set are needed. Once only one original data is available, replications can be created by fitting a model for the points on the persistence diagram, and then using the MCMC methods. One of such model is the RST (Replicating Statistical Topology). In this paper we suggest a modification of the RST model. Using a simulation study, we show that the modified RST improves the performance of the RST in terms of goodness of fit. We use the MCMC Metropolis-Hastings algorithm for sampling according to the fitted model.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源