网络重新采样以估计不确定性

论文标题

网络重新采样以估计不确定性

Network resampling for estimating uncertainty

论文作者

Shan, Qianhua, Levina, Elizaveta

论文摘要

随着网络数据在许多应用中变得无处不在，已经提出了许多用于网络分析的模型和算法。然而，除了网络参数的点估计之外，提供不确定性估计的方法要少得多。尽管引导程序和其他重新采样程序是估算I.I.D.不确定性的有效一般工具。样本，将它们适应网络是高度不平凡的。在这项工作中，我们研究了三个不同的网络重采样程序以进行不确定性估计，并提出了一种通用算法来通过网络重新采样来构建网络参数的置信区间。我们还提出了一种用于选择采样分数的算法，该算法对性能具有重大影响。我们发现，毫不奇怪，在所有任务上，没有一个程序在经验上是最好的，但是在许多情况下，选择适当的抽样分数可以显着提高性能。我们在模拟网络和Facebook数据上说明了这一点。

With network data becoming ubiquitous in many applications, many models and algorithms for network analysis have been proposed. Yet methods for providing uncertainty estimates in addition to point estimates of network parameters are much less common. While bootstrap and other resampling procedures have been an effective general tool for estimating uncertainty from i.i.d. samples, adapting them to networks is highly nontrivial. In this work, we study three different network resampling procedures for uncertainty estimation, and propose a general algorithm to construct confidence intervals for network parameters through network resampling. We also propose an algorithm for selecting the sampling fraction, which has a substantial effect on performance. We find that, unsurprisingly, no one procedure is empirically best for all tasks, but that selecting an appropriate sampling fraction substantially improves performance in many cases. We illustrate this on simulated networks and on Facebook data.

下载PDF全文

下载文献需遵守相关版权规定

论文标题