论文标题

旋转精确集合计数的力量

The Power of Pivoting for Exact Clique Counting

论文作者

Jain, Shweta, Seshadhri, C.

论文摘要

集团计数是网络分析中的一项基本任务,即使是最简单的$ 3 $ cliques(三角形)的设置也是最近研究的中心。由于大集团的搜索空间中的指数爆炸,因此获得较大$ k $的$ k $ cliques的计数是算法具有挑战性的。但是,许多最新应用(尤其是用于社区检测或聚类)使用较大的集团计数。此外,人们经常需要\ textit {local}计数,每个顶点/边缘的$ k $ cliques数量。 我们的主要结果是Pivoter,这是一种算法,它准确地计算了$ k $ cliques的数量,\ textit {对于$ k $}的所有值。它在实践中非常有效,并且能够获得以前作品无法触及的图表数。例如,Pivoter在社交网络中获得了所有集团的计数,该社交网络在两小时内在商品机器上具有100m边缘。以前的平行算法不会在几天内终止。对于许多具有数千万边缘的公共数据集,枢轴还可以可行地获得本地人均和每义$ k $ clique计数(对于所有$ k $)。据我们所知,这是第一个实现此类结果的算法。 主要的见解是构造简洁的集团树(SCT),该树在输入图中存储了所有集团的压缩独特表示。它是使用一种称为\ textit {旋转}的技术构建的,这是Bron-Kerbosch的经典方法,可减少最大插座的回溯算法的递归树。值得注意的是,可以在不实际列举所有集团的情况下构建SCT,并提供简洁的数据结构,从中可以有效地从中从中获得精确的集团统计信息($ K $ -Clique计数,本地计数)。

Clique counting is a fundamental task in network analysis, and even the simplest setting of $3$-cliques (triangles) has been the center of much recent research. Getting the count of $k$-cliques for larger $k$ is algorithmically challenging, due to the exponential blowup in the search space of large cliques. But a number of recent applications (especially for community detection or clustering) use larger clique counts. Moreover, one often desires \textit{local} counts, the number of $k$-cliques per vertex/edge. Our main result is Pivoter, an algorithm that exactly counts the number of $k$-cliques, \textit{for all values of $k$}. It is surprisingly effective in practice, and is able to get clique counts of graphs that were beyond the reach of previous work. For example, Pivoter gets all clique counts in a social network with a 100M edges within two hours on a commodity machine. Previous parallel algorithms do not terminate in days. Pivoter can also feasibly get local per-vertex and per-edge $k$-clique counts (for all $k$) for many public data sets with tens of millions of edges. To the best of our knowledge, this is the first algorithm that achieves such results. The main insight is the construction of a Succinct Clique Tree (SCT) that stores a compressed unique representation of all cliques in an input graph. It is built using a technique called \textit{pivoting}, a classic approach by Bron-Kerbosch to reduce the recursion tree of backtracking algorithms for maximal cliques. Remarkably, the SCT can be built without actually enumerating all cliques, and provides a succinct data structure from which exact clique statistics ($k$-clique counts, local counts) can be read off efficiently.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源