论文标题

与最小分歧的拟合指标和超特质

Fitting Metrics and Ultrametrics with Minimum Disagreements

论文作者

Cohen-Addad, Vincent, Fan, Chenglin, Lee, Euiwoong, de Mesmay, Arnaud

论文摘要

给定$ x \ in(\ mathbb {r} _ {\ geq 0})^{\ binom {[n]} {2}}} $记录成对距离,度量违规距离(MVD)问题要求计算$ \ ell_0 $ \ ell_0 $ x $ x $ x $和mentric cone;即,修改$ x $的最小条目数量以使其成为度量。由于其在各种数据分析和优化任务中的大量应用程序,最近对此问题进行了积极研究。 我们提出了MVD的$ O(\ log n)$ - 近似算法,指数级提高了Fan等人的$ O(opt^{1/3})$的先前最佳近似值。 [SODA,2018]。此外,我们算法的主要优势是它的简单和运行时间。我们还研究了超法违规距离(UMVD)的相关问题,其中的目标是计算$ \ ell_0 $的距离到超透明锥体,并达到恒定的因子近似算法。可以将UMVD视为拟合Ailon和Charikar研究的拟合超特质问题的扩展[Siam J. Computing,2011],以及Cohen-Addad等人。 [focs,2021]从$ \ ell_1 $ norm到$ \ ell_0 $ norm。我们表明,这个问题可以被有利地解释为与附加层次结构的相关聚类的实例,我们使用新的$ o(1)$ - 近似算法来求解相关聚类的近似算法,该算法具有其结构性属性,其结构性属性可以输出最佳簇的完善。满足这种财产的算法可以被视为独立利益。我们还为加权实例提供$ O(\ log n \ log \ log n)$近似算法。最后,我们调查了这些问题的互补版本,其中一个人旨在选择最大数量的条目,以$ x $形成(Ultra-)度量。与最小化版本形成鲜明对比的是,我们证明这些最大化版本在假设唯一的游戏猜想的情况下很难在任何恒定因素中近似。

Given $x \in (\mathbb{R}_{\geq 0})^{\binom{[n]}{2}}$ recording pairwise distances, the METRIC VIOLATION DISTANCE (MVD) problem asks to compute the $\ell_0$ distance between $x$ and the metric cone; i.e., modify the minimum number of entries of $x$ to make it a metric. Due to its large number of applications in various data analysis and optimization tasks, this problem has been actively studied recently. We present an $O(\log n)$-approximation algorithm for MVD, exponentially improving the previous best approximation ratio of $O(OPT^{1/3})$ of Fan et al. [ SODA, 2018]. Furthermore, a major strength of our algorithm is its simplicity and running time. We also study the related problem of ULTRAMETRIC VIOLATION DISTANCE (UMVD), where the goal is to compute the $\ell_0$ distance to the cone of ultrametrics, and achieve a constant factor approximation algorithm. The UMVD can be regarded as an extension of the problem of fitting ultrametrics studied by Ailon and Charikar [SIAM J. Computing, 2011] and by Cohen-Addad et al. [FOCS, 2021] from $\ell_1$ norm to $\ell_0$ norm. We show that this problem can be favorably interpreted as an instance of Correlation Clustering with an additional hierarchical structure, which we solve using a new $O(1)$-approximation algorithm for correlation clustering that has the structural property that it outputs a refinement of the optimum clusters. An algorithm satisfying such a property can be considered of independent interest. We also provide an $O(\log n \log \log n)$ approximation algorithm for weighted instances. Finally, we investigate the complementary version of these problems where one aims at choosing a maximum number of entries of $x$ forming an (ultra-)metric. In stark contrast with the minimization versions, we prove that these maximization versions are hard to approximate within any constant factor assuming the Unique Games Conjecture.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源