论文标题

重量约束各向异性分配和聚类的核心

Coresets for Weight-Constrained Anisotropic Assignment and Clustering

论文作者

Fiedler, Maximilian, Gritzmann, Peter

论文摘要

目前的纸张构建了核心,用于重量约束各向异性分配和聚类。与研究良好的最小二乘聚类问题相反,在重量约束的各向异性案例中近似簇的质心不足,因为即使将点分配给了最佳站点。这个任务步骤通常是材料科学的限制因素,这是一个部分激发我们工作的问题。我们建立在Har-Peled和Kushal的纸上,他构建了尺寸$ \ MATHCAL {O} \ bigl(\ frac {k^3} {ε^{d+1}} \ bigr)$的块,用于不受限制的最差squares clustering。我们以各种方式概括并改善它们的结果,甚至导致较小的核心,仅$ \ MATHCAL {O} \ bigl(\ frac {k^2} {ε^{d+1}}} \ bigR)$,用于举重量压约束的各向异性群集。此外,我们通过表明总敏感性可能与受约束的情况下原始数据集的基数一样大,从而回答了关于负面核心设计的一个开放问题。因此,许多基于重要性抽样的技术不适用于重量受限的聚类。

The present paper constructs coresets for weight-constrained anisotropic assignment and clustering. In contrast to the well-studied unconstrained least-squares clustering problem, approximating the centroids of the clusters no longer suffices in the weight-constrained anisotropic case, as even the assignment of the points to best sites is involved. This assignment step is often the limiting factor in materials science, a problem that partially motivates our work. We build on a paper by Har-Peled and Kushal, who constructed coresets of size $\mathcal{O}\bigl(\frac{k^3}{ε^{d+1}}\bigr)$ for unconstrained least-squares clustering. We generalize and improve on their results in various ways, leading to even smaller coresets with a size of only $\mathcal{O}\bigl(\frac{k^2}{ε^{d+1}}\bigr)$ for weight-constrained anisotropic clustering. Moreover, we answer an open question on coreset designs in the negative, by showing that the total sensitivity can become as large as the cardinality of the original data set in the constrained case. Consequently, many techniques based on importance sampling do not apply to weight-constrained clustering.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源