论文标题
对CUR分解的采样稳定性
Stability of Sampling for CUR Decompositions
论文作者
论文摘要
本文研究了如何通过主要随机抽样来形成低级矩阵的CUR分解,尽管也说明了由于先前工作引起的确定性方法。主要问题是确定等级$ k $矩阵的列列还具有排名$ k $。对于随机列采样方案,通常需要选择的列数与确定采样概率的复杂性之间存在权衡。我们讨论了几种抽样方法及其复杂性以及在概率和基础矩阵的扰动下的方法的稳定性。作为一种应用,我们可以通过CUR分解来提供高概率保证子空间聚类问题的精确解决方案时,当列根据其欧几里得长度采样时。
This article studies how to form CUR decompositions of low-rank matrices via primarily random sampling, though deterministic methods due to previous works are illustrated as well. The primary problem is to determine when a column submatrix of a rank $k$ matrix also has rank $k$. For random column sampling schemes, there is typically a tradeoff between the number of columns needed to be chosen and the complexity of determining the sampling probabilities. We discuss several sampling methods and their complexities as well as stability of the method under perturbations of both the probabilities and the underlying matrix. As an application, we give a high probability guarantee of the exact solution of the Subspace Clustering Problem via CUR decompositions when columns are sampled according to their Euclidean lengths.