k-slined互信息：对尺寸可伸缩性的定量研究

论文标题

k-slined互信息：对尺寸可伸缩性的定量研究

k-Sliced Mutual Information: A Quantitative Study of Scalability with Dimension

论文作者

Goldfeld, Ziv, Greenewald, Kristjan, Nuradha, Theshani, Reeves, Galen

论文摘要

切成薄片的相互信息（SMI）定义为在随机变量的一维随机投影之间的平均值（MI）项。它是对经典MI依赖的替代度量，该量子保留了许多特性，但更可扩展到高维度。但是，对SMI本身和其估计率的定量表征取决于环境维度，这对于理解可伸缩性至关重要，仍然晦涩难懂。这项工作提供了一个多方面的帐户，说明了SMI对维度的依赖性，在更广泛的框架中，称为$ k $ -smi，该框架将预测视为$ k $维度的子空间。使用2-Wasserstein指标中差分熵的连续性的新结果，我们对蒙特卡洛（MC）基于$ k $ -smi的估计的误差得出了鲜明的界限，并且明确依赖于$ k $和环境尺寸，从而揭示了他们与样品数量的相互作用。然后，我们将MC积分器与神经估计框架相结合，以提供端到端$ K $ -SMI估算器，以确定最佳的收敛速率。随着维度的增长，我们还探索了人口$ k $ -smi的渐近学，从而为高斯近似结果提供了在适当的力矩范围下衰减的残差。我们所有的结果都通过设置$ k = 1 $来微不足道地适用于SMI。我们的理论通过数值实验验证，并适用于切片Infogan，该切片完全提供了$ k $ -smi的可伸缩性问题的全面定量说明，包括SMI作为特殊情况，当$ k = 1 $。

Sliced mutual information (SMI) is defined as an average of mutual information (MI) terms between one-dimensional random projections of the random variables. It serves as a surrogate measure of dependence to classic MI that preserves many of its properties but is more scalable to high dimensions. However, a quantitative characterization of how SMI itself and estimation rates thereof depend on the ambient dimension, which is crucial to the understanding of scalability, remain obscure. This work provides a multifaceted account of the dependence of SMI on dimension, under a broader framework termed $k$-SMI, which considers projections to $k$-dimensional subspaces. Using a new result on the continuity of differential entropy in the 2-Wasserstein metric, we derive sharp bounds on the error of Monte Carlo (MC)-based estimates of $k$-SMI, with explicit dependence on $k$ and the ambient dimension, revealing their interplay with the number of samples. We then combine the MC integrator with the neural estimation framework to provide an end-to-end $k$-SMI estimator, for which optimal convergence rates are established. We also explore asymptotics of the population $k$-SMI as dimension grows, providing Gaussian approximation results with a residual that decays under appropriate moment bounds. All our results trivially apply to SMI by setting $k=1$. Our theory is validated with numerical experiments and is applied to sliced InfoGAN, which altogether provide a comprehensive quantitative account of the scalability question of $k$-SMI, including SMI as a special case when $k=1$.

下载PDF全文

下载文献需遵守相关版权规定

论文标题