论文标题

k-slined互信息:对尺寸可伸缩性的定量研究

k-Sliced Mutual Information: A Quantitative Study of Scalability with Dimension

论文作者

Goldfeld, Ziv, Greenewald, Kristjan, Nuradha, Theshani, Reeves, Galen

论文摘要

切成薄片的相互信息(SMI)定义为在随机变量的一维随机投影之间的平均值(MI)项。它是对经典MI依赖的替代度量,该量子保留了许多特性,但更可扩展到高维度。但是,对SMI本身和其估计率的定量表征取决于环境维度,这对于理解可伸缩性至关重要,仍然晦涩难懂。这项工作提供了一个多方面的帐户,说明了SMI对维度的依赖性,在更广泛的框架中,称为$ k $ -smi,该框架将预测视为$ k $维度的子空间。使用2-Wasserstein指标中差分熵的连续性的新结果,我们对蒙特卡洛(MC)基于$ k $ -smi的估计的误差得出了鲜明的界限,并且明确依赖于$ k $和环境尺寸,从而揭示了他们与样品数量的相互作用。然后,我们将MC积分器与神经估计框架相结合,以提供端到端$ K $ -SMI估算器,以确定最佳的收敛速率。随着维度的增长,我们还探索了人口$ k $ -smi的渐近学,从而为高斯近似结果提供了在适当的力矩范围下衰减的残差。我们所有的结果都通过设置$ k = 1 $来微不足道地适用于SMI。我们的理论通过数值实验验证,并适用于切片Infogan,该切片完全提供了$ k $ -smi的可伸缩性问题的全面定量说明,包括SMI作为特殊情况,当$ k = 1 $。

Sliced mutual information (SMI) is defined as an average of mutual information (MI) terms between one-dimensional random projections of the random variables. It serves as a surrogate measure of dependence to classic MI that preserves many of its properties but is more scalable to high dimensions. However, a quantitative characterization of how SMI itself and estimation rates thereof depend on the ambient dimension, which is crucial to the understanding of scalability, remain obscure. This work provides a multifaceted account of the dependence of SMI on dimension, under a broader framework termed $k$-SMI, which considers projections to $k$-dimensional subspaces. Using a new result on the continuity of differential entropy in the 2-Wasserstein metric, we derive sharp bounds on the error of Monte Carlo (MC)-based estimates of $k$-SMI, with explicit dependence on $k$ and the ambient dimension, revealing their interplay with the number of samples. We then combine the MC integrator with the neural estimation framework to provide an end-to-end $k$-SMI estimator, for which optimal convergence rates are established. We also explore asymptotics of the population $k$-SMI as dimension grows, providing Gaussian approximation results with a residual that decays under appropriate moment bounds. All our results trivially apply to SMI by setting $k=1$. Our theory is validated with numerical experiments and is applied to sliced InfoGAN, which altogether provide a comprehensive quantitative account of the scalability question of $k$-SMI, including SMI as a special case when $k=1$.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源