离散数据标准化流的潜在转化

论文标题

离散数据标准化流的潜在转化

Latent Transformations for Discrete-Data Normalising Flows

论文作者

Hesselink, Rob, Aziz, Wilker

论文摘要

离散数据的归一化流（NFS）是具有挑战性的，因为离散变量的参数化bi原始变换需要预测离散/整数参数。具有神经网络体系结构预测离散参数采用非不同的激活函数（例如，步骤函数），该功能无法基于梯度的学习。为了避免这种非差异性，以前的工作采用了偏见的代理梯度，例如直通估计器。我们提出了一种公正的替代方案，其中而不是确定性参数化一个转换，我们预测了潜在转化的分布。通过随机转换，数据的边际可能性是可区分的，并且可以通过得分函数估计来基于梯度的学习。为了测试离散数据NF的生存能力，我们研究了二进制MNIST的性能。我们观察到确定性替代梯度和无偏分函数估计的巨大挑战。尽管前者通常甚至无法学习浅的转变，但后者的差异无法得到足够的控制，无法接受更深的NFS。

Normalising flows (NFs) for discrete data are challenging because parameterising bijective transformations of discrete variables requires predicting discrete/integer parameters. Having a neural network architecture predict discrete parameters takes a non-differentiable activation function (eg, the step function) which precludes gradient-based learning. To circumvent this non-differentiability, previous work has employed biased proxy gradients, such as the straight-through estimator. We present an unbiased alternative where rather than deterministically parameterising one transformation, we predict a distribution over latent transformations. With stochastic transformations, the marginal likelihood of the data is differentiable and gradient-based learning is possible via score function estimation. To test the viability of discrete-data NFs we investigate performance on binary MNIST. We observe great challenges with both deterministic proxy gradients and unbiased score function estimation. Whereas the former often fails to learn even a shallow transformation, the variance of the latter could not be sufficiently controlled to admit deeper NFs.

下载PDF全文

下载文献需遵守相关版权规定

论文标题