论文标题
双重随机目标的摊销差异
Amortized variance reduction for doubly stochastic objectives
论文作者
论文摘要
在复杂的概率模型(例如深层过程)中的近似推断需要优化双重随机目标函数。这些目标既包含数据的随机性,又是来自数据的微型次要采样以及蒙特卡洛的期望估计。如果梯度方差很高,则随机优化问题随着收敛速度缓慢而困难。控制变量可用于减少方差,但过去的方法没有考虑到小批量的随机性如何影响采样随机性,从而导致降低差异。我们提出了一种新方法,其中我们使用识别网络廉价地近似于每个迷你批次的最佳控制变量,而没有其他模型梯度计算。我们说明了该提案的特性,并在逻辑回归和深层过程中测试了其性能。
Approximate inference in complex probabilistic models such as deep Gaussian processes requires the optimisation of doubly stochastic objective functions. These objectives incorporate randomness both from mini-batch subsampling of the data and from Monte Carlo estimation of expectations. If the gradient variance is high, the stochastic optimisation problem becomes difficult with a slow rate of convergence. Control variates can be used to reduce the variance, but past approaches do not take into account how mini-batch stochasticity affects sampling stochasticity, resulting in sub-optimal variance reduction. We propose a new approach in which we use a recognition network to cheaply approximate the optimal control variate for each mini-batch, with no additional model gradient computations. We illustrate the properties of this proposal and test its performance on logistic regression and deep Gaussian processes.