Amagold：摊销的大都市调整，以进行有效的随机梯度MCMC

论文标题

Amagold：摊销的大都市调整，以进行有效的随机梯度MCMC

AMAGOLD: Amortized Metropolis Adjustment for Efficient Stochastic Gradient MCMC

论文作者

Zhang, Ruqi, Cooper, A. Feder, De Sa, Christopher

论文摘要

随机梯度哈密顿蒙特卡洛（SGHMC）是一种从连续分布中取样的有效方法。它是HMC的更快替代方法：SGHMC不使用每个迭代时使用整个数据集，而是仅使用一个子样本。这提高了性能，但引入了可能导致SGHMC收敛到错误分布的偏差。人们可以使用衰减为零的步骤尺寸来防止这种情况，但是这种步骤尺寸的时间表可以大大减慢收敛性。为了解决这种张力，我们提出了一种新颖的二阶SG-MCMC算法--- amagold ---不经常使用大都市磨刀（M-H）校正来消除偏见。矫正的频率不足会摊销其成本。我们证明，Amagold以固定而不是减小的步长收敛到目标分布，并且其收敛速率最多比全批基线慢。我们从经验上证明了Amagold对合成分布，贝叶斯逻辑回归和贝叶斯神经网络的有效性。

Stochastic gradient Hamiltonian Monte Carlo (SGHMC) is an efficient method for sampling from continuous distributions. It is a faster alternative to HMC: instead of using the whole dataset at each iteration, SGHMC uses only a subsample. This improves performance, but introduces bias that can cause SGHMC to converge to the wrong distribution. One can prevent this using a step size that decays to zero, but such a step size schedule can drastically slow down convergence. To address this tension, we propose a novel second-order SG-MCMC algorithm---AMAGOLD---that infrequently uses Metropolis-Hastings (M-H) corrections to remove bias. The infrequency of corrections amortizes their cost. We prove AMAGOLD converges to the target distribution with a fixed, rather than a diminishing, step size, and that its convergence rate is at most a constant factor slower than a full-batch baseline. We empirically demonstrate AMAGOLD's effectiveness on synthetic distributions, Bayesian logistic regression, and Bayesian neural networks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题