改善了Stein变分梯度下降，重量重量

论文标题

改善了Stein变分梯度下降，重量重量

Improved Stein Variational Gradient Descent with Importance Weights

论文作者

Sun, Lukang, Richtárik, Peter

论文摘要

Stein变分梯度下降（SVGD）是一种流行的抽样算法，用于各种机器学习任务。众所周知，SVGD源于Kullback-Leibler Divergence $ d_ {kl} \ left（\ cdot \ cdot \midπ\ right）$的离散化梯度流的离散化，其中$π$是目标分布。在这项工作中，我们建议通过引入重要性权重来增强SVGD，这导致了一种新方法，我们将其称为$β$ -SVGD。在连续的时间和无限的粒子状态下，该流程汇集到由Stein Fisher信息量化的平衡分布$π$的时间取决于$ρ_0$和$π$非常弱。这与Kullback-Leibler Divergence的内核化梯度流完全不同，后者的时间复杂性取决于$ d_ {kl} \ left（ρ_0\midπ\ right）$。在某些假设下，我们为人口限制$β$ -SVGD提供了下降引理，当$β\至0 $ $β\时，该人口限制了svgd的下降引理。我们还通过实验说明了$β$ -SVGD的优点。

Stein Variational Gradient Descent (SVGD) is a popular sampling algorithm used in various machine learning tasks. It is well known that SVGD arises from a discretization of the kernelized gradient flow of the Kullback-Leibler divergence $D_{KL}\left(\cdot\midπ\right)$, where $π$ is the target distribution. In this work, we propose to enhance SVGD via the introduction of importance weights, which leads to a new method for which we coin the name $β$-SVGD. In the continuous time and infinite particles regime, the time for this flow to converge to the equilibrium distribution $π$, quantified by the Stein Fisher information, depends on $ρ_0$ and $π$ very weakly. This is very different from the kernelized gradient flow of Kullback-Leibler divergence, whose time complexity depends on $D_{KL}\left(ρ_0\midπ\right)$. Under certain assumptions, we provide a descent lemma for the population limit $β$-SVGD, which covers the descent lemma for the population limit SVGD when $β\to 0$. We also illustrate the advantages of $β$-SVGD over SVGD by experiments.

下载PDF全文

下载文献需遵守相关版权规定

论文标题