澄清基于MCMC的现代EBM培训：对比差异与最大似然

论文标题

澄清基于MCMC的现代EBM培训：对比差异与最大似然

Clarifying MCMC-based training of modern EBMs : Contrastive Divergence versus Maximum Likelihood

论文作者

Gagnon, Léo, Lajoie, Guillaume

论文摘要

基于能量的模型（EBM）框架是一种非常通用的生成建模方法，它试图学习和利用概率分布仅定义了虽然不符合分数。由于用卷积神经网络（CNN）参数分布，它在图像生成中获得了令人印象深刻的结果，最近它的流行度上升了。但是，最近的论文通常不存在现代EBM背后的动机和理论基础，这有时会导致一些混乱。特别是，流行的基于MCMC的学习算法差异（CD）背后的理论理由经常被掩盖，我们发现这导致了最近有影响力的论文中的理论错误（Du＆Mordatch，2019; Du等，2020）。在提供了基于MCMC的培训的第一原理介绍之后，我们认为他们使用的学习算法实际上不能被描述为CD并根据新的解释重新解释其方法。最后，我们讨论了新解释的含义，并提供了一些说明性实验。

The Energy-Based Model (EBM) framework is a very general approach to generative modeling that tries to learn and exploit probability distributions only defined though unnormalized scores. It has risen in popularity recently thanks to the impressive results obtained in image generation by parameterizing the distribution with Convolutional Neural Networks (CNN). However, the motivation and theoretical foundations behind modern EBMs are often absent from recent papers and this sometimes results in some confusion. In particular, the theoretical justifications behind the popular MCMC-based learning algorithm Contrastive Divergence (CD) are often glossed over and we find that this leads to theoretical errors in recent influential papers (Du & Mordatch, 2019; Du et al., 2020). After offering a first-principles introduction of MCMC-based training, we argue that the learning algorithm they use can in fact not be described as CD and reinterpret theirs methods in light of a new interpretation. Finally, we discuss the implications of our new interpretation and provide some illustrative experiments.

下载PDF全文

下载文献需遵守相关版权规定

论文标题