以正确的方式培训和推断任何订单自回归模型

论文标题

以正确的方式培训和推断任何订单自回归模型

Training and Inference on Any-Order Autoregressive Models the Right Way

论文作者

Shih, Andy, Sadigh, Dorsa, Ermon, Stefano

论文摘要

对变量的任意子集的有条件推断是概率推断的核心问题，其中包括蒙版语言建模和图像插入等重要应用。近年来，与流行模型（如Bert和XLNet）密切相关的任何阶段自回旋模型（AO-ARMS）家族在跨各种范围范围内的任意条件任务中表现出突破性的表现。但是，尽管他们取得了成功，但在本文中，我们确定了对先前AO臂配方的重大改进。首先，我们表明AO-arms的概率模型中具有冗余，即它们以多种不同方式定义了相同的分布。我们通过对较小的单变量条件进行培训来减轻这种冗余，这些条件仍然支持有效的任意有条件推断。其次，我们将单变量有条件的培训损失提高，这些条件在推断过程中更频繁地评估。我们的方法会提高性能，而无需障碍，在文本（Text8），Image（CIFAR10，Imagenet32）和连续表格数据域中进行任意条件建模的最新可能性。

Conditional inference on arbitrary subsets of variables is a core problem in probabilistic inference with important applications such as masked language modeling and image inpainting. In recent years, the family of Any-Order Autoregressive Models (AO-ARMs) -- closely related to popular models such as BERT and XLNet -- has shown breakthrough performance in arbitrary conditional tasks across a sweeping range of domains. But, in spite of their success, in this paper we identify significant improvements to be made to previous formulations of AO-ARMs. First, we show that AO-ARMs suffer from redundancy in their probabilistic model, i.e., they define the same distribution in multiple different ways. We alleviate this redundancy by training on a smaller set of univariate conditionals that still maintains support for efficient arbitrary conditional inference. Second, we upweight the training loss for univariate conditionals that are evaluated more frequently during inference. Our method leads to improved performance with no compromises on tractability, giving state-of-the-art likelihoods in arbitrary conditional modeling on text (Text8), image (CIFAR10, ImageNet32), and continuous tabular data domains.

下载PDF全文

下载文献需遵守相关版权规定

论文标题