关于贝叶斯分类的不确定性，回火和数据增强

论文标题

关于贝叶斯分类的不确定性，回火和数据增强

On Uncertainty, Tempering, and Data Augmentation in Bayesian Classification

论文作者

Kapoor, Sanyam, Maddox, Wesley J., Izmailov, Pavel, Wilson, Andrew Gordon

论文摘要

剧烈的不确定性捕获了数据的固有随机性，例如测量噪声。在贝叶斯回归中，我们经常使用高斯观察模型，在该模型中，我们控制着使用噪声方差参数来控制差异不确定性的水平。相比之下，对于贝叶斯分类，我们使用一个分类分布，没有机制来代表我们对不确定性的信念。我们的工作表明，明确考虑不确定性的明确考虑会显着改善贝叶斯神经网络的性能。我们注意到，许多标准基准（例如CIFAR）本质上没有不确定性。此外，我们在近似推断中显示了数据的增强具有减轻可能性的影响，从而导致不理解并深刻地歪曲了我们对核心不确定性的诚实信念。因此，我们发现，一个寒冷的后部被大于一种的力量恢复，通常更诚实地反映了我们对剧烈不确定性的信念，而不是没有回火，这提供了数据增强和冷后代之间的明确联系。我们表明，我们可以使用Dirichlet观察模型来匹配或超过后恢复的性能，在该模型中，我们明确地控制着良性不确定性的水平，而无需进行降温。

Aleatoric uncertainty captures the inherent randomness of the data, such as measurement noise. In Bayesian regression, we often use a Gaussian observation model, where we control the level of aleatoric uncertainty with a noise variance parameter. By contrast, for Bayesian classification we use a categorical distribution with no mechanism to represent our beliefs about aleatoric uncertainty. Our work shows that explicitly accounting for aleatoric uncertainty significantly improves the performance of Bayesian neural networks. We note that many standard benchmarks, such as CIFAR, have essentially no aleatoric uncertainty. Moreover, we show data augmentation in approximate inference has the effect of softening the likelihood, leading to underconfidence and profoundly misrepresenting our honest beliefs about aleatoric uncertainty. Accordingly, we find that a cold posterior, tempered by a power greater than one, often more honestly reflects our beliefs about aleatoric uncertainty than no tempering -- providing an explicit link between data augmentation and cold posteriors. We show that we can match or exceed the performance of posterior tempering by using a Dirichlet observation model, where we explicitly control the level of aleatoric uncertainty, without any need for tempering.

下载PDF全文

下载文献需遵守相关版权规定

论文标题