论文标题
贝叶斯弹网的无限效力扩展,可修复其渐近性过度自信
An Infinite-Feature Extension for Bayesian ReLU Nets That Fixes Their Asymptotic Overconfidence
论文作者
论文摘要
贝叶斯治疗可以减轻训练数据周围的Relu网中的过度自信。但是离它们很远,Relu Bayesian神经网络(BNN)仍然可以低估不确定性,因此渐近地过度自信。由于具有有限的许多功能的BNN的输出差异在距数据区域的距离上是二次的,因此出现了此问题。同时,具有Relu特征的贝叶斯线性模型在无限宽度的限制中收敛到特定的高斯过程(GP),其方差具有立方体生长,以免发生渐近过度自信。尽管这似乎主要是理论上的兴趣,但在这项工作中,我们表明它可以用于实践中的利益。我们通过GP扩展了具有无限relu特征的有限relu BNN,并表明所产生的模型在远离数据上是渐近的最大不确定的,而BNNS的预测能力不受数据影响。尽管所得的模型近似于完整的GP后部,但由于其结构,它可以以低成本将其应用于任何预训练的Relu BNN。
A Bayesian treatment can mitigate overconfidence in ReLU nets around the training data. But far away from them, ReLU Bayesian neural networks (BNNs) can still underestimate uncertainty and thus be asymptotically overconfident. This issue arises since the output variance of a BNN with finitely many features is quadratic in the distance from the data region. Meanwhile, Bayesian linear models with ReLU features converge, in the infinite-width limit, to a particular Gaussian process (GP) with a variance that grows cubically so that no asymptotic overconfidence can occur. While this may seem of mostly theoretical interest, in this work, we show that it can be used in practice to the benefit of BNNs. We extend finite ReLU BNNs with infinite ReLU features via the GP and show that the resulting model is asymptotically maximally uncertain far away from the data while the BNNs' predictive power is unaffected near the data. Although the resulting model approximates a full GP posterior, thanks to its structure, it can be applied \emph{post-hoc} to any pre-trained ReLU BNN at a low cost.