论文标题
带瓶颈的广泛神经网络是深层的过程
Wide Neural Networks with Bottlenecks are Deep Gaussian Processes
论文作者
论文摘要
最近,在神经网络的“宽限制”上进行了很多工作,因为贝叶斯神经网络(BNN)被证明会收敛到高斯过程(GP),因为所有隐藏的层都发送到无限宽度。但是,这些结果不适用于需要一个或多个隐藏层保持狭窄的体系结构。在本文中,我们考虑了BNN的广泛限制,其中一些被称为“瓶颈”的隐藏层保持有限的宽度。结果是GP的组成,我们称我们为“瓶颈神经网络高斯过程”(瓶颈NNGP)。尽管直观,但证据的微妙之处在于表明网络组成的广泛限制实际上是限制GPS的组成。我们还从理论上分析了单骨nngp,发现瓶颈会引起多输出网络的输出之间的依赖性,该输出通过极端的孔后深层深度持续存在,并防止网络的内核失去歧视功率的极端后脑后脑中的极端歧视功率。
There has recently been much work on the "wide limit" of neural networks, where Bayesian neural networks (BNNs) are shown to converge to a Gaussian process (GP) as all hidden layers are sent to infinite width. However, these results do not apply to architectures that require one or more of the hidden layers to remain narrow. In this paper, we consider the wide limit of BNNs where some hidden layers, called "bottlenecks", are held at finite width. The result is a composition of GPs that we term a "bottleneck neural network Gaussian process" (bottleneck NNGP). Although intuitive, the subtlety of the proof is in showing that the wide limit of a composition of networks is in fact the composition of the limiting GPs. We also analyze theoretically a single-bottleneck NNGP, finding that the bottleneck induces dependence between the outputs of a multi-output network that persists through extreme post-bottleneck depths, and prevents the kernel of the network from losing discriminative power at extreme post-bottleneck depths.