浅神经网络的非呈现概括范围

论文标题

浅神经网络的非呈现概括范围

Non-Vacuous Generalisation Bounds for Shallow Neural Networks

论文作者

Biggs, Felix, Guedj, Benjamin

论文摘要

我们专注于具有单个隐藏层的特定类别的浅神经网络，即具有$ L_2 $ normalister的数据的那些以及Sigmoid形状的高斯错误函数（“ ERF”）激活或高斯误差线性单元（GELU）激活。对于这些网络，我们通过Pac-Bayesian理论得出了新的泛化界限。与大多数现有的界限不同，它们适用于具有确定性而不是随机参数的神经网络。当网络接受Mnist和Fashion-Mnist上的Vanilla随机梯度下降训练时，我们的边界在经验上是不利的。

We focus on a specific class of shallow neural networks with a single hidden layer, namely those with $L_2$-normalised data and either a sigmoid-shaped Gaussian error function ("erf") activation or a Gaussian Error Linear Unit (GELU) activation. For these networks, we derive new generalisation bounds through the PAC-Bayesian theory; unlike most existing such bounds they apply to neural networks with deterministic rather than randomised parameters. Our bounds are empirically non-vacuous when the network is trained with vanilla stochastic gradient descent on MNIST and Fashion-MNIST.

下载PDF全文

下载文献需遵守相关版权规定

论文标题