论文标题

浅神经网络的非呈现概括范围

Non-Vacuous Generalisation Bounds for Shallow Neural Networks

论文作者

Biggs, Felix, Guedj, Benjamin

论文摘要

我们专注于具有单个隐藏层的特定类别的浅神经网络,即具有$ L_2 $ normalister的数据的那些以及Sigmoid形状的高斯错误函数(“ ERF”)激活或高斯误差线性单元(GELU)激活。对于这些网络,我们通过Pac-Bayesian理论得出了新的泛化界限。与大多数现有的界限不同,它们适用于具有确定性而不是随机参数的神经网络。当网络接受Mnist和Fashion-Mnist上的Vanilla随机梯度下降训练时,我们的边界在经验上是不利的。

We focus on a specific class of shallow neural networks with a single hidden layer, namely those with $L_2$-normalised data and either a sigmoid-shaped Gaussian error function ("erf") activation or a Gaussian Error Linear Unit (GELU) activation. For these networks, we derive new generalisation bounds through the PAC-Bayesian theory; unlike most existing such bounds they apply to neural networks with deterministic rather than randomised parameters. Our bounds are empirically non-vacuous when the network is trained with vanilla stochastic gradient descent on MNIST and Fashion-MNIST.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源