论文标题
熵与神经网络中的普遍性之间的相关性
Correlation between entropy and generalizability in a neural network
论文作者
论文摘要
尽管神经网络可以解决非常复杂的机器学习问题,但其普遍性的理论原因仍未完全理解。在这里,我们使用Wang-Landau Mote Carlo算法来计算给定的测试准确性的熵(参数空间的一部分体积的对数)以及给定的训练损失函数值或训练精度。我们的结果表明,熵力有助于普遍性。尽管我们的研究是对神经网络(螺旋数据集和小型,完全连接的神经网络)非常简单的应用,但我们的方法对于解释未来工作中更复杂的神经网络的普遍性应该很有用。
Although neural networks can solve very complex machine-learning problems, the theoretical reason for their generalizability is still not fully understood. Here we use Wang-Landau Mote Carlo algorithm to calculate the entropy (logarithm of the volume of a part of the parameter space) at a given test accuracy, and a given training loss function value or training accuracy. Our results show that entropical forces help generalizability. Although our study is on a very simple application of neural networks (a spiral dataset and a small, fully-connected neural network), our approach should be useful in explaining the generalizability of more complicated neural networks in future works.