论文标题
准直态性和内在维度作为学习和概括的度量
Quasi-orthogonality and intrinsic dimensions as measures of learning and generalisation
论文作者
论文摘要
寻找学习机的最佳体系结构,例如深神经网络,是众所周知的技术和理论挑战。 Mellor等人(2021年)的最新工作表明,训练有素的网络的准确性与在随机初始化的网络上定义的一些易于计算的度量的值之间可能存在相关性,这可能使得无需训练即可搜索成千上万的神经体系结构。 Mellor等人使用了在所有Relu神经元中评估的锤距离作为这种度量。受这些发现的激励,在我们的工作中,我们提出了一个问题,即其他有原则的措施的存在,可以用作给定神经建筑成功的决定因素。特别是,我们检查了神经网络特征空间的维度和准正交性是否与训练后网络的性能相关。我们显示,使用该设置如Mellor等人所示,维度和准正交性可以共同充当网络的性能判别因子。除了提供加速神经体系结构搜索的新机会外,我们的发现还表明了网络的最终性能与其随机初始化特征空间的属性之间的重要关系:数据维度和准正交性。
Finding best architectures of learning machines, such as deep neural networks, is a well-known technical and theoretical challenge. Recent work by Mellor et al (2021) showed that there may exist correlations between the accuracies of trained networks and the values of some easily computable measures defined on randomly initialised networks which may enable to search tens of thousands of neural architectures without training. Mellor et al used the Hamming distance evaluated over all ReLU neurons as such a measure. Motivated by these findings, in our work, we ask the question of the existence of other and perhaps more principled measures which could be used as determinants of success of a given neural architecture. In particular, we examine, if the dimensionality and quasi-orthogonality of neural networks' feature space could be correlated with the network's performance after training. We showed, using the setup as in Mellor et al, that dimensionality and quasi-orthogonality may jointly serve as network's performance discriminants. In addition to offering new opportunities to accelerate neural architecture search, our findings suggest important relationships between the networks' final performance and properties of their randomly initialised feature spaces: data dimension and quasi-orthogonality.