论文标题
相关数据相关的初始化
Correlated Initialization for Correlated Data
论文作者
论文摘要
空间数据显示附近点相关的属性。这也适用于跨层的学习表示形式,但不适合常用的权重初始化方法。我们的理论分析量化了单个空间过滤器的权重的学习行为。因此,它与讨论权重的统计特性的大量工作相反。它表明,不相关的初始化(i)可能导致收敛行为差,并且(ii)(某些)参数的训练可能会导致收敛缓慢。经验分析表明,单个空间滤波器的这些发现扩展到具有许多空间过滤器的网络。 (相关)初始化的影响在很大程度上取决于学习率和L2型规范化。
Spatial data exhibits the property that nearby points are correlated. This also holds for learnt representations across layers, but not for commonly used weight initialization methods. Our theoretical analysis quantifies the learning behavior of weights of a single spatial filter. It is thus in contrast to a large body of work that discusses statistical properties of weights. It shows that uncorrelated initialization (i) might lead to poor convergence behavior and (ii) training of (some) parameters is likely subject to slow convergence. Empirical analysis shows that these findings for a single spatial filter extend to networks with many spatial filters. The impact of (correlated) initialization depends strongly on learning rates and l2-regularization.