论文标题
通过模块化和增强来改善系统的概括
Improving Systematic Generalization Through Modularity and Augmentation
论文作者
论文摘要
系统的概括是将已知部分结合成新颖含义的能力。有效的人类学习的一个重要方面,但是神经网络学习的弱点。在这项工作中,我们研究了两个众所周知的建模原理(模块化和数据增强)如何影响接地语言学习中神经网络的系统概括。我们分析词汇要实现系统概括所需的大小以及增强数据与手头问题的相似程度。我们的发现表明,即使在合成基准的受控设置中,实现系统的概括仍然非常困难。在在增强数据集上进行了近40倍的副本训练之后,非模块化基线无法系统地概括为已知动词和副词的新型组合。当将任务分为认知和导航等认知过程时,模块化神经网络能够利用增强数据并更加系统地概括,在两个没有改进的GSCAN测试上,在两个GSCAN测试上实现了70%和40%的精确匹配。我们希望这项工作能够深入了解系统概括的驱动因素,以及我们仍然需要改进神经网络,才能像人类一样学习。
Systematic generalization is the ability to combine known parts into novel meaning; an important aspect of efficient human learning, but a weakness of neural network learning. In this work, we investigate how two well-known modeling principles -- modularity and data augmentation -- affect systematic generalization of neural networks in grounded language learning. We analyze how large the vocabulary needs to be to achieve systematic generalization and how similar the augmented data needs to be to the problem at hand. Our findings show that even in the controlled setting of a synthetic benchmark, achieving systematic generalization remains very difficult. After training on an augmented dataset with almost forty times more adverbs than the original problem, a non-modular baseline is not able to systematically generalize to a novel combination of a known verb and adverb. When separating the task into cognitive processes like perception and navigation, a modular neural network is able to utilize the augmented data and generalize more systematically, achieving 70% and 40% exact match increase over state-of-the-art on two gSCAN tests that have not previously been improved. We hope that this work gives insight into the drivers of systematic generalization, and what we still need to improve for neural networks to learn more like humans do.