论文标题
补丁:卷积神经网络的功能空间块级正规化技术
PatchUp: A Feature-Space Block-Level Regularization Technique for Convolutional Neural Networks
论文作者
论文摘要
当接受有限的标记培训数据训练时,大型深度学习模型通常容易出现高概括差距。最近解决此问题的方法使用了各种方法来通过混合(或更多)(或更多)培训样本来构建新的培训样本。我们建议使用卷积神经网络(CNN)的隐藏状态块级正规化技术,该技术应用于从随机样本中的特征图的选定连续块上应用。我们的方法改善了CNN模型对其他最新混合方法中可能发生的多种侵入问题的鲁棒性。此外,由于我们在隐藏空间中混合了连续的特征块,该尺寸比输入空间更大,因此我们获得了更多样化的样品,用于训练不同的维度。我们在CIFAR10/100,SVHN,Tiny-ImageNet和Imagenet上进行的实验,使用包括PREACTRESNET18/34,WRN-28-10,RESNET101/152的RESNET体系结构进行了实验,该模型表明,Patchup在CNNS上的当前正常正常人的表现会改善或平等。我们还表明,修补程序可以为变形样品提供更好的概括,并且在对抗攻击方面更强大。
Large capacity deep learning models are often prone to a high generalization gap when trained with a limited amount of labeled training data. A recent class of methods to address this problem uses various ways to construct a new training sample by mixing a pair (or more) of training samples. We propose PatchUp, a hidden state block-level regularization technique for Convolutional Neural Networks (CNNs), that is applied on selected contiguous blocks of feature maps from a random pair of samples. Our approach improves the robustness of CNN models against the manifold intrusion problem that may occur in other state-of-the-art mixing approaches. Moreover, since we are mixing the contiguous block of features in the hidden space, which has more dimensions than the input space, we obtain more diverse samples for training towards different dimensions. Our experiments on CIFAR10/100, SVHN, Tiny-ImageNet, and ImageNet using ResNet architectures including PreActResnet18/34, WRN-28-10, ResNet101/152 models show that PatchUp improves upon, or equals, the performance of current state-of-the-art regularizers for CNNs. We also show that PatchUp can provide a better generalization to deformed samples and is more robust against adversarial attacks.