FMIX：增强混合样品数据增强

论文标题

FMIX：增强混合样品数据增强

FMix: Enhancing Mixed Sample Data Augmentation

论文作者

Harris, Ethan, Marcu, Antonia, Painter, Matthew, Niranjan, Mahesan, Prügel-Bennett, Adam, Hare, Jonathon

论文摘要

近年来，混合样本数据增强（MSDA）受到了越来越多的关注，许多成功的变体，例如混合和cutmix。通过研究VAE在原始数据上学习的函数与增强数据之间的互助信息，我们表明，混音扭曲了学习的功能，以cutmix不具有的方式。我们进一步证明了这一点，表明混合是一种对抗性训练的一种形式，增强了对诸如愚人和统一噪声之类的攻击的鲁棒性，产生了与混合产生的示例相似的示例。我们认为这种失真阻止模型了解数据中的样本特定特征，并有助于概括性能。相比之下，我们建议CutMix的工作方式更像是传统的增强，通过防止记忆而不扭曲数据分布来提高性能。但是，我们认为，建立在CutMix上的MSDA包括任意形状的掩模，而不是正方形，可以进一步防止记忆，同时以相同的方式保留数据分布。为此，我们提出了FMIX，一种MSDA，使用通过将阈值应用于从傅立叶空间采样的低频图像获得的随机二进制掩码。这些随机掩模可以采用多种形状，可以与一个，二维和三维数据一起使用。 FMIX改善了混合和CutMix的性能，而没有增加训练时间的训练时间，用于多个数据集和问题设置的许多型号，从而在没有外部数据的情况下获得了CIFAR-10的新单个模型最先进的结果。最后，我们表明，插值MSDA（例如混音和掩盖MSDA）（例如FMIX）之间的差异的结果是，可以将两者结合起来以进一步提高性能。所有实验的代码均在https://github.com/ecs-vlc/fmix上提供。

Mixed Sample Data Augmentation (MSDA) has received increasing attention in recent years, with many successful variants such as MixUp and CutMix. By studying the mutual information between the function learned by a VAE on the original data and on the augmented data we show that MixUp distorts learned functions in a way that CutMix does not. We further demonstrate this by showing that MixUp acts as a form of adversarial training, increasing robustness to attacks such as Deep Fool and Uniform Noise which produce examples similar to those generated by MixUp. We argue that this distortion prevents models from learning about sample specific features in the data, aiding generalisation performance. In contrast, we suggest that CutMix works more like a traditional augmentation, improving performance by preventing memorisation without distorting the data distribution. However, we argue that an MSDA which builds on CutMix to include masks of arbitrary shape, rather than just square, could further prevent memorisation whilst preserving the data distribution in the same way. To this end, we propose FMix, an MSDA that uses random binary masks obtained by applying a threshold to low frequency images sampled from Fourier space. These random masks can take on a wide range of shapes and can be generated for use with one, two, and three dimensional data. FMix improves performance over MixUp and CutMix, without an increase in training time, for a number of models across a range of data sets and problem settings, obtaining a new single model state-of-the-art result on CIFAR-10 without external data. Finally, we show that a consequence of the difference between interpolating MSDA such as MixUp and masking MSDA such as FMix is that the two can be combined to improve performance even further. Code for all experiments is provided at https://github.com/ecs-vlc/FMix .

下载PDF全文

下载文献需遵守相关版权规定

论文标题