EAANET：有效的注意增强卷积网络

论文标题

EAANET：有效的注意增强卷积网络

EAANet: Efficient Attention Augmented Convolutional Networks

论文作者

Zhang, Runqing, Zhu, Tianshu

论文摘要

人类可以在复杂的场景中有效地找到显着区域。自我发项机制被引入计算机视觉（CV）以实现这一目标。注意增强卷积网络（AANET）是卷积和自我注意力的混合物，可提高典型重新连接的准确性。但是，就输入令牌的数量而言，自我注意的复杂性是O（N2）。在这个项目中，我们提出了EAANET：有效的注意力增强卷积网络，该网络将有效的自我注意解机制纳入了卷积和自我发挥的混合体系结构中，以减少模型的内存足迹。我们最好的模型显示了对AA-NET和RESNET18的性能提高。我们还探索了通过自我发挥机制增强卷积网络的不同方法，并表明了与Resnet相比训练这些方法的困难。最后，我们表明，比正常的自我发项机制可以增强具有分解尺度的有效自我发挥机制，其输入尺寸更好。因此，我们的EAANET更有能力与高分辨率图像合作。

Humans can effectively find salient regions in complex scenes. Self-attention mechanisms were introduced into Computer Vision (CV) to achieve this. Attention Augmented Convolutional Network (AANet) is a mixture of convolution and self-attention, which increases the accuracy of a typical ResNet. However, The complexity of self-attention is O(n2) in terms of computation and memory usage with respect to the number of input tokens. In this project, we propose EAANet: Efficient Attention Augmented Convolutional Networks, which incorporates efficient self-attention mechanisms in a convolution and self-attention hybrid architecture to reduce the model's memory footprint. Our best model show performance improvement over AA-Net and ResNet18. We also explore different methods to augment Convolutional Network with self-attention mechanisms and show the difficulty of training those methods compared to ResNet. Finally, we show that augmenting efficient self-attention mechanisms with ResNet scales better with input size than normal self-attention mechanisms. Therefore, our EAANet is more capable of working with high-resolution images.

下载PDF全文

下载文献需遵守相关版权规定

论文标题