多粒子的多目标矩阵归一化，以进行细粒的视觉识别

论文标题

多粒子的多目标矩阵归一化，以进行细粒的视觉识别

Multi-Objective Matrix Normalization for Fine-grained Visual Recognition

论文作者

Min, Shaobo, Yao, Hantao, Xie, Hongtao, Zha, Zheng-Jun, Zhang, Yongdong

论文摘要

双线性合并在细粒的视觉识别（FGVC）方面取得了巨大的成功。最近的方法表明，矩阵功率归一化可以稳定双线性特征的二阶信息，但是有些问题，例如冗余信息和过度拟合，仍有待解决。在本文中，我们提出了一种有效的多目标矩阵归一化（MOMN）方法，该方法可以同时根据方形，低级别和稀疏性将双线性表示标准化。这三个正规化器不仅可以稳定二阶信息，还可以压实双线性特征并促进模型概括。在MOMN中，核心挑战是如何共同优化不同凸形特性的三个非平滑正规化器。为此，MOMN首先将它们制定为具有近似正规化程序约束的增强Lagrange公式。然后，引入辅助变量以放宽不同的约束，从而使每个常规器交替求解。最后，基于梯度下降的几种更新策略旨在获得一致的融合和有效的实现。因此，仅使用矩阵乘法实现MOMN，这与GPU加速度非常兼容，并且归一化的双线性特征稳定且有区别。关于FGVC的五个公共基准测试的实验表明，就准确性和效率而言，提出的MOMN优于现有的基于标准化的方法。代码可用：https：//github.com/mbobogo/momn。

Bilinear pooling achieves great success in fine-grained visual recognition (FGVC). Recent methods have shown that the matrix power normalization can stabilize the second-order information in bilinear features, but some problems, e.g., redundant information and over-fitting, remain to be resolved. In this paper, we propose an efficient Multi-Objective Matrix Normalization (MOMN) method that can simultaneously normalize a bilinear representation in terms of square-root, low-rank, and sparsity. These three regularizers can not only stabilize the second-order information, but also compact the bilinear features and promote model generalization. In MOMN, a core challenge is how to jointly optimize three non-smooth regularizers of different convex properties. To this end, MOMN first formulates them into an augmented Lagrange formula with approximated regularizer constraints. Then, auxiliary variables are introduced to relax different constraints, which allow each regularizer to be solved alternately. Finally, several updating strategies based on gradient descent are designed to obtain consistent convergence and efficient implementation. Consequently, MOMN is implemented with only matrix multiplication, which is well-compatible with GPU acceleration, and the normalized bilinear features are stabilized and discriminative. Experiments on five public benchmarks for FGVC demonstrate that the proposed MOMN is superior to existing normalization-based methods in terms of both accuracy and efficiency. The code is available: https://github.com/mboboGO/MOMN.

下载PDF全文

下载文献需遵守相关版权规定

论文标题