增强的平行金字塔网，用于注意力指导姿势估计

论文标题

增强的平行金字塔网，用于注意力指导姿势估计

Augmented Parallel-Pyramid Net for Attention Guided Pose-Estimation

论文作者

Hou, Luanxuan, Cao, Jie, Zhao, Yuan, Shen, Haifeng, Meng, Yiping, He, Ran, Ye, Jieping

论文摘要

人姿势估计的目标是从图像中确定每个人的身体部位或关节位置。这是广泛应用程序的具有挑战性的问题。为了解决这个问题，本文提出了一个增强的并行perlamid网，该网络带有部分模块和可区分的自动数据扩展。从技术上讲，提出了平行的金字塔结构来补偿信息损失。我们采用平行结构的设计以进行反向补偿。同时，总体计算复杂性不会增加。我们进一步定义了一个注意部分模块（APM）运算符，以从平行金字塔结构生成的不同比例特征图中提取加权特征。与通过上采样运算符精炼相比，APM可以更好地捕获通道之间的关系。最后，我们提出了一种可区分的自动数据增强方法，以进一步提高估计准确性。我们定义了一个新的姿势搜索空间，其中数据增强序列被公式为可训练的CNN组件。实验证实了我们提出的方法的有效性。值得注意的是，我们的方法在具有挑战性的可可关键基准和MPII数据集上的最新结果上实现了前1位的准确性。

The target of human pose estimation is to determine body part or joint locations of each person from an image. This is a challenging problems with wide applications. To address this issue, this paper proposes an augmented parallel-pyramid net with attention partial module and differentiable auto-data augmentation. Technically, a parallel pyramid structure is proposed to compensate the loss of information. We take the design of parallel structure for reverse compensation. Meanwhile, the overall computational complexity does not increase. We further define an Attention Partial Module (APM) operator to extract weighted features from different scale feature maps generated by the parallel pyramid structure. Compared with refining through upsampling operator, APM can better capture the relationship between channels. At last, we proposed a differentiable auto data augmentation method to further improve estimation accuracy. We define a new pose search space where the sequences of data augmentations are formulated as a trainable and operational CNN component. Experiments corroborate the effectiveness of our proposed method. Notably, our method achieves the top-1 accuracy on the challenging COCO keypoint benchmark and the state-of-the-art results on the MPII datasets.

下载PDF全文

下载文献需遵守相关版权规定

论文标题