对人类姿势估计的对抗语义数据增强

论文标题

对人类姿势估计的对抗语义数据增强

Adversarial Semantic Data Augmentation for Human Pose Estimation

论文作者

Bin, Yanrui, Cao, Xuan, Chen, Xinya, Ge, Yanhao, Tai, Ying, Wang, Chengjie, Li, Jilin, Huang, Feiyue, Gao, Changxin, Sang, Nong

论文摘要

人姿势估计是从静止图像中定位身体关键点的任务。最先进的方法遭受了充满挑战性案例的例子，例如对称外观，重闭，附近的人。为了扩大具有挑战性的案例的数量，以前的方法通过裁剪和粘贴具有弱语义的图像贴片来增强图像，从而导致不现实的外观和有限的多样性。相反，我们提出了语义数据增强（SDA），该方法通过粘贴具有各种语义粒度的分段身体部位来增强图像。此外，我们提出了对抗性语义数据增强（ASDA），该数据利用生成网络来动态预测量身定制的粘贴配置。鉴于现成的姿势估计网络作为歧视者，生成器寻求最令人困惑的转换以增加鉴别器的损失，而鉴别器将生成的样本作为输入并从中学习。整个管道以对抗性方式进行了优化。在具有挑战性的基准方面取得了最先进的结果。

Human pose estimation is the task of localizing body keypoints from still images. The state-of-the-art methods suffer from insufficient examples of challenging cases such as symmetric appearance, heavy occlusion and nearby person. To enlarge the amounts of challenging cases, previous methods augmented images by cropping and pasting image patches with weak semantics, which leads to unrealistic appearance and limited diversity. We instead propose Semantic Data Augmentation (SDA), a method that augments images by pasting segmented body parts with various semantic granularity. Furthermore, we propose Adversarial Semantic Data Augmentation (ASDA), which exploits a generative network to dynamiclly predict tailored pasting configuration. Given off-the-shelf pose estimation network as discriminator, the generator seeks the most confusing transformation to increase the loss of the discriminator while the discriminator takes the generated sample as input and learns from it. The whole pipeline is optimized in an adversarial manner. State-of-the-art results are achieved on challenging benchmarks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题