互联网视频的运动捕获

论文标题

互联网视频的运动捕获

Motion Capture from Internet Videos

论文作者

Dong, Junting, Shuai, Qing, Zhang, Yuanqing, Liu, Xian, Zhou, Xiaowei, Bao, Hujun

论文摘要

基于图像的人姿势估计的最新进展使得可以从单个RGB视频中捕获3D人类运动。但是，单一视图中固有的深度歧义和自我封闭禁止将作为高质量运动作为多视图重建的恢复。虽然多视频视频并不常见，但是在互联网上，执行特定动作的名人的视频通常很丰富。即使在不同时间实例中记录了这些视频，它们也会编码该人的相同运动特征。因此，我们建议通过共同分析这些互联网视频而不是单独使用单个视频来捕捉人类运动。但是，这项新任务带来了许多新的挑战，因为视频不同步，摄像机的观点未知，背景场景是不同的，并且在视频中不完全相同。为了应对这些挑战，我们提出了一个基于优化的新型框架，并在实验上证明了其与单眼运动捕获方法相比，从多个视频中恢复了更精确和详细的运动的能力。

Recent advances in image-based human pose estimation make it possible to capture 3D human motion from a single RGB video. However, the inherent depth ambiguity and self-occlusion in a single view prohibit the recovery of as high-quality motion as multi-view reconstruction. While multi-view videos are not common, the videos of a celebrity performing a specific action are usually abundant on the Internet. Even if these videos were recorded at different time instances, they would encode the same motion characteristics of the person. Therefore, we propose to capture human motion by jointly analyzing these Internet videos instead of using single videos separately. However, this new task poses many new challenges that cannot be addressed by existing methods, as the videos are unsynchronized, the camera viewpoints are unknown, the background scenes are different, and the human motions are not exactly the same among videos. To address these challenges, we propose a novel optimization-based framework and experimentally demonstrate its ability to recover much more precise and detailed motion from multiple videos, compared against monocular motion capture methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题