开发运动代码嵌入视频中的动作识别

论文标题

开发运动代码嵌入视频中的动作识别

Developing Motion Code Embedding for Action Recognition in Videos

论文作者

Alibayev, Maxat, Paulius, David, Sun, Yu

论文摘要

在这项工作中，我们提出了一种称为运动代码的运动嵌入策略，该策略是基于操纵的显着机械属性的运动的矢量化表示。这些运动代码提供了强大的运动表示，并使用称为运动分类法的特征层次结构获得。我们开发并训练了一个深层神经网络模型，该模型结合了视觉和语义特征，以确定运动分类法中发现的特征，以嵌入或注释带有运动代码的视频。为了证明运动代码作为机器学习任务的功能的潜力，我们将运动嵌入模型中提取的功能集成到了当前的最新动作识别模型中。从Epic-Kitchens数据集中，获得的模型比基线模型高于基线模型。

In this work, we propose a motion embedding strategy known as motion codes, which is a vectorized representation of motions based on a manipulation's salient mechanical attributes. These motion codes provide a robust motion representation, and they are obtained using a hierarchy of features called the motion taxonomy. We developed and trained a deep neural network model that combines visual and semantic features to identify the features found in our motion taxonomy to embed or annotate videos with motion codes. To demonstrate the potential of motion codes as features for machine learning tasks, we integrated the extracted features from the motion embedding model into the current state-of-the-art action recognition model. The obtained model achieved higher accuracy than the baseline model for the verb classification task on egocentric videos from the EPIC-KITCHENS dataset.

下载PDF全文

下载文献需遵守相关版权规定

论文标题