使用深层CNN功能在孤立的标志识别中评估隐藏的马尔可夫模型

论文标题

使用深层CNN功能在孤立的标志识别中评估隐藏的马尔可夫模型

Evaluation Of Hidden Markov Models Using Deep CNN Features In Isolated Sign Recognition

论文作者

Tur, Anil Osman, Keles, Hacer Yalim

论文摘要

由于标志的多模式性质，视频流的隔离标志识别是一个具有挑战性的问题，在该标志的多模式性质中，本地和全球手和面部手势都需要同时参加。最近，使用基于深卷积神经网络（CNN）特征和基于长期短期记忆（LSTM）的深层深层序列模型对这个问题进行了广泛的研究。但是，目前的文献缺乏使用具有深度特征的隐藏马尔可夫模型（HMM）提供经验分析。在这项研究中，我们提供了一个由三个模块组成的框架，用于使用不同的序列模型解决孤立的标志识别问题。深度特征的尺寸通常太大，无法与HMM型号一起使用。为了解决这个问题，我们提出了两个基于CNN的替代架构作为框架中的第二个模块，以有效地减少深度功能维度。经过广泛的实验，我们表明，使用RESNET50验证的RESNET50功能和我们的基于CNN的降低模型之一，HMM可以使用RGB和骨骼数据在Montalbano数据集中对孤立的符号进行90.15％的精度进行分类。该性能与当前基于LSTM的模型相当。 HMM的参数较少，可以快速训练并在商品计算机上运行，而无需GPU。因此，我们对深度特征的分析表明，在挑战孤立的标志识别问题中，也可以利用HMM以及深层序列模型。

Isolated sign recognition from video streams is a challenging problem due to the multi-modal nature of the signs, where both local and global hand features and face gestures needs to be attended simultaneously. This problem has recently been studied widely using deep Convolutional Neural Network (CNN) based features and Long Short-Term Memory (LSTM) based deep sequence models. However, the current literature is lack of providing empirical analysis using Hidden Markov Models (HMMs) with deep features. In this study, we provide a framework that is composed of three modules to solve isolated sign recognition problem using different sequence models. The dimensions of deep features are usually too large to work with HMM models. To solve this problem, we propose two alternative CNN based architectures as the second module in our framework, to reduce deep feature dimensions effectively. After extensive experiments, we show that using pretrained Resnet50 features and one of our CNN based dimension reduction models, HMMs can classify isolated signs with 90.15% accuracy in Montalbano dataset using RGB and Skeletal data. This performance is comparable with the current LSTM based models. HMMs have fewer parameters and can be trained and run on commodity computers fast, without requiring GPUs. Therefore, our analysis with deep features show that HMMs could also be utilized as well as deep sequence models in challenging isolated sign recognition problem.

下载PDF全文

下载文献需遵守相关版权规定

论文标题