论文标题
一种基于跨信息瓶颈的方法,用于压缩顺序网络以进行人类行动识别
A Variational Information Bottleneck Based Method to Compress Sequential Networks for Human Action Recognition
论文作者
论文摘要
在过去的几年中,深层神经网络的压缩已成为机器学习和计算机视觉研究的重要链接。例如,当使用视频中的人类动作识别(HAR)时,深层模型需要大量的计算复杂性和存储空间,这使得它们不适合在边缘设备上部署。在本文中,我们解决了这个问题,并提出了一种有效压缩复发性神经网络(RNN)的方法,例如用于HAR的门控复发单元(GRUS)和长期使用期限单元(LSTMS)。我们使用基于变异信息瓶颈(VIB)基于理论的修剪方法来限制通过RNN的顺序细胞流向小子集的信息流。此外,我们将修剪方法与特定的组群正规化技术相结合,可显着改善压缩。提出的技术从潜在表示中降低了模型参数和内存足迹,而验证精度几乎没有降低,同时提高了推理速度几个倍。我们在三个广泛使用的动作识别数据集上执行实验,即。 UCF11,HMDB51和UCF101,以验证我们的方法。结果表明,对于最近的竞争对手,在UCF11上的动作识别任务时,我们的方法的压缩度比最近的竞争对手高70倍以上。
In the last few years, compression of deep neural networks has become an important strand of machine learning and computer vision research. Deep models require sizeable computational complexity and storage, when used for instance for Human Action Recognition (HAR) from videos, making them unsuitable to be deployed on edge devices. In this paper, we address this issue and propose a method to effectively compress Recurrent Neural Networks (RNNs) such as Gated Recurrent Units (GRUs) and Long-Short-Term-Memory Units (LSTMs) that are used for HAR. We use a Variational Information Bottleneck (VIB) theory-based pruning approach to limit the information flow through the sequential cells of RNNs to a small subset. Further, we combine our pruning method with a specific group-lasso regularization technique that significantly improves compression. The proposed techniques reduce model parameters and memory footprint from latent representations, with little or no reduction in the validation accuracy while increasing the inference speed several-fold. We perform experiments on the three widely used Action Recognition datasets, viz. UCF11, HMDB51, and UCF101, to validate our approach. It is shown that our method achieves over 70 times greater compression than the nearest competitor with comparable accuracy for the task of action recognition on UCF11.