基于被动相似性的CNN过滤器修剪，以进行有效的声学场景分类

论文标题

基于被动相似性的CNN过滤器修剪，以进行有效的声学场景分类

A Passive Similarity based CNN Filter Pruning for Efficient Acoustic Scene Classification

论文作者

Singh, Arshdeep, Plumbley, Mark D.

论文摘要

我们提出了一种开发低复杂性卷积神经网络（CNN）进行声学场景分类（ASC）的方法。典型CNN的大尺寸和高计算复杂性是它们在资源受限设备上部署的瓶颈。我们提出了一个被动过滤器修剪框架，其中消除了来自CNN的一些卷积过滤器以产生压缩的CNN。我们的假设是，类似的过滤器会产生相似的响应，并提供冗余信息，从而可以从网络中消除此类过滤器。为了识别类似的过滤器，提出了基于余弦的贪婪算法。然后，进行微调过程，以恢复由于消除过滤器而导致的大部分性能。为了进行有效的微调，我们分析了性能如何随着微调培训示例的数量而变化。对所提出的框架进行了实验评估，该框架是对经过ASC训练的公开DCASE 2021任务1A基线网络进行的。所提出的方法是简单的，将每个推断的计算降低27％，参数少25％，精度下降了1％。

We present a method to develop low-complexity convolutional neural networks (CNNs) for acoustic scene classification (ASC). The large size and high computational complexity of typical CNNs is a bottleneck for their deployment on resource-constrained devices. We propose a passive filter pruning framework, where a few convolutional filters from the CNNs are eliminated to yield compressed CNNs. Our hypothesis is that similar filters produce similar responses and give redundant information allowing such filters to be eliminated from the network. To identify similar filters, a cosine distance based greedy algorithm is proposed. A fine-tuning process is then performed to regain much of the performance lost due to filter elimination. To perform efficient fine-tuning, we analyze how the performance varies as the number of fine-tuning training examples changes. An experimental evaluation of the proposed framework is performed on the publicly available DCASE 2021 Task 1A baseline network trained for ASC. The proposed method is simple, reduces computations per inference by 27%, with 25% fewer parameters, with less than 1% drop in accuracy.

下载PDF全文

下载文献需遵守相关版权规定

论文标题