医学声学的深度功能学习

论文标题

医学声学的深度功能学习

Deep Feature Learning for Medical Acoustics

论文作者

Poirè, Alessandro Maria, Simonetta, Federico, Ntalampiras, Stavros

论文摘要

本文的目的是比较医学声学任务中不同可学习的前端。已经实施了一个框架，以将人类的呼吸道声音和心跳分为两类，即健康或受病理影响。在获得两个合适的数据集后，我们开始使用两个可学习的前沿（叶子和nnaudio）对声音进行分类，以及一个不可学习的基线前端，即mel-Filterbanks。然后，计算出的功能将被馈入两种不同的CNN模型，即VGG16和EdgitionNet。根据参数，计算资源和有效性的数量，对前端进行了仔细的基准测试。这项工作表明了神经音频分类系统中可学习前端的整合如何提高性能，尤其是在医学声学领域。但是，此类框架的使用使所需的数据数量更大。因此，如果可用于培训的数据量足够大以帮助特征学习过程，则它们很有用。

The purpose of this paper is to compare different learnable frontends in medical acoustics tasks. A framework has been implemented to classify human respiratory sounds and heartbeats in two categories, i.e. healthy or affected by pathologies. After obtaining two suitable datasets, we proceeded to classify the sounds using two learnable state-of-art frontends -- LEAF and nnAudio -- plus a non-learnable baseline frontend, i.e. Mel-filterbanks. The computed features are then fed into two different CNN models, namely VGG16 and EfficientNet. The frontends are carefully benchmarked in terms of the number of parameters, computational resources, and effectiveness. This work demonstrates how the integration of learnable frontends in neural audio classification systems may improve performance, especially in the field of medical acoustics. However, the usage of such frameworks makes the needed amount of data even larger. Consequently, they are useful if the amount of data available for training is adequately large to assist the feature learning process.

下载PDF全文

下载文献需遵守相关版权规定

论文标题