论文标题

在受限数据环境中,半监督和无监督的心脏声音分类方法

Semi-supervised and Unsupervised Methods for Heart Sounds Classification in Restricted Data Environments

论文作者

Unnikrishnan, Balagopal, Singh, Pranshu Ranjan, Yang, Xulei, Chua, Matthew Chin Heng

论文摘要

自动化心脏声音分类是一种备受期待的诊断工具,因为在全球范围内增加了与心脏有关的疾病的发生率。在这项研究中,我们通过使用Physionet/CINC 2016挑战数据集的各种监督,半监督和无监督的方法对心脏声音分类进行了全面研究。有监督的方法,包括深度学习和机器学习方法,都需要大量的标记数据来训练模型,这些模型在大多数实际情况下都具有挑战性。鉴于需要减轻临床实践的标签负担,在这种临床实践中,人类标签既昂贵又耗时,在受限数据设置中的半监督或什至无监督的方法是可取的。因此,提出了一种基于GAN的半监督方法,该方法允许使用未标记的数据样本来增加数据分布的学习。当存在有限的数据样本时,它在AUROC方面的性能更好。此外,通过将给定问题视为一种异常检测方案,将几种无监督的方法作为替代方法探索。特别是,使用1D CNN自动编码器与一级SVM相结合的无监督特征提取可获得良好的性能,而无需任何数据标记。提议的半监督和无监督方法的潜力可能会导致将来创建高质量数据集的工作流程工具。

Automated heart sounds classification is a much-required diagnostic tool in the view of increasing incidences of heart related diseases worldwide. In this study, we conduct a comprehensive study of heart sounds classification by using various supervised, semi-supervised and unsupervised approaches on the PhysioNet/CinC 2016 Challenge dataset. Supervised approaches, including deep learning and machine learning methods, require large amounts of labelled data to train the models, which are challenging to obtain in most practical scenarios. In view of the need to reduce the labelling burden for clinical practices, where human labelling is both expensive and time-consuming, semi-supervised or even unsupervised approaches in restricted data setting are desirable. A GAN based semi-supervised method is therefore proposed, which allows the usage of unlabelled data samples to boost the learning of data distribution. It achieves a better performance in terms of AUROC over the supervised baseline when limited data samples exist. Furthermore, several unsupervised methods are explored as an alternative approach by considering the given problem as an anomaly detection scenario. In particular, the unsupervised feature extraction using 1D CNN Autoencoder coupled with one-class SVM obtains good performance without any data labelling. The potential of the proposed semi-supervised and unsupervised methods may lead to a workflow tool in the future for the creation of higher quality datasets.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源