论文标题

COLOC:有条件的本地化和分类器,用于声音事件的本地化和检测

CoLoC: Conditioned Localizer and Classifier for Sound Event Localization and Detection

论文作者

Kapka, Sławomir, Tkaczuk, Jakub

论文摘要

在本文中,我们描述了条件定位器和分类器(COLOC),这是一种新颖的解决方案,用于声音事件定位和检测(SELD)。解决方案构成了两个阶段:首先完成本地化,然后按照定位器的输出为条件进行分类。为了解决未知数量来源的问题,我们合并了从顺序集合(SSG)借用的想法。来自两个阶段的模型都是类似Seldnet的CRNN,但具有单个输出。进行的推理表明,这样的两个单输出模型适合SELD任务。我们表明,在Starss22数据集中大多数指标中,我们的解决方案改进了基线系统。

In this article, we describe Conditioned Localizer and Classifier (CoLoC) which is a novel solution for Sound Event Localization and Detection (SELD). The solution constitutes of two stages: the localization is done first and is followed by classification conditioned by the output of the localizer. In order to resolve the problem of the unknown number of sources we incorporate the idea borrowed from Sequential Set Generation (SSG). Models from both stages are SELDnet-like CRNNs, but with single outputs. Conducted reasoning shows that such two single-output models are fit for SELD task. We show that our solution improves on the baseline system in most metrics on the STARSS22 Dataset.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源