多源DOA通过模式识别回响音轨的模态连贯性的估计

论文标题

多源DOA通过模式识别回响音轨的模态连贯性的估计

Multi-Source DOA Estimation through Pattern Recognition of the Modal Coherence of a Reverberant Soundfield

论文作者

Fahim, A., Samarasinghe, P. N., Abhayapala, T. D.

论文摘要

我们建议使用卷积神经网络算法提出了一种新型的多源到达方向（DOA）估计技术，该算法通过测量的球形谐波系数了解了入射声场的模态相干模式。我们通过分析每个所需方向的模态相干性的独特快照，在短时傅立叶变换频谱中训练模型在短时间傅立叶变换频谱中训练模型。所提出的方法能够使用单源培训方案在$ 3 $ d空间上同时估算在$ 3 $ d空间上的活动。这种单源培训方案减少了训练时间和资源要求，并允许将相同训练的模型重用用于不同的多源组合。该方法是针对具有不同声学标准的各种模拟和实用的嘈杂和回响环境进行评估的，并发现根据DOA估计的准确性，该方法的表现优于基线方法。此外，拟议的算法允许在$ 3 $ d空间的完整估计中对方位角和高程进行独立培训，从而显着提高了其训练效率而不会影响整体估计精度。

We propose a novel multi-source direction of arrival (DOA) estimation technique using a convolutional neural network algorithm which learns the modal coherence patterns of an incident soundfield through measured spherical harmonic coefficients. We train our model for individual time-frequency bins in the short-time Fourier transform spectrum by analyzing the unique snapshot of modal coherence for each desired direction. The proposed method is capable of estimating simultaneously active multiple sound sources on a $3$D space using a single-source training scheme. This single-source training scheme reduces the training time and resource requirements as well as allows the reuse of the same trained model for different multi-source combinations. The method is evaluated against various simulated and practical noisy and reverberant environments with varying acoustic criteria and found to outperform the baseline methods in terms of DOA estimation accuracy. Furthermore, the proposed algorithm allows independent training of azimuth and elevation during a full DOA estimation over $3$D space which significantly improves its training efficiency without affecting the overall estimation accuracy.

下载PDF全文

下载文献需遵守相关版权规定

论文标题