半喂食者：使用多览伪标记的联合学习的半监督学习言语情感识别

论文标题

半喂食者：使用多览伪标记的联合学习的半监督学习言语情感识别

Semi-FedSER: Semi-supervised Learning for Speech Emotion Recognition On Federated Learning using Multiview Pseudo-Labeling

论文作者

Feng, Tiantian, Narayanan, Shrikanth

论文摘要

语音情感识别（SER）应用通常与隐私问题有关，因为它经常获取并在客户端传输语音数据到远程云平台以进行进一步处理。这些语音数据不仅可以揭示语音内容和情感信息，还可以揭示说话者的身份，人口特征和健康状况。 Federated Learning（FL）是一种分布式机器学习算法，可协调客户在不共享本地数据的情况下进行协作培训模型。该算法显示了SER应用程序的巨大潜力，因为共享用户设备中的原始语音或语音功能很容易受到隐私攻击的影响。但是，FL的主要挑战是高质量标记的数据样本的可用性有限。在这项工作中，我们提出了一个半监督联合学习框架，半喂养者，该框架利用标记和未标记的数据样本来解决FL中有限标记的数据样本的挑战。我们表明，即使使用两个SER基准数据集：IEMOCAP和MSP-IMPROV，即使本地标签速率L = 20，我们的半填充器也可以生成所需的SER性能。

Speech Emotion Recognition (SER) application is frequently associated with privacy concerns as it often acquires and transmits speech data at the client-side to remote cloud platforms for further processing. These speech data can reveal not only speech content and affective information but the speaker's identity, demographic traits, and health status. Federated learning (FL) is a distributed machine learning algorithm that coordinates clients to train a model collaboratively without sharing local data. This algorithm shows enormous potential for SER applications as sharing raw speech or speech features from a user's device is vulnerable to privacy attacks. However, a major challenge in FL is limited availability of high-quality labeled data samples. In this work, we propose a semi-supervised federated learning framework, Semi-FedSER, that utilizes both labeled and unlabeled data samples to address the challenge of limited labeled data samples in FL. We show that our Semi-FedSER can generate desired SER performance even when the local label rate l=20 using two SER benchmark datasets: IEMOCAP and MSP-Improv.

下载PDF全文

下载文献需遵守相关版权规定

论文标题