论文标题
基于对称显着性的对抗性攻击说话者身份证明
Symmetric Saliency-based Adversarial Attack To Speaker Identification
论文作者
论文摘要
据我们所知,对抗说话者身份的对抗攻击方法要么需要高计算成本,要么不是很有效。为了解决这个问题,在本文中,我们提出了一种基于一代网络的新型方法,称为基于对称显着性的编码器 - 编码器(SSED),以生成对扬声器识别的对抗性语音示例。它包含两个新型组件。首先,它使用新颖的显着图解码器来了解语音样本对目标说话者识别系统的决定的重要性,以使攻击者专注于为重要样本生成人造噪声。它还提出了角度损失功能,以将扬声器嵌入远离源扬声器的距离。我们的实验结果表明,所提出的SSED产生了最先进的性能,即,在开放式和封闭式扬声器识别任务上,有超过97%的目标攻击成功率和信噪比的水平超过39 dB,计算成本较低。
Adversarial attack approaches to speaker identification either need high computational cost or are not very effective, to our knowledge. To address this issue, in this paper, we propose a novel generation-network-based approach, called symmetric saliency-based encoder-decoder (SSED), to generate adversarial voice examples to speaker identification. It contains two novel components. First, it uses a novel saliency map decoder to learn the importance of speech samples to the decision of a targeted speaker identification system, so as to make the attacker focus on generating artificial noise to the important samples. It also proposes an angular loss function to push the speaker embedding far away from the source speaker. Our experimental results demonstrate that the proposed SSED yields the state-of-the-art performance, i.e. over 97% targeted attack success rate and a signal-to-noise level of over 39 dB on both the open-set and close-set speaker identification tasks, with a low computational cost.