论文标题
PIN:一种新颖的平行互动网络,用于口语理解
PIN: A Novel Parallel Interactive Network for Spoken Language Understanding
论文作者
论文摘要
口语理解(SLU)是口语对话系统的重要组成部分,该系统通常由意图检测(ID)和插槽填充(SF)任务组成。最近,基于复发的神经网络(RNN)方法实现了SLU的最新方法。值得注意的是,在现有的基于RNN的方法中,ID和SF任务通常被共同建模以利用它们之间的相关信息。但是,我们注意到,到目前为止,通过支持ID和SF之间的双向和明确的信息交换来获得更好的绩效的努力尚未得到很好的研究。此外,很少有研究尝试捕获本地上下文信息以增强SF的性能。在本文中,提出了这些发现的动机,并提出了平行的交互式网络(PIN),以模拟ID和SF之间的相互指导。具体而言,鉴于话语,引入了高斯自我攻击编码器,以生成能够捕获本地上下文信息的话语的上下文感知功能嵌入。开发出插入发音,slot2intent模块和Intent2Slot模块的功能嵌入,以捕获ID和SF任务的双向信息流。最后,构建了一种合作机制,以融合从slot2intent和intent2slot模块中获得的信息,以进一步降低预测BIA。在两个基准数据集(即狙击和ATI)上进行的实验证明了我们的方法的有效性,这使我们的方法具有竞争性的结果。更令人鼓舞的是,通过使用预先训练的语言模型BERT产生的话语的功能嵌入,我们的方法在所有比较方法中实现了最新的方法。
Spoken Language Understanding (SLU) is an essential part of the spoken dialogue system, which typically consists of intent detection (ID) and slot filling (SF) tasks. Recently, recurrent neural networks (RNNs) based methods achieved the state-of-the-art for SLU. It is noted that, in the existing RNN-based approaches, ID and SF tasks are often jointly modeled to utilize the correlation information between them. However, we noted that, so far, the efforts to obtain better performance by supporting bidirectional and explicit information exchange between ID and SF are not well studied.In addition, few studies attempt to capture the local context information to enhance the performance of SF. Motivated by these findings, in this paper, Parallel Interactive Network (PIN) is proposed to model the mutual guidance between ID and SF. Specifically, given an utterance, a Gaussian self-attentive encoder is introduced to generate the context-aware feature embedding of the utterance which is able to capture local context information. Taking the feature embedding of the utterance, Slot2Intent module and Intent2Slot module are developed to capture the bidirectional information flow for ID and SF tasks. Finally, a cooperation mechanism is constructed to fuse the information obtained from Slot2Intent and Intent2Slot modules to further reduce the prediction bias.The experiments on two benchmark datasets, i.e., SNIPS and ATIS, demonstrate the effectiveness of our approach, which achieves a competitive result with state-of-the-art models. More encouragingly, by using the feature embedding of the utterance generated by the pre-trained language model BERT, our method achieves the state-of-the-art among all comparison approaches.