在SIMMC 2.0挑战中探索歧义检测和核心分辨率的多模式表示

论文标题

在SIMMC 2.0挑战中探索歧义检测和核心分辨率的多模式表示

Exploring Multi-Modal Representations for Ambiguity Detection & Coreference Resolution in the SIMMC 2.0 Challenge

论文作者

Chiyah-Garcia, Javier, Suglia, Alessandro, Lopes, José, Eshghi, Arash, Hastie, Helen

论文摘要

广告表达式（例如代词和参照描述）与先前的转弯以及直接视觉环境的语言背景有关。但是，说话者的参考描述并不总是唯一地识别指称人，从而导致需要通过随后的澄清交换来解决的歧义。因此，有效的歧义检测和核心分辨率是对会话AI任务成功的关键。在本文中，我们将这两个任务的模型作为SIMMC 2.0挑战的一部分（Kottur等，2021）。具体而言，我们使用基于Tod-Bert和LXMERT的模型，将它们与许多基线进行比较，并提供消融实验。我们的结果表明，（1）语言模型能够利用数据中的相关性来检测歧义；（2）单峰核心分辨率模型可以通过使用智能对象表示来避免对视觉组件的需求。

Anaphoric expressions, such as pronouns and referential descriptions, are situated with respect to the linguistic context of prior turns, as well as, the immediate visual environment. However, a speaker's referential descriptions do not always uniquely identify the referent, leading to ambiguities in need of resolution through subsequent clarificational exchanges. Thus, effective Ambiguity Detection and Coreference Resolution are key to task success in Conversational AI. In this paper, we present models for these two tasks as part of the SIMMC 2.0 Challenge (Kottur et al. 2021). Specifically, we use TOD-BERT and LXMERT based models, compare them to a number of baselines and provide ablation experiments. Our results show that (1) language models are able to exploit correlations in the data to detect ambiguity; and (2) unimodal coreference resolution models can avoid the need for a vision component, through the use of smart object representations.

下载PDF全文

下载文献需遵守相关版权规定

论文标题