论文标题

更新神经语义解析器时,克服矛盾的数据

Overcoming Conflicting Data when Updating a Neural Semantic Parser

论文作者

Gaddy, David, Kouzemtchenko, Alex, Muddireddy, Pavankumar Reddy, Kolhar, Prateek, Shah, Rushin

论文摘要

在本文中,我们探讨了如何使用少量新数据来更新面向任务的语义解析模型时,当某些示例的所需输出发生了变化时。在以这种方式进行更新时,出现的一个潜在问题是存在冲突的数据或原始培训集中的过时标签。为了评估该研究研究的问题的影响,我们提出了一种实验设置,以模拟对神经语义解析器的变化。我们表明,相互矛盾的数据的存在极大地阻碍了学习更新的学习,然后探索几种减轻其效果的方法。与幼稚的数据混合策略相比,我们的多任务和数据选择方法可导致模型准确性的巨大提高,而我们的最佳方法缩小了该基线和Oracle上限之间的精度差距的86%。

In this paper, we explore how to use a small amount of new data to update a task-oriented semantic parsing model when the desired output for some examples has changed. When making updates in this way, one potential problem that arises is the presence of conflicting data, or out-of-date labels in the original training set. To evaluate the impact of this understudied problem, we propose an experimental setup for simulating changes to a neural semantic parser. We show that the presence of conflicting data greatly hinders learning of an update, then explore several methods to mitigate its effect. Our multi-task and data selection methods lead to large improvements in model accuracy compared to a naive data-mixing strategy, and our best method closes 86% of the accuracy gap between this baseline and an oracle upper bound.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源