增强自然语言用于生成序列标签

论文标题

增强自然语言用于生成序列标签

Augmented Natural Language for Generative Sequence Labeling

论文作者

Athiwaratkun, Ben, Santos, Cicero Nogueira dos, Krone, Jason, Xiang, Bing

论文摘要

我们为关节序列标签和句子级分类提出了生成框架。我们的模型使用单个共享的自然语言输出空间立即执行多个序列标记任务。与先前的判别方法不同，我们的模型自然结合了标签语义，并分享了跨任务的知识。我们的框架是通用的，在几次，低资源和高资源的任务上表现良好。我们在受欢迎的命名实体识别，插槽标签和意图分类基准上证明了这些优势。我们为少量插槽标签设置了一个新的最先进的标签，在前5次（$ 75.0 \％\ rightarrow 90.9 \％$）和1-shot（$ 70.4 \％\％\％\ rightarrow 81.0 \％$）上有了显着改善。此外，我们的模型通过合并标签语义，在BERT基线上产生了低资源插槽标签（$ 46.27 \％\ rightarrow 63.83 \％$）。我们还在高资源任务上保持了竞争成果，在所有任务的最新任务的两个点内执行，并在SNIPS数据集中设置新的最新时间。

We propose a generative framework for joint sequence labeling and sentence-level classification. Our model performs multiple sequence labeling tasks at once using a single, shared natural language output space. Unlike prior discriminative methods, our model naturally incorporates label semantics and shares knowledge across tasks. Our framework is general purpose, performing well on few-shot, low-resource, and high-resource tasks. We demonstrate these advantages on popular named entity recognition, slot labeling, and intent classification benchmarks. We set a new state-of-the-art for few-shot slot labeling, improving substantially upon the previous 5-shot ($75.0\% \rightarrow 90.9\%$) and 1-shot ($70.4\% \rightarrow 81.0\%$) state-of-the-art results. Furthermore, our model generates large improvements ($46.27\% \rightarrow 63.83\%$) in low-resource slot labeling over a BERT baseline by incorporating label semantics. We also maintain competitive results on high-resource tasks, performing within two points of the state-of-the-art on all tasks and setting a new state-of-the-art on the SNIPS dataset.

下载PDF全文

下载文献需遵守相关版权规定

论文标题