一般低资源信息提取的梯度模仿增强学习

论文标题

一般低资源信息提取的梯度模仿增强学习

Gradient Imitation Reinforcement Learning for General Low-Resource Information Extraction

论文作者

Hu, Xuming, Meng, Shiao, Zhang, Chenwei, Yang, Xiangli, Wen, Lijie, King, Irwin, Yu, Philip S.

论文摘要

信息提取（IE）旨在从异质来源提取结构化信息。 IE来自自然语言文本包括子任务，例如命名实体识别（NER），关系提取（RE）和事件提取（EE）。大多数IE系统都需要对句子结构，隐含语义和领域知识的全面理解才能表现良好。因此，IE任务始终需要足够的外部资源和注释。但是，获得更多人类注释需要时间和精力。低资源信息提取（LRIE）致力于使用无监督的数据，从而减少所需的资源和人类注释。在实践中，现有系统要么利用自训练方案来生成伪标签，从而导致逐渐漂移问题，或者利用不可避免地具有确认偏见的一致性正则化方法。为了减轻现有LRIE学习范式中缺乏反馈回路的确认偏差，我们开发了一种梯度模仿增强学习（GIRL）方法，以鼓励伪标记的数据来模仿标记数据的梯度下降方向，这些数据可以迫使伪标记数据以获得与标记数据相似的更好的优化能力。根据伪标记的数据模仿从标记的数据获得的指导性梯度下降方向，我们设计了一个奖励来量化模仿过程，并通过试验和错误引导伪标记数据的优化能力。除了学习范式外，女孩还不仅限于特定的子任务，而且我们利用女孩在低资源设置（半措辞的IE，IE和少数射击的IE）中解决所有IE子任务（命名为实体识别，关系提取和事件提取）。

Information Extraction (IE) aims to extract structured information from heterogeneous sources. IE from natural language texts include sub-tasks such as Named Entity Recognition (NER), Relation Extraction (RE), and Event Extraction (EE). Most IE systems require comprehensive understandings of sentence structure, implied semantics, and domain knowledge to perform well; thus, IE tasks always need adequate external resources and annotations. However, it takes time and effort to obtain more human annotations. Low-Resource Information Extraction (LRIE) strives to use unsupervised data, reducing the required resources and human annotation. In practice, existing systems either utilize self-training schemes to generate pseudo labels that will cause the gradual drift problem, or leverage consistency regularization methods which inevitably possess confirmation bias. To alleviate confirmation bias due to the lack of feedback loops in existing LRIE learning paradigms, we develop a Gradient Imitation Reinforcement Learning (GIRL) method to encourage pseudo-labeled data to imitate the gradient descent direction on labeled data, which can force pseudo-labeled data to achieve better optimization capabilities similar to labeled data. Based on how well the pseudo-labeled data imitates the instructive gradient descent direction obtained from labeled data, we design a reward to quantify the imitation process and bootstrap the optimization capability of pseudo-labeled data through trial and error. In addition to learning paradigms, GIRL is not limited to specific sub-tasks, and we leverage GIRL to solve all IE sub-tasks (named entity recognition, relation extraction, and event extraction) in low-resource settings (semi-supervised IE and few-shot IE).

下载PDF全文

下载文献需遵守相关版权规定

论文标题