控制预验证的语言生成模型的重点

论文标题

控制预验证的语言生成模型的重点

Controlling the Focus of Pretrained Language Generation Models

论文作者

Ji, Jiabao, Kim, Yoon, Glass, James, He, Tianxing

论文摘要

基于变压器的语言生成模型的填充通常是以端到端的方式进行的，该模型学会学会自己参与输入的相关部分。但是，没有直接控制模型重点的机制。这项工作旨在开发一种控制机制，用户可以通过该机制选择上下文跨度作为“突出显示”，以供模型关注并生成相关的输出。为了实现这一目标，我们可以增强具有可训练的“焦点向量”的验证模型，该模型直接应用于模型的嵌入方式，而模型本身则保持固定。这些向量是根据归因方法得出的自动注释训练的，是上下文重要性的指标。我们在两个核心一代任务上测试我们的方法：对话响应生成和抽象性摘要。我们还收集了评估数据，其中重点生成对由人类注释。我们的实验表明，受过训练的焦点向量可以有效地转向模型，以生成与用户选择的亮点相关的输出。

The finetuning of pretrained transformer-based language generation models are typically conducted in an end-to-end manner, where the model learns to attend to relevant parts of the input by itself. However, there does not exist a mechanism to directly control the model's focus. This work aims to develop a control mechanism by which a user can select spans of context as "highlights" for the model to focus on, and generate relevant output. To achieve this goal, we augment a pretrained model with trainable "focus vectors" that are directly applied to the model's embeddings, while the model itself is kept fixed. These vectors, trained on automatic annotations derived from attribution methods, act as indicators for context importance. We test our approach on two core generation tasks: dialogue response generation and abstractive summarization. We also collect evaluation data where the highlight-generation pairs are annotated by humans. Our experiments show that the trained focus vectors are effective in steering the model to generate outputs that are relevant to user-selected highlights.

下载PDF全文

下载文献需遵守相关版权规定

论文标题