可控制的文本生成具有神经化的甲骨文

论文标题

可控制的文本生成具有神经化的甲骨文

Controllable Text Generation with Neurally-Decomposed Oracle

论文作者

Meng, Tao, Lu, Sidi, Peng, Nanyun, Chang, Kai-Wei

论文摘要

我们提出了一个通用和高效的框架，以使用神经化的甲骨文（NADO）来控制自动回归生成模型。给定预先训练的基本语言模型和一个序列级布尔甲骨文功能，我们建议将Oracle函数分解为令牌级别的指导，以在文本生成中引导基本模型。具体而言，令牌级的指导是通过训练基础模型取样的示例的神经模型来近似的，要求不需要其他辅助标记的数据。基于后正则化，我们提出了封闭形式的最佳解决方案，将令牌级的指导纳入可控生成的基础模型中。我们进一步提供了理论分析，说明NADO的近似质量如何影响可控的生成结果。对两种应用进行的实验：（1）具有词汇约束的文本生成和（2）具有形式控制的机器翻译表明，我们的框架有效地将基本模型指向给定的Oracle，同时保持高生成质量。

We propose a general and efficient framework to control auto-regressive generation models with NeurAlly-Decomposed Oracle (NADO). Given a pre-trained base language model and a sequence-level boolean oracle function, we propose to decompose the oracle function into token-level guidance to steer the base model in text generation. Specifically, the token-level guidance is approximated by a neural model trained with examples sampled from the base model, demanding no additional auxiliary labeled data. Based on posterior regularization, we present the closed-form optimal solution to incorporate the token-level guidance into the base model for controllable generation. We further provide a theoretical analysis of how the approximation quality of NADO affects the controllable generation results. Experiments conducted on two applications: (1) text generation with lexical constraints and (2) machine translation with formality control demonstrate that our framework efficiently guides the base model towards the given oracle while maintaining high generation quality.

下载PDF全文

下载文献需遵守相关版权规定

论文标题