论文标题

主管:用于监督语言建模的生成器分类器

DIRECTOR: Generator-Classifiers For Supervised Language Modeling

论文作者

Arora, Kushal, Shuster, Kurt, Sukhbaatar, Sainbayar, Weston, Jason

论文摘要

当前的语言模型达到了较低的困惑,但其产生的几代人仍然遭受有毒的反应,重复性和矛盾。标准语言建模设置无法解决这些问题。在本文中,我们介绍了一个新的体系结构{\ sc导演},该架构由一个统一的生成器分类器组成,其中既有语言建模,又是每个输出令牌的分类头。培训是使用标准语言建模数据共同进行的,并以所需和不良序列标记的数据。与标准语言模型相比,该模型在几种环境中的实验表明,该模型具有竞争性的培训和解码速度,同时产生了较高的结果,从而减轻了已知的问题,同时保持发电质量。它还优于现有的模型指导方法,从精度和效率方面。

Current language models achieve low perplexity but their resulting generations still suffer from toxic responses, repetitiveness and contradictions. The standard language modeling setup fails to address these issues. In this paper, we introduce a new architecture, {\sc Director}, that consists of a unified generator-classifier with both a language modeling and a classification head for each output token. Training is conducted jointly using both standard language modeling data, and data labeled with desirable and undesirable sequences. Experiments in several settings show that the model has competitive training and decoding speed compared to standard language models while yielding superior results, alleviating known issues while maintaining generation quality. It also outperforms existing model guiding approaches in terms of both accuracy and efficiency.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源