论文标题

有条件文本生成带有封闭域对话系统的传输

Conditioned Text Generation with Transfer for Closed-Domain Dialogue Systems

论文作者

d'Ascoli, Stéphane, Coucke, Alice, Caltagirone, Francesco, Caulier, Alexandre, Lelarge, Marc

论文摘要

针对任务的对话系统缺乏培训数据是一个众所周知的问题,通常以昂贵且耗时的手动数据注释解决。另一种解决方案是依靠自动文本生成,尽管它比人类的监督不太准确,但其优势是便宜和快速。我们的贡献是双重的。首先,我们展示如何使用条件变异自动编码器最佳地训练和控制特定于特定句子的生成。然后,我们引入了一个名为“查询传输”的新协议,该协议允许利用一个可能包含无关查询的大型未标记数据集来提取相关信息。与两个不同的基线的比较表明,在适当的制度中,这种方法始终提高生成的查询的多样性而不会损害其质量。我们还证明了我们一代方法作为语言建模任务的数据增强技术的有效性。

Scarcity of training data for task-oriented dialogue systems is a well known problem that is usually tackled with costly and time-consuming manual data annotation. An alternative solution is to rely on automatic text generation which, although less accurate than human supervision, has the advantage of being cheap and fast. Our contribution is twofold. First we show how to optimally train and control the generation of intent-specific sentences using a conditional variational autoencoder. Then we introduce a new protocol called query transfer that allows to leverage a large unlabelled dataset, possibly containing irrelevant queries, to extract relevant information. Comparison with two different baselines shows that this method, in the appropriate regime, consistently improves the diversity of the generated queries without compromising their quality. We also demonstrate the effectiveness of our generation method as a data augmentation technique for language modelling tasks.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源