论文标题
朝属性输入可控文本生成:祝福生成的试点研究
Towards Attribute-Entangled Controllable Text Generation: A Pilot Study of Blessing Generation
论文作者
论文摘要
可控文本生成(CTG)由于专注于多个属性而获得的细粒生成能力获得了巨大的成功。但是,大多数现有的CTG研究都忽略了如何利用属性纠缠来增强受控生成的文本的多样性。面对这一困境,我们专注于一种新颖的CTG场景,即祝福产生,这是充满挑战的,因为高质量的祝福文本需要CTG模型,以全面考虑多个属性(例如对象和场合)之间的纠缠。为了促进关于祝福一代的研究,我们提出了电子商院,这是一个大规模的祝福文本数据集,其中包含带有多个属性的293k英语句子。此外,我们提出了新颖的评估指标,以衡量我们设计的基线模型产生的祝福文本的质量。我们的研究为可控文本生成开了一个新的研究方向,并可以开发属性输入的CTG模型。我们的数据集和源代码可在\ url {https://github.com/huangshulin123/blessing-generation}中获得。
Controllable Text Generation (CTG) has obtained great success due to its fine-grained generation ability obtained by focusing on multiple attributes. However, most existing CTG researches overlook how to utilize the attribute entanglement to enhance the diversity of the controlled generated texts. Facing this dilemma, we focus on a novel CTG scenario, i.e., blessing generation which is challenging because high-quality blessing texts require CTG models to comprehensively consider the entanglement between multiple attributes (e.g., objects and occasions). To promote the research on blessing generation, we present EBleT, a large-scale Entangled Blessing Text dataset containing 293K English sentences annotated with multiple attributes. Furthermore, we propose novel evaluation metrics to measure the quality of the blessing texts generated by the baseline models we designed. Our study opens a new research direction for controllable text generation and enables the development of attribute-entangled CTG models. Our dataset and source codes are available at \url{https://github.com/huangshulin123/Blessing-Generation}.