论文标题

从预先训练的LLM中选择更好的样本:关于问题产生的案例研究

Selecting Better Samples from Pre-trained LLMs: A Case Study on Question Generation

论文作者

Yuan, Xingdi, Wang, Tong, Wang, Yen-Hsiang, Fine, Emery, Abdelghani, Rania, Lucas, Pauline, Sauzéon, Hélène, Oudeyer, Pierre-Yves

论文摘要

近年来,大型语言模型(LLMS)在自然语言产生中表现出了令人印象深刻的能力。提高发电多样性的一种常见做法是从模型中采样多个输出。但是,缺乏一种简单,坚固的方式来从这些随机样品中选择最佳输出。作为一个案例研究,在问题产生的背景下,我们提出了两种基于迅速的方法来从一组LLM生成的候选人中选择高质量问题。我们的方法在1)限制下起作用,一个黑框(不可模拟的)问题生成模型和2)缺乏访问人类宣传的参考文献 - 这两者都是现实世界部署LLMS的现实局限性。通过自动和人类的评估,我们从经验上证明,我们的方法可以有效地选择比贪婪产生更高质量的问题。

Large Language Models (LLMs) have in recent years demonstrated impressive prowess in natural language generation. A common practice to improve generation diversity is to sample multiple outputs from the model. However, there lacks a simple and robust way of selecting the best output from these stochastic samples. As a case study framed in the context of question generation, we propose two prompt-based approaches to selecting high-quality questions from a set of LLM-generated candidates. Our method works under the constraints of 1) a black-box (non-modifiable) question generation model and 2) lack of access to human-annotated references -- both of which are realistic limitations for real-world deployment of LLMs. With automatic as well as human evaluations, we empirically demonstrate that our approach can effectively select questions of higher qualities than greedy generation.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源