指导诱导：从几个示例到自然语言任务描述

论文标题

指导诱导：从几个示例到自然语言任务描述

Instruction Induction: From Few Examples to Natural Language Task Descriptions

论文作者

Honovich, Or, Shaham, Uri, Bowman, Samuel R., Levy, Omer

论文摘要

大型语言模型能够通过在一些输入输出演示中进行调节 - 一种称为“文本学习”的范式来执行任务。我们表明，语言模型可以通过促使他们生成适合示例的自然语言指令来明确从一些演示中推断出的基本任务。为了探索这种能力，我们介绍了指令诱导挑战，编译由24个任务组成的数据集，并根据执行生成的指令定义新颖的评估指标。我们发现，当使用既足够大又一致以遵循指示的模型时，生成指令的能力确实确实出现了。在我们的基于执行的指标中，指令gpt可实现65.7％的人类绩效，而原始的GPT-3模型仅达到人类绩效的9.8％。这个令人惊讶的结果表明，教学诱导本身可能是可行的学习范式，在该范式上，没有将一组潜在的连续参数拟合到数据，而是在自然语言假设空间中搜索最佳描述。

Large language models are able to perform a task by conditioning on a few input-output demonstrations - a paradigm known as in-context learning. We show that language models can explicitly infer an underlying task from a few demonstrations by prompting them to generate a natural language instruction that fits the examples. To explore this ability, we introduce the instruction induction challenge, compile a dataset consisting of 24 tasks, and define a novel evaluation metric based on executing the generated instruction. We discover that, to a large extent, the ability to generate instructions does indeed emerge when using a model that is both large enough and aligned to follow instructions; InstructGPT achieves 65.7% of human performance in our execution-based metric, while the original GPT-3 model reaches only 9.8% of human performance. This surprising result suggests that instruction induction might be a viable learning paradigm in and of itself, where instead of fitting a set of latent continuous parameters to the data, one searches for the best description in the natural language hypothesis space.

下载PDF全文

下载文献需遵守相关版权规定

论文标题