论文标题
重新思考示范的作用:是什么使内在的学习工作?
Rethinking the Role of Demonstrations: What Makes In-Context Learning Work?
论文作者
论文摘要
大型语言模型(LMS)能够在文本中学习 - 通过在一些输入标签对(演示)中进行调节并对新输入进行预测,仅通过推理执行新任务。但是,对模型的学习方式以及演示的哪些方面有助于最终任务绩效的哪些方面几乎没有了解。在本文中,我们表明并不需要地面真相演示 - 在演示中随机替换标签几乎不会损害一系列分类和多选项任务的性能,这始终超过12个不同的模型,包括GPT-3。取而代之的是,我们发现演示的其他方面是最终任务性能的关键驱动因素,包括它们提供了一些示例(1)标签空间,(2)输入文本的分布以及(3)序列的整体格式。我们的分析共同提供了一种新的方式,可以理解如何以及为什么在文本学习工作,同时就仅通过推断就可以从大型语言模型中学到多少的新问题。
Large language models (LMs) are able to in-context learn -- perform a new task via inference alone by conditioning on a few input-label pairs (demonstrations) and making predictions for new inputs. However, there has been little understanding of how the model learns and which aspects of the demonstrations contribute to end task performance. In this paper, we show that ground truth demonstrations are in fact not required -- randomly replacing labels in the demonstrations barely hurts performance on a range of classification and multi-choce tasks, consistently over 12 different models including GPT-3. Instead, we find that other aspects of the demonstrations are the key drivers of end task performance, including the fact that they provide a few examples of (1) the label space, (2) the distribution of the input text, and (3) the overall format of the sequence. Together, our analysis provides a new way of understanding how and why in-context learning works, while opening up new questions about how much can be learned from large language models through inference alone.