ReadOnce Transformers：变形金刚的文本可重复使用的表示

论文标题

ReadOnce Transformers：变形金刚的文本可重复使用的表示

ReadOnce Transformers: Reusable Representations of Text for Transformers

论文作者

Lin, Shih-Ting, Sabharwal, Ashish, Khot, Tushar

论文摘要

我们介绍了ReadOnce Transformers，一种将基于变压器的模型转换为可以构建信息捕获，独立于任务和压缩文本表示的方法。在不同的示例和任务中可以重复使用所得的表示形式，从而需要在许多示例或任务中共享的文档才能为\ emph {读一次}。这会导致对模型的更快培训和评估。此外，我们将标准的文本到文本变压器模型扩展到表示+文本到文本模型，并在多个下游任务上进行评估：多跳QA，抽象QA和长期文档摘要。与标准的文本到文本模型相比，我们一次性计算的表示形式的速度为2x-5X的加速，而压缩还允许现有的语言模型处理更长的文档，而无需设计新的预训练模型。

We present ReadOnce Transformers, an approach to convert a transformer-based model into one that can build an information-capturing, task-independent, and compressed representation of text. The resulting representation is reusable across different examples and tasks, thereby requiring a document shared across many examples or tasks to only be \emph{read once}. This leads to faster training and evaluation of models. Additionally, we extend standard text-to-text transformer models to Representation+Text-to-text models, and evaluate on multiple downstream tasks: multi-hop QA, abstractive QA, and long-document summarization. Our one-time computed representation results in a 2x-5x speedup compared to standard text-to-text models, while the compression also allows existing language models to handle longer documents without the need for designing new pre-trained models.

下载PDF全文

下载文献需遵守相关版权规定

论文标题