论文标题

Qameleon:多语言质量检查,只有5个示例

QAmeleon: Multilingual QA with Only 5 Examples

论文作者

Agrawal, Priyanka, Alberti, Chris, Huot, Fantine, Maynez, Joshua, Ma, Ji, Ruder, Sebastian, Ganchev, Kuzman, Das, Dipanjan, Lapata, Mirella

论文摘要

大型高质量数据集的可用性一直是最新进度回答(QA)的主要驱动力之一。然而,这种注释的数据集很难收集,而且很少有英语以外的语言存在,从而使代表性不足的语言无法访问QA技术。构建大型单语培训数据集的一种替代方法是在几次学习设置下利用预训练的语言模型(PLM)。我们的方法Qameleon使用PLM自动生成QA模型的多语言数据,从而避免了昂贵的注释。迅速调整PLM以进行数据合成,每种语言只有五个示例提供的精度优于基于翻译的基准,桥梁仅在英语基线和对近50,000个手工标记的示例中受过近50,000个训练的上限培训之间的差距近60%,并且始终可以使QA模型与QA模型直接在低资源中相比,始终可以进行实质性的改进。 Tydiqa-goldp和MLQA基准的实验表明,跨语言的数据合成量表很少,这是大规模注释的可行替代品。

The availability of large, high-quality datasets has been one of the main drivers of recent progress in question answering (QA). Such annotated datasets however are difficult and costly to collect, and rarely exist in languages other than English, rendering QA technology inaccessible to underrepresented languages. An alternative to building large monolingual training datasets is to leverage pre-trained language models (PLMs) under a few-shot learning setting. Our approach, QAmeleon, uses a PLM to automatically generate multilingual data upon which QA models are trained, thus avoiding costly annotation. Prompt tuning the PLM for data synthesis with only five examples per language delivers accuracy superior to translation-based baselines, bridges nearly 60% of the gap between an English-only baseline and a fully supervised upper bound trained on almost 50,000 hand labeled examples, and always leads to substantial improvements compared to fine-tuning a QA model directly on labeled examples in low resource settings. Experiments on the TyDiQA-GoldP and MLQA benchmarks show that few-shot prompt tuning for data synthesis scales across languages and is a viable alternative to large-scale annotation.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源