零拍的跨语性转移不明显优化

论文标题

零拍的跨语性转移不明显优化

Zero-shot Cross-lingual Transfer is Under-specified Optimization

论文作者

Wu, Shijie, Van Durme, Benjamin, Dredze, Mark

论文摘要

预处理的多语言编码器可实现零拍的跨语性转移，但通常会产生不可靠的模型，这些模型在目标语言上表现出很高的性能差异。我们假设这种高方差是由零摄像的跨语义转移解决了不明指定的优化问题。我们表明，源语言单语模型和源 +目标双语模型之间的任何线性交互模型都具有较低的源语言概括错误，但是，当我们从单语言模型转移到双语模型时，目标语言概括误差会顺利而线性地降低，这表明单独使用源语言的良好解决方案来识别良好的解决方案和使用源语言的良好解决方案。此外，我们表明零击解决方案在于目标语言误差概括表面的非燃料区域，从而导致较高的方差。

Pretrained multilingual encoders enable zero-shot cross-lingual transfer, but often produce unreliable models that exhibit high performance variance on the target language. We postulate that this high variance results from zero-shot cross-lingual transfer solving an under-specified optimization problem. We show that any linear-interpolated model between the source language monolingual model and source + target bilingual model has equally low source language generalization error, yet the target language generalization error reduces smoothly and linearly as we move from the monolingual to bilingual model, suggesting that the model struggles to identify good solutions for both source and target languages using the source language alone. Additionally, we show that zero-shot solution lies in non-flat region of target language error generalization surface, causing the high variance.

下载PDF全文

下载文献需遵守相关版权规定

论文标题