更少的是：一种轻巧，强大的神经架构，用于解析话语

论文标题

更少的是：一种轻巧，强大的神经架构，用于解析话语

Less is More: A Lightweight and Robust Neural Architecture for Discourse Parsing

论文作者

Li, Ming, Huang, Ruihong

论文摘要

复杂的功能提取器被广泛用于文本表示构建。但是，这些复杂的功能提取器使NLP系统容易过度拟合，尤其是当下游培训数据集相对较小时，对于多个话语解析任务就是这种情况。因此，我们提出了一种替代性轻型神经结构，该神经结构可以消除多个复杂的特征提取器，并且仅利用可学习的自我发项模块来间接利用预处理的神经语言模型，以最大程度地维护预训练的语言模型的普遍性。在三个常见的话语解析任务上进行的实验表明，该任务是由最近预处理的语言模型提供动力的，仅由两个自我发挥层组成的轻量级体系结构可获得更好的概括性和鲁棒性。同时，它具有可比较甚至更好的系统性能，而可学习的参数和更少的处理时间。

Complex feature extractors are widely employed for text representation building. However, these complex feature extractors make the NLP systems prone to overfitting especially when the downstream training datasets are relatively small, which is the case for several discourse parsing tasks. Thus, we propose an alternative lightweight neural architecture that removes multiple complex feature extractors and only utilizes learnable self-attention modules to indirectly exploit pretrained neural language models, in order to maximally preserve the generalizability of pre-trained language models. Experiments on three common discourse parsing tasks show that powered by recent pretrained language models, the lightweight architecture consisting of only two self-attention layers obtains much better generalizability and robustness. Meanwhile, it achieves comparable or even better system performance with fewer learnable parameters and less processing time.

下载PDF全文

下载文献需遵守相关版权规定

论文标题