预先训练的句子嵌入隐式话语关系分类

论文标题

预先训练的句子嵌入隐式话语关系分类

Pre-trained Sentence Embeddings for Implicit Discourse Relation Classification

论文作者

Balusu, Murali Raghu Babu, Ji, Yangfeng, Eisenstein, Jacob

论文摘要

隐式话语关系将较小的语言单位绑定到连贯的文本中。对隐式关系的自动意识预测很难，因为它需要理解链接参数的语义。此外，由于现象的规模，注释的数据集包含相对较少的标记示例：平均而言，每种话语关系都包含数十个单词。在本文中，我们探讨了预先训练的句子嵌入作为神经网络中的基本表示的实用性，以实现隐式话语关系的理解分类。我们使用受监督的端对端训练模型和预训练的编码技术（Skipthought，Send2Vec和Infersent）提出了一系列实验。预训练的嵌入与端到端模型具有竞争力，并且方法是互补的，合并的模型对三个评估中的两种进行了显着改善。

Implicit discourse relations bind smaller linguistic units into coherent texts. Automatic sense prediction for implicit relations is hard, because it requires understanding the semantics of the linked arguments. Furthermore, annotated datasets contain relatively few labeled examples, due to the scale of the phenomenon: on average each discourse relation encompasses several dozen words. In this paper, we explore the utility of pre-trained sentence embeddings as base representations in a neural network for implicit discourse relation sense classification. We present a series of experiments using both supervised end-to-end trained models and pre-trained sentence encoding techniques - SkipThought, Sent2vec and Infersent. The pre-trained embeddings are competitive with the end-to-end model, and the approaches are complementary, with combined models yielding significant performance improvements on two of the three evaluations.

下载PDF全文

下载文献需遵守相关版权规定

论文标题