分割和征服：文本语义匹配与删除的关键字和意图

论文标题

分割和征服：文本语义匹配与删除的关键字和意图

Divide and Conquer: Text Semantic Matching with Disentangled Keywords and Intents

论文作者

Zou, Yicheng, Liu, Hongwei, Gui, Tao, Wang, Junzhe, Zhang, Qi, Tang, Meng, Li, Haixiang, Wang, Daniel

论文摘要

文本语义匹配是一项基本任务，已在各种情况下广泛使用，例如社区问题答案，信息检索和建议。大多数最先进的匹配模型，例如，伯特，通过统一处理每个单词来直接执行文本比较。但是，查询句子通常包含需要不同级别匹配粒度的内容。具体而言，关键字代表了应严格匹配的事实信息，例如行动，实体和事件，而意图传达了可以解释为各种表达式的抽象概念和思想。在这项工作中，我们提出了一种简单而有效的培训策略，以通过将关键词与意图解开，以分裂和互动的方式进行文本语义匹配。我们的方法可以轻松地与预先训练的语言模型（PLM）结合使用，而不会影响其推理效率，从而在三个基准上对广泛的PLM进行稳定的性能提高。

Text semantic matching is a fundamental task that has been widely used in various scenarios, such as community question answering, information retrieval, and recommendation. Most state-of-the-art matching models, e.g., BERT, directly perform text comparison by processing each word uniformly. However, a query sentence generally comprises content that calls for different levels of matching granularity. Specifically, keywords represent factual information such as action, entity, and event that should be strictly matched, while intents convey abstract concepts and ideas that can be paraphrased into various expressions. In this work, we propose a simple yet effective training strategy for text semantic matching in a divide-and-conquer manner by disentangling keywords from intents. Our approach can be easily combined with pre-trained language models (PLM) without influencing their inference efficiency, achieving stable performance improvements against a wide range of PLMs on three benchmarks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题