论文标题
HLATR:使用混合列表Anears Transformer Reranking增强多阶段文本检索
HLATR: Enhance Multi-stage Text Retrieval with Hybrid List Aware Transformer Reranking
论文作者
论文摘要
深度训练的语言模型(e,g。Bert)在大规模文本检索任务中有效。现有的文本检索系统具有最先进的性能通常是由于预先训练的语言模型的高计算成本和较大的语料库大小而采用检索到级别的架构。在这样的多阶段架构下,以前的研究主要集中于优化框架的单个阶段,从而提高了整体检索性能。但是,如何直接对优化的多阶段功能进行尚未得到很好的研究。在本文中,我们将Hybrid List Aware Realander Reranking(HLATR)设计为随后的Reranking模块,以合并检索和重新固定阶段功能。 HLATR是轻量级的,可以轻松地与现有的文本检索系统并行并行,以便可以在单个但有效的处理中执行重读过程。在两个大规模文本检索数据集上进行的经验实验表明,HLATR可以有效地提高现有多阶段文本检索方法的排名性能。
Deep pre-trained language models (e,g. BERT) are effective at large-scale text retrieval task. Existing text retrieval systems with state-of-the-art performance usually adopt a retrieve-then-reranking architecture due to the high computational cost of pre-trained language models and the large corpus size. Under such a multi-stage architecture, previous studies mainly focused on optimizing single stage of the framework thus improving the overall retrieval performance. However, how to directly couple multi-stage features for optimization has not been well studied. In this paper, we design Hybrid List Aware Transformer Reranking (HLATR) as a subsequent reranking module to incorporate both retrieval and reranking stage features. HLATR is lightweight and can be easily parallelized with existing text retrieval systems so that the reranking process can be performed in a single yet efficient processing. Empirical experiments on two large-scale text retrieval datasets show that HLATR can efficiently improve the ranking performance of existing multi-stage text retrieval methods.