选择性弱监督神经信息检索

论文标题

选择性弱监督神经信息检索

Selective Weak Supervision for Neural Information Retrieval

论文作者

Zhang, Kaitao, Xiong, Chenyan, Liu, Zhenghao, Liu, Zhiyuan

论文摘要

本文将神经信息的检索民主化，以使没有可用的大规模相关性培训信号的情况。我们重新审视了锚固文档关系近似查询文档相关性的经典红外直觉，并提出了一种强化弱监督选择方法Reinfoselect，该方法学会选择锚定文档对，最能通过相关的标签作为奖励，最能通过相关的标签来选择弱弱监督神经排名者（动作）。迭代地，对于一批锚点对，重新选择背部通过神经排名传播梯度，收集其NDCG奖励，并使用策略梯度来优化数据选择网络，直到神经排名者在目标相关性指标（Convergence）上的性能峰（Convergence）。在我们对三个TREC基准测试的实验中，由Reinfoselect培训的神经排名者只有可公开可用的锚数据，显着优于基于功能的学习，可以对方法进行排名并匹配接受私人商业搜索日志训练的神经排名者的有效性。我们的分析表明，重新选择可以根据神经排名训练的阶段有效地选择弱监督信号，并且直觉地选择类似于查询文件对的锚档案对。

This paper democratizes neural information retrieval to scenarios where large scale relevance training signals are not available. We revisit the classic IR intuition that anchor-document relations approximate query-document relevance and propose a reinforcement weak supervision selection method, ReInfoSelect, which learns to select anchor-document pairs that best weakly supervise the neural ranker (action), using the ranking performance on a handful of relevance labels as the reward. Iteratively, for a batch of anchor-document pairs, ReInfoSelect back propagates the gradients through the neural ranker, gathers its NDCG reward, and optimizes the data selection network using policy gradients, until the neural ranker's performance peaks on target relevance metrics (convergence). In our experiments on three TREC benchmarks, neural rankers trained by ReInfoSelect, with only publicly available anchor data, significantly outperform feature-based learning to rank methods and match the effectiveness of neural rankers trained with private commercial search logs. Our analyses show that ReInfoSelect effectively selects weak supervision signals based on the stage of the neural ranker training, and intuitively picks anchor-document pairs similar to query-document pairs.

下载PDF全文

下载文献需遵守相关版权规定

论文标题