论文标题
使用引文簇检索学术信息:基于系统评价的深入评估
Academic information retrieval using citation clusters: In-depth evaluation based on systematic reviews
论文作者
论文摘要
科学计量学领域已经显示了基于引用的簇进行文献分析的力量,但是该技术几乎没有用于信息检索任务。这项工作评估了基于引文的群集的性能,以进行信息检索任务。我们使用这些群集和群集的树层次结构和群集选择算法模拟了搜索过程。我们评估了为25个系统评价找到相关文档的任务。我们的评估考虑了集群选择的召回和精度之间的几个权衡,我们还复制了系统审查自我报告的布尔查询,以作为参考。我们发现,基于引用的群集搜索性能是可变的且无法预测的,它最适合于以2到8之间的比率召回精度的用户,并且当与基于查询的搜索一起使用时,它们相互补充,包括查找新的相关文档。
The field of scientometrics has shown the power of citation-based clusters for literature analysis, yet this technique has barely been used for information retrieval tasks. This work evaluates the performance of citation based-clusters for information retrieval tasks. We simulated a search process using these clusters with a tree hierarchy of clusters and a cluster selection algorithm. We evaluated the task of finding the relevant documents for 25 systematic reviews. Our evaluation considered several trade-offs between recall and precision for the cluster selection, and we also replicated the Boolean queries self-reported by the systematic review to serve as a reference. We found that citation-based clusters search performance is highly variable and unpredictable, that it works best for users that prefer recall over precision at a ratio between 2 and 8, and that when used along with query-based search they complement each other, including finding new relevant documents.