论文标题
越南法律文本的多阶段信息检索
Multi-stage Information Retrieval for Vietnamese Legal Texts
论文作者
论文摘要
这项研究涉及越南法律文本的信息检索问题(IR)。尽管对多种语言进行了很好的研究,但信息检索仍未受到越南研究社区的关注。对于很难处理的法律文件的情况,尤其如此。这项研究提出了一种使用句子转换器的越南法律文件的新方法来检索信息的新方法。此外,还进行了各种实验,以在不同的变压器模型,排名分数,音节级别和单词级训练之间进行比较。实验结果表明,所提出的模型优于当前有关越南文档信息检索的研究模型。
This study deals with the problem of information retrieval (IR) for Vietnamese legal texts. Despite being well researched in many languages, information retrieval has still not received much attention from the Vietnamese research community. This is especially true for the case of legal documents, which are hard to process. This study proposes a new approach for information retrieval for Vietnamese legal documents using sentence-transformer. Besides, various experiments are conducted to make comparisons between different transformer models, ranking scores, syllable-level, and word-level training. The experiment results show that the proposed model outperforms models used in current research on information retrieval for Vietnamese documents.