论文标题

基于变压器的法律文本处理方法

Transformer-based Approaches for Legal Text Processing

论文作者

Nguyen, Ha-Thanh, Nguyen, Minh-Phuong, Vuong, Thi-Hai-Yen, Bui, Minh-Quan, Nguyen, Minh-Chau, Dang, Tran-Binh, Tran, Vu, Nguyen, Le-Minh, Satoh, Ken

论文摘要

在本文中,我们使用基于变压器的模型介绍了我们的方法,以解决Coliee 2021自动法律文本处理竞赛的不同问题。法律文件的自动处理是一项具有挑战性的任务,因为法律文件的特征以及数据量的限制。通过我们的详细实验,我们发现,基于变压器的预审前的语言模型可以通过适当的方法使用自动的法律文本处理问题来表现良好。我们详细描述了每个任务的处理步骤,例如问题制定,数据处理和增强,预处理,填充。此外,我们向社区介绍了两个验证的模型,这些模型利用法律域,NFSP和NMSP中的平行翻译。其中,NFSP实现了比赛5的最新结果。尽管本文着重于技术报告,但其方法的新颖性也可能是使用基于变压器模型的自动化法律文档处理中的有用参考。

In this paper, we introduce our approaches using Transformer-based models for different problems of the COLIEE 2021 automatic legal text processing competition. Automated processing of legal documents is a challenging task because of the characteristics of legal documents as well as the limitation of the amount of data. With our detailed experiments, we found that Transformer-based pretrained language models can perform well with automated legal text processing problems with appropriate approaches. We describe in detail the processing steps for each task such as problem formulation, data processing and augmentation, pretraining, finetuning. In addition, we introduce to the community two pretrained models that take advantage of parallel translations in legal domain, NFSP and NMSP. In which, NFSP achieves the state-of-the-art result in Task 5 of the competition. Although the paper focuses on technical reporting, the novelty of its approaches can also be an useful reference in automated legal document processing using Transformer-based models.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源