论文标题

针对临床查询的对话理解的特定领域的语言预培训预培训对话

Domain-specific Language Pre-training for Dialogue Comprehension on Clinical Inquiry-Answering Conversations

论文作者

Liu, Zhengyuan, Krishnaswamy, Pavitra, Chen, Nancy F.

论文摘要

从临床对话中自动提取相关信息的自动提取越来越兴趣。但是,很难为临床对话任务收集和构建大量注释的资源。自然语言处理的最新发展表明,可以利用大规模的预训练的语言骨架来用于这种机器理解和信息提取任务。然而,由于预训练和下游临床领域之间的差距,利用通用骨架的特定于域特异性应用仍然具有挑战性。因此,在这项工作中,我们提出了一种特定领域的语言预训练,以提高对话理解等下游任务的性能。除了共同的代币级别掩盖预训练方法外,根据人类对话的性质和多主题查询 - 召开对话的交互作用的性质,我们进一步提出了使用说话者和话语操纵的样本生成策略。对话预训练指导语言骨干基于其余上下文一致重建话语,从而弥合一般和特定领域之间的差距。实验是在临床对话数据集上进行症状检查的,护士与患者询问并讨论症状信息。我们从经验上表明,通过我们提出的方法的神经模型可以改善对话理解任务,并可以在低资源培训方案中取得好成绩。

There is growing interest in the automated extraction of relevant information from clinical dialogues. However, it is difficult to collect and construct large annotated resources for clinical dialogue tasks. Recent developments in natural language processing suggest that large-scale pre-trained language backbones could be leveraged for such machine comprehension and information extraction tasks. Yet, due to the gap between pre-training and downstream clinical domains, it remains challenging to exploit the generic backbones for domain-specific applications. Therefore, in this work, we propose a domain-specific language pre-training, to improve performance on downstream tasks like dialogue comprehension. Aside from the common token-level masking pre-training method, according to the nature of human conversations and interactive flow of multi-topic inquiry-answering dialogues, we further propose sample generation strategies with speaker and utterance manipulation. The conversational pre-training guides the language backbone to reconstruct the utterances coherently based on the remaining context, thus bridging the gap between general and specific domains. Experiments are conducted on a clinical conversation dataset for symptom checking, where nurses inquire and discuss symptom information with patients. We empirically show that the neural model with our proposed approach brings improvement in the dialogue comprehension task, and can achieve favorable results in the low resource training scenario.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源