一个受人为行为启发的程序理解的神经网络体系结构

论文标题

一个受人为行为启发的程序理解的神经网络体系结构

A Neural Network Architecture for Program Understanding Inspired by Human Behaviors

论文作者

Zhu, Renyu, Yuan, Lei, Li, Xiang, Gao, Ming, Cai, Wenyuan

论文摘要

程序理解是程序语言处理中的一项基本任务。尽管取得了成功，但现有作品未能将人类行为作为理解计划的参考。在本文中，我们考虑了人类的行为，并提出了由两个主要组成部分组成的PGNN-EK模型。一方面，受到人类的“分裂和争议”阅读行为的启发，我们在升级的代码AST上提出了基于分区的图形神经网络模型PGNN。另一方面，要表征人类诉诸于其他资源以帮助代码理解的行为，我们用外部知识来转换原始代码，并将培训预训练技术应用于信息提取。最后，我们将从两个组件生成的两个嵌入到输出代码嵌入。我们进行了广泛的实验，以显示PGNN-EK在代码摘要和代码克隆检测任务上的出色性能。特别是，为了展示我们的模型的概括能力，我们发布了一个新的数据集，该数据集在代码克隆检测方面更具挑战性，并且可以推进社区的发展。我们的代码和数据可在https://github.com/recklessronan/pgnn-ek上公开获取。

Program understanding is a fundamental task in program language processing. Despite the success, existing works fail to take human behaviors as reference in understanding programs. In this paper, we consider human behaviors and propose the PGNN-EK model that consists of two main components. On the one hand, inspired by the "divide-and-conquer" reading behaviors of humans, we present a partitioning-based graph neural network model PGNN on the upgraded AST of codes. On the other hand, to characterize human behaviors of resorting to other resources to help code comprehension, we transform raw codes with external knowledge and apply pre-training techniques for information extraction. Finally, we combine the two embeddings generated from the two components to output code embeddings. We conduct extensive experiments to show the superior performance of PGNN-EK on the code summarization and code clone detection tasks. In particular, to show the generalization ability of our model, we release a new dataset that is more challenging for code clone detection and could advance the development of the community. Our codes and data are publicly available at https://github.com/RecklessRonan/PGNN-EK.

下载PDF全文

下载文献需遵守相关版权规定

论文标题