论文标题
SEQ2SEQ模型在代码更正上的应用
Application of Seq2Seq Models on Code Correction
论文作者
论文摘要
我们将各种SEQ2SEQ模型应用于编程语言校正任务上的C/C ++和软件保证参考数据集的Java(SARD),并实现75 \%(对于C/C ++)和56 \%(对于Java)(对于Java)。我们在这些SEQ2SEQ模型中介绍了金字塔编码器,该模型在很大程度上提高了计算效率和记忆效率,而维修速度仍然与非金字塔对应物相似。我们在ITC基准测试示例(仅有685个代码实例)上成功执行错误类型分类任务,并使用在Juliet Test Suite上进行的模型进行转移学习,并指出了处理小型编程语言数据集的新颖方法。
We apply various seq2seq models on programming language correction tasks on Juliet Test Suite for C/C++ and Java of Software Assurance Reference Datasets(SARD), and achieve 75\%(for C/C++) and 56\%(for Java) repair rates on these tasks. We introduce Pyramid Encoder in these seq2seq models, which largely increases the computational efficiency and memory efficiency, while remain similar repair rate to their non-pyramid counterparts. We successfully carry out error type classification task on ITC benchmark examples (with only 685 code instances) using transfer learning with models pre-trained on Juliet Test Suite, pointing out a novel way of processing small programing language datasets.