论文标题
CodeGen:具有多转化程序合成的代码的开放大型语言模型
CodeGen: An Open Large Language Model for Code with Multi-Turn Program Synthesis
论文作者
论文摘要
程序综合努力生成计算机程序作为解决给定问题规范的解决方案,并用输入输出示例或自然语言描述表示。大型语言模型的普遍性推动了计划综合的最新方法,尽管有限的培训资源和数据阻碍了对此类模型的开放访问。为了使这一点民主化,我们培训并发布了一个高达16.1b参数的大型语言模型,称为Codegen,自然语言和编程语言数据,并开源培训库Jaxformer。我们通过证明它与HumaneVal上的零击Pithon代码生成上的先前最先进的竞争力来展示了受过训练的模型的实用性。我们进一步研究了用于程序合成的多步范式,其中将一个程序分解为多个指定子问题的提示。为此,我们构建了一个开放的基准测试,多转弯编程基准(MTPB),该基准由115个不同的问题集组成,这些问题集被分解为多转移提示。我们对MTPB的分析表明,以多转变方式提供给Codegen的意图可显着改善作为单个转弯所提供的计划合成。我们使培训库JaxFormer和模型检查点可作为开源贡献:https://github.com/salesforce/codegen。
Program synthesis strives to generate a computer program as a solution to a given problem specification, expressed with input-output examples or natural language descriptions. The prevalence of large language models advances the state-of-the-art for program synthesis, though limited training resources and data impede open access to such models. To democratize this, we train and release a family of large language models up to 16.1B parameters, called CODEGEN, on natural language and programming language data, and open source the training library JAXFORMER. We show the utility of the trained model by demonstrating that it is competitive with the previous state-of-the-art on zero-shot Python code generation on HumanEval. We further investigate the multi-step paradigm for program synthesis, where a single program is factorized into multiple prompts specifying subproblems. To this end, we construct an open benchmark, Multi-Turn Programming Benchmark (MTPB), consisting of 115 diverse problem sets that are factorized into multi-turn prompts. Our analysis on MTPB shows that the same intent provided to CODEGEN in multi-turn fashion significantly improves program synthesis over that provided as a single turn. We make the training library JAXFORMER and model checkpoints available as open source contribution: https://github.com/salesforce/CodeGen.