论文标题
学习通过学习完善搜索策略来找到证明和定理:循环不变综合的情况
Learning to Find Proofs and Theorems by Learning to Refine Search Strategies: The Case of Loop Invariant Synthesis
论文作者
论文摘要
我们提出了一种新的方法来自动化定理,证明了α-式代理人正在自我培训以完善一种以非确定性计划表示的通用高级专家策略。一个类似的教师代理人是自我训练,以产生对学习者的适当相关性和困难的任务。这允许利用最少数量的域知识来解决训练数据不可用或难以合成的问题。作为一个特定的例证,我们考虑循环不变综合命令程序,并使用神经网络来完善教师和求解器策略。
We propose a new approach to automated theorem proving where an AlphaZero-style agent is self-training to refine a generic high-level expert strategy expressed as a nondeterministic program. An analogous teacher agent is self-training to generate tasks of suitable relevance and difficulty for the learner. This allows leveraging minimal amounts of domain knowledge to tackle problems for which training data is unavailable or hard to synthesize. As a specific illustration, we consider loop invariant synthesis for imperative programs and use neural networks to refine both the teacher and solver strategies.