论文标题
增强学习中的抽象理论
A Theory of Abstraction in Reinforcement Learning
论文作者
论文摘要
强化学习定义了通过单独采取行动和观察来做出良好决定的代理商面临的问题。为了成为有效的问题解决者,这些代理必须有效地探索广阔的世界,从延迟的反馈中分配信贷,并概括为新的经验,同时使用有限的数据,计算资源和感知带宽。抽象对于所有这些努力都是必不可少的。通过抽象,代理可以形成其环境的简洁模型,以支持理性,适应性决策者所需的许多实践。在这篇论文中,我提出了一种在增强学习中的抽象理论。我首先为执行抽象过程的功能提供三个Desiderata:1)保留近乎最佳行为的表示,2)可以有效地学习和构造,以及3)较低的计划或学习时间。然后,我提出了一系列新算法和分析,这些算法阐明了代理如何根据这些避难所学习抽象。总的来说,这些结果为发现和使用抽象提供了部分途径,从而最大程度地减少了有效的增强学习的复杂性。
Reinforcement learning defines the problem facing agents that learn to make good decisions through action and observation alone. To be effective problem solvers, such agents must efficiently explore vast worlds, assign credit from delayed feedback, and generalize to new experiences, all while making use of limited data, computational resources, and perceptual bandwidth. Abstraction is essential to all of these endeavors. Through abstraction, agents can form concise models of their environment that support the many practices required of a rational, adaptive decision maker. In this dissertation, I present a theory of abstraction in reinforcement learning. I first offer three desiderata for functions that carry out the process of abstraction: they should 1) preserve representation of near-optimal behavior, 2) be learned and constructed efficiently, and 3) lower planning or learning time. I then present a suite of new algorithms and analysis that clarify how agents can learn to abstract according to these desiderata. Collectively, these results provide a partial path toward the discovery and use of abstraction that minimizes the complexity of effective reinforcement learning.