结合增强学习和张量网络，并应用于动态大偏差

论文标题

结合增强学习和张量网络，并应用于动态大偏差

Combining Reinforcement Learning and Tensor Networks, with an Application to Dynamical Large Deviations

论文作者

Gillman, Edward, Rose, Dominic C., Garrahan, Juan P.

论文摘要

我们提出了一个将张量网络（TN）方法与增强学习（RL）集成的框架，以解决动态优化任务。我们考虑RL Actor-Critic方法，这是一种解决RL问题的无模型方法，并将TNS作为其政策和价值功能的近似值。我们的“具有张量网络的参与者评论”（ACTEN）方法特别适合具有较大且可分解的状态和动作空间的问题。为了说明ACTEN的适用性，我们解决了在两个范式的随机模型中对稀有轨迹进行指定的艰巨任务，因为玻璃的东模型和不对称的简单排除过程（ASEP），由于缺乏详细的平衡，后者对其他方法特别具有挑战性。由于有很大的潜力与现有的RL方法进一步集成，因此，此处介绍的方法有望在物理应用和多代理RL问题上更广泛地提供。

We present a framework to integrate tensor network (TN) methods with reinforcement learning (RL) for solving dynamical optimisation tasks. We consider the RL actor-critic method, a model-free approach for solving RL problems, and introduce TNs as the approximators for its policy and value functions. Our "actor-critic with tensor networks" (ACTeN) method is especially well suited to problems with large and factorisable state and action spaces. As an illustration of the applicability of ACTeN we solve the exponentially hard task of sampling rare trajectories in two paradigmatic stochastic models, the East model of glasses and the asymmetric simple exclusion process (ASEP), the latter being particularly challenging to other methods due to the absence of detailed balance. With substantial potential for further integration with the vast array of existing RL methods, the approach introduced here is promising both for applications in physics and to multi-agent RL problems more generally.

下载PDF全文

下载文献需遵守相关版权规定

论文标题