论文标题
弱监督的强化学习可控行为
Weakly-Supervised Reinforcement Learning for Controllable Behavior
论文作者
论文摘要
强化学习(RL)是学习采取行动来解决任务的有力框架。但是,在许多设置中,代理必须将所有可能任务中所有可能的任务的大空间降低到当前被要求解决的单一任务。我们可以将任务空间限制在语义上有意义的人身上吗?在这项工作中,我们介绍了一个框架,用于使用弱监管来自动将其具有语义上有意义的任务子空间从巨大的“ CHAFF”任务的巨大空间中分离出来。我们表明,该学到的子空间可以有效探索,并提供了捕获状态之间距离的表示形式。在各种具有挑战性的基于视觉的持续控制问题上,我们的方法会带来可观的性能增长,尤其是随着环境的复杂性的增长。
Reinforcement learning (RL) is a powerful framework for learning to take actions to solve tasks. However, in many settings, an agent must winnow down the inconceivably large space of all possible tasks to the single task that it is currently being asked to solve. Can we instead constrain the space of tasks to those that are semantically meaningful? In this work, we introduce a framework for using weak supervision to automatically disentangle this semantically meaningful subspace of tasks from the enormous space of nonsensical "chaff" tasks. We show that this learned subspace enables efficient exploration and provides a representation that captures distance between states. On a variety of challenging, vision-based continuous control problems, our approach leads to substantial performance gains, particularly as the complexity of the environment grows.