论文标题
多相多目标灵巧操作,具有自适应分层课程
Multi-Phase Multi-Objective Dexterous Manipulation with Adaptive Hierarchical Curriculum
论文作者
论文摘要
灵巧的操纵任务通常具有多个目标,这些目标的优先级可能会在操纵任务的不同阶段有所不同。不同的优先级使机器人几乎没有甚至无法通过深入的强化学习(DRL)方法来学习最佳政策。为了解决这个问题,我们开发了一种新颖的自适应分层奖励机制(AHRM),以指导DRL代理学习具有多个优先目标的操纵任务。 AHRM可以在学习过程中确定客观优先级,并更新奖励层次结构以适应不同阶段的客观优先级。所提出的方法通过Jaco机器人组在多目标操纵任务中进行了验证,在该任务中,机器人需要在其中用包围的障碍物操纵目标。模拟和物理实验结果表明,该方法改善了任务性能和学习效率的机器人学习。
Dexterous manipulation tasks usually have multiple objectives, and the priorities of these objectives may vary at different phases of a manipulation task. Varying priority makes a robot hardly or even failed to learn an optimal policy with a deep reinforcement learning (DRL) method. To solve this problem, we develop a novel Adaptive Hierarchical Reward Mechanism (AHRM) to guide the DRL agent to learn manipulation tasks with multiple prioritized objectives. The AHRM can determine the objective priorities during the learning process and update the reward hierarchy to adapt to the changing objective priorities at different phases. The proposed method is validated in a multi-objective manipulation task with a JACO robot arm in which the robot needs to manipulate a target with obstacles surrounded. The simulation and physical experiment results show that the proposed method improved robot learning in task performance and learning efficiency.