论文标题
使用进化计算和深度加强学习的机器人抓手操作
Robotic Grasp Manipulation Using Evolutionary Computing and Deep Reinforcement Learning
论文作者
论文摘要
对于机器人来说,掌握智能对象操纵是一个具有挑战性的问题。与机器人不同,人类几乎立即知道如何操纵由于多年来学习而引起的抓握物体。由于多年来的学习技巧,一个成年的女人比孩子更熟练地掌握物体,而在当今的机器人握把上的缺席迫使其在人类物体上掌握基准测试的良好表现。在本文中,我们通过将问题分解为位置和方向学习来制定基于学习的姿势估计的挑战。更具体地,对于掌握位置估计,我们探索了三种不同的方法 - 一种基于遗传算法(GA)的优化方法,以最小化计算的图像点和预测的终端效果(EE)位置之间的误差,一种基于回归的方法(RM),基于回归的方法(RM),在该方法中,机器人EE和图像点的数据点已通过线性模型(pseudmand)模型(PSEUD)模型进行了回归,该模型是一种模型,该模型是PSEIRIST的形式,该模型是MOMPERIIST的模型。机器人EE位置和几个观测值的图像点。进一步的掌握方向学习,我们开发了深入的增强学习(DRL)模型,我们将其命名为GRASP DEEP Q-NETWORK(GDQN),并通过修改后的VGG16(MVGG16)对结果进行了基准测试。严格的实验表明,由于产生非常高质量的解决方案来进行优化问题和搜索问题的固有能力,基于GA的预测器的性能比其他两个模型进行位置估计要好得多。为导向学习结果表明,通过GDQN的策略学习优于MVGG16,因为GDQN体系结构是专门用于强化学习的。根据我们提出的架构和算法,机器人能够抓住所有具有常规形状的刚体。
Intelligent Object manipulation for grasping is a challenging problem for robots. Unlike robots, humans almost immediately know how to manipulate objects for grasping due to learning over the years. A grown woman can grasp objects more skilfully than a child because of learning skills developed over years, the absence of which in the present day robotic grasping compels it to perform well below the human object grasping benchmarks. In this paper we have taken up the challenge of developing learning based pose estimation by decomposing the problem into both position and orientation learning. More specifically, for grasp position estimation, we explore three different methods - a Genetic Algorithm (GA) based optimization method to minimize error between calculated image points and predicted end-effector (EE) position, a regression based method (RM) where collected data points of robot EE and image points have been regressed with a linear model, a PseudoInverse (PI) model which has been formulated in the form of a mapping matrix with robot EE position and image points for several observations. Further for grasp orientation learning, we develop a deep reinforcement learning (DRL) model which we name as Grasp Deep Q-Network (GDQN) and benchmarked our results with Modified VGG16 (MVGG16). Rigorous experimentations show that due to inherent capability of producing very high-quality solutions for optimization problems and search problems, GA based predictor performs much better than the other two models for position estimation. For orientation learning results indicate that off policy learning through GDQN outperforms MVGG16, since GDQN architecture is specially made suitable for the reinforcement learning. Based on our proposed architectures and algorithms, the robot is capable of grasping all rigid body objects having regular shapes.