论文标题
使用加强学习解决不确定性下的月球着陆器问题
Solving The Lunar Lander Problem under Uncertainty using Reinforcement Learning
论文作者
论文摘要
强化学习(RL)是机器学习的领域,与使代理商能够以不确定性为导航环境,以最大程度地提高一些累积的长期奖励概念。在本文中,我们在Openai Gym的Lunarlander-V2环境中实施和分析了两种不同的RL技术,SARSA和DEEP QLEALNING。然后,我们引入了原始问题的其他不确定性,以测试上述技术的鲁棒性。借助我们最好的模型,我们能够使用SARSA代理商获得170+的平均奖励,而对于原始问题的深度Q学习代理,我们就可以实现170+的平均奖励。我们还表明,这些技术能够克服额外的不确定性,并获得两种代理的正平均奖励。然后,我们对两种技术进行比较分析,以结论哪种代理更好。
Reinforcement Learning (RL) is an area of machine learning concerned with enabling an agent to navigate an environment with uncertainty in order to maximize some notion of cumulative long-term reward. In this paper, we implement and analyze two different RL techniques, Sarsa and Deep QLearning, on OpenAI Gym's LunarLander-v2 environment. We then introduce additional uncertainty to the original problem to test the robustness of the mentioned techniques. With our best models, we are able to achieve average rewards of 170+ with the Sarsa agent and 200+ with the Deep Q-Learning agent on the original problem. We also show that these techniques are able to overcome the additional uncertainities and achieve positive average rewards of 100+ with both agents. We then perform a comparative analysis of the two techniques to conclude which agent peforms better.