论文标题
电动汽车路由问题的深度加固学习与时间窗口
Deep Reinforcement Learning for Electric Vehicle Routing Problem with Time Windows
论文作者
论文摘要
在过去的十年中,电动汽车(EV)在市场上迅速渗透,越来越多的物流和运输公司开始部署电动汽车以供服务提供。为了建模商业电动汽车车队的操作,我们使用时间窗口(EVRPTW)使用电动汽车路由问题。在这项研究中,我们提出了一个端到端的深入强化学习框架来解决EVRPTW。特别是,我们开发了一个关注模型,该模型结合了指针网络和图形嵌入技术,以参数化求解EVRPTW的随机策略。然后,使用带有推出基线的策略梯度对模型进行培训。我们的数值研究表明,所提出的模型能够有效地求解大型大小的EVRPTW实例,这些实例无法通过任何现有方法解决。
The past decade has seen a rapid penetration of electric vehicles (EV) in the market, more and more logistics and transportation companies start to deploy EVs for service provision. In order to model the operations of a commercial EV fleet, we utilize the EV routing problem with time windows (EVRPTW). In this research, we propose an end-to-end deep reinforcement learning framework to solve the EVRPTW. In particular, we develop an attention model incorporating the pointer network and a graph embedding technique to parameterize a stochastic policy for solving the EVRPTW. The model is then trained using policy gradient with rollout baseline. Our numerical studies show that the proposed model is able to efficiently solve EVRPTW instances of large sizes that are not solvable with any existing approaches.