论文标题

无人用的设备底层通信:通过多代理深度强化学习最小化信息的时代

UAV-to-Device Underlay Communications: Age of Information Minimization by Multi-agent Deep Reinforcement Learning

论文作者

Wu, Fanyi, Zhang, Hongliang, Wu, Jianjun, Song, Lingyang, Han, Zhu, Poor, H. Vincent

论文摘要

近年来,无人驾驶飞机(UAV)发现了许多感知应用,预计未来十年将为世界经济增加数十亿美元。为了进一步改善此类应用中的服务质量(QO),第三代合作伙伴项目(3GPP)考虑了采用地面蜂窝网络来支持无人机传感服务,也称为无人机的蜂窝互联网。在本文中,我们考虑了无人机的蜂窝互联网,其中可以通过蜂窝链接传输到基站(BS),也可以通过Underlay Uav-to-Device(U2D)通信传输到基站(BS)。为了评估数据的新鲜度,采用了信息时代(AOI),其中较低的AOI暗示了更新鲜的数据。由于无人机的AOI在传感和传输过程中取决于它们的轨迹,因此我们通过设计其轨迹来研究无人机的AOI最小化问题。这个问题是马尔可夫决策问题(MDP),具有无限的状态行动空间,因此我们利用多代理的深入强化学习(DRL)来近似国家行动空间。然后,我们提出了一种多功能轨迹设计算法来解决此问题。仿真结果表明,我们的算法的AOI比贪婪算法和策略梯度算法低。

In recent years, unmanned aerial vehicles (UAVs) have found numerous sensing applications, which are expected to add billions of dollars to the world economy in the next decade. To further improve the Quality-of-Service (QoS) in such applications, the 3rd Generation Partnership Project (3GPP) has considered the adoption of terrestrial cellular networks to support UAV sensing services, also known as the cellular Internet of UAVs. In this paper, we consider a cellular Internet of UAVs, where the sensory data can be transmitted either to base station (BS) via cellular links, or to mobile devices by underlay UAV-to-Device (U2D) communications. To evaluate the freshness of data, the age of information (AoI) is adopted, in which a lower AoI implies fresher data. Since UAVs' AoIs are determined by their trajectories during sensing and transmission, we investigate the AoI minimization problem for UAVs by designing their trajectories. This problem is a Markov decision problem (MDP) with an infinite state-action space, and thus we utilize multi-agent deep reinforcement learning (DRL) to approximate the state-action space. Then, we propose a multi-UAV trajectory design algorithm to solve this problem. Simulation results show that our algorithm achieves a lower AoI than greedy algorithm and policy gradient algorithm.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源