动态无单元网络中的多次访问：中断性能和基于深度强化学习的设计

论文标题

动态无单元网络中的多次访问：中断性能和基于深度强化学习的设计

Multiple Access in Dynamic Cell-Free Networks: Outage Performance and Deep Reinforcement Learning-Based Design

论文作者

Al-Eryani, Yasser, Akrout, Mohamed, Hossain, Ekram

论文摘要

在未来的无单元（或无单元）的无线网络中，地理区域中的大量设备将通过大量分布式接入点（APS）同时提供非正交多个访问方案，它们与集中的处理池协调。对于具有静态预定义波束形成设计的这种集中式细胞网络，我们首先得出上行链路中断概率中断的闭合形式表达。为了显着降低在存在大量设备和AP的情况下用户信号的关节处理的复杂性，我们提出了一种新型的无动态细胞网络体系结构。在此架构中，分布式AP被分配（即聚类），其中每个子组充当配备分布式天线系统（DAS）的虚拟AP。当群集大小为一个时，常规的无单元网络是该动态细胞网络的特殊情况。对于这个无单元的网络，我们提出了连续的干扰取消（SIC）启用信号检测方法和使用使用者间交易（IUI） - 意识到DAS的DAS接收多样性组合方案。然后，我们制定了聚类AP并设计波束成形向量的一般问题，以最大化总和率或最大化最低速率。为此，我们提出了一个混合深度加固学习（DRL）模型，即深层确定性政策梯度（DDPG） - 深度Q-NETWORK（DDQN）模型，以解决以低复杂性的在线实施的优化问题。总和优化的DRL模型明显胜过最大化最低率的平均每用户速率性能。同样，在我们的系统设置中，建议的DDPG-DDQN方案可通过基于详尽的搜索设计实现$ 78 \％$ $的利率。

In future cell-free (or cell-less) wireless networks, a large number of devices in a geographical area will be served simultaneously in non-orthogonal multiple access scenarios by a large number of distributed access points (APs), which coordinate with a centralized processing pool. For such a centralized cell-free network with static predefined beamforming design, we first derive a closed-form expression of the uplink per-user probability of outage. To significantly reduce the complexity of joint processing of users' signals in presence of a large number of devices and APs, we propose a novel dynamic cell-free network architecture. In this architecture, the distributed APs are partitioned (i.e. clustered) among a set of subgroups with each subgroup acting as a virtual AP equipped with a distributed antenna system (DAS). The conventional static cell-free network is a special case of this dynamic cell-free network when the cluster size is one. For this dynamic cell-free network, we propose a successive interference cancellation (SIC)-enabled signal detection method and an inter-user-interference (IUI)-aware DAS's receive diversity combining scheme. We then formulate the general problem of clustering APs and designing the beamforming vectors with an objective to maximizing the sum rate or maximizing the minimum rate. To this end, we propose a hybrid deep reinforcement learning (DRL) model, namely, a deep deterministic policy gradient (DDPG)-deep double Q-network (DDQN) model, to solve the optimization problem for online implementation with low complexity. The DRL model for sum-rate optimization significantly outperforms that for maximizing the minimum rate in terms of average per-user rate performance. Also, in our system setting, the proposed DDPG-DDQN scheme is found to achieve around $78\%$ of the rate achievable through an exhaustive search-based design.

下载PDF全文

下载文献需遵守相关版权规定

论文标题