使用各种量子电路的政策梯度

论文标题

使用各种量子电路的政策梯度

Policy Gradients using Variational Quantum Circuits

论文作者

Sequeira, André, Santos, Luis Paulo, Barbosa, Luís Soares

论文摘要

各种量子电路被用作多功能量子机学习模型。一些经验结果在监督和生成的学习任务中具有优势。但是，当应用于增强学习时，知之甚少。在这项工作中，我们将跨量子电路视为强化效率的ANSATZ作为增强剂学习代理的参数化策略。我们表明，可以使用对数数量的参数总数来获得策略梯度的$ε$ -APPRXIMATION。我们从经验上验证了这种量子模型的行为相似，甚至超过了标准基准测试环境和量子控制中使用的典型经典神经网络，仅使用一小部分参数。此外，我们使用Fisher信息矩阵谱系研究量子策略梯度中的贫瘠高原现象。

Variational Quantum Circuits are being used as versatile Quantum Machine Learning models. Some empirical results exhibit an advantage in supervised and generative learning tasks. However, when applied to Reinforcement Learning, less is known. In this work, we considered a Variational Quantum Circuit composed of a low-depth hardware-efficient ansatz as the parameterized policy of a Reinforcement Learning agent. We show that an $ε$-approximation of the policy gradient can be obtained using a logarithmic number of samples concerning the total number of parameters. We empirically verify that such quantum models behave similarly or even outperform typical classical neural networks used in standard benchmarking environments and in quantum control, using only a fraction of the parameters. Moreover, we study the Barren Plateau phenomenon in quantum policy gradients using the Fisher Information Matrix spectrum.

下载PDF全文

下载文献需遵守相关版权规定

论文标题