在线适应性影响最大化的可证明有效的加强学习

论文标题

在线适应性影响最大化的可证明有效的加强学习

Provably Efficient Reinforcement Learning for Online Adaptive Influence Maximization

论文作者

Huang, Kaixuan, Wu, Yu, Zhang, Xuezhou, Tu, Shenyinying, Wu, Qingyun, Wang, Mengdi, Wang, Huazheng

论文摘要

在线影响最大化旨在通过选择一些种子节点，最大程度地利用未知网络模型在社交网络中的影响力传播。最近的研究遵循非自适应设置，在扩散过程开始之前选择种子节点，并且在扩散停止时更新网络参数。我们考虑了与内容相关的在线影响最大化问题的自适应版本，其中种子节点是根据实时反馈依次激活的。在本文中，我们将问题提出为无限 - 马在线性扩散过程下的折扣MDP，并提出了基于模型的增强学习解决方案。我们的算法维护网络模型估算，并适应种子用户，探索社交网络，同时乐观地改善最佳策略。我们建立了$ \ widetilde o（\ sqrt {t}）$遗憾的算法。合成网络的经验评估证明了我们算法的效率。

Online influence maximization aims to maximize the influence spread of a content in a social network with unknown network model by selecting a few seed nodes. Recent studies followed a non-adaptive setting, where the seed nodes are selected before the start of the diffusion process and network parameters are updated when the diffusion stops. We consider an adaptive version of content-dependent online influence maximization problem where the seed nodes are sequentially activated based on real-time feedback. In this paper, we formulate the problem as an infinite-horizon discounted MDP under a linear diffusion process and present a model-based reinforcement learning solution. Our algorithm maintains a network model estimate and selects seed users adaptively, exploring the social network while improving the optimal policy optimistically. We establish $\widetilde O(\sqrt{T})$ regret bound for our algorithm. Empirical evaluations on synthetic network demonstrate the efficiency of our algorithm.

下载PDF全文

下载文献需遵守相关版权规定

论文标题