使用强化学习优化对流行控制的锁定政策

论文标题

使用强化学习优化对流行控制的锁定政策

Optimising Lockdown Policies for Epidemic Control using Reinforcement Learning

论文作者

Khadilkar, Harshad, Ganu, Tanuja, Seetharam, Deva P

论文摘要

在正在进行的COVID-19大流行的背景下，一些报告和研究试图建模和预测疾病的传播。关于将损害限制在卫生和经济上的政策也存在激烈的争论。一方面，人口的健康和安全是大多数国家的主要考虑因素。另一方面，我们不能忽视严格的全国性封锁造成的长期经济损害的潜力。在本工作论文中，我们提出了一种定量方法，以计算各个城市或地区的锁定决策，同时平衡健康和经济考虑。此外，这些策略是通过拟议算法自动学习的，这是疾病参数（传染性，妊娠期，症状持续时间，死亡概率）和人口特征（密度，运动倾向）的函数。我们考虑了现实的考虑因素，例如不完美的锁定，并表明使用强化学习获得的政策是一种可行的定量方法。

In the context of the ongoing Covid-19 pandemic, several reports and studies have attempted to model and predict the spread of the disease. There is also intense debate about policies for limiting the damage, both to health and to the economy. On the one hand, the health and safety of the population is the principal consideration for most countries. On the other hand, we cannot ignore the potential for long-term economic damage caused by strict nation-wide lockdowns. In this working paper, we present a quantitative way to compute lockdown decisions for individual cities or regions, while balancing health and economic considerations. Furthermore, these policies are learnt automatically by the proposed algorithm, as a function of disease parameters (infectiousness, gestation period, duration of symptoms, probability of death) and population characteristics (density, movement propensity). We account for realistic considerations such as imperfect lockdowns, and show that the policy obtained using reinforcement learning is a viable quantitative approach towards lockdowns.

下载PDF全文

下载文献需遵守相关版权规定

论文标题