基于深层Q学习/遗传算法的新方法，用于优化COVID-19大流行政府行动

论文标题

基于深层Q学习/遗传算法的新方法，用于优化COVID-19大流行政府行动

A Deep Q-learning/genetic Algorithms Based Novel Methodology For Optimizing Covid-19 Pandemic Government Actions

论文作者

Miralles-Pechuán, Luis, Jiménez, Fernando, Ponce, Hiram, Martínez-Villaseñor, Lourdes

论文摘要

每当国家受到大流行的威胁，就像19号病毒一样，政府应采取正确的行动来保护公共卫生，并减轻对经济的负面影响。在这方面，政府可以采取两种完全不同的方法：一种限制性的方法，其中诸如自我隔离等巨大措施可能会严重损害经济，而一种更加自由的措施，其中更加放松的限制可能会使人口占人群的较高比例。最佳方法可以在介于两者之间，并且为了做出正确的决定，有必要准确估计采取一种或其他措施的未来影响。在本文中，我们使用SEIR流行病学模型（易感 - 暴露 - 感染 - 恢复）来代表人群中covid -19病毒的演变。为了优化政府可以采取的最佳动作序列，我们提出了一种采用两种方法的方法，一种基于深度Q学习，另一种基于遗传算法的方法。根据奖励系统的重点是实现两个目标的奖励系统，评估了动作的序列（限制，自我隔离，两米距离或不采取限制）：首先，受感染的人很少，以使医院不会被关键患者淹没，其次，避免了太长时间的严重措施，这可能会对经济造成严重损害。进行的实验证明，我们的方法论是发现政府在两种感觉上都可以减少大流行的负面影响的有效工具。我们还证明，基于深Q学习的方法克服了基于遗传算法以优化作用序列的方法。

Whenever countries are threatened by a pandemic, as is the case with the COVID-19 virus, governments should take the right actions to safeguard public health as well as to mitigate the negative effects on the economy. In this regard, there are two completely different approaches governments can take: a restrictive one, in which drastic measures such as self-isolation can seriously damage the economy, and a more liberal one, where more relaxed restrictions may put at risk a high percentage of the population. The optimal approach could be somewhere in between, and, in order to make the right decisions, it is necessary to accurately estimate the future effects of taking one or other measures. In this paper, we use the SEIR epidemiological model (Susceptible - Exposed - Infected - Recovered) for infectious diseases to represent the evolution of the virus COVID-19 over time in the population. To optimize the best sequences of actions governments can take, we propose a methodology with two approaches, one based on Deep Q-Learning and another one based on Genetic Algorithms. The sequences of actions (confinement, self-isolation, two-meter distance or not taking restrictions) are evaluated according to a reward system focused on meeting two objectives: firstly, getting few people infected so that hospitals are not overwhelmed with critical patients, and secondly, avoiding taking drastic measures for too long which can potentially cause serious damage to the economy. The conducted experiments prove that our methodology is a valid tool to discover actions governments can take to reduce the negative effects of a pandemic in both senses. We also prove that the approach based on Deep Q-Learning overcomes the one based on Genetic Algorithms for optimizing the sequences of actions.

下载PDF全文

下载文献需遵守相关版权规定

论文标题