论文标题
预测巴西的Covid-19动力学:一种数据驱动的方法
Forecasting Covid-19 dynamics in Brazil: a data driven approach
论文作者
论文摘要
本文有双重贡献。第一个是基于更先进国家的数据来预测Covid 19大流行动力学的数据驱动方法。第二个是报告和讨论通过这种方法为巴西国家获得的结果,截至2020年5月4日。我们首先提出通过培训LSTM SAE网络获得的初步结果,这有点令人失望。然后,我们的主要方法是基于一套手动工程特征,代表了国家对大流行的早期蔓延的反应,对数据可用的世界区域以及大流行处于高级阶段的初始聚类。然后,从这些集群中训练了修改的自动编码网络,并学会了预测巴西国家的未来数据。这些预测用于估计有关该疾病的重要统计数据,例如峰值。最后,在预测上进行曲线拟合,以找到最适合MAE输出的分布,并完善大流行峰的估计值。结果表明,大流行在巴西仍在增长,大多数状态在4月25日至2020年5月19日之间估计的感染峰值。预测的数量达到了240万个受感染的巴西人,分布在不同的州之间,圣保罗估计有近65万例,证实了近65万例。估计的大流行(97%的案件达到结果)的终结开始于5月28日,在某些州,截止至2020年8月14日。
This paper has a twofold contribution. The first is a data driven approach for predicting the Covid 19 pandemic dynamics, based on data from more advanced countries. The second is to report and discuss the results obtained with this approach for Brazilian states, as of May 4th, 2020. We start by presenting preliminary results obtained by training an LSTM SAE network, which are somewhat disappointing. Then, our main approach consists in an initial clustering of the world regions for which data is available and where the pandemic is at an advanced stage, based on a set of manually engineered features representing a country response to the early spread of the pandemic. A Modified Auto-Encoder network is then trained from these clusters and learns to predict future data for Brazilian states. These predictions are used to estimate important statistics about the disease, such as peaks. Finally, curve fitting is carried out on the predictions in order to find the distribution that best fits the outputs of the MAE, and to refine the estimates of the peaks of the pandemic. Results indicate that the pandemic is still growing in Brazil, with most states peaks of infection estimated between the 25th of April and the 19th of May 2020. Predicted numbers reach a total of 240 thousand infected Brazilians, distributed among the different states, with São Paulo leading with almost 65 thousands estimated, confirmed cases. The estimated end of the pandemics (with 97 percent of cases reaching an outcome) starts as of May 28th for some states and rests through August 14th, 2020.