论文标题

LSTM嵌入的静态播种和聚类,以从松散的时间耦合事件中学习

Static Seeding and Clustering of LSTM Embeddings to Learn from Loosely Time-Decoupled Events

论文作者

Manasseh, Christian, Veliche, Razvan, Bennett, Jared, Clouse, Hamilton

论文摘要

人类从事件的发生在不同的地方和时间中学习,以预测类似的事件轨迹。我们将松散的分离时间(LDT)现象定义为两个或多个事件,这些事件可能在不同的地方和不同的时间表上发生,但在事件的性质和位置的特性上具有相似之处。在这项工作中,我们改善了重复的神经网络(RNN)的使用,特别是短期内存(LSTM)网络,以启用为LDT生成更好的时间表预测的AI解决方案。我们根据趋势使用时间表之间的相似性度量,并引入代表这些趋势的嵌入。嵌入式代表事件的属性,该事件与LSTM结构相结合,可以聚集以识别类似的时间不对然后的事件。在本文中,我们探讨了从与LSTM建模的地球物理和人口统计学现象有关的时间不变数据中播种多元LSTM的方法。我们将这些方法应用于从COVID-19检测到的感染和死亡病例中得出的时间表数据。我们使用公开可用的社会经济数据来播种LSTM模型,创建嵌入,以确定这种播种是否改善了病例预测。这些LSTM产生的嵌入量被聚集,以识别预测不断发展的时间表的最佳匹配候选者。采用这种方法,我们显示了美国县一级疾病传播的10天移动平均预测的改善。

Humans learn from the occurrence of events in a different place and time to predict similar trajectories of events. We define Loosely Decoupled Timeseries (LDT) phenomena as two or more events that could happen in different places and across different timelines but share similarities in the nature of the event and the properties of the location. In this work we improve on the use of Recurring Neural Networks (RNN), in particular Long Short-Term Memory (LSTM) networks, to enable AI solutions that generate better timeseries predictions for LDT. We use similarity measures between timeseries based on the trends and introduce embeddings representing those trends. The embeddings represent properties of the event which, coupled with the LSTM structure, can be clustered to identify similar temporally unaligned events. In this paper, we explore methods of seeding a multivariate LSTM from time-invariant data related to the geophysical and demographic phenomena being modeled by the LSTM. We apply these methods on the timeseries data derived from the COVID-19 detected infection and death cases. We use publicly available socio-economic data to seed the LSTM models, creating embeddings, to determine whether such seeding improves case predictions. The embeddings produced by these LSTMs are clustered to identify best-matching candidates for forecasting an evolving timeseries. Applying this method, we show an improvement in 10-day moving average predictions of disease propagation at the US County level.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源