论文标题
在预测时空分析中揭开深度学习:信息理论框架
Demystifying Deep Learning in Predictive Spatio-Temporal Analytics: An Information-Theoretic Framework
论文作者
论文摘要
在过去的几年中,深度学习取得了令人难以置信的成功,尤其是在各种具有挑战性的预测时空分析(PSTA)任务中,例如疾病预测,气候预测和交通预测,数据之间存在固有的依赖关系,并且通常表现为多个时空量表。但是,鉴于特定的PSTA任务和相应的数据集,如何适当地确定深度学习模型的所需配置,理论上分析了模型的学习行为,并定量表征模型的学习能力仍然是一个谜。为了揭开PSTA深度学习的力量,我们在本文中为深度学习模型设计和信息理论分析提供了一个全面的框架。首先,我们开发并展示了一种新颖的交互式和集成连接的深度复发性神经网络(I $^2 $ drnn)模型。 I $^2 $ DRNN由三个模块组成:一个输入模块,该模块集成了来自异质源的数据;一个隐藏的模块,可在不同尺度上捕获信息,同时允许信息之间的信息在层之间交互流动;和一个输出模块,该模块对信息从各个隐藏层产生输出预测的集成效应进行建模。其次,从理论上讲,我们设计的模型可以在PSTA任务中学习多尺度时空依赖性,我们提供了信息理论分析,以检查所提出模型的基于信息的学习能力(I-CAP)。第三,为了验证I $^2 $ DRNN模型并确认其I-CAP,我们系统地进行了一系列涉及合成数据集和现实世界PSTA任务的实验。实验结果表明,I $^2 $ DRNN模型的表现优于经典和最先进的模型,并且能够捕获有意义的多尺度时空依赖性。
Deep learning has achieved incredible success over the past years, especially in various challenging predictive spatio-temporal analytics (PSTA) tasks, such as disease prediction, climate forecast, and traffic prediction, where intrinsic dependency relationships among data exist and generally manifest at multiple spatio-temporal scales. However, given a specific PSTA task and the corresponding dataset, how to appropriately determine the desired configuration of a deep learning model, theoretically analyze the model's learning behavior, and quantitatively characterize the model's learning capacity remains a mystery. In order to demystify the power of deep learning for PSTA, in this paper, we provide a comprehensive framework for deep learning model design and information-theoretic analysis. First, we develop and demonstrate a novel interactively- and integratively-connected deep recurrent neural network (I$^2$DRNN) model. I$^2$DRNN consists of three modules: an Input module that integrates data from heterogeneous sources; a Hidden module that captures the information at different scales while allowing the information to flow interactively between layers; and an Output module that models the integrative effects of information from various hidden layers to generate the output predictions. Second, to theoretically prove that our designed model can learn multi-scale spatio-temporal dependency in PSTA tasks, we provide an information-theoretic analysis to examine the information-based learning capacity (i-CAP) of the proposed model. Third, to validate the I$^2$DRNN model and confirm its i-CAP, we systematically conduct a series of experiments involving both synthetic datasets and real-world PSTA tasks. The experimental results show that the I$^2$DRNN model outperforms both classical and state-of-the-art models, and is able to capture meaningful multi-scale spatio-temporal dependency.