论文标题
部分可观测时空混沌系统的无模型预测
SHAPE: An Unified Approach to Evaluate the Contribution and Cooperation of Individual Modalities
论文作者
论文摘要
随着深度学习的进步,对能够从多模式资源中综合信息的模型的需求不断增长,以解决从现实生活应用程序中提出的复杂任务。最近,已经收集了许多大型多模式数据集,研究人员在其上积极探索了融合多模式信息的不同方法。但是,很少有人注意量化所提出模型中不同方式的贡献。在本文中,我们提出了基于{\ bf sh} apley v {\ bf a} lue {\ bf pe} rceptual(shape)得分,以测量单个模态的边际贡献和跨模态的合作程度。使用这些分数,我们系统地评估了不同任务的不同多模式数据集上的不同融合方法。我们的实验表明,对于某些不同模式是互补的任务,多模式模型仍然倾向于仅使用主导方式,而忽略了跨模态的合作。另一方面,在任务必不可少的情况下,模型学会利用跨模式合作。在这种情况下,分数表明最好在相对较早的阶段融合不同的方式。我们希望我们的分数能够帮助提高人们对当前多模式模型如何以不同方式运作的理解,并鼓励更复杂的多种模式的方法。
As deep learning advances, there is an ever-growing demand for models capable of synthesizing information from multi-modal resources to address the complex tasks raised from real-life applications. Recently, many large multi-modal datasets have been collected, on which researchers actively explore different methods of fusing multi-modal information. However, little attention has been paid to quantifying the contribution of different modalities within the proposed models. In this paper, we propose the {\bf SH}apley v{\bf A}lue-based {\bf PE}rceptual (SHAPE) scores that measure the marginal contribution of individual modalities and the degree of cooperation across modalities. Using these scores, we systematically evaluate different fusion methods on different multi-modal datasets for different tasks. Our experiments suggest that for some tasks where different modalities are complementary, the multi-modal models still tend to use the dominant modality alone and ignore the cooperation across modalities. On the other hand, models learn to exploit cross-modal cooperation when different modalities are indispensable for the task. In this case, the scores indicate it is better to fuse different modalities at relatively early stages. We hope our scores can help improve the understanding of how the present multi-modal models operate on different modalities and encourage more sophisticated methods of integrating multiple modalities.