论文标题

空间混合物

Spatial Mixture-of-Experts

论文作者

Dryden, Nikoli, Hoefler, Torsten

论文摘要

许多数据对空间位置具有潜在的依赖。可能是地球上的天气,网格上的模拟或注册图像。然而,此功能很少被利用,并且违反了许多神经网络层(例如翻译均衡)的共同假设。此外,许多确实结合了当地的作品无法捕获细粒结构。为了解决这个问题,我们介绍了一个空间混合物(SMOE)层,这是一个稀疏门控的层,在输入域中学习空间结构,并以细粒度的水平将专家路由以利用它。我们还开发了训练臭虫的新技术,包括自我监督的路由损失和阻尼专家错误。最后,我们在众多任务上显示了Smoes的良好结果,并为中等天气预测和后处理的合奏天气预报设定了新的最新结果。

Many data have an underlying dependence on spatial location; it may be weather on the Earth, a simulation on a mesh, or a registered image. Yet this feature is rarely taken advantage of, and violates common assumptions made by many neural network layers, such as translation equivariance. Further, many works that do incorporate locality fail to capture fine-grained structure. To address this, we introduce the Spatial Mixture-of-Experts (SMoE) layer, a sparsely-gated layer that learns spatial structure in the input domain and routes experts at a fine-grained level to utilize it. We also develop new techniques to train SMoEs, including a self-supervised routing loss and damping expert errors. Finally, we show strong results for SMoEs on numerous tasks, and set new state-of-the-art results for medium-range weather prediction and post-processing ensemble weather forecasts.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源