论文标题
与神经网络的非易失性元素计算内存中的共同设计视图
A Co-design view of Compute in-Memory with Non-Volatile Elements for Neural Networks
论文作者
论文摘要
深度学习的神经网络无处不在,但是传统的计算机架构达到了能够为当今的大量工作量有效执行它们的局限性。它们受到冯·诺伊曼(Von Neumann)瓶颈的限制:在内存和计算引擎之间移动数据时产生的能量和潜伏期的高成本。今天,特殊的CMOS设计解决了这个瓶颈。下一代计算硬件将需要消除或大大减轻这种瓶颈。我们讨论了计算中的记忆如何在这一开发中起重要作用。在这里,基于非易失性存储器的跨杆架构形成了发动机的核心,该发动机使用模拟过程并平行于矩阵矢量乘法操作,该操作在所有神经网络工作负载中反复使用。有时被称为神经形态方法的跨杆架构可能是未来计算机中的关键硬件元素。在这篇评论的第一部分中,我们对设计约束及其在固定跨杆架构的新材料和内存设备上的需求进行共同设计视图。在第二部分中,我们回顾了适合计算内存的不同新的非易失性记忆材料和设备的了解,并讨论前景和挑战。
Deep Learning neural networks are pervasive, but traditional computer architectures are reaching the limits of being able to efficiently execute them for the large workloads of today. They are limited by the von Neumann bottleneck: the high cost in energy and latency incurred in moving data between memory and the compute engine. Today, special CMOS designs address this bottleneck. The next generation of computing hardware will need to eliminate or dramatically mitigate this bottleneck. We discuss how compute-in-memory can play an important part in this development. Here, a non-volatile memory based cross-bar architecture forms the heart of an engine that uses an analog process to parallelize the matrix vector multiplication operation, repeatedly used in all neural network workloads. The cross-bar architecture, at times referred to as a neuromorphic approach, can be a key hardware element in future computing machines. In the first part of this review we take a co-design view of the design constraints and the demands it places on the new materials and memory devices that anchor the cross-bar architecture. In the second part, we review what is knows about the different new non-volatile memory materials and devices suited for compute in-memory, and discuss the outlook and challenges.