低成本FPGA上的基于ODE的神经网络加速

论文标题

低成本FPGA上的基于ODE的神经网络加速

Accelerating ODE-Based Neural Networks on Low-Cost FPGAs

论文作者

Watanabe, Hirohisa, Matsutani, Hiroki

论文摘要

ODENET是一种深度神经网络体系结构，其中重新NET的堆叠结构由普通的微分方程（ODE）求解器实现。它可以减少参数的数量，并通过选择适当的求解器在准确性和性能之间取得平衡。在资源有限的边缘设备上保留相同数量的参数，也可以提高准确性。在本文中，使用Euler方法作为ODE求解器，ODENET的一部分被用作低成本FPGA（现场可编程门阵列）板（例如Pynq-Z2板）上的专用逻辑。作为ODENET变体，减少的ODENET（Rodenets）都大量使用ODENET层的一部分，并且对低成本FPGA实现进行了不同的层次/消除某些层。它们是根据FPGA上的参数大小，准确性，执行时间和资源利用来评估的。结果表明，与纯软件的执行相比，Rodenet变体的整体执行时间最多可提高2.66倍，同时保持与原始ODENET的可比精度。

ODENet is a deep neural network architecture in which a stacking structure of ResNet is implemented with an ordinary differential equation (ODE) solver. It can reduce the number of parameters and strike a balance between accuracy and performance by selecting a proper solver. It is also possible to improve the accuracy while keeping the same number of parameters on resource-limited edge devices. In this paper, using Euler method as an ODE solver, a part of ODENet is implemented as a dedicated logic on a low-cost FPGA (Field-Programmable Gate Array) board, such as PYNQ-Z2 board. As ODENet variants, reduced ODENets (rODENets) each of which heavily uses a part of ODENet layers and reduces/eliminates some layers differently are proposed and analyzed for low-cost FPGA implementation. They are evaluated in terms of parameter size, accuracy, execution time, and resource utilization on the FPGA. The results show that an overall execution time of an rODENet variant is improved by up to 2.66 times compared to a pure software execution while keeping a comparable accuracy to the original ODENet.

下载PDF全文

下载文献需遵守相关版权规定

论文标题