对转换深度学习模型的挑战的实证研究

论文标题

对转换深度学习模型的挑战的实证研究

An Empirical Study of Challenges in Converting Deep Learning Models

论文作者

Openja, Moses, Nikanjam, Amin, Yahmed, Ahmed Haj, Khomh, Foutse, Ming, Zhen, Jiang

论文摘要

在现实世界应用程序中部署深度学习（DL）的软件系统有所增加。通常，DL模型是使用具有自己的内部机制/格式来代表和训练DL模型的DL框架开发和训练的，通常这些格式无法通过其他框架识别。此外，训练有素的模型通常部署在与开发的环境中。为了解决互操作性问题并使DL模型与不同的框架/环境兼容，引入了某些交换格式，例如DL模型，例如ONNX和Coreml。但是，社区从未经过经验评估ONNX和Coreml，以揭示其转换后的预测准确性，性能和鲁棒性。转换模型的准确性差或不稳定行为可能导致部署的基于DL的软件系统的质量差。在本文中，我们进行了第一项评估ONNX和Coreml的经验研究，以转换训练有素的DL模型。在我们的系统方法中，两个流行的DL框架KERAS和PYTORCH用于在三个流行数据集上训练五个广泛使用的DL模型。然后将训练有素的模型转换为ONNX和Coreml，并将其转移到指定为此类格式的两个运行时环境中进行评估。我们研究转换之前和之后的预测准确性。我们的结果揭示了转换模型的预测准确性在相同的原始级别。也研究了转换模型的性能（时间成本和内存消耗）。转换后模型的大小减小，这可能会导致优化的基于DL的软件部署。通常将转换的模型评估为在相同级别的原始级别上。但是，获得的结果表明，与ONNX相比，Coreml模型更容易受到对抗性攻击的影响。

There is an increase in deploying Deep Learning (DL)-based software systems in real-world applications. Usually DL models are developed and trained using DL frameworks that have their own internal mechanisms/formats to represent and train DL models, and usually those formats cannot be recognized by other frameworks. Moreover, trained models are usually deployed in environments different from where they were developed. To solve the interoperability issue and make DL models compatible with different frameworks/environments, some exchange formats are introduced for DL models, like ONNX and CoreML. However, ONNX and CoreML were never empirically evaluated by the community to reveal their prediction accuracy, performance, and robustness after conversion. Poor accuracy or non-robust behavior of converted models may lead to poor quality of deployed DL-based software systems. We conduct, in this paper, the first empirical study to assess ONNX and CoreML for converting trained DL models. In our systematic approach, two popular DL frameworks, Keras and PyTorch, are used to train five widely used DL models on three popular datasets. The trained models are then converted to ONNX and CoreML and transferred to two runtime environments designated for such formats, to be evaluated. We investigate the prediction accuracy before and after conversion. Our results unveil that the prediction accuracy of converted models are at the same level of originals. The performance (time cost and memory consumption) of converted models are studied as well. The size of models are reduced after conversion, which can result in optimized DL-based software deployment. Converted models are generally assessed as robust at the same level of originals. However, obtained results show that CoreML models are more vulnerable to adversarial attacks compared to ONNX.

下载PDF全文

下载文献需遵守相关版权规定

论文标题