论文标题
DEEPFD:自动化故障诊断和深度学习计划的本地化
DeepFD: Automated Fault Diagnosis and Localization for Deep Learning Programs
论文作者
论文摘要
由于深度学习(DL)系统被广泛部署到关键任务应用程序中,因此调试此类系统变得至关重要。大多数现有的作品在训练有素的深神经网络(DNN)上识别和修复可疑神经元,不幸的是,这可能是绕道而行的。具体而言,一些现有的研究报告说,许多不令人满意的行为实际上源自DL程序中的故障。此外,找到故障的神经元对于开发人员无法采取行动,而在DL程序中找到错误的陈述可以为开发人员提供更多有用的调试信息。尽管提出了一些最近的研究,以查明DL程序或培训设置中的错误陈述(例如,学习率太大),但它们主要是基于预定义的规则而设计的,导致许多错误的警报或虚假负面因素,尤其是当故障超出其功能之外。 考虑到这些局限性,在本文中,我们提出了DEEPFD,这是一个基于学习的故障诊断和本地化框架,将故障本地化任务映射到学习问题。特别是,它通过监视DNN型号训练期间提取的运行时功能,然后在DL程序中找到诊断的故障,从而渗透可疑的故障类型。它通过识别DL程序中的故障的根本原因而不是神经元而不是通过学习方法诊断故障来克服局限性,而不是一组硬编码的规则。评估表现出DEEPFD的潜力。它正确诊断了52%的DL程序,而最佳最新作品达到了一半(27%)。此外,对于故障本地化,DEEPFD还胜过现有作品,正确定位了42%的故障程序,这几乎是现有作品实现的最佳结果(23%)的两倍。
As Deep Learning (DL) systems are widely deployed for mission-critical applications, debugging such systems becomes essential. Most existing works identify and repair suspicious neurons on the trained Deep Neural Network (DNN), which, unfortunately, might be a detour. Specifically, several existing studies have reported that many unsatisfactory behaviors are actually originated from the faults residing in DL programs. Besides, locating faulty neurons is not actionable for developers, while locating the faulty statements in DL programs can provide developers with more useful information for debugging. Though a few recent studies were proposed to pinpoint the faulty statements in DL programs or the training settings (e.g. too large learning rate), they were mainly designed based on predefined rules, leading to many false alarms or false negatives, especially when the faults are beyond their capabilities. In view of these limitations, in this paper, we proposed DeepFD, a learning-based fault diagnosis and localization framework which maps the fault localization task to a learning problem. In particular, it infers the suspicious fault types via monitoring the runtime features extracted during DNN model training and then locates the diagnosed faults in DL programs. It overcomes the limitations by identifying the root causes of faults in DL programs instead of neurons and diagnosing the faults by a learning approach instead of a set of hard-coded rules. The evaluation exhibits the potential of DeepFD. It correctly diagnoses 52% faulty DL programs, compared with around half (27%) achieved by the best state-of-the-art works. Besides, for fault localization, DeepFD also outperforms the existing works, correctly locating 42% faulty programs, which almost doubles the best result (23%) achieved by the existing works.