论文标题
基于歧管的图像分类器的测试生成
Manifold-based Test Generation for Image Classifiers
论文作者
论文摘要
必须使用足够的现实数据来测试用于关键应用程序中图像分类任务的神经网络以确保其正确性。为了有效测试图像分类神经网络,必须获得足够的现实测试数据,以激发隐式需求与学习模型之间的差异的信心。这引起了两个挑战:首先,必须仔细选择数据点的足够子集以激发信心,其次,隐式要求必须有意义地将其推论为超出明确培训集中的数据点。本文提出了一个解决这些挑战的新颖框架。我们的方法是基于以下前提:大型输入数据空间中的模式可以在较小的歧管空间中有效捕获,从中可以采样和生成类似但新颖的测试用例 - 输入和标签。条件变分自动编码器(CVAE)的变体用于使用生成函数捕获此歧管,并在此歧管空间上应用了搜索技术,以有效地查找错误透视的输入。实验表明,这种方法可以有效地产生数千个现实但避开故障的测试用例,即使对于训练有素的模型也是如此。
Neural networks used for image classification tasks in critical applications must be tested with sufficient realistic data to assure their correctness. To effectively test an image classification neural network, one must obtain realistic test data adequate enough to inspire confidence that differences between the implicit requirements and the learned model would be exposed. This raises two challenges: first, an adequate subset of the data points must be carefully chosen to inspire confidence, and second, the implicit requirements must be meaningfully extrapolated to data points beyond those in the explicit training set. This paper proposes a novel framework to address these challenges. Our approach is based on the premise that patterns in a large input data space can be effectively captured in a smaller manifold space, from which similar yet novel test cases---both the input and the label---can be sampled and generated. A variant of Conditional Variational Autoencoder (CVAE) is used for capturing this manifold with a generative function, and a search technique is applied on this manifold space to efficiently find fault-revealing inputs. Experiments show that this approach enables generation of thousands of realistic yet fault-revealing test cases efficiently even for well-trained models.