论文标题
关于最小预测的可转让性,保留了有关输入的回答
On the Transferability of Minimal Prediction Preserving Inputs in Question Answering
论文作者
论文摘要
最近的工作(Feng等,2018)确立了短而无法解释的输入片段的存在,这些片段在神经模型中产生了很高的信心和准确性。我们将其称为最小预测,以保留输入(MPPI)。在问答的背景下,我们研究了MPPI存在的竞争假设,包括神经模型的后验校准不良,缺乏预训练和“数据集偏见”(在培训数据中,模型学会参与杂乱无章的,非属性的提示)。我们发现MPPI对随机训练种子,模型架构,预处理和训练领域的不变性。 MPPI表现出与相当短的查询相比,在范围内具有明显更高性能的显着可传递性。此外,对MPPI的过度信心无法改善概括或对抗性鲁棒性。这些结果表明,MPPI的解释性不足以表征这些模型的概括能力。我们希望这项重点调查能够鼓励对示例的人类可解释分布之外的模型行为进行更系统的分析。
Recent work (Feng et al., 2018) establishes the presence of short, uninterpretable input fragments that yield high confidence and accuracy in neural models. We refer to these as Minimal Prediction Preserving Inputs (MPPIs). In the context of question answering, we investigate competing hypotheses for the existence of MPPIs, including poor posterior calibration of neural models, lack of pretraining, and "dataset bias" (where a model learns to attend to spurious, non-generalizable cues in the training data). We discover a perplexing invariance of MPPIs to random training seed, model architecture, pretraining, and training domain. MPPIs demonstrate remarkable transferability across domains achieving significantly higher performance than comparably short queries. Additionally, penalizing over-confidence on MPPIs fails to improve either generalization or adversarial robustness. These results suggest the interpretability of MPPIs is insufficient to characterize generalization capacity of these models. We hope this focused investigation encourages more systematic analysis of model behavior outside of the human interpretable distribution of examples.