论文标题
apachejit:一个用于即时缺陷预测的大数据集
ApacheJIT: A Large Dataset for Just-In-Time Defect Prediction
论文作者
论文摘要
在本文中,我们提出了Apachejit,这是一个大型数据集,用于即时缺陷预测。 Apachejit由流行的Apache项目中的干净和诱导软件变化组成。 Apachejit共有106,674次提交(28,239个诱发错误和78,435个清洁提交)。拥有大量的投入使Apachejit成为机器学习模型的合适数据集,尤其是需要大型培训集以有效地将历史数据中存在的模式推广到将来数据的深度学习模型。
In this paper, we present ApacheJIT, a large dataset for Just-In-Time defect prediction. ApacheJIT consists of clean and bug-inducing software changes in popular Apache projects. ApacheJIT has a total of 106,674 commits (28,239 bug-inducing and 78,435 clean commits). Having a large number of commits makes ApacheJIT a suitable dataset for machine learning models, especially deep learning models that require large training sets to effectively generalize the patterns present in the historical data to future data.