自动检测问题跟踪器中的键入链接

论文标题

自动检测问题跟踪器中的键入链接

Automated Detection of Typed Links in Issue Trackers

论文作者

Lüders, Clara Marie, Pietz, Tim, Maalej, Walid

论文摘要

软件项目中的利益相关者使用诸如JIRA之类的问题跟踪器来捕获和管理问题，包括需求和错误。为了简化发行导航和结构项目知识，利益相关者通过某些类型的链接手动连接问题，这些类型反映了不同的依赖性，例如Epic-，Block-，副本，重复或相关链接。基于15个JIRA存储库的大型数据集，我们研究了最先进的机器学习模型如何自动检测通用链接类型。我们发现，在链接问题的标题和描述上训练的纯BERT模型极大地优于其他优化的深度学习模型，可在检测所有存储库中检测9种流行的链接类型（加权F1评分为0.73）的平均宏观F1分数为0.64。对于特定的子任务和史诗般的链接，该模型的最高F1得分分别为0.89和0.97。我们的模型不仅仅了解问题的文本相似性。通常，较短的问题文本似乎可以提高预测准确性-0.70。我们发现，相关链接通常会与其他链接混淆，这表明它们可能在不清楚的情况下被用作默认链接。我们还观察到整个存储库的显着差异，具体取决于它们的使用方式和由谁使用。

Stakeholders in software projects use issue trackers like JIRA to capture and manage issues, including requirements and bugs. To ease issue navigation and structure project knowledge, stakeholders manually connect issues via links of certain types that reflect different dependencies, such as Epic-, Block-, Duplicate-, or Relate- links. Based on a large dataset of 15 JIRA repositories, we study how well state-of-the-art machine learning models can automatically detect common link types. We found that a pure BERT model trained on titles and descriptions of linked issues significantly outperforms other optimized deep learning models, achieving an encouraging average macro F1-score of 0.64 for detecting 9 popular link types across all repositories (weighted F1-score of 0.73). For the specific Subtask- and Epic- links, the model achieved top F1-scores of 0.89 and 0.97, respectively. Our model does not simply learn the textual similarity of the issues. In general, shorter issue text seems to improve the prediction accuracy with a strong negative correlation of -0.70. We found that Relate-links often get confused with the other links, which suggests that they are likely used as default links in unclear cases. We also observed significant differences across the repositories, depending on how they are used and by whom.

下载PDF全文

下载文献需遵守相关版权规定

论文标题