论文标题

GitHub存储库,链接到学术论文:公共访问,可追溯性和进化论

GitHub Repositories with Links to Academic Papers: Public Access, Traceability, and Evolution

论文作者

Wattanakriengkrai, Supatsara, Chinthanet, Bodin, Hata, Hideaki, Kula, Raula Gaikovina, Treude, Christoph, Guo, Jin, Matsumoto, Kenichi

论文摘要

已发表的科学突破与其实施之间的可追溯性至关重要,尤其是在开源科学软件中,该软件在其代码中实现了出血性科学。但是,将GITHUB存储库与学术论文之间的联系保持一致可能很困难,并且当前建立和维持此类链接的实践尚不清楚。本文研究了这些存储库中包含的学术论文参考的作用。我们对20,000个GitHub存储库进行了一项大规模研究,这些研究提到了学术论文。我们使用混合方法方法来确定链接的公共访问,可追溯性和进化方面。尽管参考论文并不是典型的,但我们发现绝大多数参考的学术论文都是公众访问。这些存储库往往隶属于学术社区。超过一半的论文不会链接回任何存储库。我们发现,顶级SE场所的学术论文不太可能引用存储库,但是当它们这样做时,它们通常链接到GitHub软件存储库。在ARXIV论文和引用的存储库网络中,我们发现最多的论文在学术界引用了(i)引用,并且(ii)由用不同的编程语言编写的存储库引用。

Traceability between published scientific breakthroughs and their implementation is essential, especially in the case of open-source scientific software which implements bleeding-edge science in its code. However, aligning the link between GitHub repositories and academic papers can prove difficult, and the current practice of establishing and maintaining such links remains unknown. This paper investigates the role of academic paper references contained in these repositories. We conduct a large-scale study of 20 thousand GitHub repositories that make references to academic papers. We use a mixed-methods approach to identify public access, traceability and evolutionary aspects of the links. Although referencing a paper is not typical, we find that a vast majority of referenced academic papers are public access. These repositories tend to be affiliated with academic communities. More than half of the papers do not link back to any repository. We find that academic papers from top-tier SE venues are not likely to reference a repository, but when they do, they usually link to a GitHub software repository. In a network of arXiv papers and referenced repositories, we find that the most referenced papers are (i) highly-cited in academia and (ii) are referenced by repositories written in different programming languages.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源