论文标题
预测科学出版物中共有的资源的寿命
Predicting the longevity of resources shared in scientific publications
论文作者
论文摘要
研究表明,文章中共有的大多数资源(例如,代码或数据的URL)并未保持最新状态,并且几年后大部分从网络中消失(Zeng等,2019)。关于区分和预测这些资源寿命的因素知之甚少。本文探讨了与出版物场所,作者,参考以及共享资源共享的一系列解释性功能。我们分析了广泛的出版物存储库,并通过Web档案服务重建它们如何看待不同时间点。我们发现,最重要的因素与资源的共享地点和方式有关,令人惊讶的是,作者的声誉或期刊的声望几乎没有解释。通过检查共享持久资源的地方,我们建议使用现代技术传播和创建标准至关重要。最后,我们讨论了对可重复性的影响,并认识到科学数据集是一流的公民。
Research has shown that most resources shared in articles (e.g., URLs to code or data) are not kept up to date and mostly disappear from the web after some years (Zeng et al., 2019). Little is known about the factors that differentiate and predict the longevity of these resources. This article explores a range of explanatory features related to the publication venue, authors, references, and where the resource is shared. We analyze an extensive repository of publications and, through web archival services, reconstruct how they looked at different time points. We discover that the most important factors are related to where and how the resource is shared, and surprisingly little is explained by the author's reputation or prestige of the journal. By examining the places where long-lasting resources are shared, we suggest that it is critical to disseminate and create standards with modern technologies. Finally, we discuss implications for reproducibility and recognizing scientific datasets as first-class citizens.