论文标题
您的依赖关系是否审查了?:测量依赖项更新中的代码审查覆盖范围
Are your dependencies code reviewed?: Measuring code review coverage in dependency updates
论文作者
论文摘要
由于现代软件广泛使用免费的开源软件包作为依赖项,因此开发人员必须通过频繁更新定期提取新的第三方代码。但是,如果没有适当的审查,脆弱和恶意代码可以通过这些依赖性潜入代码库中。这项研究的目的是通过测量更新中的代码更改通过代码审核过程来帮助开发人员安全地接受依赖性更新。我们实现了Depdive,这是一种用于crates.io,npm,pypi和Rubygems注册表中包装的更新审核工具。 DepDive首先(i)标识了更新中的文件和代码更改,该更改无法追溯到软件包的源存储库,即\ textIt {phantom trifacts};然后(ii)衡量更新的部分变化(不包括幻影文物)已经通过了代码审查过程,即\ textit {代码审查覆盖范围}。 使用Depdive,我们介绍了四个注册表中每个下载最多的1000个软件包的最新10个更新的实证研究。我们通过维护者协议调查进一步评估了我们的结果。我们发现更新通常仅由部分代码审查(时间为52.5%)。此外,只有9.0 \%的软件包在我们的数据集中完全审查了所有更新,这表明即使最常用的软件包也可以在软件供应链中引入未浏览的代码。我们还观察到更新倾向于具有高\ textIt {CRC}或Low \ textit {CRC},这表明频谱另一端的软件包可能需要一组单独的处理。
As modern software extensively uses free open source packages as dependencies, developers have to regularly pull in new third-party code through frequent updates. However, without a proper review of every incoming change, vulnerable and malicious code can sneak into the codebase through these dependencies. The goal of this study is to aid developers in securely accepting dependency updates by measuring if the code changes in an update have passed through a code review process. We implement Depdive, an update audit tool for packages in Crates.io, npm, PyPI, and RubyGems registry. Depdive first (i) identifies the files and the code changes in an update that cannot be traced back to the package's source repository, i.e., \textit{phantom artifacts}; and then (ii) measures what portion of changes in the update, excluding the phantom artifacts, has passed through a code review process, i.e., \textit{code review coverage}. Using Depdive, we present an empirical study across the latest ten updates of the most downloaded 1000 packages in each of the four registries. We further evaluated our results through a maintainer agreement survey. We find the updates are typically only partially code-reviewed (52.5\% of the time). Further, only 9.0\% of the packages had all their updates in our data set fully code-reviewed, indicating that even the most used packages can introduce non-reviewed code in the software supply chain. We also observe that updates either tend to have high \textit{CRC} or low \textit{CRC}, suggesting that packages at the opposite end of the spectrum may require a separate set of treatments.