论文标题
估计Bloom的碳足迹,这是一种176B参数语言模型
Estimating the Carbon Footprint of BLOOM, a 176B Parameter Language Model
论文作者
论文摘要
鉴于培训ML模型需要大量的计算资源,能源和材料,机器学习的进展(ML)带来了环境成本。在本文中,我们旨在量化其生命周期的Bloom的碳足迹,Bloom是1760亿个参数语言模型。我们估计,如果我们仅考虑动态功耗,则Bloom的最终培训会排放约24.7吨〜\ Carboneq〜,如果我们考虑了从设备制造到基于能源的运营消费的所有流程,则为50.5吨。我们还通过实时接收用户查询的API端点来研究其部署的能源需求和碳排放。最后,我们讨论了精确估计ML模型和未来研究方向的碳足迹的难度,这些碳足迹有助于改善碳排放报告。
Progress in machine learning (ML) comes with a cost to the environment, given that training ML models requires significant computational resources, energy and materials. In the present article, we aim to quantify the carbon footprint of BLOOM, a 176-billion parameter language model, across its life cycle. We estimate that BLOOM's final training emitted approximately 24.7 tonnes of~\carboneq~if we consider only the dynamic power consumption, and 50.5 tonnes if we account for all processes ranging from equipment manufacturing to energy-based operational consumption. We also study the energy requirements and carbon emissions of its deployment for inference via an API endpoint receiving user queries in real-time. We conclude with a discussion regarding the difficulty of precisely estimating the carbon footprint of ML models and future research directions that can contribute towards improving carbon emissions reporting.