论文标题
培训机器学习模型超过分布式数据源的成本
The Cost of Training Machine Learning Models over Distributed Data Sources
论文作者
论文摘要
联合学习是标准集中学习范式的最吸引人的替代方案之一,允许一组异质的设备训练机器学习模型而无需共享其原始数据。但是,它需要中央服务器来协调学习过程,从而引入潜在的可扩展性和安全性问题。在文献中,已经提出了无服务的联合学习方法,例如联合学习和支持区块链的联合学习,以减轻这些问题。在这项工作中,我们提出了这三种技术的完整概述,该技术根据整体性能指标进行了比较,包括模型准确性,时间复杂性,交流开销,收敛时间和能耗。广泛的仿真活动允许考虑前进和卷积神经网络模型的定量分析。结果表明,八卦联合学习和标准联合解决方案能够达到相似的准确性,其能耗受到采用的机器学习模型,软件库和所使用的硬件的影响。不同的是,支持区块链的联合学习是一种可行的解决方案,用于以更高的安全水平实施分散的学习,而额外的能源使用和数据共享为代价。最后,我们确定了两个分散的联合学习实施的开放问题,并在这个新的研究领域中提供了有关潜在扩展和可能的研究方向的见解。
Federated learning is one of the most appealing alternatives to the standard centralized learning paradigm, allowing a heterogeneous set of devices to train a machine learning model without sharing their raw data. However, it requires a central server to coordinate the learning process, thus introducing potential scalability and security issues. In the literature, server-less federated learning approaches like gossip federated learning and blockchain-enabled federated learning have been proposed to mitigate these issues. In this work, we propose a complete overview of these three techniques proposing a comparison according to an integral set of performance indicators, including model accuracy, time complexity, communication overhead, convergence time, and energy consumption. An extensive simulation campaign permits to draw a quantitative analysis considering both feedforward and convolutional neural network models. Results show that gossip federated learning and standard federated solution are able to reach a similar level of accuracy, and their energy consumption is influenced by the machine learning model adopted, the software library, and the hardware used. Differently, blockchain-enabled federated learning represents a viable solution for implementing decentralized learning with a higher level of security, at the cost of an extra energy usage and data sharing. Finally, we identify open issues on the two decentralized federated learning implementations and provide insights on potential extensions and possible research directions in this new research field.