可扩展的且可证明的准确算法，用于差异私有分布式决策树学习

论文标题

可扩展的且可证明的准确算法，用于差异私有分布式决策树学习

Scalable and Provably Accurate Algorithms for Differentially Private Distributed Decision Tree Learning

论文作者

Wang, Kaiwen, Dick, Travis, Balcan, Maria-Florina

论文摘要

本文介绍了第一种在分布式设置中差异化的，自上而下的决策树学习的第一种准确的算法（Balcan等，2012）。我们提出了DP-TOPDOWN，这是保存决策树学习算法的一般隐私，并提出了两个分布式实现。我们的第一种方法NoisyCounts自然使用Laplace机制扩展了单机算法。我们的第二种方法LocalRNM通过在每个数据持有人处执行局部优化来大大降低通信和噪声。我们为在单个机器和分布式设置中差异化的自上而下决策树学习提供了第一个实用程序保证。这些保证表明，只要数据集足够大，私人学习的决策树的错误就会迅速达到零。我们在实际数据集上进行的广泛实验说明了在分布式环境中学习私人决策树时的隐私，准确性和概括的权衡。

This paper introduces the first provably accurate algorithms for differentially private, top-down decision tree learning in the distributed setting (Balcan et al., 2012). We propose DP-TopDown, a general privacy preserving decision tree learning algorithm, and present two distributed implementations. Our first method NoisyCounts naturally extends the single machine algorithm by using the Laplace mechanism. Our second method LocalRNM significantly reduces communication and added noise by performing local optimization at each data holder. We provide the first utility guarantees for differentially private top-down decision tree learning in both the single machine and distributed settings. These guarantees show that the error of the privately-learned decision tree quickly goes to zero provided that the dataset is sufficiently large. Our extensive experiments on real datasets illustrate the trade-offs of privacy, accuracy and generalization when learning private decision trees in the distributed setting.

下载PDF全文

下载文献需遵守相关版权规定

论文标题