论文标题
在用户级本地差异隐私下的离散分发估算
Discrete Distribution Estimation under User-level Local Differential Privacy
论文作者
论文摘要
我们研究用户级局部差异隐私(LDP)下的离散分配估计。在用户级$ \ varepsilon $ -LDP中,每个用户都有$ m \ ge1 $示例,所有$ m $样本的隐私必须同时保存。我们解决以下困境:一方面,每个用户的样本更多的样本应提供有关基础分布的更多信息,但另一方面,保证所有$ m $样本的隐私应该使估计任务更加困难。在几乎所有参数制度下,我们都会获得此问题的紧密界限。也许令人惊讶的是,我们在适当的参数制度中表明,每个用户的$ M $样本相当于拥有$ M $ M $ M $倍的用户,每个用户只有一个示例。我们的结果表明,在估计风险中,$ m $的有趣相变和隐私参数$ \ varepsilon $。最后,与改组的DP的最新结果联系起来,我们表明,与随机改组相结合,我们的算法在某些参数方面在用户级DP的中心模型下导致最佳错误保证(最多可对数因素)。我们提供了几个模拟来验证我们的理论发现。
We study discrete distribution estimation under user-level local differential privacy (LDP). In user-level $\varepsilon$-LDP, each user has $m\ge1$ samples and the privacy of all $m$ samples must be preserved simultaneously. We resolve the following dilemma: While on the one hand having more samples per user should provide more information about the underlying distribution, on the other hand, guaranteeing the privacy of all $m$ samples should make the estimation task more difficult. We obtain tight bounds for this problem under almost all parameter regimes. Perhaps surprisingly, we show that in suitable parameter regimes, having $m$ samples per user is equivalent to having $m$ times more users, each with only one sample. Our results demonstrate interesting phase transitions for $m$ and the privacy parameter $\varepsilon$ in the estimation risk. Finally, connecting with recent results on shuffled DP, we show that combined with random shuffling, our algorithm leads to optimal error guarantees (up to logarithmic factors) under the central model of user-level DP in certain parameter regimes. We provide several simulations to verify our theoretical findings.