论文标题
基于学习的草图的本地差异频率估计
Local Differentially Private Frequency Estimation based on Learned Sketches
论文作者
论文摘要
草图被广泛用于具有大域的数据频率估计。但是,在考虑隐私时,基于草图的频率估计面临更多的挑战。当地差异隐私(LDP)是对敏感数据进行频率估计的解决方案,同时保留隐私。 LDP使每个用户能够在客户端上扰动其数据以保护隐私,但也会对频率估计引入错误。草图中的哈希碰撞使低频物品的估计更糟。在本文中,我们提出了一个基于LDP学习的草图的大型域数据的两阶段频率估计框架,该框架将高频率和低频率的项目分开,以避免哈希碰撞引起的错误。从理论上讲,我们所提出的方法满足了最不发达国家,并且比包括Apple-CMS,Apple-HCMS和FLH在内的最先进的频率估计方法更准确。实验结果验证了我们方法的性能。
Sketches are widely used for frequency estimation of data with a large domain. However, sketches-based frequency estimation faces more challenges when considering privacy. Local differential privacy (LDP) is a solution to frequency estimation on sensitive data while preserving the privacy. LDP enables each user to perturb its data on the client-side to protect the privacy, but it also introduces errors to the frequency estimations. The hash collisions in the sketches make the estimations for low-frequent items even worse. In this paper, we propose a two-phase frequency estimation framework for data with a large domain based on an LDP learned sketch, which separates the high-frequent and low-frequent items to avoid the errors caused by hash collisions. We theoretically proved that the proposed method satisfies LDP and it is more accurate than the state-of-the-art frequency estimation methods including Apple-CMS, Apple-HCMS and FLH. The experimental results verify the performance of our method.