论文标题
通过指数机制优化私人凸优化
Private Convex Optimization via Exponential Mechanism
论文作者
论文摘要
在本文中,我们研究了非平滑凸功能的私人优化问题$ f(x)= \ mathbb {e} _i f_i(x)$ on $ \ mathbb {r}^d $。我们表明,通过将$ \ ell_2^2 $正规化器添加到$ f(x)$并从$π(x)\ propto \ exp(-k(f(x)+μ\ | x \ | _2^2/2)$恢复到已知的最佳empirical风险损失和$($)$(^unc.此外,我们展示了如何使用$ \ wideTilde {o}(n \ min(d,n))$ QUERIES $ QUERIES $ f_i(x)$的DP-SCO进行实施,其中$ n $是$ n $是样本/用户的数量,$ d $是环境尺寸。我们还在评估查询的数量上给出了一个(几乎)匹配的下限$ \widetildeΩ(n \ min(d,n))$。 我们的结果利用以下具有独立感兴趣的工具:(1)如果损失函数强烈凸出并且扰动是Lipschitz,则证明指数机制的高斯差异隐私(GDP)。我们的隐私约束是\ emph {optimal},因为它包括高斯机制作为一种特殊情况的隐私性,并使用等仪不平等事物证明了强烈的对数concove措施。 (2)我们展示了如何从$ \ exp(-f(x)-μ\ | x \ |^2_2/2)$ for $ g $ -lipschitz $ f $带有$η$误差的总变量(电视)距离(使用$ \ widetilde {o}(g^2/μ)\ log^2(d/η)$ usiase $ nibe $ g $ g $ -lipschitz $ f $的$。这是第一个在dimension $ d $和精度$η$上具有\ emph {polylogarithmic依赖}的查询复杂性的采样器。
In this paper, we study private optimization problems for non-smooth convex functions $F(x)=\mathbb{E}_i f_i(x)$ on $\mathbb{R}^d$. We show that modifying the exponential mechanism by adding an $\ell_2^2$ regularizer to $F(x)$ and sampling from $π(x)\propto \exp(-k(F(x)+μ\|x\|_2^2/2))$ recovers both the known optimal empirical risk and population loss under $(ε,δ)$-DP. Furthermore, we show how to implement this mechanism using $\widetilde{O}(n \min(d, n))$ queries to $f_i(x)$ for the DP-SCO where $n$ is the number of samples/users and $d$ is the ambient dimension. We also give a (nearly) matching lower bound $\widetildeΩ(n \min(d, n))$ on the number of evaluation queries. Our results utilize the following tools that are of independent interest: (1) We prove Gaussian Differential Privacy (GDP) of the exponential mechanism if the loss function is strongly convex and the perturbation is Lipschitz. Our privacy bound is \emph{optimal} as it includes the privacy of Gaussian mechanism as a special case and is proved using the isoperimetric inequality for strongly log-concave measures. (2) We show how to sample from $\exp(-F(x)-μ\|x\|^2_2/2)$ for $G$-Lipschitz $F$ with $η$ error in total variation (TV) distance using $\widetilde{O}((G^2/μ) \log^2(d/η))$ unbiased queries to $F(x)$. This is the first sampler whose query complexity has \emph{polylogarithmic dependence} on both dimension $d$ and accuracy $η$.