论文标题

差异私人在线supproular最大化

Differentially Private Online Submodular Maximization

论文作者

Perez-Salazar, Sebastian, Cummings, Rachel

论文摘要

在这项工作中,我们考虑了具有差异隐私(DP)的基数约束下的在线supsodular最大化问题。 $ t $ submodular函数的流在公共有限的接地套件上$ u $在线到达,并且在每次键入时,决策者必须在观察该功能之前最多可以选择$ u $的$ k $元素。决策者获得的收益与所选集合中评估的功能相等,并旨在学习一系列使预期遗憾低的集合。在全信息设置中,我们开发了一个$(\ varepsilon,δ)$ - dp算法,其预期$(1-1/e)$ - 遗憾的是$ \ Mathcal {o} \ left(\ frac {k^2 \ log |该算法包含$ k $订购的专家,这些专家在整个时间范围内都学习最佳的边缘增量,同时保持功能的隐私。在强盗设置中,我们提供$(\ varepsilon,δ+ o(e^{ - t^{1/3}}))$ - dp算法,带有预期$(1-1/e)$ - 遗憾 - 遗憾的bound $ \ nathcal {o} (| u | \ log | u |)^{1/3})^2 t^{2/3} \ right)$。我们的算法包含$ k $订购的专家,这些专家可以学习最佳的边缘项目,以便给定选择的前辈,同时保持功能的隐私。在这种情况下,隐私的一个挑战是,Expert $ i $的回报和反馈取决于她的$ i-1 $前辈所采取的行动。这种特殊类型的信息泄漏不受后处理的涵盖,并且需要新的分析。我们使用FeedForward维持隐私的技术可能具有独立的利益。

In this work we consider the problem of online submodular maximization under a cardinality constraint with differential privacy (DP). A stream of $T$ submodular functions over a common finite ground set $U$ arrives online, and at each time-step the decision maker must choose at most $k$ elements of $U$ before observing the function. The decision maker obtains a payoff equal to the function evaluated on the chosen set, and aims to learn a sequence of sets that achieves low expected regret. In the full-information setting, we develop an $(\varepsilon,δ)$-DP algorithm with expected $(1-1/e)$-regret bound of $\mathcal{O}\left( \frac{k^2\log |U|\sqrt{T \log k/δ}}{\varepsilon} \right)$. This algorithm contains $k$ ordered experts that learn the best marginal increments for each item over the whole time horizon while maintaining privacy of the functions. In the bandit setting, we provide an $(\varepsilon,δ+ O(e^{-T^{1/3}}))$-DP algorithm with expected $(1-1/e)$-regret bound of $\mathcal{O}\left( \frac{\sqrt{\log k/δ}}{\varepsilon} (k (|U| \log |U|)^{1/3})^2 T^{2/3} \right)$. Our algorithms contains $k$ ordered experts that learn the best marginal item to select given the items chosen her predecessors, while maintaining privacy of the functions. One challenge for privacy in this setting is that the payoff and feedback of expert $i$ depends on the actions taken by her $i-1$ predecessors. This particular type of information leakage is not covered by post-processing, and new analysis is required. Our techniques for maintaining privacy with feedforward may be of independent interest.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源