论文标题

使用需求学习的动态性个性化定价

Privacy-Preserving Dynamic Personalized Pricing with Demand Learning

论文作者

Chen, Xi, Simchi-Levi, David, Wang, Yining

论文摘要

电子商务的普遍性使零售商易于访问详细的客户的个人信息,并且这些信息已被广泛用于定价决策。当涉及个性化信息时,如何保护此类信息的隐私是实践中的关键问题。在本文中,我们考虑了$ emph {未知}的需求函数的$ t $时间段的动态定价问题。每次$ t $,零售商都会观察到到达客户的个人信息并提供价格。然后,客户做出购买决定,零售商将使用该决定来学习潜在的需求功能。在此过程中,可能存在严重的隐私问题:第三方代理商可能会从定价系统的价格变化中推断出个性化信息并购买决策。使用计算机科学的差异隐私框架的基本框架,我们制定了一个保护隐私的动态定价政策,该政策试图最大化零售商的收入,同时避免信息泄漏单个客户的信息和购买决策。为此,我们首先引入了\ emph {预期} $(\ varepsilon,δ)$ - 差异隐私的概念,该隐私是为动态定价问题量身定制的。我们的政策以遗憾获得了隐私保证和绩效保证。粗略地说,对于$ d $维的个性化信息,我们的算法以$ \ tilde {o}(\ varepsilon^{ - 1} { - 1} \ sqrt {d^3 t} $的顺序达到了预期的遗憾,当客户的信息是对手的。对于随机的个性化信息,可以将遗憾的绑定进一步改进到$ \ tilde {o}(\ sqrt {d^2t} + \ varepsilon^{ - 2} d^2)$

The prevalence of e-commerce has made detailed customers' personal information readily accessible to retailers, and this information has been widely used in pricing decisions. When involving personalized information, how to protect the privacy of such information becomes a critical issue in practice. In this paper, we consider a dynamic pricing problem over $T$ time periods with an \emph{unknown} demand function of posted price and personalized information. At each time $t$, the retailer observes an arriving customer's personal information and offers a price. The customer then makes the purchase decision, which will be utilized by the retailer to learn the underlying demand function. There is potentially a serious privacy concern during this process: a third party agent might infer the personalized information and purchase decisions from price changes from the pricing system. Using the fundamental framework of differential privacy from computer science, we develop a privacy-preserving dynamic pricing policy, which tries to maximize the retailer revenue while avoiding information leakage of individual customer's information and purchasing decisions. To this end, we first introduce a notion of \emph{anticipating} $(\varepsilon, δ)$-differential privacy that is tailored to dynamic pricing problem. Our policy achieves both the privacy guarantee and the performance guarantee in terms of regret. Roughly speaking, for $d$-dimensional personalized information, our algorithm achieves the expected regret at the order of $\tilde{O}(\varepsilon^{-1} \sqrt{d^3 T})$, when the customers' information is adversarially chosen. For stochastic personalized information, the regret bound can be further improved to $\tilde{O}(\sqrt{d^2T} + \varepsilon^{-2} d^2)$

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源