论文标题
可信赖的在线市场实验,预算清算设计
Trustworthy Online Marketplace Experimentation with Budget-split Design
论文作者
论文摘要
在线实验,也称为A/B测试,是衡量产品影响并在技术行业做出业务决策的黄金标准。但是,实验的有效性和实用性取决于公正和足够的功率。在双面在线市场中,这两个要求都受到质疑。伯努利随机实验是有偏见的,因为治疗单位通过市场竞争干扰控制单位并违反了“稳定的单位治疗价值假设”(SUTVA)。由于两侧的样本量不同,因此至少在市场的一侧的实验能力通常不足。尽管在线市场对在线经济的重要性以及在产品改进中的关键作用实验具有重要作用,但仍缺乏解决市场实验中偏见和低功率问题的有效解决方案。我们的论文填补了这一空白,提出了一种实验设计,该设计在买家的预算有限或无限的任何市场中都没有偏见。我们表明,它比文学中所有其他无偏设计更强大。然后,我们提供可通用的系统体系结构,以将此设计部署到在线市场上。最后,我们通过在两个现实世界的在线市场中进行实验的经验表现来确认我们的发现。
Online experimentation, also known as A/B testing, is the gold standard for measuring product impacts and making business decisions in the tech industry. The validity and utility of experiments, however, hinge on unbiasedness and sufficient power. In two-sided online marketplaces, both requirements are called into question. The Bernoulli randomized experiments are biased because treatment units interfere with control units through market competition and violate the "stable unit treatment value assumption"(SUTVA). The experimental power on at least one side of the market is often insufficient because of disparate sample sizes on the two sides. Despite the important of online marketplaces to the online economy and the crucial role experimentation plays in product improvement, there lacks an effective and practical solution to the bias and low power problems in marketplace experimentation. Our paper fills this gap by proposing an experimental design that is unbiased in any marketplace where buyers have a defined budget, which could be finite or infinite. We show that it is more powerful than all other unbiased designs in literature. We then provide generalizable system architecture for deploying this design to online marketplaces. Finally, we confirm our findings with empirical performance from experiments run in two real-world online marketplaces.