论文标题
基于图的离散选择方法
Graph-Based Methods for Discrete Choice
论文作者
论文摘要
个人做出的选择具有广泛的影响 - 例如,人们在政治候选人之间,在社交媒体帖子之间以及在购买品牌之间进行投票的选择,这些选择的数据越来越丰富。离散选择模型是从此类数据中学习单个偏好的关键工具。此外,诸如顺从性和传染等社会因素会影响个人选择。将这些因素纳入选择模型的传统方法并不能说明整个社交网络,并且需要手工制作的功能。为了克服这些局限性,我们使用图形学习来研究网络上下文中的选择。我们确定可以将图形学习技术用于离散选择的三种方式:学习选择器表示,正规化选择模型参数以及直接从网络构建预测。我们在每个类别中设计方法,并在实际选择数据集上对其进行测试,包括县级2016年美国选举结果以及Android应用程序安装和使用数据。我们表明,结合社交网络结构可以改善标准计量经济学选择模型,多项式logit的预测。我们提供的证据表明,应用程序安装受社会环境的影响,但是我们发现对同一参与者的应用程序使用没有这种影响,而这是习惯驱动的。在选举数据中,我们重点介绍了离散选择框架在分类或回归(典型方法)上提供的其他见解。在合成数据上,我们证明了在选择模型中使用社会信息的样本复杂性好处。
Choices made by individuals have widespread impacts--for instance, people choose between political candidates to vote for, between social media posts to share, and between brands to purchase--moreover, data on these choices are increasingly abundant. Discrete choice models are a key tool for learning individual preferences from such data. Additionally, social factors like conformity and contagion influence individual choice. Traditional methods for incorporating these factors into choice models do not account for the entire social network and require hand-crafted features. To overcome these limitations, we use graph learning to study choice in networked contexts. We identify three ways in which graph learning techniques can be used for discrete choice: learning chooser representations, regularizing choice model parameters, and directly constructing predictions from a network. We design methods in each category and test them on real-world choice datasets, including county-level 2016 US election results and Android app installation and usage data. We show that incorporating social network structure can improve the predictions of the standard econometric choice model, the multinomial logit. We provide evidence that app installations are influenced by social context, but we find no such effect on app usage among the same participants, which instead is habit-driven. In the election data, we highlight the additional insights a discrete choice framework provides over classification or regression, the typical approaches. On synthetic data, we demonstrate the sample complexity benefit of using social information in choice models.