追逐长尾巴：医疗保健环境中的私人预测

论文标题

追逐长尾巴：医疗保健环境中的私人预测

Chasing Your Long Tails: Differentially Private Prediction in Health Care Settings

论文作者

Suriyakumar, Vinith M., Papernot, Nicolas, Goldenberg, Anna, Ghassemi, Marzyeh

论文摘要

医疗保健中的机器学习模型通常在保护患者隐私很重要的环境中部署。在这种情况下，差异私有（DP）学习的方法提供了一种通用方法，可以通过隐私保证学习模型。 DP学习的现代方法通过审查信息判断为太独特的机制确保隐私。因此，由此产生的隐私模型从数据分布的尾部忽略了信息，从而导致准确性丧失，可能会对小组产生不成比例的影响。在本文中，我们研究了DP学习对医疗保健的影响。我们使用最先进的方法来学习DP学习在临床预测任务中培训隐私模型，包括图像的X射线分类和时间序列数据中的死亡率预测。我们使用这些模型对隐私，效用，鲁棒性与数据集转移和公平性之间的权衡进行全面的实证研究。我们的结果突出了鲜为人知的医疗保健中DP学习方法的局限性，在隐私和公用事业之间表现出巨大的权衡的模型以及预测的模型受到培训数据中大型人群的影响不成比例的模型。我们讨论了医疗保健中私人学习的成本和收益。

Machine learning models in health care are often deployed in settings where it is important to protect patient privacy. In such settings, methods for differentially private (DP) learning provide a general-purpose approach to learn models with privacy guarantees. Modern methods for DP learning ensure privacy through mechanisms that censor information judged as too unique. The resulting privacy-preserving models, therefore, neglect information from the tails of a data distribution, resulting in a loss of accuracy that can disproportionately affect small groups. In this paper, we study the effects of DP learning in health care. We use state-of-the-art methods for DP learning to train privacy-preserving models in clinical prediction tasks, including x-ray classification of images and mortality prediction in time series data. We use these models to perform a comprehensive empirical investigation of the tradeoffs between privacy, utility, robustness to dataset shift, and fairness. Our results highlight lesser-known limitations of methods for DP learning in health care, models that exhibit steep tradeoffs between privacy and utility, and models whose predictions are disproportionately influenced by large demographic groups in the training data. We discuss the costs and benefits of differentially private learning in health care.

下载PDF全文

下载文献需遵守相关版权规定

论文标题