机器学习模型中GDPR合规性的数据最小化

论文标题

机器学习模型中GDPR合规性的数据最小化

Data Minimization for GDPR Compliance in Machine Learning Models

论文作者

Goldsteen, Abigail, Ezov, Gilad, Shmelkin, Ron, Moffie, Micha, Farkash, Ariel

论文摘要

欧盟一般数据保护法规（GDPR）规定了数据最小化的原则，该原理要求收集仅需要的数据才能收集一定的目的。但是，通常很难确定所需的数据量最小，尤其是在复杂的机器学习模型（例如神经网络）中。我们提出了一种第一个方法，可以通过删除或推广某些输入功能来减少使用机器学习模型执行预测所需的个人数据量。我们的方法利用模型中编码的知识来产生对其准确性几乎没有影响的概括。这使机器学习模型的创建者和用户能够以可证明的方式达到数据最小化。

The EU General Data Protection Regulation (GDPR) mandates the principle of data minimization, which requires that only data necessary to fulfill a certain purpose be collected. However, it can often be difficult to determine the minimal amount of data required, especially in complex machine learning models such as neural networks. We present a first-of-a-kind method to reduce the amount of personal data needed to perform predictions with a machine learning model, by removing or generalizing some of the input features. Our method makes use of the knowledge encoded within the model to produce a generalization that has little to no impact on its accuracy. This enables the creators and users of machine learning models to acheive data minimization, in a provable manner.

下载PDF全文

下载文献需遵守相关版权规定

论文标题