论文标题
保留K-均值聚类的隐私:安全的多方计算方法
Privacy Preserving K-Means Clustering: A Secure Multi-Party Computation Approach
论文作者
论文摘要
知识发现是人工智能的主要目标之一。这些知识通常存储在数据库中,在不同的环境中传播,是访问和从中提取数据的乏味(或不可能)的任务。为此,我们必须补充说,这些数据源可能包含私人数据,因此该信息永远不会离开源。隐私保护机器学习(PPML)有助于克服这种困难,采用加密技术,在确保数据隐私的同时发现知识发现。 K均值是用于发现知识的数据挖掘技术之一,将数据点分组为包含类似功能的群集。本文着重于保存隐私的机器学习,使用分子术领域的最新协议应用于K均值。该算法应用于可以水平或垂直分布数据的不同情况。
Knowledge discovery is one of the main goals of Artificial Intelligence. This Knowledge is usually stored in databases spread in different environments, being a tedious (or impossible) task to access and extract data from them. To this difficulty we must add that these datasources may contain private data, therefore the information can never leave the source. Privacy Preserving Machine Learning (PPML) helps to overcome this difficulty, employing cryptographic techniques, allowing knowledge discovery while ensuring data privacy. K-means is one of the data mining techniques used in order to discover knowledge, grouping data points in clusters that contain similar features. This paper focuses in Privacy Preserving Machine Learning applied to K-means using recent protocols from the field of criptography. The algorithm is applied to different scenarios where data may be distributed either horizontally or vertically.