论文标题
通过应用组合优化启发式方法来聚类二进制数据
Clustering Binary Data by Application of Combinatorial Optimization Heuristics
论文作者
论文摘要
我们研究用于二元数据的聚类方法,首先定义了测量簇的紧凑性的聚合标准。介绍了五种新的和原始的方法,使用社区和种群行为组合优化元启发术:第一个是模拟退火,阈值接受和禁忌搜索,而其他方法是遗传算法和蚂蚁群体优化。在启发式方法的情况下,实施了这些方法,执行参数的适当校准,以确保良好的结果。从准蒙特卡洛实验生成的一组16个数据表中,使用L1差异进行了比较,其中一个使用L1差异,并具有分层聚类,以及K均值的版本:围绕Medioids或PAM进行分区。模拟退火的性能非常好,尤其是与经典方法相比。
We study clustering methods for binary data, first defining aggregation criteria that measure the compactness of clusters. Five new and original methods are introduced, using neighborhoods and population behavior combinatorial optimization metaheuristics: first ones are simulated annealing, threshold accepting and tabu search, and the others are a genetic algorithm and ant colony optimization. The methods are implemented, performing the proper calibration of parameters in the case of heuristics, to ensure good results. From a set of 16 data tables generated by a quasi-Monte Carlo experiment, a comparison is performed for one of the aggregations using L1 dissimilarity, with hierarchical clustering, and a version of k-means: partitioning around medoids or PAM. Simulated annealing perform very well, especially compared to classical methods.