论文标题
选举欺诈的新颖性检测:具有基于代理的仿真数据的案例研究
Novelty Detection for Election Fraud: A Case Study with Agent-Based Simulation Data
论文作者
论文摘要
在本文中,我们提出了一个强大的选举模拟模型,并独立开发了选举异常检测算法,该算法证明了模拟的实用性。该模拟产生了人工选举,其特性和趋势与现实世界中的选举相似,同时赋予用户控制选举所有重要组成部分的知识。我们生成一个干净的选举结果数据集,而无需欺诈,以及不同程度欺诈的数据集。然后,我们衡量算法能够成功地检测存在的欺诈程度。该算法确定了与具有相似人口相似的其他区域的投票结果和回归模型相比,相似的实际选举结果是如何相比的。我们使用K-均值将选举区域分配为集群,使得人口均匀性在集群之间最大化。然后,我们使用一种新颖的检测算法,该算法作为单级支持向量机实现,其中以轮询预测和回归预测的形式提供了干净的数据。回归预测是从实际数据构建的,以数据监督自身的方式。我们在识别欺诈区域的成功中既展示了仿真技术的有效性,又显示了机器学习模型。
In this paper, we propose a robust election simulation model and independently developed election anomaly detection algorithm that demonstrates the simulation's utility. The simulation generates artificial elections with similar properties and trends as elections from the real world, while giving users control and knowledge over all the important components of the elections. We generate a clean election results dataset without fraud as well as datasets with varying degrees of fraud. We then measure how well the algorithm is able to successfully detect the level of fraud present. The algorithm determines how similar actual election results are as compared to the predicted results from polling and a regression model of other regions that have similar demographics. We use k-means to partition electoral regions into clusters such that demographic homogeneity is maximized among clusters. We then use a novelty detection algorithm implemented as a one-class Support Vector Machine where the clean data is provided in the form of polling predictions and regression predictions. The regression predictions are built from the actual data in such a way that the data supervises itself. We show both the effectiveness of the simulation technique and the machine learning model in its success in identifying fraudulent regions.