论文标题
使用无监督的模糊聚类在历史扫描文档中识别手写样式
Recognizing Handwriting Styles in a Historical Scanned Document Using Unsupervised Fuzzy Clustering
论文作者
论文摘要
数字化文档中笔迹的法医归因于多个抄写员是高维度的挑战性问题。独特的手写样式在几个因素的融合中可能是不同的,包括字符尺寸,中风宽度,环,管道,倾斜角度和草皮连接。以隐藏的马尔可夫模型,支持向量机和半监视的复发性神经网络的标记数据进行了以前的工作,已提供了中等至高的成功。在这项研究中,我们通过模糊的软聚类与线性主成分分析结合使用,成功地检测了历史手稿中的手移位。这项进步证明了无监督的方法成功地部署了历史文档的作者归因和法医文档分析。
The forensic attribution of the handwriting in a digitized document to multiple scribes is a challenging problem of high dimensionality. Unique handwriting styles may be dissimilar in a blend of several factors including character size, stroke width, loops, ductus, slant angles, and cursive ligatures. Previous work on labeled data with Hidden Markov models, support vector machines, and semi-supervised recurrent neural networks have provided moderate to high success. In this study, we successfully detect hand shifts in a historical manuscript through fuzzy soft clustering in combination with linear principal component analysis. This advance demonstrates the successful deployment of unsupervised methods for writer attribution of historical documents and forensic document analysis.