论文标题
稀疏的概念编码的四极转换,用于无约束的odia字符识别
Sparse Concept Coded Tetrolet Transform for Unconstrained Odia Character Recognition
论文作者
论文摘要
时空分解形式的特征表示是自动手写字符识别系统中采用的强大技术之一。在这方面,我们使用稀疏的概念编码的四极管为无约束的手写字母数字字符提出了一种新的图像表示方法。四极管不使用固定的二元平方块进行光谱分解,例如传统小波,它通过采用四损者来保留手写中的局部变化,它们捕获了形状的几何形状。发现低熵四极表示的稀疏概念编码是为了提取重要模式歧视的重要隐藏信息(概念)。使用六个不同脚本(Bangla,Devanagari,Odia,English,Arabic和Telugu)中的十个数据库进行了大规模实验。提出的特征表示以及标准分类器(如随机森林,支持向量机(SVM),最近的邻居和修改后的二次判别函数(MQDF))在所有数据库中都可以实现最新的识别性能,即实现最新的识别性能,即。 99.40%(MNIST); 98.72%和93.24%(IITBBS); 99.38%和99.22%(ISI加尔各答)。所提出的OCR系统的性能比其他基于稀疏的技术(例如PCA,SparsePCA和Sparselda)更好,并且比现有变换(小波,slantlet和Stockwell)更好。
Feature representation in the form of spatio-spectral decomposition is one of the robust techniques adopted in automatic handwritten character recognition systems. In this regard, we propose a new image representation approach for unconstrained handwritten alphanumeric characters using sparse concept coded Tetrolets. Tetrolets, which does not use fixed dyadic square blocks for spectral decomposition like conventional wavelets, preserve the localized variations in handwritings by adopting tetrominoes those capture the shape geometry. The sparse concept coding of low entropy Tetrolet representation is found to extract the important hidden information (concept) for superior pattern discrimination. Large scale experimentation using ten databases in six different scripts (Bangla, Devanagari, Odia, English, Arabic and Telugu) has been performed. The proposed feature representation along with standard classifiers such as random forest, support vector machine (SVM), nearest neighbor and modified quadratic discriminant function (MQDF) is found to achieve state-of-the-art recognition performance in all the databases, viz. 99.40% (MNIST); 98.72% and 93.24% (IITBBS); 99.38% and 99.22% (ISI Kolkata). The proposed OCR system is shown to perform better than other sparse based techniques such as PCA, SparsePCA and SparseLDA, as well as better than existing transforms (Wavelet, Slantlet and Stockwell).