文本线细分的无监督深度学习

论文标题

文本线细分的无监督深度学习

Unsupervised deep learning for text line segmentation

论文作者

Barakat, Berat Kurar, Droby, Ahmad, Alasam, Rym, Madi, Boraq, Rabaev, Irina, Shammes, Raed, El-Sana, Jihad

论文摘要

我们提出了一种无监督的深度学习方法，用于文本线段分割，它是受文本线条之间的相对差异和文本线之间的相对差异的启发。手写文本线细分对于进一步处理的效率很重要。一种常见的方法是训练深度学习网络，以将文档图像嵌入到追踪文本线路的斑点线的图像中。以前的方法以监督的方式学习了这种嵌入，需要对许多文档图像的注释。本文介绍了文档图像补丁的无监督嵌入，而无需注释。文本线上的前景像素的数量与文本线之间空间上的前景像素的数量相对较大。依靠这一原则产生相似和不同的对肯定会导致异常值。但是，如结果所示，离群值不会损害收敛性，并且网络学会了将文本线与文本线之间的空间区分开。值得注意的是，凭借具有挑战性的阿拉伯手写文本线路细分数据集VML-AHTE，我们在监督方法上取得了卓越的性能。此外，在ICDAR 2017和ICFHR 2010年手写文本行分割数据集上评估了提出的方法。

We present an unsupervised deep learning method for text line segmentation that is inspired by the relative variance between text lines and spaces among text lines. Handwritten text line segmentation is important for the efficiency of further processing. A common method is to train a deep learning network for embedding the document image into an image of blob lines that are tracing the text lines. Previous methods learned such embedding in a supervised manner, requiring the annotation of many document images. This paper presents an unsupervised embedding of document image patches without a need for annotations. The number of foreground pixels over the text lines is relatively different from the number of foreground pixels over the spaces among text lines. Generating similar and different pairs relying on this principle definitely leads to outliers. However, as the results show, the outliers do not harm the convergence and the network learns to discriminate the text lines from the spaces between text lines. Remarkably, with a challenging Arabic handwritten text line segmentation dataset, VML-AHTE, we achieved superior performance over the supervised methods. Additionally, the proposed method was evaluated on the ICDAR 2017 and ICFHR 2010 handwritten text line segmentation datasets.

下载PDF全文

下载文献需遵守相关版权规定

论文标题