具有预期最大化算法的弱监督任意形状检测

论文标题

具有预期最大化算法的弱监督任意形状检测

Weakly-Supervised Arbitrary-Shaped Text Detection with Expectation-Maximization Algorithm

论文作者

Zhao, Mengbiao, Feng, Wei, Yin, Fei, Zhang, Xu-Yao, Liu, Cheng-Lin

论文摘要

任意形状的文本检测是计算机视觉中的一项重要且具有挑战性的任务。大多数现有的方法都需要大量的数据标签工作，以生成多边形级文本区域标签以进行监督培训。为了降低数据标记的成本，我们研究了弱监督的任意形状检测检测，以结合各种弱监督表（例如，图像级标签，粗，松散和紧密的边界框），这对于注释而言更容易。我们提出了一个基于弱监督的学习框架（EM）的期望最大化（EM），以训练准确的任意形状的文本检测器，仅使用少量的多边形级注释数据与大量弱注释的数据相结合。同时，我们提出了一个基于轮廓的任意形状的文本检测器，该检测器适合纳入弱监督的学习。 Extensive experiments on three arbitrary-shaped text benchmarks (CTW1500, Total-Text and ICDAR-ArT) show that (1) using only 10% strongly annotated data and 90% weakly annotated data, our method yields comparable performance to state-of-the-art methods, (2) with 100% strongly annotated data, our method outperforms existing methods on all three benchmarks.将来，我们将使弱注释的数据集公开可用。

Arbitrary-shaped text detection is an important and challenging task in computer vision. Most existing methods require heavy data labeling efforts to produce polygon-level text region labels for supervised training. In order to reduce the cost in data labeling, we study weakly-supervised arbitrary-shaped text detection for combining various weak supervision forms (e.g., image-level tags, coarse, loose and tight bounding boxes), which are far easier for annotation. We propose an Expectation-Maximization (EM) based weakly-supervised learning framework to train an accurate arbitrary-shaped text detector using only a small amount of polygon-level annotated data combined with a large amount of weakly annotated data. Meanwhile, we propose a contour-based arbitrary-shaped text detector, which is suitable for incorporating weakly-supervised learning. Extensive experiments on three arbitrary-shaped text benchmarks (CTW1500, Total-Text and ICDAR-ArT) show that (1) using only 10% strongly annotated data and 90% weakly annotated data, our method yields comparable performance to state-of-the-art methods, (2) with 100% strongly annotated data, our method outperforms existing methods on all three benchmarks. We will make the weakly annotated datasets publicly available in the future.

下载PDF全文

下载文献需遵守相关版权规定

论文标题