PatchRefinenet：通过合并最佳贴片的信号来改善二进制分割

论文标题

PatchRefinenet：通过合并最佳贴片的信号来改善二进制分割

PatchRefineNet: Improving Binary Segmentation by Incorporating Signals from Optimal Patch-wise Binarization

论文作者

Nagendra, Savinay, Shen, Chaopeng, Kifer, Daniel

论文摘要

二进制分割模型的目的是确定哪个像素属于感兴趣的对象（例如，图像中的哪个像素是道路的一部分）。模型为每个像素分配了logit分数（即概率），并通过阈值将这些分数转换为预测（即，每个具有logit评分$ \ geqτ$的像素预计将是道路的一部分）。但是，当前和以前的最新分割模型中的一个共同现象是空间偏见 - 在某些斑块中，logit评分始终偏向上，而在其他斑块中，它们始终偏向下降。这些偏见在最终预测中引起误报和假否定。在本文中，我们提出了PatchRefineNet（PRN），这是一个位于基本分割模型之上的小型网络，并学会了纠正其特定于补丁的偏见。在各种基本模型中，PRN始终帮助他们提高MIOU 2-3 \％。 PRN背后的关键思想之一是在训练过程中增加了新的监督信号。给定基本分割模型产生的logit分数，每个像素都通过在每个图像贴片中的logit分数最佳阈值来获得一个伪标记。将这些伪标签纳入PRN的损失功能有助于纠正系统的偏见并减少误报/负面因素。尽管我们主要关注二进制分割，但我们也展示了如何将PRN扩展到显着性检测和很少的分割。我们还讨论了如何将思想扩展到多类细分。

The purpose of binary segmentation models is to determine which pixels belong to an object of interest (e.g., which pixels in an image are part of roads). The models assign a logit score (i.e., probability) to each pixel and these are converted into predictions by thresholding (i.e., each pixel with logit score $\geq τ$ is predicted to be part of a road). However, a common phenomenon in current and former state-of-the-art segmentation models is spatial bias -- in some patches, the logit scores are consistently biased upwards and in others they are consistently biased downwards. These biases cause false positives and false negatives in the final predictions. In this paper, we propose PatchRefineNet (PRN), a small network that sits on top of a base segmentation model and learns to correct its patch-specific biases. Across a wide variety of base models, PRN consistently helps them improve mIoU by 2-3\%. One of the key ideas behind PRN is the addition of a novel supervision signal during training. Given the logit scores produced by the base segmentation model, each pixel is given a pseudo-label that is obtained by optimally thresholding the logit scores in each image patch. Incorporating these pseudo-labels into the loss function of PRN helps correct systematic biases and reduce false positives/negatives. Although we mainly focus on binary segmentation, we also show how PRN can be extended to saliency detection and few-shot segmentation. We also discuss how the ideas can be extended to multiclass segmentation.

下载PDF全文

下载文献需遵守相关版权规定

论文标题