长尾颈细胞检测的基于样品硬度的梯度损失

论文标题

长尾颈细胞检测的基于样品硬度的梯度损失

Sample hardness based gradient loss for long-tailed cervical cell detection

论文作者

Liu, Minmin, Li, Xuechen, Gao, Xiangbo, Chen, Junliang, Shen, Linlin, Wu, Huisi

论文摘要

由于癌症样品收集和注释的难度，宫颈癌数据集通常表现出长尾数据分布。当训练检测器以检测WSI（整个切片图像）图像中的癌细胞时，从TCT（ThinPREP细胞学测试）样品捕获的样品时，头部类别（例如正常细胞和炎性细胞）通常比尾巴类别（例如癌细胞）具有大量样品。对象检测中的大多数现有最新的长尾学习方法将重点放在类别分布统计上，以在长尾方案中解决该问题，而无需考虑每个样本的“硬度”。为了解决这个问题，在这项工作中，我们提出了一个Grad-libra损失，该损失利用梯度动态校准每个样品的硬度程度，以减少不同类别的硬度，并重新平衡正面和负样品的梯度。因此，我们的损失可以帮助探测器更加重视头部和尾部类别中的这些硬样品。在长尾的TCT WSI图像数据集上进行了广泛的实验表明，主流检测器，例如对使用我们建议的梯度损失训练的训练比使用跨透明分类损失训练的地图要高得多（7.8％）的地图。

Due to the difficulty of cancer samples collection and annotation, cervical cancer datasets usually exhibit a long-tailed data distribution. When training a detector to detect the cancer cells in a WSI (Whole Slice Image) image captured from the TCT (Thinprep Cytology Test) specimen, head categories (e.g. normal cells and inflammatory cells) typically have a much larger number of samples than tail categories (e.g. cancer cells). Most existing state-of-the-art long-tailed learning methods in object detection focus on category distribution statistics to solve the problem in the long-tailed scenario without considering the "hardness" of each sample. To address this problem, in this work we propose a Grad-Libra Loss that leverages the gradients to dynamically calibrate the degree of hardness of each sample for different categories, and re-balance the gradients of positive and negative samples. Our loss can thus help the detector to put more emphasis on those hard samples in both head and tail categories. Extensive experiments on a long-tailed TCT WSI image dataset show that the mainstream detectors, e.g. RepPoints, FCOS, ATSS, YOLOF, etc. trained using our proposed Gradient-Libra Loss, achieved much higher (7.8%) mAP than that trained using cross-entropy classification loss.

下载PDF全文

下载文献需遵守相关版权规定

论文标题