论文标题

通过密度引导的自适应选择CNN和变压器估计来计算不同密度的人群

Counting Varying Density Crowds Through Density Guided Adaptive Selection CNN and Transformer Estimation

论文作者

Chen, Yuehai, Yang, Jing, Chen, Badong, Du, Shaoyi

论文摘要

在实际人群计算应用程序中,图像中的人群密度差异很大。当面对密度变化时,人类倾向于定位并计算低密度区域中的目标,并推理高密度区域的数量。我们观察到,CNN专注于使用固定尺寸卷积内核的局部信息相关性,而变压器可以使用全球自我注意机制有效地提取语义人群信息。因此,CNN可以在低密度区域准确定位和估计人群,而很难正确理解高密度区域的密度。相反,变压器在高密度区域具有很高的可靠性,但未能在稀疏区域定位目标。 CNN和Transformer都无法很好地处理这种密度变化。为了解决此问题,我们提出了一个CNN和变压器自适应选择网络(CTASNET),该网络可以自适应地为不同密度区域选择适当的计数分支。首先,CTASNET生成了CNN和Transformer的预测结果。然后,考虑到CNN/变压器适合低/高密度区域,密度引导的自适应选择模块被设计为自动结合CNN和Transformer的预测。此外,为了减少注释噪声的影响,我们引入了基于Correntropy的最佳运输损失。对四个挑战的人群计数数据集进行了广泛的实验,已经验证了所提出的方法。

In real-world crowd counting applications, the crowd densities in an image vary greatly. When facing density variation, humans tend to locate and count the targets in low-density regions, and reason the number in high-density regions. We observe that CNN focus on the local information correlation using a fixed-size convolution kernel and the Transformer could effectively extract the semantic crowd information by using the global self-attention mechanism. Thus, CNN could locate and estimate crowds accurately in low-density regions, while it is hard to properly perceive the densities in high-density regions. On the contrary, Transformer has a high reliability in high-density regions, but fails to locate the targets in sparse regions. Neither CNN nor Transformer can well deal with this kind of density variation. To address this problem, we propose a CNN and Transformer Adaptive Selection Network (CTASNet) which can adaptively select the appropriate counting branch for different density regions. Firstly, CTASNet generates the prediction results of CNN and Transformer. Then, considering that CNN/Transformer is appropriate for low/high-density regions, a density guided adaptive selection module is designed to automatically combine the predictions of CNN and Transformer. Moreover, to reduce the influences of annotation noise, we introduce a Correntropy based optimal transport loss. Extensive experiments on four challenging crowd counting datasets have validated the proposed method.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源