论文标题
序列信息通道串联以改善相机陷阱图像爆发分类
Sequence Information Channel Concatenation for Improving Camera Trap Image Burst Classification
论文作者
论文摘要
相机陷阱广泛用于观察其自然栖息地中的野生动植物,而不会干扰生态系统。这可能有助于早期发现自然或人类对动物的威胁,并有助于生态保护。当前,已经在世界各地的各个生态保护区部署了大量此类相机陷阱,数十年来收集数据,从而需要自动化以检测包含动物的图像。现有系统执行分类,以检测图像是否通过考虑单个图像包含动物。但是,由于动物在自然栖息地中伪装的动物挑战性的场景,有时很难仅仅从单个图像中识别出动物的存在。我们假设,假设动物移动的一小段图像,而不是单个图像,那么人类以及机器可以检测到动物的存在变得更加容易。在这项工作中,我们探讨了各种方法,并测量使用短图像序列(3张图像的爆发)对改进相机陷阱图像分类的影响。我们表明,与从单个图像中学习的等效模型相比,在跨通道中包含序列信息和来自3图像爆炸的图像的连接掩模将ROC AUC提高了20%。
Camera Traps are extensively used to observe wildlife in their natural habitat without disturbing the ecosystem. This could help in the early detection of natural or human threats to animals, and help towards ecological conservation. Currently, a massive number of such camera traps have been deployed at various ecological conservation areas around the world, collecting data for decades, thereby requiring automation to detect images containing animals. Existing systems perform classification to detect if images contain animals by considering a single image. However, due to challenging scenes with animals camouflaged in their natural habitat, it sometimes becomes difficult to identify the presence of animals from merely a single image. We hypothesize that a short burst of images instead of a single image, assuming that the animal moves, makes it much easier for a human as well as a machine to detect the presence of animals. In this work, we explore a variety of approaches, and measure the impact of using short image sequences (burst of 3 images) on improving the camera trap image classification. We show that concatenating masks containing sequence information and the images from the 3-image-burst across channels, improves the ROC AUC by 20% on a test-set from unseen camera-sites, as compared to an equivalent model that learns from a single image.