计算机科学
人工智能
目标检测
计算机视觉
特征(语言学)
棱锥(几何)
频道(广播)
特征提取
航空影像
像素
模式识别(心理学)
图像(数学)
数学
电信
哲学
语言学
几何学
作者
yongle Zhang,Zhanchuan Cai
标识
DOI:10.1109/tgrs.2023.3299651
摘要
Thermal infrared (TIR) technology is crucial for wildlife detection in unmanned aerial vehicles (UAVs), allowing executives to explore and detect at night. However, the images captured by TIR cameras are unavoidably affected by various unexpected challenges such as image jitter, wildlife overlap, and fog, which may drastically decrease wildlife detection ability. To overcome these challenges, we propose a high-accuracy infrared object detection method called channel enhancement RetinaNet (CE-RetinaNet). Firstly, a new channel enhancement (CE) module is proposed to strengthen the feature extraction of infrared images. Then, a new batch-norm stochastic channel attention (BSCA) module is proposed to filter occlusion-caused anomalous activations and focus on the pixel in the same position across channels. Next, a path augmentation (PA) operation is added after the feature pyramid network (FPN) to improve the localization capability at the entire feature level. Finally, we modified the output strategy of the classification and regression subnets. Additionally, we built a TIR wildlife detection dataset called the Infrared Salient Object Detection (ISOD) comprising 2534 images, which is accessible by the website: https://doi.org/10.5281/zenodo.7445307. We conduct extensive experiments on both public and ISOD datasets, and the experimental results reveal that CE-RetinaNet obtains higher average precision (AP) (e.g., 11.3% more) and Recall (e.g., 11.6% more) compared to other state-of-the-art object detectors.
科研通智能强力驱动
Strongly Powered by AbleSci AI