计算机科学
遥感
频道(广播)
目标检测
计算机视觉
人工智能
模式识别(心理学)
地质学
电信
作者
Guozheng Nan,Yue Zhao,Liyong Fu,Qiaolin Ye
标识
DOI:10.1109/jstars.2024.3388013
摘要
Smart satellites and unmanned aerial vehicles (UAVs) are typically equipped with visible light and infrared (IR) spectrum sensors. However, achieving real-time object detection utilizing these multimodal data on such resource-limited devices is a challenging task. This paper proposes HyperYOLO, a real-time lightweight object detection framework for multimodal remote sensing images. First, we propose a lightweight multimodal fusion module named Channel and Spatial Exchange (CSE) to effectively extract complementary information from different modalities. The CSE module consists of two stages: channel exchange and spatial exchange. Channel exchange achieves global fusion by learning global weights to better utilize cross-channel information correlation, while spatial exchange captures details by considering spatial relationships to calibrate local fusion. Second, we propose an effective auxiliary branch module based on the feature pyramid network for super resolution (FPNSR) to enhance the framework's responsiveness to small objects by learning high-quality feature representations. Moreover, we embed a coordinate attention mechanism to assist our network in precisely localizing and attending to the objects of interest. The experimental results show that on the VEDAI remote sensing dataset, HyperYOLO achieves a 76.72% mAP 50 , surpassing the SOTA SuperYOLO by 1.63%. Meanwhile, the parameter size and GFLOPs of HyperYOLO are about 1.34 million (28%) and 3.97 (22%) less than SuperYOLO, respectively. In addition, HyperYOLO has a file size of only 7.3 MB after the removal of the auxiliary FPNSR branch, which makes it easier to deploy on these resource-constrained devices.
科研通智能强力驱动
Strongly Powered by AbleSci AI