情态动词
遥感
计算机科学
目标检测
对象(语法)
人工智能
计算机视觉
模式识别(心理学)
地质学
材料科学
高分子化学
作者
Kim-Phuong Phung,Ngoc-Long Le,Quang-Uy Nguyen
标识
DOI:10.1109/rivf60135.2023.10471841
摘要
Object detection in remote sensing images (RSI) poses a significant challenge, particularly in accurately detecting small objects across different scales and rotations. State-of-the-art object detection solutions often rely on complex deep neural networks, leading to high computational costs. In this paper, we propose a novel approach to enhance the performance of the YOLO series on RSI by combining the strengths of SuperYOLO, based on YOLOv5, and YOLOv8. Our approach leverages the multi-modal data fusion capabilities of SuperYOLO to extract complementary information from diverse data sources while incorporating key advancements from YOLOv8. First, to improve efficiency, we introduce architectural modifications, we replaced the Conv6x6 layer with a 3x3 layer and substituted the C3 layer of SuperYOLO with the C2f layer of YOLOv8. These modifications aim to combine high-level features with contextual information and enhance its detection capabilities. Additionally, we propose the use of soft-NMS instead of non-maximum suppression (NMS) during the post-processing stage of the model. Soft-Nmsoffers improved object localization and reduces duplicate detections, thereby enhancing overall detection accuracy. Experimental evaluations conducted on the widely used VEDAI RS dataset affirm the effectiveness of our approach in achieving a good balance between accuracy and computational efficiency compared to state-of-the-art YOLO variants.
科研通智能强力驱动
Strongly Powered by AbleSci AI