Remote Sensing Small Object Detection Network Based on Multi-Scale Feature Extraction and Information Fusion

计算机科学特征（语言学）判别式水准点（测量）目标检测人工智能特征提取数据挖掘骨干网对象（语法）模式识别（心理学）大地测量学计算机网络语言学哲学地理

作者

Junsuo Qu,Tong Liu,Zongbing Tang,Yifei Duan,H Yao,Jiao Hu

出处

期刊：Remote Sensing [MDPI AG]
日期：2025-03-05 卷期号：17 (5): 913-913 被引量：2

链接

mdpi.com mdpi.com doaj.orgdoi.org

标识

DOI：10.3390/rs17050913

摘要

Nowadays, object detection algorithms are widely used in various scenarios. However, there are further small object detection requirements in some special scenarios. Due to the problems related to small objects, such as their less available features, unbalanced samples, higher positioning accuracy requirements, and fewer data sets, a small object detection algorithm is more complex than a general object detection algorithm. The detection effect of the model for small objects is not ideal. Therefore, this paper takes YOLOXs as the benchmark network and enhances the feature information on small objects by improving the network’s structure so as to improve the detection effect of the model for small objects. This specific research is presented as follows: Aiming at the problem of a neck network based on an FPN and its variants being prone to information loss in the feature fusion of non-adjacent layers, this paper proposes a feature fusion and distribution module, which replaces the information transmission path, from deep to shallow, in the neck network of YOLOXs. This method first fuses and extracts the feature layers used by the backbone network for prediction to obtain global feature information containing multiple-size objects. Then, the global feature information is distributed to each prediction branch to ensure that the high-level semantic and fine-grained information are more efficiently integrated so as to help the model effectively learn the discriminative information on small objects and classify them correctly. Finally, after testing on the VisDrone2021 dataset, which corresponds to a standard image size of 1080p (1920 × 1080), the resolution of each image is high and the video frame rate contained in the dataset is usually 30 frames/second (fps), with a high resolution in time, it can be used to detect objects of various sizes and for dynamic object detection tasks. And when we integrated the module into a YOLOXs network (named the FE-YOLO network) with the three improvement points of the feature layer, channel number, and maximum pool, the mAP and APs were increased by 1.0% and 0.8%, respectively. Compared with YOLOV5m, YOLOV7-Tiny, FCOS, and other advanced models, it can obtain the best performance.

求助该文献

最长约 10秒，即可获得该文献文件

Remote Sensing Small Object Detection Network Based on Multi-Scale Feature Extraction and Information Fusion

今日热心研友