光学(聚焦)
最小边界框
计算机科学
块(置换群论)
卷积(计算机科学)
跳跃式监视
模式识别(心理学)
特征(语言学)
融合
传感器融合
哈尔小波转换
计算机视觉
目标检测
测距
小波
补语(音乐)
特征提取
小波变换
高斯分布
人工智能
空间分析
高光谱成像
注意力网络
卷积神经网络
钥匙(锁)
判别式
空间语境意识
高斯过程
位置感知
假警报
作者
Weijie Xu,Zhenxing Ding,Ziheng Wang,Zhiqing Cui,Yifan Hu,Feng Jiang
标识
DOI:10.1109/tgrs.2025.3612417
摘要
Infrared small target detection (IRSTD) remains challenging due to the extremely low signal-to-noise ratio (SNR). Existing methods struggle to balance accuracy and speed, especially under limited computational resources. To address these issues, we propose the frequency-spatial contextual fusion network (FSCFNet) based on You Only Look Once (YOLO) v10n architecture. Particularly, the novel frequency-spatial convolution (FSConv) is designed that decomposes input features via Haar Wavelet Transform. High-frequency cues focus on local details to highlight small targets, while low-frequency cues provide global information to complement spatial features. Subsequently, the asymmetric cross-domain attention (ACA) is developed to enhance the local central feature extraction, which reflects the typical spatial Gaussian pattern of small targets. Furthermore, we introduce the customized multi-scale receptive contextual Block (MRCB) to capture the long-range information by leveraging diverse dilated convolutions. In addition, the Wasserstein Distance Loss (WDL) is utilized to improve bounding box quality. Extensive experiments on three public datasets including IRSTD-1k, NUDT-SIRST, and NUAA-SIRST confirm the effectiveness of FSCFNet. Notably, FSCFNet surpasses the baseline by 4.7% in precision, 3.3% in recall, and 3.9% in AP@50 on IRSTD-1k, with only a 3.6% increase in parameters. FSCFNet provides a robust solution for real-time infrared surveillance systems under resource-constrained environments. More comparisons are shown in Fig. 1.
科研通智能强力驱动
Strongly Powered by AbleSci AI