计算机科学
特征提取
人工智能
模式识别(心理学)
特征(语言学)
特征选择
编码器
卷积神经网络
计算机视觉
领域(数学)
特征学习
面子(社会学概念)
边缘检测
特征检测(计算机视觉)
GSM演进的增强数据速率
目标检测
分割
卷积码
特征模型
特征向量
频道(广播)
图像分割
作者
Ruimin Huang,Jun Huang,Yong Ma,Fan Fan,Yiming Zhu
标识
DOI:10.1109/tgrs.2025.3638454
摘要
Infrared small target detection (IRSTD) has significantly benefited from UNet-based neural models in recent years. However, current methodologies face challenges in achieving optimal compromise between missed detections and false alarms. To overcome this limitation, we rethink the role of each structural component within UNet-based architectures applied for IRSTD. Accordingly, we conceptualize the UNet’s encoder as specializing in feature extraction, the skip connections in feature selection, and the decoder in fusion-based reconstruction. Building upon these conceptualizations, we propose the DSTransNet. Within the feature extraction stage, the edge shape receptive field (ESR) module enhances edge and shape feature extraction and expands the receptive field via multiple convolutional branches, thereby reducing missed detections. At the feature selection stage, the reliable dynamic selection filtering (RDSF) module employs dynamic feature selection, leveraging encoder-based self-attention and decoder-based cross-attention of the Transformer to suppress background features resembling small targets and mitigate false alarms. During the feature fusion-based reconstruction stage, the cross-attention of spaces and channels (CSCE) module emphasizes small target features via spatial and channel cross-attention, reconstructing more accurate multi-scale detection masks. Extensive experiments on the SIRST, NUDT-SIRST, and SIRST-Aug datasets demonstrate that the proposed DSTransNet method outperforms state-of-the-art IRSTD approaches. The code is available at https://github.com/RuiminHuang/DSTransNet.
科研通智能强力驱动
Strongly Powered by AbleSci AI