遥感
运动检测
红外线的
计算机视觉
计算机科学
人工智能
运动(物理)
光学
地质学
物理
作者
Yuanxin Huang,Xiyang Zhi,Jianming Hu,Lijian Yu,Qichao Han,Wenbin Chen,Wei Zhang
标识
DOI:10.1109/tgrs.2024.3502663
摘要
In temporal infrared small target detection, it is crucial to leverage the disparities in spatiotemporal characteristics between the target and the background to distinguish the former. However, remote imaging and the relative motion between the detection platform and the background cause significant coupling of spatiotemporal characteristics, making target detection highly challenging. To address these challenges, we propose a network named LMAFormer. First, we introduce a local motion-aware spatiotemporal attention mechanism that aligns and enhances multiframe features to extract local spatiotemporal salient features of targets while avoiding interference from moving backgrounds. Second, we employ a multiscale fusion transformer encoder that computes self-attention weights across and within scales during encoding, to establish multiscale correlations among different regions of temporal images, enabling motion background modeling. Last, we propose a multiframe joint query decoder. The shallowest feature map after multiscale feature propagation is mapped to initial query weights, which are refined through grouped convolutions to generate grouped query vectors. These are jointly optimized to encapsulate rich multiframe details, strengthening motion background modeling and target feature representation, improving prediction accuracy. Experimental results on the NUDT-MIRSDT, IRDST, and the established TSIRMT datasets demonstrate that our network outperforms state-of-the-art (SOTA) methods. Our code and dataset will be available at https://github.com/lifier/LMAFormer.
科研通智能强力驱动
Strongly Powered by AbleSci AI