感知器
计算机科学
人工智能
多层感知器
模式识别(心理学)
融合
图层(电子)
网(多面体)
红外线的
人工神经网络
遥感
地质学
数学
光学
物理
材料科学
哲学
语言学
复合材料
几何学
作者
Zhishe Wang,Chunfa Wang,Xiaosong Li,Chaoqun Xia,Jiawei Xu
标识
DOI:10.1109/tgrs.2024.3515648
摘要
Infrared small target detection (IRSTD) faces various challenges such as long distances, weak features, and small scales. While methodologies based on convolutional neural networks (CNNs) have made strides, they are inherently hampered by a bias toward local reduction, limiting their global interpretive power. Conversely, Transformer-based approaches, though capable of capturing long-range dependencies, struggle with computational inefficiencies due to their quadratic complexity. To surmount these challenges, this article presents MLP-Net, a novel multilayer perceptron (MLP) fusion network for IRSTD. The architecture combines the advantages of CNNs and MLPs to capture global semantic information from local features and significantly enhance feature representation. Additionally, we develop a parallel token interaction mixer (PTIM) that processes the token representations with direction-specific interactive information across the height, width, and channel dimensions on MLPs, dynamically reinforcing the ability of long-range dependency modeling. Complementing this, we devise a contextual selection fusion module (CSFM) to gradually aggregate high-level semantics and low-level details from coarse to fine. This module integrates the complementary characteristics of different layers to promote detection accuracy. Finally, comprehensive experiments on the NUAA-SIRST, NUDT-SIRST, and IRSTD-1K benchmarks demonstrate that the proposed MLP-Net delivers promising detection performance, transcending other state-of-the-art alternatives. The relevant codes will be available at https://github.com/Zhishe-Wang/MLP-Net.
科研通智能强力驱动
Strongly Powered by AbleSci AI