高光谱成像
计算机科学
人工智能
桥接(联网)
模式识别(心理学)
变压器
融合
计算机视觉
工程类
计算机网络
语言学
哲学
电压
电气工程
作者
Fulin Xu,Shaohui Mei,Ge Zhang,Nan Wang,Qian Du
标识
DOI:10.1109/tgrs.2024.3419266
摘要
Feature representation is crucial for hyperspectral image (HSI) classification. However, existing convolutional neural network (CNN)-based methods are limited by the convolution kernel and only focus on local features, which causes it to ignore the global properties of HSIs. Transformer-based networks can make up for the limitations of CNNs because they emphasize the global features of HSIs. How to combine the advantages of these two networks in feature extraction is of great importance in improving classification accuracy. Therefore, a cross-attention fusion network bridging CNN and Transformer (CAF-Former) is proposed, which can fully utilize the advantages of CNN in local features and Transformer’s long time-dependent feature learning for hyperspectral classification. In order to fully explore the local and global information within an HSI, a Dynamic-CNN branch is proposed to effectively encode local features of pixels, while a Gaussian Transformer branch is constructed to accurately model the global features and long-range dependencies. Moreover, in order to fully interact with local and global features, a cross-attention fusion (CAF) module is proposed as a bridge to fuse the features extracted by the two branches. Experiments over several benchmark datasets demonstrate that the proposed CAF-Former significantly outperforms both CNN-based and Transformer-based state-of-the-art networks for HSI classification.
科研通智能强力驱动
Strongly Powered by AbleSci AI