计算机科学
判别式
人工智能
合成孔径雷达
互补性(分子生物学)
深度学习
机器学习
模式识别(心理学)
数据挖掘
遗传学
生物
作者
Wen-Shuai Hu,Wei Li,Heng-Chao Li,Fenghua Huang,Ran Tao
标识
DOI:10.1109/tnnls.2024.3406735
摘要
Multisource remote sensing data classification is a challenging research topic, and how to address the inherent heterogeneity between multimodal data while exploring their complementarity is crucial. Existing deep learning models usually directly adopt feature-level fusion designs, most of which, however, fail to overcome the impact of heterogeneity, limiting their performance. As such, a multimodal joint classification framework, called global clue-guided cross-memory quaternion transformer network (GCCQTNet), is proposed for multisource data [i.e., hyperspectral image (HSI) and synthetic aperture radar (SAR)/light detection and ranging (LiDAR)] classification. First, a three-branch structure is built to extract the local and global features, where an independent squeeze-expansion-like fusion (ISEF) structure is designed to update the local and global representations by considering the global information as an agent, suppressing the negative impact of multimodal heterogeneity layer by layer. A cross-memory quaternion transformer (CMQT) structure is further constructed to model the complex inner relationships between the intramodality and intermodality features to capture more discriminative fusion features that fully characterize multimodal complementarity. Finally, a cross-modality comparative learning (CMCL) structure is developed to impose the consistency constraint on global information learning, which, in conjunction with a classification head, is used to guide the end-to-end training of GCCQTNet. Extensive experiments on three public multisource remote sensing datasets illustrate the superiority of our GCCQTNet with regards to other state-of-the-art methods.
科研通智能强力驱动
Strongly Powered by AbleSci AI