计算机科学
人工智能
激光雷达
高光谱成像
特征提取
模式识别(心理学)
先验与后验
特征(语言学)
传感器融合
自然语言处理
遥感
哲学
语言学
认识论
地质学
作者
Mengxin Cao,Guixin Zhao,Guohua Lv,Aimei Dong,Ying Guo,Xiangjun Dong
标识
DOI:10.1109/tgrs.2023.3346935
摘要
The fusion classification of hyperspectral image (HSI) and light detection and ranging (LiDAR) data has gained widespread attention because of its ability to obtain more comprehensive spatial and spectral information. However, the heterogeneous gap between HSI and LiDAR data also adversely affects the classification performance. Despite the excellent performance of traditional multimodal fusion classification models, language information containing much linguistic priori knowledge to enrich visual representations needs to be addressed. Therefore, we design a Spectral-Spatial-Language fusion network (S 2 LFNet), which can fuse visual and language features to broaden the semantic space using linguistic priori knowledge commonly shared between spectral features and spatial features. First, we propose a dual-channel cascaded image fusion encoder (DCIFencoder) for visual feature extraction and progressive feature fusion of different levels for HSI and LiDAR data. Then, three aspects of Text data are designed to extract linguistic priori knowledge using the Text encoder. Finally, contrastive learning is utilized to construct a unified semantic space, and Spectral-Spatial-Language fusion features are obtained for classification tasks. We evaluate the classification performance of the proposed S 2 LFNet on three datasets through extensive experiments, and the results show that it outperforms the state-of-the-art fusion classification methods.
科研通智能强力驱动
Strongly Powered by AbleSci AI