高光谱成像
计算机科学
人工智能
多光谱图像
背景(考古学)
模式识别(心理学)
无监督学习
卷积(计算机科学)
编码器
计算机视觉
特征(语言学)
图像融合
保险丝(电气)
卷积神经网络
自编码
特征提取
特征学习
空间语境意识
空间分析
传感器融合
一致性(知识库)
上下文图像分类
遥感
图像(数学)
领域(数学)
冗余(工程)
图像分辨率
卷积码
多模态
图像处理
图像分割
作者
Yuanchao Su,Sheng Li,Yicong Zhou,Lianru Gao,Mengying Jiang,Xu Sun,Haiwei Li,Enke Hou
标识
DOI:10.1109/tgrs.2025.3636047
摘要
Multimodal fusion widely uses convolutional layers to capture local correlations and adjust feature dimensions. However, the progressive expansion of the receptive field in convolutional layers often compromises spatial context retention, leading to the loss of fine details. Furthermore, the fixed-size kernels typically used in standard convolution restrict the network’s ability to capture multiscale contextual details. To address this limitation, this paper develops a dilated transformation-guided unsupervised multimodal learning (DTUML) method to fuse a high-resolution multispectral image (HR-MSI) and a low-resolution hyperspectral image (LR-HSI), thereby generating a high-resolution hyperspectral image (HR-HSI). Our DTUML adopts a dual-stream encoder architecture to conduct multimodal data, where one stream focuses on preserving spectral information from LR-HSIs, while the other emphasizes the acquisition of spatial details from HR-MSIs. These complementary features are subsequently integrated to ensure spectral fidelity and retain spatial detail. Then, a convolutional layer restores dimensional consistency and outputs an HR-HSI. Extensive experiments demonstrate the effectiveness of DTUML, showing superior performance and strong competitiveness compared to state-of-the-art methods. Code: https://github.com/yuanchaosu/TGRS-DTUML.
科研通智能强力驱动
Strongly Powered by AbleSci AI