土地覆盖
遥感
情态动词
封面(代数)
地理
合成孔径雷达
模式识别(心理学)
人工智能
地图学
计算机科学
上下文图像分类
土地利用
图像(数学)
工程类
材料科学
机械工程
土木工程
高分子化学
作者
Haodi Zhang,Anzhu Yu,Kuiliang Gao,Xuanbei Lu,Xuefeng Cao,Wenyue Guo,Weiqi Lian
标识
DOI:10.1080/17538947.2024.2447347
摘要
Land cover classification (LCC) is essential for monitoring land use and changes. This study examines the integration of optical (OPT) and synthetic aperture radar (SAR) images for precise LCC. The disparity between OPT and SAR images introduces challenges in fusing high-level semantic information and utilizing multi-scale features. To address these challenges, this paper proposes a novel multi-modal capsules model (M²Caps) incorporating multi-modal capsules learning and cascaded features fusion modules. The multi-modal capsules learning module models high-level semantic information and abstract relationships across diverse remote sensing images (RSIs) modalities as vectors, thereby facilitating the induction of joint multi-modal features with high discriminability and robustness. Subsequently, the cascaded features fusion module integrates various feature scales, concurrently processing deep multi modal features, shallow OPT features, and shallow SAR features at each layer. This approach ensures the precise characterization of both local details and global semantics. M²Caps outperformed state-of-the-art models, improving mean intersection over union (mIoU) by 2.86% – 12.9% on the WHU-OPT-SAR dataset and 3.91% – 12.3% on the GF-2 and GF-3 Pohang datasets, demonstrating its effectiveness in high-precision LCC in complex environments.
科研通智能强力驱动
Strongly Powered by AbleSci AI