亲爱的研友该休息了!由于当前在线用户较少,发布求助请尽量完整地填写文献信息,科研通机器人24小时在线,伴您度过漫漫科研夜!身体可是革命的本钱,早点休息,好梦!

A Multilevel Multimodal Fusion Transformer for Remote Sensing Semantic Segmentation

计算机科学 分割 融合 变压器 遥感 图像分割 人工智能 计算机视觉 模式识别(心理学) 地质学 工程类 电气工程 哲学 语言学 电压
作者
Xianping Ma,Xiaokang Zhang,Man-On Pun,Ming Liu
出处
期刊:IEEE Transactions on Geoscience and Remote Sensing [Institute of Electrical and Electronics Engineers]
卷期号:62: 1-15 被引量:159
标识
DOI:10.1109/tgrs.2024.3373033
摘要

Accurate semantic segmentation of remote sensing data plays a crucial role in the success of geoscience research and applications. Recently, multimodal fusion-based segmentation models have attracted much attention due to their outstanding performance as compared to conventional single-modal techniques. However, most of these models perform their fusion operation using convolutional neural networks (CNN) or the vision transformer (Vit), resulting in insufficient local-global contextual modeling and representative capabilities. In this work, a multilevel multimodal fusion scheme called FTransUNet is proposed to provide a robust and effective multimodal fusion backbone for semantic segmentation by integrating both CNN and Vit into one unified fusion framework. Firstly, the shallow-level features are first extracted and fused through convolutional layers and shallow-level feature fusion (SFF) modules. After that, deep-level features characterizing semantic information and spatial relationships are extracted and fused by a well-designed Fusion Vit (FVit). It applies Adaptively Mutually Boosted Attention (Ada-MBA) layers and Self-Attention (SA) layers alternately in a three-stage scheme to learn cross-modality representations of high inter-class separability and low intra-class variations. Specifically, the proposed Ada-MBA computes SA and Cross-Attention (CA) in parallel to enhance intra- and cross-modality contextual information simultaneously while steering attention distribution towards semantic-aware regions. As a result, FTransUNet can fuse shallow-level and deep-level features in a multilevel manner, taking full advantage of CNN and transformer to accurately characterize local details and global semantics, respectively. Extensive experiments confirm the superior performance of the proposed FTransUNet compared with other multimodal fusion approaches on two fine-resolution remote sensing datasets, namely ISPRS Vaihingen and Potsdam. The source code in this work is available at https://github.com/sstary/SSRS.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
闪闪的晓丝完成签到 ,获得积分10
2秒前
香蕉觅云应助大宝君采纳,获得10
6秒前
量子星尘发布了新的文献求助20
7秒前
王润萌完成签到,获得积分10
8秒前
隐形萃完成签到 ,获得积分10
9秒前
科研通AI2S应助科研通管家采纳,获得10
9秒前
9秒前
彭于晏应助科研通管家采纳,获得10
9秒前
科研通AI2S应助科研通管家采纳,获得10
9秒前
优美紫槐应助科研通管家采纳,获得10
9秒前
李爱国应助QianYang采纳,获得10
11秒前
俊逸的问薇完成签到 ,获得积分10
11秒前
12秒前
13秒前
隐形曼青应助常青采纳,获得10
14秒前
星辰大海应助危机的尔琴采纳,获得20
15秒前
18秒前
19秒前
Betty完成签到,获得积分10
20秒前
24秒前
27秒前
doctor2023完成签到,获得积分10
29秒前
抹茶苔藓完成签到,获得积分10
29秒前
30秒前
烟花应助溜溜梅采纳,获得50
33秒前
优美的莹芝完成签到,获得积分10
33秒前
36秒前
41秒前
刘瀚臻发布了新的文献求助10
44秒前
小阿博完成签到,获得积分10
44秒前
Cwin发布了新的文献求助20
44秒前
Cherish完成签到,获得积分10
47秒前
47秒前
尊敬的语薇完成签到,获得积分10
48秒前
科研通AI2S应助刘瀚臻采纳,获得10
49秒前
mwm完成签到 ,获得积分10
51秒前
Akim应助尊敬的语薇采纳,获得30
52秒前
科研通AI2S应助Cherish采纳,获得10
54秒前
大宝君发布了新的文献求助10
54秒前
55秒前
高分求助中
(应助此贴封号)【重要!!请各用户(尤其是新用户)详细阅读】【科研通的精品贴汇总】 10000
Basic And Clinical Science Course 2025-2026 3000
人脑智能与人工智能 1000
花の香りの秘密―遺伝子情報から機能性まで 800
Principles of Plasma Discharges and Materials Processing, 3rd Edition 400
Signals, Systems, and Signal Processing 400
4th edition, Qualitative Data Analysis with NVivo Jenine Beekhuyzen, Pat Bazeley 300
热门求助领域 (近24小时)
化学 材料科学 生物 医学 工程类 计算机科学 有机化学 物理 生物化学 纳米技术 复合材料 内科学 化学工程 人工智能 催化作用 遗传学 数学 基因 量子力学 物理化学
热门帖子
关注 科研通微信公众号,转发送积分 5611827
求助须知:如何正确求助?哪些是违规求助? 4695978
关于积分的说明 14890100
捐赠科研通 4727293
什么是DOI,文献DOI怎么找? 2545926
邀请新用户注册赠送积分活动 1510337
关于科研通互助平台的介绍 1473236