Text-Assisted Vision Model for Medical Image Segmentation

计算机视觉 计算机科学 图像分割 人工智能 分割 医学影像学 图像(数学) 计算机图形学(图像)
作者
Md. Motiur Rahman,Saeka Rahman,Smriti Bhatt,Miad Faezipour
出处
期刊:IEEE Journal of Biomedical and Health Informatics [Institute of Electrical and Electronics Engineers]
卷期号:: 1-14 被引量:1
标识
DOI:10.1109/jbhi.2025.3569491
摘要

Precise medical image segmentation is important for automating diagnosis and treatment planning in healthcare. While images present the most significant information for segmenting organs using deep learning models, text reports also provide complementary details that can be leveraged to improve segmentation precision. Performance improvement depends on the proper utilization of text reports and the corresponding images. Most attention modules focus on single-modality computation of spatial, channel, or pixel-level attention. They are ineffective in cross-modal alignment, raising issues in multi-modal scenarios. This study addresses these gaps by presenting a text-assisted vision (TAV) model for medical image segmentation with a novel attention computation module named triguided attention module (TGAM). TGAM computes visual-visual, language-language, and language-visual attention, enabling the model to understand the important features and correlation between images and medical notes. This module helps the model identify the relevant features within images, text annotations, and text annotations to visual interactions. We incorporate an attention gate (AG) that modulates the influence of TGAM, ensuring it does not overflow the encoded features with irrelevant or redundant information, while maintaining their uniqueness. We evaluated the performance of TAV on two popular datasets containing images and corresponding text annotations. We find TAV to be a new state-of-the-art model, as it improves the performance by 2-7% compared to other models. Extensive experiments were performed to demonstrate the effectiveness of each component of the proposed model. The code and datasets are available on Github1.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI

祝大家在新的一年里科研腾飞
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
Qing完成签到,获得积分10
5秒前
博姐37完成签到 ,获得积分10
8秒前
要减肥的冰姬完成签到,获得积分10
11秒前
心灵美草丛完成签到,获得积分10
15秒前
Xingkun_li完成签到,获得积分10
16秒前
大模型应助medowin采纳,获得10
17秒前
19秒前
21秒前
AH发布了新的文献求助30
22秒前
24秒前
mou发布了新的文献求助10
29秒前
斯文败类应助倩青春采纳,获得10
30秒前
31秒前
32秒前
36秒前
獭獭完成签到,获得积分20
36秒前
yyh123发布了新的文献求助10
37秒前
獭獭发布了新的文献求助10
40秒前
43秒前
思源应助复杂便当采纳,获得10
44秒前
45秒前
46秒前
小轶灿完成签到,获得积分20
46秒前
47秒前
ZZ完成签到,获得积分10
48秒前
倩青春发布了新的文献求助10
50秒前
迷路芷容完成签到 ,获得积分10
50秒前
yang完成签到,获得积分10
51秒前
nini发布了新的文献求助10
51秒前
子苓完成签到 ,获得积分10
55秒前
清川完成签到,获得积分10
58秒前
狂野飞柏完成签到 ,获得积分10
1分钟前
1分钟前
清川发布了新的文献求助10
1分钟前
donnolea完成签到 ,获得积分10
1分钟前
Moonboss完成签到 ,获得积分10
1分钟前
正直听白完成签到,获得积分10
1分钟前
1分钟前
太吾墨完成签到,获得积分0
1分钟前
1分钟前
高分求助中
(应助此贴封号)【重要!!请各用户(尤其是新用户)详细阅读】【科研通的精品贴汇总】 10000
Common Foundations of American and East Asian Modernisation: From Alexander Hamilton to Junichero Koizumi 600
Signals, Systems, and Signal Processing 510
Discrete-Time Signals and Systems 510
Jailing People With Mental Illness While Awaiting Commitment Hearings 500
T/SNFSOC 0002—2025 独居石精矿碱法冶炼工艺技术标准 300
The Impact of Lease Accounting Standards on Lending and Investment Decisions 250
热门求助领域 (近24小时)
化学 材料科学 生物 医学 工程类 计算机科学 有机化学 物理 生物化学 纳米技术 复合材料 内科学 化学工程 人工智能 催化作用 遗传学 数学 基因 量子力学 物理化学
热门帖子
关注 科研通微信公众号,转发送积分 5858940
求助须知:如何正确求助?哪些是违规求助? 6342919
关于积分的说明 15639349
捐赠科研通 4972830
什么是DOI,文献DOI怎么找? 2682401
邀请新用户注册赠送积分活动 1626067
关于科研通互助平台的介绍 1583294