图像分割
计算机科学
人工智能
计算机视觉
医学影像学
图像融合
图像(数学)
分割
模式识别(心理学)
作者
Xichuan Zhou,Lingfeng Yan,Rui Ding,Chukwuemeka Clinton Atabansi,Jing Nie,Lihui Chen,Yu-Jie Feng,Haijun Liu
标识
DOI:10.1109/jbhi.2025.3561425
摘要
In recent times, leveraging lesion text as supplementary data to enhance the performance of medical image segmentation models has garnered attention. Previous approaches only used attention mechanisms to integrate image and text features, while not effectively utilizing the highly condensed textual semantic information in improving the fused features, resulting in inaccurate lesion segmentation. This paper introduces a novel approach, the Medical Image-Text Segment Anything Model (MIT-SAM), for text-assisted medical image segmentation. Specifically, we introduce the SAM-enhanced image encoder and a Bert-based text encoder to extract heterogeneous features. To better leverage the highly condensed textual semantic information for heterogeneous feature fusion, such as crucial details like position and quantity, we propose the image-text interactive fusion (ITIF) block and self-supervised text reconstruction (SSTR) method. The ITIF block facilitates the mutual enhancement of homogeneous information among heterogeneous features and the SSTR method empowers the model to capture crucial details concerning lesion text, including location, quantity, and other key aspects. Experimental results demonstrate that our proposed model achieves state-of-the-art performance on the QaTa-COV19 and MosMedData+ datasets. The code of MIT-SAM is available at https://github.com/jojodan514/MIT-SAM.
科研通智能强力驱动
Strongly Powered by AbleSci AI