计算机科学
人工智能
计算机视觉
图像分割
分割
图像(数学)
模式识别(心理学)
计算机图形学(图像)
作者
Pengchen Liang,Lei Shi,Bin Pu,Renkai Wu,Jianguo Chen,Lixin Zhou,Liming Xu,Zhuangzhuang Chen,Qing Chang,Yiwei Li
标识
DOI:10.1109/jbhi.2025.3544548
摘要
The Segment Anything Model (SAM) has shown exceptional versatility in segmentation tasks across various natural image scenarios. However, its application to medical image segmentation poses significant challenges due to the intricate anatomical details and domain-specific characteristics inherent in medical images. To address these challenges, we propose a novel VMamba adapter framework that integrates a lightweight, trainable Visual Mamba (VMamba) branch with the pre-trained SAM ViT encoder. The VMamba adapter accurately captures multi-scale contextual correlations, integrates global and local information, and reduces ambiguities arising from local features only. Specifically, we propose a novel cross-branch attention (CBA) mechanism to facilitate effective interaction between the SAM and VMamba branches. This mechanism enables the model to learn and adapt more efficiently to the nuances of medical images, extracting rich, complementary features that enhance its representational capacity. Beyond architectural enhancements, we streamline the segmentation workflow by eliminating the need for prompt-driven input mechanisms. This results in an autonomous prediction model that reduces manual input requirements and improves operational efficiency. In addition, our method introduces only minimal additional trainable parameters, offering an efficient solution for medical image segmentation. Extensive evaluations of four medical image datasets demonstrate that our VMamba adapter framework achieves state-of-the-art performance. Specifically, on the ACDC dataset with limited training data, our method achieves an average Dice coefficient improvement of 0.18 and reduces the Hausdorff distance by 20.38 mm compared to the AutoSAM.
科研通智能强力驱动
Strongly Powered by AbleSci AI