计算机视觉
图像分割
人工智能
计算机科学
基础(证据)
分割
图像(数学)
地理
考古
作者
Jiarun Liu,Hao Yang,Hong-Yu Zhou,Lequan Yu,Yong Liang,Yizhou Yu,Shaoting Zhang,Hairong Zheng,Shanshan Wang
标识
DOI:10.1109/tmi.2024.3508698
摘要
Vision foundation models have shown great potential in improving generalizability and data efficiency, especially for medical image segmentation since medical image datasets are relatively small due to high annotation costs and privacy concerns. However, current research on foundation models predominantly relies on transformers. The high quadratic complexity and large parameter counts make these models computationally expensive, limiting their potential for clinical applications. In this work, we introduce Swin-UMamba†, a novel Mamba-based model for medical image segmentation that seamlessly leverages the power of the vision foundation model, which is also computationally efficient with the linear complexity of Mamba. Moreover, we investigated and verified the impact of the vision foundation model on medical image segmentation, in which a self-supervised model adaptation scheme was designed to bridge the gap between natural and medical data. Notably, Swin-UMamba† outperforms 7 state-of-the-art methods, including CNN-based, transformer-based, and Mamba-based approaches across AbdomenMRI, Encoscopy, and Microscopy datasets. The code and models are publicly available at: https://github.com/JiarunLiu/Swin-UMamba.
科研通智能强力驱动
Strongly Powered by AbleSci AI