计算机科学
图像分割
人工智能
计算机视觉
分割
医学影像学
图像(数学)
作者
Diwei Su,Jianxu Luo,Cheng Fei
标识
DOI:10.1109/jbhi.2024.3374780
摘要
Accurate medical image segmentation is an essential part of the medical image analysis process that provides detailed quantitative metrics. In recent years, extensions of classical networks such as UNet have achieved state-of-the-art performance on medical image segmentation tasks. However, the high model complexity of these networks limits their applicability to devices with constrained computational resources. To alleviate this problem, we propose a shallow hierarchical Transformer for medical image segmentation, called SHFormer. By decreasing the number of transformer blocks utilized, the model complexity of SHFormer can be reduced to an acceptable level. To improve the learned attention while keeping the structure lightweight, we propose a spatial-channel connection module. This module separately learns attention in the spatial and channel dimensions of the feature while interconnecting them to produce more focused attention. To keep the decoder lightweight, the MLP-D module is proposed to progressively fuse multi-scale features in which channels are aligned using Multi-Layer Perceptron (MLP) and spatial information is fused by convolutional blocks. We first validated the performance of SHFormer on the ISIC-2018 dataset. Compared to the latest network, SHFormer exhibits comparable performance with 15 times fewer parameters, 30 times lower computational complexity and 5 times higher inference efficiency. To test the generalizability of SHFormer, we introduced the polyp dataset for additional testing. SHFormer achieves comparable segmentation accuracy to the latest network while having lower computational overhead.
科研通智能强力驱动
Strongly Powered by AbleSci AI