计算机科学
编码器
分割
人工智能
变压器
图像分割
残余物
尺度空间分割
计算机视觉
模式识别(心理学)
算法
电压
工程类
操作系统
电气工程
作者
Li Li,Qiyuan Liu,Xinyi Shi,Yujia Wei,Huanqi Li,Hanguang Xiao
标识
DOI:10.1016/j.eswa.2023.121717
摘要
U-Net is a common segmentation model and achieves good segmentation results, but U-Net has a large semantic gap between the encoder and decoder. In addition, both high-frequency information (details like boundaries) and low-frequency information (shape, etc.) are critical for segmentation tasks. To address these problems, this work proposes a transformer-based model (UCFilTransNet). Specifically, UCFilTransNet designs a Cross-Filter Transformer (CFTrans) Block between the encoder and decoder to fuse the multi-scale feature information extracted from the encoder to improve segmentation accuracy. Considering the characteristics of various frequencies, UCFilTransNet redesigns the transformer structure from the standpoint of the frequency domain to enhance local information and build long-range dependencies. To augment the global information, UCFilTransNet develops a residual pyramid squeeze-excitation (RPSA) module in the bottleneck. This work demonstrates that UCFilTransNet outperforms state-of-the-art (SOTA) methods on two datasets. More importantly, it only has few model parameters (24.88M) and low computational complexity (19.71G). A large number of experiments prove that the proposed CFTrans and RPSA modules are effective for CT image segmentation.
科研通智能强力驱动
Strongly Powered by AbleSci AI