计算机科学
分割
人工智能
编码器
增采样
特征学习
判别式
变压器
特征(语言学)
棱锥(几何)
模式识别(心理学)
计算机视觉
图像分割
图像(数学)
语言学
哲学
物理
量子力学
电压
光学
操作系统
作者
Na Ta,Haipeng Chen,Xianzhu Liu,Nanguo Jin
标识
DOI:10.1007/s00530-023-01165-z
摘要
Abstract Medical image segmentation has attracted increasing attention due to its practical clinical requirements. However, the prevalence of small targets still poses great challenges for accurate segmentation. In this paper, we propose a novel locally enhanced transformer network (LET-Net) that combines the strengths of transformer and convolution to address this issue. LET-Net utilizes a pyramid vision transformer as its encoder and is further equipped with two novel modules to learn more powerful feature representation. Specifically, we design a feature-aligned local enhancement module, which encourages discriminative local feature learning on the condition of adjacent-level feature alignment. Moreover, to effectively recover high-resolution spatial information, we apply a newly designed progressive local-induced decoder. This decoder contains three cascaded local reconstruction and refinement modules that dynamically guide the upsampling of high-level features by their adaptive reconstruction kernels and further enhance feature representation through a split-attention mechanism. Additionally, to address the severe pixel imbalance for small targets, we design a mutual information loss that maximizes task-relevant information while eliminating task-irrelevant noises. Experimental results demonstrate that our LET-Net provides more effective support for small target segmentation and achieves state-of-the-art performance in polyp and breast lesion segmentation tasks.
科研通智能强力驱动
Strongly Powered by AbleSci AI