计算机科学
图像分割
变压器
计算机视觉
人工智能
计算机图形学(图像)
分割
图像处理
图像(数学)
电气工程
工程类
电压
作者
Siyavash Shabani,Muhammad Sohaib,Sayed A. Mohamed,Bahram Parvin
标识
DOI:10.1109/isbi60581.2025.10981294
摘要
Vision Transformers have outperformed traditional convolution-based frameworks across various visual tasks, including, but not limited to, the segmentation of 3D medical images. To further advance this area, this study introduces the Coupled Swin Transformers and Multi-Apertures Networks (CSTA-Net), which integrates the outputs of each Swin Transformer with an Aperture Network. Each aperture network consists of a convolution and a fusion block for combining global and local feature maps. The proposed model has been tested on two independent datasets to show that fine details are delineated. The proposed architecture was trained on the Synapse multi-organ and ACDC datasets to conclude an average Dice score of 90.19±0.05 and 93.77±0.04, respectively. The code is available here: https://github.com/Siyavashshabani/CSTANet.
科研通智能强力驱动
Strongly Powered by AbleSci AI