计算机科学
Boosting(机器学习)
分割
人工智能
特征学习
卷积神经网络
特征(语言学)
卷积(计算机科学)
模式识别(心理学)
冗余(工程)
图像分割
人工神经网络
哲学
语言学
操作系统
作者
Qingqing Yan,Shu Li,Chengju Liu,Ming Liu,Qijun Chen
标识
DOI:10.1109/icra48891.2023.10161421
摘要
This paper proposes a novel real-time semantic segmentation network via frequency domain learning, called FDLNet, which revisits the segmentation task from two critical perspectives: spatial structure description and multilevel feature fusion. We first devise an image-size convolution (IS-Conv) as a global frequency-domain learning operator to capture long-range dependency in a single shot. To model spatial structure information, we construct the global structure representation path (GSRP) based on IS-Conv, which learns a unified edge-region representation with affordable complexity. For efficient and lightweight multi-level feature fusion, we propose the factorized stereoscopic attention (FSA) module, which alleviates semantic confusion and reduces feature redundancy by introducing level-wise attention before channel and spatial attention. Combining the above modules, we propose a concise semantic segmentation framework named FDLNet. We experimentally demonstrate the effectiveness and superiority of the proposed method. FDLNet achieves state-of-the-art performance on the Cityscapes, which reports 76.32% mIoU at 150+ FPS and 79.0% mIoU at 41+ FPS. The code is available at https://github.com/qyan0131/FDLNet.
科研通智能强力驱动
Strongly Powered by AbleSci AI