计算机科学
分割
人工智能
感知
计算机视觉
空间频率
模式识别(心理学)
物理
神经科学
光学
生物
作者
Shuli Cheng,Jiang Liu,Anyu Du
标识
DOI:10.1016/j.eswa.2025.127760
摘要
Remote sensing semantic segmentation , as a current research hotspot, is widely applied in scenarios such as agricultural planning and urban construction planning . Deep learning models based on spatial and frequency domain co-perception have rapidly developed due to their comprehensive perception capabilities. However, current networks primarily apply frequency domain perception within various attention mechanisms , lacking complementary learning with the spatial domain. In addition, whether the encoder–decoder pairing in existing U-shaped architectures can fully exploit the advantages of dual-domain perception requires further discussion. To address these issues, this paper proposes the Spatial and Learnable Frequency Dynamic Collaborative Visual Perception Network (SFCVPNet). The encoder uses a cascade of Spatial and Frequency Domain-Aware Transformer (SFFormer) blocks and Convolutional Neural Networks (CNNs) blocks to perceive features, while the decoder employs simple skip connections, forming a novel integration method of CNNs and Transformer blocks in the U-shaped architecture. SFFormer includes the Spatial and Frequency Domain Collaborative Perception Module (SFCPM) and the Spatial and Frequency Guided Multi-Layer Perceptron (SF-MLP). Both SFCPM and SF-MLP utilize a dual-domain perception approach to extract and integrate features. Additionally, we adopt learnable operations for frequency domain feature perception, ensuring that frequency domain features meet the learning requirements of the network model. On three public datasets: ISPRS Vaihingen, ISPRS Potsdam, and LoveDA, respectively, we achieved mIoU scores of 85.90%, 88.19%, and 55.6%. The code will be released at https://github.com/cslxju/SFCVPNet .
科研通智能强力驱动
Strongly Powered by AbleSci AI