计算机科学
分割
人工智能
弹丸
自然语言处理
模式识别(心理学)
有机化学
化学
作者
Jiaqi Ma,Guo-Sen Xie,Fang Zhao,Zechao Li
标识
DOI:10.1109/tmm.2025.3535348
摘要
Few-shot learning aims to recognize novel concepts by leveraging prior knowledge learned from a few samples. However, for visually intensive tasks such as few-shot semantic segmentation, pixel-level annotations are time-consuming and costly. Therefore, in this paper, we utilize the more challenging image-level annotations and propose an adaptive frequency-aware network (AFANet) for weakly-supervised few-shot semantic segmentation (WFSS). Specifically, we first propose a cross-granularity frequency-aware module (CFM) that decouples RGB images into high-frequency and low-frequency distributions and further optimizes semantic structural information by realigning them. Unlike most existing WFSS methods using the textual information from the multi-modal language-vision model, e.g., CLIP, in an offline learning manner, we further propose a CLIP-guided spatial-adapter module (CSM), which performs spatial domain adaptive transformation on textual information through online learning, thus providing enriched cross-modal semantic information for CFM. Extensive experiments on the Pascal-5i and COCO-20i datasets demonstrate that AFANet has achieved state-of-the-art performance.
科研通智能强力驱动
Strongly Powered by AbleSci AI