计算机科学
分割
判别式
可扩展性
卷积神经网络
人工智能
变压器
帕斯卡(单位)
模式识别(心理学)
机器学习
量子力学
数据库
物理
电压
程序设计语言
作者
Chunmeng Liu,Yao Shen,Qingguo Xiao,Guangyao Li
出处
期刊:Neurocomputing
[Elsevier BV]
日期:2024-08-01
卷期号:593: 127834-127834
标识
DOI:10.1016/j.neucom.2024.127834
摘要
Generating initial seeds is an important step in weakly supervised semantic segmentation (WSSS). Our approach concentrates on generating and refining initial seeds. The convolutional neural networks (CNNs)–based initial seeds focus only on the most discriminative regions and lack global information about the target. The Vision Transformer (ViT)–based approach can capture long-range feature dependencies due to the unique advantage of the self-attention mechanism. Still, we find that it suffers from distractor object leakage and background leakage problems. Based on these observations, we propose PCSformer, which improves the model's ability to extract features through a Pair-wise Cross-scale (PC) strategy and solves the problem of distractor object leakage by further extracting potential target features through Sub-Prototypes (SP) mining. In addition, the proposed Conflict Self-Elimination (CSE) module further alleviates the background leakage problem. We validate our approach on the widely adopted Pascal VOC 2012 and MS COCO 2014, and extensive experiments demonstrate our superior performance. Furthermore, our method proves to be competitive for WSSS in medical images and challenging scenarios involving deformable and cluttered scenes. Additionally, we extend the PCSformer to weakly supervised object localization tasks, further highlighting its scalability and versatility. The code is available at https://github.com/ChunmengLiu1/PCSformer.
科研通智能强力驱动
Strongly Powered by AbleSci AI