Breaking Barriers, Localizing Saliency: A Large-Scale Benchmark and Baseline for Condition-Constrained Salient Object Detection

计算机科学突出人工智能目标检测计算机视觉水准点（测量）对象（语法）解码方法光学（聚焦）构造（python库）模式识别（心理学）约束（计算机辅助设计）领域（数学）先验概率特征提取变更检测视频跟踪隐马尔可夫模型姿势基线（sea）视觉对象识别的认知神经科学机器学习编码桥（图论）

作者

Runmin Cong,Zhiyang Chen,Hao Fang,Sam Kwong,Wei Zhang

出处

期刊：IEEE Transactions on Pattern Analysis and Machine Intelligence [IEEE Computer Society]
日期：2025-12-11 卷期号：48 (4): 4167-4183 被引量：1

链接

nih.govdoi.org

标识

DOI：10.1109/tpami.2025.3642893

摘要

Salient Object Detection (SOD) aims to identify and segment the most prominent objects in an image. In real open environments, intelligent systems often encounter complex and challenging scenes, such as low-light, rain, snow, etc., which we call constrained conditions. These real situations pose more severe challenges to existing SOD models. However, there is no comprehensive and in-depth exploration of this field at both the data and model levels, and most of them focus on ideal situations or a single condition. To bridge this gap, we launch a new task, Condition-Constrained Salient Object Detection (CSOD), aimed at robustly and accurately locating salient objects in constrained environments. On the one hand, to compensate for the lack of datasets, we construct the first large-scale condition-constrained salient object detection dataset CSOD10 K, comprising 10,000 pixel-level annotated images and over 100 categories of salient objects. This dataset is oriented towards the real environment and includes 8 real-world constrained scenes under 3 main constraint types, making it extremely challenging. On the other hand, we abandon the paradigm of "restoration before detection" and instead introduce a unified end-to-end framework CSSAM that fully explores scene attributes, eliminating the need for additional ground-truth restored images and reducing computational overhead. Specifically, we design a Scene Prior-Guided Adapter (SPGA), which injects scene priors to enable the foundation model to better adapt to downstream constrained scenes. To automatically decode salient objects, we propose a Hybrid Prompt Decoding Strategy (HPDS), which can effectively integrate multiple types of prompts to achieve adaptation to the SOD task. Extensive experiments show that our model significantly outperforms state-of-the-art methods on both the CSOD10 K dataset and existing standard SOD benchmarks.

求助该文献

最长约 10秒，即可获得该文献文件

Breaking Barriers, Localizing Saliency: A Large-Scale Benchmark and Baseline for Condition-Constrained Salient Object Detection

今日热心研友