计算机科学
突出
人工智能
对象(语法)
自然语言处理
标识
DOI:10.1109/tmm.2023.3325731
摘要
Salient Object Detection (SOD) is dominated by Encoder-Decoder networks which involve multi-scale feature fusion and multi-resolution dense supervision. It is prevalent yet problematic to interpolate feature maps or pool ground truth (GT) to fit the size of decoder stages in SOD. Structural properties are unavoidably damaged since pixels are discarded or changed during scaling, resulting in restoration difficulties and poor predictions. Second, it is intuitive and suboptimal to posit the last layer of an encoder as global context, even though it has been widely accepted that high-level encoder features contain global information that contributes to the overall shape of a SOD. To this end, this paper aims to enhance the abovementioned techniques for richer details and a more complete shape. First, we developed a Global Context Branch (GCB) which is a patch-wise supervised SOD on top of the encoder for better global context modeling. Second, we developed a Context Refinement Module (CRM) to improve high/low-level feature fusion and enhance detail reconstruction. Lastly, we adopt Pixel Shuffle (PS) when scaling features and GT maps to preserve structural information. Experiments demonstrated that our proposed framework achieved state-of-the-art performance among all five benchmark datasets against six related existing evaluation metrics.
科研通智能强力驱动
Strongly Powered by AbleSci AI