计算机科学
判别式
正确性
突出
模式(遗传算法)
人工智能
成对比较
骨料(复合)
背景(考古学)
特征(语言学)
目标检测
机器学习
模式识别(心理学)
算法
复合材料
材料科学
古生物学
哲学
生物
语言学
标识
DOI:10.1016/j.engappai.2022.105733
摘要
Video salient object detection (VSOD), aiming to detect the most conspicuous objects or regions in a video, has become an important research topic over the past few years. Preliminary studies mainly focus on spatial–temporal architecture that heavily relies on implicit attention model to aggregate complementary information from adjacent video frames. Despite the remarkable improvements, existing approaches pay little attention to cross-video affinities, which is important to build explicit attention schema for VSOD. To this end, we propose a novel attention correctness strategy to supervise the aggregation process. Specifically, different from previous works, we employ pairwise training schema, leveraging both positive and negative aggregation supervision to explore inter-video affinities for VSOD. The proposed mechanism successfully suppresses negative correspondence for video frames and reinforces discriminative feature mining for conspicuous objects. To enhance intra-video correspondence, we propose part-aware similarity aggregation module that helps intra-video affinities to segment the salient objects with video-level context. Extensive experiments are conducted on six popular benchmarks, including FBMS, DAVIS, DAVSOD, SegTrack-V2, VOS and ViS. Experimental results on challenging scenes (i.e., for DAVSOD-T, we achieve an improvement of 0.4% for MAE, 1.1% for maximum F-measure and 0.5% for S-measure compared with other competitive models) demonstrate the effectiveness of our proposed method.
科研通智能强力驱动
Strongly Powered by AbleSci AI