计算机科学
光流
人工智能
分割
计算机视觉
编码器
图形
场景图
解析
模式识别(心理学)
理论计算机科学
图像(数学)
渲染(计算机图形)
操作系统
作者
Gensheng Pei,Fumin Shen,Yazhou Yao,Tao Chen,Xian-Sheng Hua,Heng Tao Shen
出处
期刊:IEEE transactions on image processing
[Institute of Electrical and Electronics Engineers]
日期:2023-01-01
卷期号:32: 5909-5920
标识
DOI:10.1109/tip.2023.3326395
摘要
The optical flow guidance strategy is ideal for obtaining motion information of objects in the video. It is widely utilized in video segmentation tasks. However, existing optical flow-based methods have a significant dependency on optical flow, which results in poor performance when the optical flow estimation fails for a particular scene. The temporal consistency provided by the optical flow could be effectively supplemented by modeling in a structural form. This paper proposes a new hierarchical graph neural network (GNN) architecture, dubbed hierarchical graph pattern understanding (HGPU), for zero-shot video object segmentation (ZS-VOS). Inspired by the strong ability of GNNs in capturing structural relations, HGPU innovatively leverages motion cues (i.e., optical flow) to enhance the high-order representations from the neighbors of target frames. Specifically, a hierarchical graph pattern encoder with message aggregation is introduced to acquire different levels of motion and appearance features in a sequential manner. Furthermore, a decoder is designed for hierarchically parsing and understanding the transformed multi-modal contexts to achieve more accurate and robust results. HGPU achieves state-of-the-art performance on four publicly available benchmarks (DAVIS-16, YouTube-Objects, Long-Videos and DAVIS-17). Code and pre-trained model can be found at https://github.com/NUST-Machine-Intelligence-Laboratory/HGPU.
科研通智能强力驱动
Strongly Powered by AbleSci AI