计算机科学
帕斯卡(单位)
人工智能
机器学习
视觉推理
目标检测
图形
过程(计算)
背景(考古学)
深度学习
视觉对象识别的认知神经科学
对象(语法)
卷积神经网络
模式识别(心理学)
理论计算机科学
古生物学
操作系统
程序设计语言
生物
作者
Dingwen Zhang,Wenyuan Zeng,Jie-Ru Yao,Junwei Han
标识
DOI:10.1109/tpami.2020.3046647
摘要
In recent years, weakly supervised object detection has attracted great attention in the computer vision community. Although numerous deep learning-based approaches have been proposed in the past few years, such an ill-posed problem is still challenging and the learning performance is still behind the expectation. In fact, most of the existing approaches only consider the visual appearance of each proposal region but ignore to make use of the helpful context information. To this end, this paper introduces two levels of context into the weakly supervised learning framework. The first one is the proposal-level context, i.e., the relationship of the spatially adjacent proposals. The second one is the semantic-level context, i.e., the relationship of the co-occurring object categories. Therefore, the proposed weakly supervised learning framework contains not only the cognition process on the visual appearance but also the reasoning process on the proposal- and semantic-level relationships, which leads to the novel deep multiple instance reasoning framework. Specifically, built upon a conventional CNN-based network architecture, the proposed framework is equipped with two additional graph convolutional network-based reasoning models to implement object location reasoning and multi-label reasoning within an end-to-end network training procedure. Comprehensive experiments on the widely used PASCAL VOC and MS COCO benchmarks have been implemented, which demonstrate the superior capacity of the proposed approach when compared with other state-of-the-art methods and baseline models.
科研通智能强力驱动
Strongly Powered by AbleSci AI