帕斯卡(单位)
计算机科学
目标检测
人工智能
模式识别(心理学)
分割
卷积神经网络
可扩展性
图像分割
背景(考古学)
数据库
生物
古生物学
程序设计语言
作者
Ross Girshick,Jeff Donahue,Trevor Darrell,Jitendra Malik
标识
DOI:10.1109/tpami.2015.2437384
摘要
Object detection performance, as measured on the canonical PASCAL VOC Challenge datasets, plateaued in the final years of the competition. The best-performing methods were complex ensemble systems that typically combined multiple low-level image features with high-level context. In this paper, we propose a simple and scalable detection algorithm that improves mean average precision (mAP) by more than 50 percent relative to the previous best result on VOC 2012-achieving a mAP of 62.4 percent. Our approach combines two ideas: (1) one can apply high-capacity convolutional networks (CNNs) to bottom-up region proposals in order to localize and segment objects and (2) when labeled training data are scarce, supervised pre-training for an auxiliary task, followed by domain-specific fine-tuning, boosts performance significantly. Since we combine region proposals with CNNs, we call the resulting model an R-CNN or Region-based Convolutional Network. Source code for the complete system is available at http://www.cs.berkeley.edu/~rbg/rcnn.
科研通智能强力驱动
Strongly Powered by AbleSci AI