计算机科学
帕斯卡(单位)
探测器
最小边界框
跳跃式监视
图像(数学)
人工智能
对象(语法)
目标检测
算法
模式识别(心理学)
数据挖掘
电信
程序设计语言
作者
Hao Zhao,Jikai Wang,Deyun Dai,Shiqi Lin,Zonghai Chen
出处
期刊:Neurocomputing
[Elsevier BV]
日期:2022-09-20
卷期号:512: 225-234
被引量:10
标识
DOI:10.1016/j.neucom.2022.09.080
摘要
Non-maximum Suppression (NMS), which is used to find the optimal inferences among all candidate bounding boxes, is a significant post-processing step in most state-of-the-art object detectors. The fixed threshold scheme in the standard NMS equally treats each input image, which leads to the neglect of uniqueness. Recently, several adaptive NMS methods have been proposed and demonstrated to be superior to the standard NMS with a fixed threshold. However, the adaptability performance of these methods is limited due to the deficiency of measuring the complexity of the input image. In this paper, we propose a dynamic NMS network (D-NMS net) to predict the best NMS threshold for each input image, which can be embedded into most state-of-the-art single-stage object detectors. Concretely, we first propose a unified scene complexity definition for a single image according to the relationship between the P-R curve and the changing NMS threshold. Secondly, we calculate the optimal NMS threshold for each image according to the proposed definition, which is then applied as the supervision label in the training stage. Lastly, we embed the lightweight regression network, D-NMS net, into the mainstream object detectors. Extensive experiments are conducted on challenging datasets. With the help of our D-NMS net, the accuracy and efficiency of detectors have achieved obvious improvements. On Pascal VOC, the mean Average Precision (mAP) of RetinaNet is boosted from 81.60% to 84.74%, and the mAP of FCOS is improved from 79.12% to 84.20%. On MS-COCO, the Average Precision(AP) of RetinaNet is boosted from 36.4% to 38.5%, and the AP of FCOS is improved from 37.2% to 39.1%. Meanwhile, the inference speed of our method is increased by 62% at most.
科研通智能强力驱动
Strongly Powered by AbleSci AI