目标检测
投票
霍夫变换
对象(语法)
分割
任务(项目管理)
对象类检测
视觉对象识别的认知神经科学
模式识别(心理学)
Viola–Jones对象检测框架
编码(集合论)
计算机科学
姿势
多数决原则
图像分割
人工智能
图像(数学)
航程(航空)
计算机视觉
高动态范围
面部识别系统
人脸检测
集合(抽象数据类型)
经济
管理
程序设计语言
法学
政治
政治学
作者
Nermin Samet,Samet Hicsonmez,Emre Akbas
标识
DOI:10.1109/tpami.2022.3200413
摘要
This paper presents HoughNet, a one-stage, anchor-free, voting-based, bottom-up object detection method. Inspired by the Generalized Hough Transform, HoughNet determines the presence of an object at a certain location by the sum of the votes cast on that location. Votes are collected from both near and long-distance locations based on a log-polar vote field. Thanks to this voting mechanism, HoughNet is able to integrate both near and long-range, class-conditional evidence for visual recognition, thereby generalizing and enhancing current object detection methodology, which typically relies on only local evidence. On the COCO dataset, HoughNet's best model achieves 46.4 AP (and 65.1 AP50), performing on par with the state-of-the-art in bottom-up object detection and outperforming most major one-stage and two-stage methods. We further validate the effectiveness of our proposal in other visual detection tasks, namely, video object detection, instance segmentation, 3D object detection and keypoint detection for human pose estimation, and an additional "labels to photo" image generation task, where the integration of our voting module consistently improves performance in all cases. Code is available at https://github.com/nerminsamet/houghnet.
科研通智能强力驱动
Strongly Powered by AbleSci AI