计算机科学
推论
目标检测
人工智能
对象(语法)
过程(计算)
视觉对象识别的认知神经科学
骨干网
语义学(计算机科学)
编码(集合论)
深度学习
信息交流
模式识别(心理学)
计算机视觉
机器学习
集合(抽象数据类型)
计算机网络
程序设计语言
操作系统
电信
作者
Yiqi Jiang,Zhiyu Tan,Junyan Wang,Xiuyu Sun,Ming Lin,Hao Li
出处
期刊:Cornell University - arXiv
日期:2022-01-01
被引量:48
标识
DOI:10.48550/arxiv.2202.04256
摘要
In conventional object detection frameworks, a backbone body inherited from image recognition models extracts deep latent features and then a neck module fuses these latent features to capture information at different scales. As the resolution in object detection is much larger than in image recognition, the computational cost of the backbone often dominates the total inference cost. This heavy-backbone design paradigm is mostly due to the historical legacy when transferring image recognition models to object detection rather than an end-to-end optimized design for object detection. In this work, we show that such paradigm indeed leads to sub-optimal object detection models. To this end, we propose a novel heavy-neck paradigm, GiraffeDet, a giraffe-like network for efficient object detection. The GiraffeDet uses an extremely lightweight backbone and a very deep and large neck module which encourages dense information exchange among different spatial scales as well as different levels of latent semantics simultaneously. This design paradigm allows detectors to process the high-level semantic information and low-level spatial information at the same priority even in the early stage of the network, making it more effective in detection tasks. Numerical evaluations on multiple popular object detection benchmarks show that GiraffeDet consistently outperforms previous SOTA models across a wide spectrum of resource constraints. The source code is available at https://github.com/jyqi/GiraffeDet.
科研通智能强力驱动
Strongly Powered by AbleSci AI