计算机科学
编码器
帕斯卡(单位)
目标检测
变压器
计算机视觉
对象(语法)
人工智能
模式识别(心理学)
电压
物理
量子力学
程序设计语言
操作系统
作者
Jiangnan Zhang,Kewen Xia,Zhiyi Huang,Sijie Wang,Romoke Grace Akindele
标识
DOI:10.1016/j.eswa.2023.119997
摘要
Detecting small objects is critical to many submissions, such as automatic drive and lung nodule detection. However, small object detection is challenging with low-resolution features. Therefore, the linchpin of small object detection is to design an effective encoder that can extract subtle features. In this paper, we present a powerful encoder, called Ensemble Transformer with Attention Modules (ETAM) encoder, for abstracting the subtle small object features without sacrificing the capability of larger object detection. In ETAM, a Magnifying Glass (MG) module is proposed to focus on representative features of small objects. Then, the Quadruple Attention (QA) is designed to enrich the small object features with width and height in addition to channel and position. To accommodate both small and large objects, we use ensemble learning in our ETAM encoder, which has two branches. Experimental results show that ETAM significantly improves small object detection based on PASCAL VOC, MS-COCO, VisDrone2019, and LIDC-IDRI. With ETAM, the mAP for small objects is improved up to 91.7% based on the four datasets.
科研通智能强力驱动
Strongly Powered by AbleSci AI