模式识别(心理学)
卷积神经网络
计算机视觉
特征(语言学)
人工神经网络
对象(语法)
深度学习
分类器(UML)
视觉对象识别的认知神经科学
作者
Cui-jin Li,Zhong Qu,Sheng-ye Wang,Ling Liu
标识
DOI:10.1016/j.patrec.2021.02.003
摘要
Abstract Improving the detection accuracy and speed is the prerequisite of multi-object recognition in the complex traffic environment. Despite object detection has made significant advances based on deep neural networks, it remains a challenge to focus on small and occlusion objects. We address this challenge by allowing multiscale fusion. We introduce a cross-layer fusion multi-object detection and recognition algorithm based on Faster R-CNN, an approach that the five-layer structure of VGG16 (Visual Geometry Group) is used to obtain more characteristic information. We implement this idea with lateral embedding the 1×1 convolution kernel, max pooling and deconvolution, in conjunction with weighted balanced multi-class cross entropy loss function and Soft-NMS to control the imbalance between difficult and easy samples. Considering the actual situation in a complex traffic environment, we manually label mixed dataset. On Cityscapes and KITTI datasets, experimental results show that the proposed model achieves better effects than the current mainstream object detection models.
科研通智能强力驱动
Strongly Powered by AbleSci AI