计算机科学
人工智能
编码
卷积神经网络
特征提取
编码器
卷积(计算机科学)
特征(语言学)
计算机视觉
模式识别(心理学)
人工神经网络
生物化学
化学
语言学
哲学
基因
操作系统
作者
Chuxia Yang,Wanshu Fan,Dongsheng Zhou,Qiang Zhang
标识
DOI:10.1109/msn57253.2022.00123
摘要
Occluded person Re-Identification (Re-ID) is built on cross views, which aims to retrieve a target person in occlusion scenes. Under the condition that occlusion leads to the interference of other objects and the loss of personal information, the efficient extraction of personal feature representation is crucial to the recognition accuracy of the system. Most of the existing methods solve this problem by designing various deep networks, which are called convolutional neural networks (CNN)-based methods. Although these methods have the powerful ability to mine local features, they may fail to capture features containing global information due to the limitation of the gaussian distribution property of convolution operation. Recently, methods based on Vision Transformer (ViT) have been successfully employed to person Re-ID task and achieved good performance. However, since ViT-based methods lack the capability of extracting local information from person images, the generated results may severely lose local details. To address these deficiencies, we design a convolution and self-attention aggregation network (CSNet) by combining the advantages of both CNN and ViT. The proposed CSNet consists of three parts. First, to better capture personal information, we adopt Dual-Branch Encoder (DBE) to encode person images. Then, we also embed a Local Information Aggregation Module (LIAM) in the feature map, which effectively leverages the useful information in the local feature map. Finally, a Multi-Head Global-to-Local Attention (MHGLA) module is designed to transmit global information to local features. Experimental results demonstrate the superiority of the proposed method compared with the state-of-the-art (SOTA) methods on both the occluded person Re-ID datasets and the holistic person Re-ID datasets.
科研通智能强力驱动
Strongly Powered by AbleSci AI