模态(人机交互)
计算机科学
人工智能
模式
深度学习
计算机视觉
翻译(生物学)
图像(数学)
图像翻译
鉴定(生物学)
模式识别(心理学)
机器学习
信使核糖核酸
社会学
基因
生物
化学
植物
生物化学
社会科学
作者
Daoxun Xia,Haojie Liu,Lili Xu,Linna Wang
出处
期刊:Neurocomputing
[Elsevier BV]
日期:2021-03-11
卷期号:443: 35-46
被引量:35
标识
DOI:10.1016/j.neucom.2021.02.088
摘要
Visible-infrared person re-identification (VI-ReID) aims to search the same pedestrian images across different modalities, which is a challenging task for video surveillance. Compared to RGB-based re-identification (Re-ID) with sufficient single-modality training samples, VI-ReID suffers from imbalanced dual-modality data which affects the accuracy of deep learning classifiers. To this end, we present a image modality translation (IMT) network that learns to generate translated modality images from given modalities. It performs image modality translation by means of cycle-consistent adversarial network (CycleGAN) and serves as a data augmentation tool to restore balance to imbalanced training images. Concretely, our method mainly includes two steps: first, we train the IMT network on real images and generate target modality samples to enlarge the training dataset size and increase its diversity. Then the source images and modality transferred images are combined to train a Re-ID CNN model for improving cross-modality retrieval performance. To validate the effectiveness of our proposed approach, we perform our work over SYSU-MM01 and RegDB datasets. The experimental results indicate that our proposed method is significantly more accurate than the state-of-the-art methods.
科研通智能强力驱动
Strongly Powered by AbleSci AI