情态动词
计算机科学
散列函数
利用
人工智能
特征提取
模态(人机交互)
变压器
特征学习
生成语法
数据挖掘
模式识别(心理学)
机器学习
理论计算机科学
工程类
电气工程
电压
化学
高分子化学
计算机安全
作者
Weihua Ou,Jiaxin Deng,Lei Zhang,Jianping Gou,Quan Zhou
标识
DOI:10.1109/tits.2022.3221787
摘要
Cross-modal hashing is an effective cross-modal retrieval approach because of its low storage and high efficiency. However, most existing methods mainly utilize pre-trained networks to extract modality-specific features, while ignore the position information and lack information interaction between different modalities. To address those problems, in this paper, we propose a novel approach, named cross-modal generation and pair correlation alignment hashing (CMGCAH), which introduces transformer to exploit position information and utilizes cross-modal generative adversarial networks (GAN) to boost cross-modal information interaction. Concretely, a cross-modal interaction network based on conditional generative adversarial network and pair correlation alignment networks are proposed to generate cross-modal common representations. On the other hand, a transformer-based feature extraction network (TFEN) is designed to exploit position information, which can be propagated to text modality and enforce the common representation to be semantically consistent. Experiments are performed on widely used datasets with text-image modalities, and results show that the proposed method achieved competitive performance compared with many existing methods.
科研通智能强力驱动
Strongly Powered by AbleSci AI