计算机科学
对比度(视觉)
人工智能
特征(语言学)
散列函数
情态动词
融合
深度学习
模式识别(心理学)
材料科学
计算机安全
语言学
哲学
高分子化学
作者
Guochang Song,Wendong Zhang,Bo Wang
标识
DOI:10.1109/icicml60161.2023.10424934
摘要
In this paper, we propose a deep cross-modal hashing CLFFH based on contrastive learning and feature fusion for solving the problem that existing deep cross-modal hashing methods use high-level representations of features while ignoring semantic loss and inter-modal intra-modal semantic correlation during the training process and focus on similarity computation of individual samples without taking into account the overall role of labeled similar samples. Specifically, CLFFH fuses the middle layer features of image and text modal data to compensate for the semantic loss in the feature learning process, sets up a memory bank to store the fused hash representations to pay attention to the inter-modal and intra-modal semantic correlation, and in addition, introduces the idea of contrastive learning, which is trained from the perspective of label-similar samples as a whole rather than from the perspective of individual samples, and effectively enhances the discriminative properties of the hash code. Experimental results show that the average retrieval accuracy of the model on three public datasets, Mirflickr25k, NUS-WIDE, and COCO2014, obtains a significant improvement.
科研通智能强力驱动
Strongly Powered by AbleSci AI