计算机科学
人工智能
判别式
散列函数
特征学习
深度学习
机器学习
模式识别(心理学)
数据挖掘
情报检索
计算机安全
作者
Peiguang Jing,Hung-Min Sun,Liqiang Nie,Yun Li,Peiguang Jing
出处
期刊:IEEE Transactions on Knowledge and Data Engineering
[Institute of Electrical and Electronics Engineers]
日期:2024-01-01
卷期号:: 1-12
标识
DOI:10.1109/tkde.2023.3337077
摘要
The pressing need for low storage and high efficiency has significantly propelled the advancement of deep hashing techniques in the realm of large-scale search and retrieval tasks. As one of the most prevailing forms of user-generated contents, micro-videos usually represent more complicated multi-modal behaviors that are further challenged in multi-label retrieval. Existing multi-modal hashing methods tend to prioritize the complementarity and consistency in multi-modal fusion, while neglecting the completeness problem. In this paper, we propose a deep multi-modal hashing with semantic enhancement (DMHSE) method that effectively integrates complete multi-modal representation learning with discriminative binary coding by means of collaboration between two distinct encoders, FoldCoder and HashCoder. FoldCoder translates latent multi-modal representation learning to a degradation process through mimicking data transmitting. Further, it incorporates a prompt learning paradigm to maximize the utilization of multi-label semantics for guiding representation learning. HashCoder combines pairwise and central constraints to ensure more discriminative hashing results. Pairwise constraint preserves the original local relevance structure, while central constraint tackles the problem of semantic ambiguity in multi-label data by leveraging the global label distribution. Experimental results demonstrate that DMHSE achieves superior performance in multi-label micro-video retrieval tasks.
科研通智能强力驱动
Strongly Powered by AbleSci AI