超球体
计算机科学
人工智能
特征学习
推论
模式识别(心理学)
稳健性(进化)
特征提取
特征(语言学)
嵌入
MNIST数据库
机器学习
深度学习
哲学
化学
基因
生物化学
语言学
作者
W Zhang,Jihao Li,Shuoke Li,Jialiang Chen,Wenkai Zhang,Xin Gao,Xian Sun
标识
DOI:10.1109/tgrs.2023.3318227
摘要
Remote sensing cross-modal text-image retrieval (RSCTIR) is a flexible and human-centered approach to retrieving rich information from different modalities, which has attracted plenty of attention in recent years. It remains challenging because the current methods usually ignore the varying difficulty levels of different sample pairs, stemming from the large image distribution difference and the high text similarity in the remote sensing (RS) field. Therefore, in this paper, we propose an innovative hypersphere-based visual semantic alignment (HVSA) network via curriculum learning. Specifically, we first design an adaptive alignment strategy based on curriculum learning, that aligns RS image-text pairs from easy to hard. Sample pairs with different levels of difficulty are treated unequally, and we obtain a better embedding representation when projecting the features onto the unit hypersphere. Then, to measure the robustness of cross-modal feature alignment on the unit hypersphere, we introduce the feature uniformity strategy. It reduces the occurrence of mismatching cases and improves generalization performance. Finally, we design the key-entity attention (KEA) mechanism to alleviate the problem of information imbalance among different modalities. KEA has the ability to extract information about the key entity which is aligned with textual information. Despite its conciseness, our framework achieves state-of-the-art performance on classical datasets of RSCTIR tasks while enjoying faster inference. The summed recall of HVSA on the RISCD and RSITMD is 120.97 and 198.94, 2.50 and 10.49 points ahead of the current best methods, respectively. Extensive experiments demonstrate the competitiveness of our method. The code has been released at https://github.com/ZhangWeihang99/HVSA.
科研通智能强力驱动
Strongly Powered by AbleSci AI