亲爱的研友该休息了!由于当前在线用户较少,发布求助请尽量完整的填写文献信息,科研通机器人24小时在线,伴您度过漫漫科研夜!身体可是革命的本钱,早点休息,好梦!

Interacting-Enhancing Feature Transformer for Cross-Modal Remote-Sensing Image and Text Retrieval

计算机科学 特征(语言学) 人工智能 卷积神经网络 嵌入 块(置换群论) 特征向量 特征学习 利用 情态动词 模式识别(心理学) 图像检索 计算机视觉 深度学习 图像(数学) 哲学 语言学 化学 几何学 数学 计算机安全 高分子化学
作者
Xu Tang,Yijing Wang,Jingjing Ma,Xiangrong Zhang,Fang Liu,Licheng Jiao
出处
期刊:IEEE Transactions on Geoscience and Remote Sensing [Institute of Electrical and Electronics Engineers]
卷期号:61: 1-15 被引量:20
标识
DOI:10.1109/tgrs.2023.3280546
摘要

Cross-modal remote sensing image-text retrieval (CMRSITR) is a challenging topic in the remote sensing (RS) community. It has gained growing attention because it can be flexibly used in many practical applications. In the current deep era, with the help of deep convolutional neural networks (DCNNs), many successful CMRSITR methods have been proposed. Most of them first learn valuable features from RS images and texts respectively. Then, the obtained visual and textual features are mapped into a common space for the final retrieval. The above operations are feasible, however, two difficulties are still to be solved. One is that the semantics within the visual and textual features are misaligned due to the independent learning manner. The other one is that the deep links between RS images and texts cannot be fully explored by simple common space mapping. To overcome the above challenges, we propose a new model named interacting-enhancing feature transformer (IEFT) for CMRSITR, which regards the RS images and texts as a whole. First, a simple feature embedding module (FEM) is developed to map images and texts into the visual and textual feature spaces. Second, an information interacting-enhancing module (IIEM) is designed to simultaneously model the inner relationships between RS images and texts and enhance the visual features. IIEM consists of three feature interacting-enhancing (FIE) blocks, each of which contains an inter-modality relationship interacting (IMRI) sub-block and a visual feature enhancing (VFE) sub-block. The duty of IMRI is to exploit the hidden relations between cross-modal data, while the responsibility of VFE is to improve the visual features. By combining them, semantic bias can be mitigated, and the complex contents of RS images can be studied. Finally, the retrieval module (RM) is constructed to generate the matching scores for deciding the search results. Extensive experiments are conducted on four public RS data sets. The positive results demonstrate that our IEFT can achieve superior retrieval performance compared with many existing methods. Our source codes are available at https://github.com/TangXu-Group/Cross-modal-remote-sensing-image-and-text-retrieval-models/tree/main/IEFT.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
13秒前
SciGPT应助郭小宝采纳,获得10
28秒前
lll完成签到,获得积分20
32秒前
34秒前
科研通AI5应助非洲大象采纳,获得50
36秒前
内啡呔完成签到,获得积分20
37秒前
郭小宝发布了新的文献求助10
40秒前
CodeCraft应助罗罗子采纳,获得10
49秒前
55秒前
1分钟前
1分钟前
罗罗子发布了新的文献求助10
1分钟前
橙子味的邱憨憨完成签到 ,获得积分10
1分钟前
1分钟前
罗罗子完成签到,获得积分20
1分钟前
Yuna96发布了新的文献求助10
1分钟前
前程似锦完成签到 ,获得积分10
1分钟前
1分钟前
可爱的函函应助Yuna96采纳,获得10
1分钟前
1分钟前
科研通AI2S应助科研通管家采纳,获得10
2分钟前
爆米花应助科研通管家采纳,获得10
2分钟前
2分钟前
2分钟前
andrele发布了新的文献求助10
2分钟前
爆米花应助andrele采纳,获得10
2分钟前
2分钟前
Much完成签到 ,获得积分10
2分钟前
2分钟前
2分钟前
3分钟前
含蓄藏花发布了新的文献求助10
3分钟前
非洲大象发布了新的文献求助50
3分钟前
3分钟前
3分钟前
非洲大象发布了新的文献求助10
3分钟前
非洲大象完成签到,获得积分10
3分钟前
ppppppp_76完成签到 ,获得积分10
3分钟前
梵莫完成签到,获得积分10
4分钟前
4分钟前
高分求助中
The Oxford Encyclopedia of the History of Modern Psychology 1500
Parametric Random Vibration 600
城市流域产汇流机理及其驱动要素研究—以北京市为例 500
Plasmonics 500
Drug distribution in mammals 500
Building Quantum Computers 458
Happiness in the Nordic World 400
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 物理 生物化学 纳米技术 计算机科学 化学工程 内科学 复合材料 物理化学 电极 遗传学 量子力学 基因 冶金 催化作用
热门帖子
关注 科研通微信公众号,转发送积分 3857311
求助须知:如何正确求助?哪些是违规求助? 3399733
关于积分的说明 10613403
捐赠科研通 3121973
什么是DOI,文献DOI怎么找? 1721164
邀请新用户注册赠送积分活动 828920
科研通“疑难数据库(出版商)”最低求助积分说明 777928