发布文献求助

SFAN: Selective Filter and Alignment Network for Cross-Modal Retrieval

计算机科学滤波器（信号处理）情态动词钥匙（锁）模态（人机交互）人工智能相似性（几何）模式识别（心理学）边距（机器学习）机器学习计算机视觉图像（数学）计算机安全化学高分子化学

作者

Yongle Huang,Zedong Liu,Shijie Sun,Ningning Cui,Jianxin Li

出处

期刊：IEEE transactions on neural networks and learning systems [Institute of Electrical and Electronics Engineers]
日期：2025-06-19 卷期号：36 (10): 18792-18804

链接

标识

DOI：10.1109/tnnls.2025.3577292

摘要

Bridging the gap between visual and textual modalities effectively has consistently been a key challenge in cross-modal retrieval. Fine-grained matching approaches improve performance by precisely aligning salient region features in visual modality with word embeddings in textual modality. However, how to effectively and efficiently filter out irrelevant features (e.g., irrelevant background regions and nonmeaningful prepositions) in multimodality remains a significant challenge. Furthermore, capturing key cross-modal relationships while minimizing misalignment interference is crucial for effective cross-modal retrieval. In this work, we propose a novel approach called the selective filter and alignment network (SFAN) to tackle these challenges. First, we propose modality-specific selective filter modules (SFMs) to selectively and implicitly filter out redundant information within each modality. We then propose the state-space models (SSMs)-based selective alignment module (SAM) to selectively capture key correspondences and reduce the disturbance of irrelevant associations. Finally, we utilize a fusion operation to combine these embeddings from both SFM and SAM to derive the final embeddings for similarity computation. Extensive experiments on the Flickr30k, MS-COCO, and MSR-VTT datasets reveal that our proposed SFAN can effectively learn robust patterns, significantly outperforming the state-of-the-art (SOTA) cross-modal retrieval methods by a wide margin.

求助该文献

科研通智能强力驱动
Strongly Powered by AbleSci AI

我的文献求助列表浏览历史

一分钟了解求助规则 | 捐赠本站 | 历史今天

更新

2025年影响因子查询已上线 (2025-6-18)

更新

PDF的下载单位、IP信息已删除 (2025-6-4)

科研通是完全免费的文献互助平台，具备全网最快的应助速度，最高的求助完成率。对每一个文献求助，科研通都将尽心尽力，给求助人一个满意的交代。

实时播报: yuan关注了科研通微信公众号

刚刚; Dr. Chen完成签到，获得积分10

刚刚; wanci上传了应助文件

刚刚; 懒羊羊大王发布了新的文献求助10

1秒前; NexusExplorer上传了应助文件

1秒前; 绝味大姨发布了新的文献求助10

1秒前; 共享精神的应助被远方的大树采纳，获得10

2秒前; JamesPei上传了应助文件

3秒前; 李健的小迷弟上传了应助文件

4秒前; 浮游上传了应助文件

4秒前; 小李完成签到，获得积分10

4秒前; luan完成签到，获得积分10

4秒前; 褚幻香发布了新的文献求助30

5秒前; 赘婿上传了应助文件

5秒前; jiangxxxx1发布了新的文献求助30

5秒前; 小王完成签到，获得积分10

6秒前; llwxx完成签到，获得积分10

6秒前; 盲盒完成签到，获得积分10

7秒前; 沉静凡松发布了新的文献求助10

7秒前; Ava上传了应助文件

7秒前; 我是老大的应助被lizi采纳，获得20

7秒前; Dr. Chen发布了新的文献求助10

9秒前; 哪里有人发布了新的文献求助10

9秒前; hsy发布了新的文献求助10

10秒前; 哈基米德上传了应助文件

12秒前; JingjingWang发布了新的文献求助10

12秒前; 善学以致用的应助被hsy采纳，获得10

13秒前; Mu完成签到，获得积分10

13秒前; jiangxxxx1完成签到，获得积分20

14秒前; 浮游上传了应助文件

15秒前; 华仔的应助被神勇的怜寒采纳，获得10

17秒前; 科目三的应助被cjj采纳，获得10

17秒前; FashionBoy的应助被cjj采纳，获得10

17秒前; 乐乐的应助被cjj采纳，获得10

17秒前; 小二郎的应助被cjj采纳，获得10

17秒前; 脑洞疼的应助被cjj采纳，获得10

17秒前; 852的应助被cjj采纳，获得10

17秒前; 852的应助被cjj采纳，获得10

17秒前; 打打的应助被cjj采纳，获得30

17秒前; 高源发布了新的文献求助10

19秒前

高分求助中: (应助此贴封号)【重要！！请各用户(尤其是新用户)详细阅读】【科研通的精品贴汇总】 10000; Pipeline and riser loss of containment 2001 - 2020 (PARLOC 2020) 1000; Comparing natural with chemical additive production 500; Machine Learning in Chemistry 500; Phylogenetic study of the order Polydesmida (Myriapoda: Diplopoda) 500; A Manual for the Identification of Plant Seeds and Fruits : Second revised edition 500; The Social Work Ethics Casebook: Cases and Commentary (revised 2nd ed.) 400

热门求助领域（近24小时）

热门帖子: 关注科研通微信公众号，转发送积分 5196870; 求助须知：如何正确求助？哪些是违规求助？ 4378399; 关于积分的说明 13636182; 捐赠科研通 4233982; 什么是DOI，文献DOI怎么找？ 2322524; 邀请新用户注册赠送积分活动 1320667; 关于科研通互助平台的介绍 1271135

今日热心研友

耍酷的指甲油

昏睡的蟠桃

无敌霸王花

注：热心度 = 本日应助数 + 本日被采纳获取积分÷10

Copyright © 2020-2025 AbleSci.COM, 科研通, All Right Reserved

科研通是非营利科研互助平台，不忘初心，为科研助力

本站互助的所有文件仅供个人学习研究用，禁止任何人把求助的所得文献进行盈利或传播

皖ICP备2024041134号-1

皖公网安备34019202002308

科研通【文献互助QQ群】：如果您有特殊求助，或发布求助超过24小时未得到应助，可加群求助，群号：941272744【点击一键加群】

科研通【志愿服务QQ群】：如果您热爱文献互助，有热心愿意为更多人服务，请加入小伙伴群，点击申请加入

关注微信服务号

科研通