发布文献求助

亲爱的研友该休息了！由于当前在线用户较少，发布求助请尽量完整地填写文献信息，科研通机器人24小时在线，伴您度过漫漫科研夜！身体可是革命的本钱，早点休息，好梦！

Structure-CLIP: Towards Scene Graph Knowledge to Enhance Multi-modal Structured Representations

杠杆（统计）计算机科学情态动词人工智能图形场景图结构化预测关系抽取自然语言处理理论计算机科学信息抽取化学高分子化学渲染（计算机图形）

作者

Yu‐Feng Huang,Jiji Tang,Zhuo Chen,Rongsheng Zhang,Xinfeng Zhang,Weijie Chen,Zeng Zhao,Tangjie Lv,Zhipeng Hu,Wen Zhang

出处

期刊：Cornell University - arXiv 日期：2023-01-01 被引量：1

链接

arxiv.org datacite.orgdoi.org

标识

DOI：10.48550/arxiv.2305.06152

摘要

Large-scale vision-language pre-training has achieved significant performance in multi-modal understanding and generation tasks. However, existing methods often perform poorly on image-text matching tasks that require structured representations, i.e., representations of objects, attributes, and relations. As illustrated in Fig.~reffig:case (a), the models cannot make a distinction between ``An astronaut rides a horse" and ``A horse rides an astronaut". This is because they fail to fully leverage structured knowledge when learning representations in multi-modal scenarios. In this paper, we present an end-to-end framework Structure-CLIP, which integrates Scene Graph Knowledge (SGK) to enhance multi-modal structured representations. Firstly, we use scene graphs to guide the construction of semantic negative examples, which results in an increased emphasis on learning structured representations. Moreover, a Knowledge-Enhance Encoder (KEE) is proposed to leverage SGK as input to further enhance structured representations. To verify the effectiveness of the proposed framework, we pre-train our model with the aforementioned approaches and conduct experiments on downstream tasks. Experimental results demonstrate that Structure-CLIP achieves state-of-the-art (SOTA) performance on VG-Attribution and VG-Relation datasets, with 12.5% and 4.1% ahead of the multi-modal SOTA model respectively. Meanwhile, the results on MSCOCO indicate that Structure-CLIP significantly enhances the structured representations while maintaining the ability of general representations. Our code is available at https://github.com/zjukg/Structure-CLIP.

求助该文献

最长约 10秒，即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI

我的文献求助列表浏览历史

一分钟了解求助规则 | 捐赠本站 | 历史今天

更新

2025年影响因子查询已上线 (2025-6-18)

更新

PDF的下载单位、IP信息已删除 (2025-6-4)

科研通是完全免费的文献互助平台，具备全网最快的应助速度，最高的求助完成率。对每一个文献求助，科研通都将尽心尽力，给求助人一个满意的交代。

实时播报: NanNan626发布了新的文献求助30

3秒前; 然463完成签到，获得积分10

12秒前; 酷波er上传了应助文件

13秒前; Hcc完成签到，获得积分10

19秒前; 所所上传了应助文件

27秒前; 风登楼发布了新的文献求助10

32秒前; 学术小白完成签到，获得积分10

34秒前; 科研通AI2S上传了应助文件

37秒前; 婼汐完成签到，获得积分10

39秒前; 瘦瘦的寒珊发布了新的文献求助10

40秒前; 量子星尘发布了新的文献求助10

47秒前; Owen的应助被瘦瘦的寒珊采纳，获得10

1分钟前; 彩色德天关闭了彩色德天的文献求助

1分钟前; 活泼的冬寒完成签到，获得积分10

1分钟前; Owen上传了应助文件

1分钟前; 瘦瘦的寒珊完成签到，获得积分10

1分钟前; 有魅力听枫完成签到，获得积分10

1分钟前; 瘦瘦的寒珊发布了新的文献求助10

1分钟前; 叶梓轩完成签到，获得积分10

1分钟前; wanci的应助被如意的书桃采纳，获得30

1分钟前; 李健的粉丝团团长的应助被gyh采纳，获得10

1分钟前; Owen的应助被风登楼采纳，获得10

1分钟前; 赘婿上传了应助文件

1分钟前; xiao发布了新的文献求助10

1分钟前; 孙孙的应助被图治采纳，获得10

1分钟前; 风登楼完成签到，获得积分10

1分钟前; 香蕉觅云的应助被xiao采纳，获得10

1分钟前; 我啊完成签到，获得积分10

1分钟前; 木木完成签到，获得积分10

1分钟前; 开心的火龙果完成签到，获得积分10

1分钟前; 西红柿有股番茄味完成签到，获得积分10

2分钟前; 量子星尘发布了新的文献求助10

2分钟前; zz完成签到，获得积分10

2分钟前; 爆米花上传了应助文件

2分钟前; 研友_VZG7GZ的应助被polaris采纳，获得10

2分钟前; 研友_VZG7GZ上传了应助文件

2分钟前; 手拿把掐吴完成签到，获得积分10

2分钟前; polaris发布了新的文献求助10

2分钟前; 英姑上传了应助文件

2分钟前; Alex发布了新的文献求助10

2分钟前

高分求助中: A new approach to the extrapolation of accelerated life test data 1000; Picture Books with Same-sex Parented Families: Unintentional Censorship 700; ACSM’s Guidelines for Exercise Testing and Prescription, 12th edition 500; Nucleophilic substitution in azasydnone-modified dinitroanisoles 500; 不知道标题是什么 500; Indomethacinのヒトにおける経皮吸収 400; Phylogenetic study of the order Polydesmida (Myriapoda: Diplopoda) 370

热门求助领域（近24小时）

热门帖子: 关注科研通微信公众号，转发送积分 3976643; 求助须知：如何正确求助？哪些是违规求助？ 3520735; 关于积分的说明 11204613; 捐赠科研通 3257484; 什么是DOI，文献DOI怎么找？ 1798716; 邀请新用户注册赠送积分活动 877897; 科研通“疑难数据库（出版商）”最低求助积分说明 806613

今日热心研友

迷人的天抒

轻松的惜芹

风雨中飘摇

注：热心度 = 本日应助数 + 本日被采纳获取积分÷10

Copyright © 2020-2025 AbleSci.COM, 科研通, All Right Reserved

科研通是非营利科研互助平台，不忘初心，为科研助力

本站互助的所有文件仅供个人学习研究用，禁止任何人把求助的所得文献进行盈利或传播

皖ICP备2024041134号-1

皖公网安备34019202002308

科研通【文献互助QQ群】：如果您有特殊求助，或发布求助超过24小时未得到应助，可加群求助，群号：941272744【点击一键加群】

科研通【志愿服务QQ群】：如果您热爱文献互助，有热心愿意为更多人服务，请加入小伙伴群，点击申请加入

关注微信服务号

科研通