发布文献求助

Exploring Multi-Modal Contextual Knowledge for Open-Vocabulary Object Detection

计算机科学情态动词自然语言处理人工智能词汇目标检测语境分析背景（考古学）机器学习语言学模式识别（心理学）哲学化学政府（语言学）高分子化学古生物学生物

作者

Yifan Xu,Mengdan Zhang,Xiaoshan Yang,Changsheng Xu

出处

期刊：Cornell University - arXiv 日期：2023-01-01

链接

arxiv.org datacite.orgdoi.org

标识

DOI：10.48550/arxiv.2308.15846

摘要

In this paper, we for the first time explore helpful multi-modal contextual knowledge to understand novel categories for open-vocabulary object detection (OVD). The multi-modal contextual knowledge stands for the joint relationship across regions and words. However, it is challenging to incorporate such multi-modal contextual knowledge into OVD. The reason is that previous detection frameworks fail to jointly model multi-modal contextual knowledge, as object detectors only support vision inputs and no caption description is provided at test time. To this end, we propose a multi-modal contextual knowledge distillation framework, MMC-Det, to transfer the learned contextual knowledge from a teacher fusion transformer with diverse multi-modal masked language modeling (D-MLM) to a student detector. The diverse multi-modal masked language modeling is realized by an object divergence constraint upon traditional multi-modal masked language modeling (MLM), in order to extract fine-grained region-level visual contexts, which are vital to object detection. Extensive experiments performed upon various detection datasets show the effectiveness of our multi-modal context learning strategy, where our approach well outperforms the recent state-of-the-art methods.

求助该文献

最长约 10秒，即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI

我的文献求助列表浏览历史

一分钟了解求助规则 | 捐赠本站 | 历史今天

活动

『应助活动周』获奖名单已公布 🔥 (2025-4-2)

更新

『中科院2025期刊分区』已更新 (2025-3-23)

更新

『即时热点』模块已上线 (2025-2-28)

科研通是完全免费的文献互助平台，具备全网最快的应助速度，最高的求助完成率。对每一个文献求助，科研通都将尽心尽力，给求助人一个满意的交代。

实时播报: 萧然完成签到，获得积分10

刚刚; 米粒之珠亦放光华发布了新的文献求助10

3秒前; Awen发布了新的文献求助10

3秒前; 平常亦凝发布了新的文献求助10

3秒前; 机灵柚子的应助被QJN采纳，获得10

4秒前; 希望天下0贩的0的应助被十二采纳，获得20

5秒前; 脑洞疼的应助被shuang采纳，获得10

5秒前; 遇事不决睡大觉完成签到，获得积分10

6秒前; 斯文败类上传了应助文件

7秒前; NexusExplorer上传了应助文件

7秒前; 遇事不决睡大觉发布了新的文献求助10

10秒前; 呆萌的源智完成签到，获得积分10

11秒前; xsy完成签到，获得积分10

11秒前; tao完成签到，获得积分10

12秒前; 平常亦凝完成签到，获得积分20

12秒前; 逃跑的想表白的你猜发布了新的文献求助10

12秒前; 自渡完成签到，获得积分10

14秒前; 十二完成签到，获得积分20

15秒前; 野性的小懒虫完成签到，获得积分20

16秒前; 活力的泥猴桃完成签到，获得积分10

17秒前; 李健的小迷弟的应助被jam采纳，获得10

18秒前; 霸气的香芦完成签到，获得积分10

18秒前; 桐桐上传了应助文件

19秒前; Leohp完成签到，获得积分10

20秒前; iNk上传了应助文件

21秒前; 科研通AI5上传了应助文件

21秒前; Jasper上传了应助文件

21秒前; 季夏十六完成签到，获得积分10

22秒前; 纯真的无声完成签到，获得积分10

23秒前; 科研小白发布了新的文献求助10

24秒前; 李健的应助被科研通管家采纳，获得10

24秒前; 淡然冬灵的应助被科研通管家采纳，获得30

24秒前; HEAUBOOK的应助被科研通管家采纳，获得10

24秒前; SciGPT的应助被科研通管家采纳，获得10

24秒前; 武傲翔发布了新的文献求助30

24秒前; 爆米花的应助被科研通管家采纳，获得10

24秒前; 科研通AI5的应助被科研通管家采纳，获得10

24秒前; 清爽的水蓝的应助被科研通管家采纳，获得10

25秒前; 脑洞疼的应助被科研通管家采纳，获得10

25秒前; 乐乐的应助被科研通管家采纳，获得10

25秒前

高分求助中: Introduction to Strong Mixing Conditions Volumes 1-3 500; Tip60 complex regulates eggshell formation and oviposition in the white-backed planthopper, providing effective targets for pest control 400; Optical and electric properties of monocrystalline synthetic diamond irradiated by neutrons 320; 共融服務學習指南 300; Essentials of Pharmacoeconomics: Health Economics and Outcomes Research 3rd Edition. by Karen Rascati 300; Peking Blues // Liao San 300; Political Ideologies Their Origins and Impact 13 edition 240

热门求助领域（近24小时）

热门帖子: 关注科研通微信公众号，转发送积分 3801134; 求助须知：如何正确求助？哪些是违规求助？ 3346777; 关于积分的说明 10330258; 捐赠科研通 3063151; 什么是DOI，文献DOI怎么找？ 1681383; 邀请新用户注册赠送积分活动 807540; 科研通“疑难数据库（出版商）”最低求助积分说明 763728

今日热心研友

平常的毛豆

剑指东方是为谁

可千万不要躺平呀

期待未来的自己

昏睡的蟠桃

注：热心度 = 本日应助数 + 本日被采纳获取积分÷10

Copyright © 2020-2025 AbleSci.COM, 科研通, All Right Reserved

科研通是非营利科研互助平台，不忘初心，为科研助力

本站互助的所有文件仅供个人学习研究用，禁止任何人把求助的所得文献进行盈利或传播

皖ICP备2024041134号-1

皖公网安备34019202002308

科研通【文献互助QQ群】：如果您有特殊求助，或发布求助超过24小时未得到应助，可加群求助，群号：941272744【点击一键加群】

科研通【志愿服务QQ群】：如果您热爱文献互助，有热心愿意为更多人服务，请加入小伙伴群，点击申请加入

关注微信服务号

科研通