Recalling Unknowns Without Losing Precision: An Effective Solution to Large Model-Guided Open World Object Detection

计算机科学 计算机视觉 人工智能 目标检测 对象(语法) 图像处理 模式识别(心理学) 算法 图像(数学)
作者
Yulin He,Wei Chen,Siqi Wang,Tianrui Liu,Meng Wang
出处
期刊:IEEE transactions on image processing [Institute of Electrical and Electronics Engineers]
卷期号:34: 729-742 被引量:10
标识
DOI:10.1109/tip.2024.3459589
摘要

Open World Object Detection (OWOD) aims to adapt object detection to an open-world environment, so as to detect unknown objects and learn knowledge incrementally. Existing OWOD methods typically leverage training sets with a relatively small number of known objects. Due to the absence of generic object knowledge, they fail to comprehensively perceive objects beyond the scope of training sets. Recent advancements in large vision models (LVMs), trained on extensive large-scale data, offer a promising opportunity to harness rich generic knowledge for the fundamental advancement of OWOD. Motivated by Segment Anything Model (SAM), a prominent LVM lauded for its exceptional ability to segment generic objects, we first demonstrate the possibility to employ SAM for OWOD and establish the very first SAM-Guided OWOD baseline solution. Subsequently, we identify and address two fundamental challenges in SAM-Guided OWOD and propose a pioneering SAM-Guided Robust Open-world Detector (SGROD) method, which can significantly improve the recall of unknown objects without losing the precision on known objects. Specifically, the two challenges in SAM-Guided OWOD include: 1) Noisy labels caused by the class-agnostic nature of SAM; 2) Precision degradation on known objects when more unknown objects are recalled. For the first problem, we propose a dynamic label assignment (DLA) method that adaptively selects confident labels from SAM during training, evidently reducing the noise impact. For the second problem, we introduce cross-layer learning (CLL) and SAM-based negative sampling (SNS), which enable SGROD to avoid precision loss by learning robust decision boundaries of objectness and classification. Experiments on public datasets show that SGROD not only improves the recall of unknown objects by a large margin (~20%), but also preserves highly-competitive precision on known objects. The program codes are available at https://github.com/harrylin-hyl/SGROD.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
PDF的下载单位、IP信息已删除 (2025-6-4)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
uouuo完成签到 ,获得积分10
刚刚
1秒前
1秒前
米斯塔林完成签到,获得积分10
1秒前
2秒前
Aourp应助畅快的白枫采纳,获得10
2秒前
FashionBoy应助hehedadahe采纳,获得30
2秒前
Aourp应助畅快的白枫采纳,获得10
2秒前
在水一方应助LILi采纳,获得10
3秒前
温暖伟祺完成签到,获得积分10
3秒前
清爽的以松完成签到,获得积分10
3秒前
乐乐应助玄天明月采纳,获得10
3秒前
豆豆完成签到,获得积分10
4秒前
大福发布了新的文献求助10
4秒前
PingLiu发布了新的文献求助10
5秒前
朝朝康发布了新的文献求助10
5秒前
Gzdaigzn完成签到,获得积分10
6秒前
Emiya完成签到,获得积分20
6秒前
xiaoxiao发布了新的文献求助10
6秒前
Fe_完成签到,获得积分10
7秒前
pups发布了新的文献求助10
7秒前
康康0919ing完成签到,获得积分10
7秒前
狼魂完成签到 ,获得积分10
7秒前
香蕉觅云应助ning_yang采纳,获得10
8秒前
Ava应助一二采纳,获得10
8秒前
amanda应助平常的如凡采纳,获得20
9秒前
聂落雁发布了新的文献求助10
9秒前
9秒前
zhangyulong完成签到,获得积分10
12秒前
12秒前
12秒前
大根猫完成签到,获得积分10
13秒前
13秒前
L1完成签到,获得积分10
14秒前
puzhongjiMiQ发布了新的文献求助10
14秒前
加加知完成签到,获得积分10
14秒前
qhg发布了新的文献求助10
14秒前
可耐的冰萍完成签到,获得积分10
15秒前
hehexi发布了新的文献求助10
15秒前
想要发文章完成签到,获得积分10
15秒前
高分求助中
(应助此贴封号)【重要!!请各用户(尤其是新用户)详细阅读】【科研通的精品贴汇总】 10000
Vertébrés continentaux du Crétacé supérieur de Provence (Sud-Est de la France) 600
A complete Carnosaur Skeleton From Zigong, Sichuan- Yangchuanosaurus Hepingensis 四川自贡一完整肉食龙化石-和平永川龙 600
Elle ou lui ? Histoire des transsexuels en France 500
FUNDAMENTAL STUDY OF ADAPTIVE CONTROL SYSTEMS 500
微纳米加工技术及其应用 500
Nanoelectronics and Information Technology: Advanced Electronic Materials and Novel Devices 500
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 生物化学 物理 纳米技术 计算机科学 内科学 化学工程 复合材料 物理化学 基因 遗传学 催化作用 冶金 量子力学 光电子学
热门帖子
关注 科研通微信公众号,转发送积分 5318176
求助须知:如何正确求助?哪些是违规求助? 4460399
关于积分的说明 13878616
捐赠科研通 4350829
什么是DOI,文献DOI怎么找? 2389556
邀请新用户注册赠送积分活动 1383649
关于科研通互助平台的介绍 1353137