Recalling Unknowns Without Losing Precision: An Effective Solution to Large Model-Guided Open World Object Detection

计算机科学 计算机视觉 人工智能 目标检测 对象(语法) 图像处理 模式识别(心理学) 算法 图像(数学)
作者
Yulin He,Wei Chen,Siqi Wang,Tianrui Liu,Meng Wang
出处
期刊:IEEE transactions on image processing [Institute of Electrical and Electronics Engineers]
卷期号:34: 729-742 被引量:10
标识
DOI:10.1109/tip.2024.3459589
摘要

Open World Object Detection (OWOD) aims to adapt object detection to an open-world environment, so as to detect unknown objects and learn knowledge incrementally. Existing OWOD methods typically leverage training sets with a relatively small number of known objects. Due to the absence of generic object knowledge, they fail to comprehensively perceive objects beyond the scope of training sets. Recent advancements in large vision models (LVMs), trained on extensive large-scale data, offer a promising opportunity to harness rich generic knowledge for the fundamental advancement of OWOD. Motivated by Segment Anything Model (SAM), a prominent LVM lauded for its exceptional ability to segment generic objects, we first demonstrate the possibility to employ SAM for OWOD and establish the very first SAM-Guided OWOD baseline solution. Subsequently, we identify and address two fundamental challenges in SAM-Guided OWOD and propose a pioneering SAM-Guided Robust Open-world Detector (SGROD) method, which can significantly improve the recall of unknown objects without losing the precision on known objects. Specifically, the two challenges in SAM-Guided OWOD include: 1) Noisy labels caused by the class-agnostic nature of SAM; 2) Precision degradation on known objects when more unknown objects are recalled. For the first problem, we propose a dynamic label assignment (DLA) method that adaptively selects confident labels from SAM during training, evidently reducing the noise impact. For the second problem, we introduce cross-layer learning (CLL) and SAM-based negative sampling (SNS), which enable SGROD to avoid precision loss by learning robust decision boundaries of objectness and classification. Experiments on public datasets show that SGROD not only improves the recall of unknown objects by a large margin (~20%), but also preserves highly-competitive precision on known objects. The program codes are available at https://github.com/harrylin-hyl/SGROD.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
害羞的天真完成签到 ,获得积分10
刚刚
cchuang完成签到,获得积分10
1秒前
鹏鹏完成签到,获得积分10
2秒前
天梦星玄发布了新的文献求助10
2秒前
能干往事完成签到,获得积分20
2秒前
凶狠的白桃完成签到 ,获得积分10
2秒前
风中的蛋卷完成签到 ,获得积分10
2秒前
赘婿应助Mmxn采纳,获得10
3秒前
活力的泽洋完成签到,获得积分10
3秒前
3秒前
看文献的高光谱完成签到,获得积分0
3秒前
Bellamy发布了新的文献求助10
3秒前
Yang完成签到,获得积分10
3秒前
Hhong完成签到,获得积分10
3秒前
Humab668完成签到 ,获得积分10
4秒前
研友_RLNXOZ完成签到,获得积分20
4秒前
文艺路人完成签到 ,获得积分10
4秒前
鹏鹏发布了新的文献求助10
5秒前
6秒前
flter完成签到,获得积分10
6秒前
莫德里奇完成签到 ,获得积分10
7秒前
7秒前
研友_RLNXOZ发布了新的文献求助10
7秒前
天梦星玄完成签到,获得积分10
8秒前
熊风完成签到,获得积分10
8秒前
春词弥弥完成签到 ,获得积分10
9秒前
地锅鸡完成签到,获得积分10
9秒前
骄傲的牛奶瓶完成签到,获得积分10
9秒前
wuhu完成签到 ,获得积分10
9秒前
9秒前
畔畔发布了新的文献求助50
9秒前
殷勤的凝海完成签到 ,获得积分10
9秒前
9秒前
红柚子不酸完成签到,获得积分10
10秒前
Vet周发布了新的文献求助10
10秒前
YIQISUDA完成签到,获得积分10
10秒前
小蘑菇应助stlibhgq采纳,获得10
11秒前
英俊溪灵完成签到,获得积分10
13秒前
HMYX完成签到 ,获得积分10
13秒前
13秒前
高分求助中
(应助此贴封号)【重要!!请各用户(尤其是新用户)详细阅读】【科研通的精品贴汇总】 10000
Development Across Adulthood 800
Chemistry and Physics of Carbon Volume 18 800
The Organometallic Chemistry of the Transition Metals 800
The formation of Australian attitudes towards China, 1918-1941 640
Signals, Systems, and Signal Processing 610
天津市智库成果选编 600
热门求助领域 (近24小时)
化学 材料科学 医学 生物 纳米技术 工程类 有机化学 化学工程 生物化学 计算机科学 物理 内科学 复合材料 催化作用 物理化学 光电子学 电极 细胞生物学 基因 无机化学
热门帖子
关注 科研通微信公众号,转发送积分 6445352
求助须知:如何正确求助?哪些是违规求助? 8259025
关于积分的说明 17593477
捐赠科研通 5505279
什么是DOI,文献DOI怎么找? 2901713
邀请新用户注册赠送积分活动 1878692
关于科研通互助平台的介绍 1718559