清晨好,您是今天最早来到科研通的研友!由于当前在线用户较少,发布求助请尽量完整的填写文献信息,科研通机器人24小时在线,伴您科研之路漫漫前行!

Lowis3D: Language-Driven Open-World Instance-Level 3D Scene Understanding

计算机科学 人工智能 自然语言处理 计算机视觉
作者
Runyu Ding,Jihan Yang,Chuhui Xue,Wenqing Zhang,Song Bai,Xiaojuan Qi
出处
期刊:IEEE Transactions on Pattern Analysis and Machine Intelligence [IEEE Computer Society]
卷期号:46 (12): 8517-8533 被引量:6
标识
DOI:10.1109/tpami.2024.3410324
摘要

Open-world instance-level scene understanding aims to locate and recognize unseen object categories that are not present in the annotated dataset. This task is challenging because the model needs to both localize novel 3D objects and infer their semantic categories. A key factor for the recent progress in 2D open-world perception is the availability of large-scale image-text pairs from the Internet, which cover a wide range of vocabulary concepts. However, this success is hard to replicate in 3D scenarios due to the scarcity of 3D-text pairs. To address this challenge, we propose to harness pre-trained vision-language (VL) foundation models that encode extensive knowledge from image-text pairs to generate captions for multi-view images of 3D scenes. This allows us to establish explicit associations between 3D shapes and semantic-rich captions. Moreover, to enhance the fine-grained visual-semantic representation learning from captions for object-level categorization, we design hierarchical point-caption association methods to learn semantic-aware embeddings that exploit the 3D geometry between 3D points and multi-view images. In addition, to tackle the localization challenge for novel classes in the open-world setting, we develop debiased instance localization, which involves training object grouping modules on unlabeled data using instance-level pseudo supervision. This significantly improves the generalization capabilities of instance grouping and, thus, the ability to accurately locate novel objects. We conduct extensive experiments on 3D semantic, instance, and panoptic segmentation tasks, covering indoor and outdoor scenes across three datasets. Our method outperforms baseline methods by a significant margin in semantic segmentation (e.g. 34.5% ∼ 65.3%), instance segmentation (e.g. 21.8% ∼ 54.0%), and panoptic segmentation (e.g. 14.7% ∼ 43.3%). Code will be available.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
cdercder应助和平港湾采纳,获得10
8秒前
优雅山柏完成签到,获得积分10
23秒前
48秒前
cdercder应助科研通管家采纳,获得20
50秒前
科研通AI2S应助科研通管家采纳,获得10
50秒前
ukz37752发布了新的文献求助10
52秒前
不眠的人完成签到,获得积分10
52秒前
和平港湾完成签到,获得积分10
1分钟前
wushang完成签到 ,获得积分10
1分钟前
含糊的无声完成签到 ,获得积分10
1分钟前
绿袖子完成签到,获得积分10
1分钟前
fogsea完成签到,获得积分0
1分钟前
zijingsy完成签到 ,获得积分10
1分钟前
zhdjj完成签到 ,获得积分10
1分钟前
lanxinge完成签到 ,获得积分10
2分钟前
小小铱完成签到,获得积分10
2分钟前
伊yan完成签到 ,获得积分10
2分钟前
咯咯咯完成签到 ,获得积分10
2分钟前
科研通AI2S应助科研通管家采纳,获得10
2分钟前
尔玉完成签到 ,获得积分10
3分钟前
胖胖橘完成签到 ,获得积分10
3分钟前
MQ完成签到 ,获得积分10
3分钟前
zzgpku完成签到,获得积分0
3分钟前
3分钟前
yaoyaoyao完成签到 ,获得积分10
3分钟前
和谐的夏岚完成签到 ,获得积分10
3分钟前
Lina完成签到 ,获得积分10
3分钟前
香蕉觅云应助紧张的海露采纳,获得10
3分钟前
开放访天完成签到 ,获得积分10
4分钟前
Alger完成签到,获得积分10
4分钟前
4分钟前
FashionBoy应助紧张的海露采纳,获得10
4分钟前
gincle完成签到 ,获得积分10
4分钟前
piaoaxi完成签到 ,获得积分10
4分钟前
4分钟前
Jupiter 1234发布了新的文献求助10
4分钟前
4分钟前
Jupiter 1234完成签到,获得积分10
4分钟前
冬菊完成签到 ,获得积分10
4分钟前
lhn完成签到 ,获得积分10
4分钟前
高分求助中
Applied Survey Data Analysis (第三版, 2025) 800
Narcissistic Personality Disorder 700
Assessing and Diagnosing Young Children with Neurodevelopmental Disorders (2nd Edition) 700
The Elgar Companion to Consumer Behaviour and the Sustainable Development Goals 540
Images that translate 500
Transnational East Asian Studies 400
Mapping the Stars: Celebrity, Metonymy, and the Networked Politics of Identity 400
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 物理 生物化学 纳米技术 计算机科学 化学工程 内科学 复合材料 物理化学 电极 遗传学 量子力学 基因 冶金 催化作用
热门帖子
关注 科研通微信公众号,转发送积分 3843282
求助须知:如何正确求助?哪些是违规求助? 3385538
关于积分的说明 10540738
捐赠科研通 3106138
什么是DOI,文献DOI怎么找? 1710890
邀请新用户注册赠送积分活动 823818
科研通“疑难数据库(出版商)”最低求助积分说明 774308