亲爱的研友该休息了!由于当前在线用户较少,发布求助请尽量完整地填写文献信息,科研通机器人24小时在线,伴您度过漫漫科研夜!身体可是革命的本钱,早点休息,好梦!

General 3D Vision-Language Model With Fast Rendering and Pre-Training Vision-Language Alignment

计算机科学 人工智能 计算机视觉 渲染(计算机图形) 机器视觉 自然语言处理
作者
Kangcheng Liu,Yong‐Jin Liu,Baoquan Chen
出处
期刊:IEEE Transactions on Pattern Analysis and Machine Intelligence [IEEE Computer Society]
卷期号:47 (9): 7352-7368
标识
DOI:10.1109/tpami.2025.3566593
摘要

Current prevailing vision-language models have achieved remarkable progress in 3D scene understanding while trained in the closed-set setting and with full labels. The major bottleneck for the current robot 3D scene recognition approach for robotic applications is that these models do not have the capacity to recognize any unseen novel classes beyond the training categories in diverse real-world robot applications such as robot manipulation as well as robot navigation. In the meantime, current state-of-the-art 3D scene understanding approaches primarily require a large number of high-quality labels to train neural networks, which merely perform well in a fully supervised manner. Therefore, we are in urgent need of a framework that can simultaneously be applicable to both 3D point cloud segmentation and detection, particularly in the circumstances where the labels are rather scarce. This work presents a generalized and straightforward framework for dealing with 3D scene understanding when the labeled scenes are quite limited. To extract knowledge for novel categories from the pre-trained vision-language models, we propose a hierarchical feature-aligned pre-training and knowledge distillation strategy to extract and distill meaningful information from large-scale vision-language models, which helps benefit the open-vocabulary scene understanding tasks. To leverage the boundary information, we propose a novel energy-based loss with boundary awareness benefiting from the region-level boundary predictions. To encourage latent instance discrimination and to guarantee efficiency, we propose the unsupervised region-level semantic contrastive learning scheme for point clouds, using confident predictions of the neural network to discriminate the intermediate feature embeddings at multiple stages. In the limited reconstruction case, our proposed approach, termed WS3D++, ranks 1st on the large-scale ScanNet benchmark on both the task of semantic segmentation and instance segmentation. Also, our proposed WS3D++ achieves state-of-the-art data-efficient learning performance on the other large-scale real-scene indoor and outdoor datasets S3DIS and SemanticKITTI. Extensive experiments with both indoor and outdoor scenes demonstrated the effectiveness of our approach in both data-efficient learning and open-world few-shot learning.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
9秒前
yyds完成签到,获得积分0
10秒前
orixero应助科研通管家采纳,获得10
42秒前
51秒前
陶醉的蜜蜂完成签到,获得积分10
54秒前
55秒前
王婷完成签到 ,获得积分10
2分钟前
lisaltp完成签到 ,获得积分10
2分钟前
2分钟前
科研通AI6.3应助niiiii采纳,获得10
3分钟前
华尔街之鼠完成签到,获得积分10
3分钟前
niiiii完成签到,获得积分10
4分钟前
愔愔应助科研通管家采纳,获得30
4分钟前
科研通AI2S应助科研通管家采纳,获得10
4分钟前
科研通AI2S应助科研通管家采纳,获得10
4分钟前
4分钟前
4分钟前
4分钟前
Wei发布了新的文献求助10
5分钟前
niiiii发布了新的文献求助10
5分钟前
Lucas应助麻辣香锅采纳,获得10
5分钟前
5分钟前
麻辣香锅发布了新的文献求助10
5分钟前
5分钟前
完美世界应助多乐多采纳,获得10
6分钟前
多乐多完成签到,获得积分10
6分钟前
6分钟前
缓慢冬莲发布了新的文献求助10
6分钟前
科研通AI2S应助科研通管家采纳,获得10
6分钟前
斯文败类应助科研通管家采纳,获得10
6分钟前
Hongtao完成签到 ,获得积分10
6分钟前
6分钟前
合适的如天完成签到,获得积分10
6分钟前
科目三应助monair采纳,获得10
6分钟前
shain发布了新的文献求助10
7分钟前
7分钟前
嘻嘻完成签到,获得积分10
7分钟前
shain完成签到,获得积分10
8分钟前
科研通AI2S应助科研通管家采纳,获得10
8分钟前
orixero应助科研通管家采纳,获得10
8分钟前
高分求助中
Standards for Molecular Testing for Red Cell, Platelet, and Neutrophil Antigens, 7th edition 1000
HANDBOOK OF CHEMISTRY AND PHYSICS 106th edition 1000
ASPEN Adult Nutrition Support Core Curriculum, Fourth Edition 1000
Signals, Systems, and Signal Processing 610
脑电大模型与情感脑机接口研究--郑伟龙 500
GMP in Practice: Regulatory Expectations for the Pharmaceutical Industry 500
简明药物化学习题答案 500
热门求助领域 (近24小时)
化学 材料科学 医学 生物 纳米技术 工程类 有机化学 化学工程 生物化学 计算机科学 物理 内科学 复合材料 催化作用 物理化学 光电子学 电极 细胞生物学 基因 无机化学
热门帖子
关注 科研通微信公众号,转发送积分 6299483
求助须知:如何正确求助?哪些是违规求助? 8116593
关于积分的说明 16991107
捐赠科研通 5360511
什么是DOI,文献DOI怎么找? 2847609
邀请新用户注册赠送积分活动 1825106
关于科研通互助平台的介绍 1679376