点云
计算机科学
人工智能
保险丝(电气)
云计算
点(几何)
计算机视觉
深度学习
图像(数学)
集合(抽象数据类型)
机器学习
几何学
数学
操作系统
电气工程
程序设计语言
工程类
作者
Hao-Yang Peng,Baopu Li,Bo Zhang,Xin Chen,Tao Chen,Hongyuan Zhu
标识
DOI:10.1109/tcsvt.2023.3343495
摘要
Point cloud based 3D deep model has wide applications in many applications such as autonomous driving, house robot, etc. Inspired by the recent prompt learning in natural language processing, this work proposes a novel Multi-view Vision Fusion Network (MvNet) for few-shot 3D point cloud classification. MvNet investigates the possibility of leveraging the off-the-shelf 2D pre-trained models to achieve the few-shot classification, which can alleviate the over-dependence issue of the existing baseline models towards the large-scale annotated 3D point cloud data. Specifically, MvNet first encodes a 3D point cloud into multi-view image features for a number of different views. Then, a novel multi-view prompt fusion module is developed to fuse information from different views effectively to bridge the gap between 3D point cloud data and 2D pre-trained models. A set of 2D image prompts can then be derived to better describe the suitable prior knowledge for a large-scale pre-trained image model for few-shot 3D point cloud classification. Extensive experiments on ModelNet, ScanObjectNN, and ShapeNet datasets demonstrate that MvNet achieves new state-of-the-art performance for 3D few-shot point cloud image classification. The source code of this work is available at https://github.com/invictus717/MetaTransformer.
科研通智能强力驱动
Strongly Powered by AbleSci AI