计算机科学
人工智能
卷积神经网络
多边形(计算机图形学)
代表(政治)
模式识别(心理学)
网格
形状分析(程序分析)
背景(考古学)
形状上下文
体素
建筑
实体造型
计算机视觉
图像(数学)
数学
静态分析
艺术
电信
古生物学
视觉艺术
几何学
帧(网络)
政治
政治学
法学
生物
程序设计语言
作者
Hang Su,Subhransu Maji,Evangelos Kalogerakis,Erik Learned-Miller
出处
期刊:Cornell University - arXiv
日期:2015-12-01
被引量:1993
标识
DOI:10.1109/iccv.2015.114
摘要
A longstanding question in computer vision concerns the representation of 3D shapes for recognition: should 3D shapes be represented with descriptors operating on their native 3D formats, such as voxel grid or polygon mesh, or can they be effectively represented with view-based descriptors? We address this question in the context of learning to recognize 3D shapes from a collection of their rendered views on 2D images. We first present a standard CNN architecture trained to recognize the shapes' rendered views independently of each other, and show that a 3D shape can be recognized even from a single view at an accuracy far higher than using state-of-the-art 3D shape descriptors. Recognition rates further increase when multiple views of the shapes are provided. In addition, we present a novel CNN architecture that combines information from multiple views of a 3D shape into a single and compact shape descriptor offering even better recognition performance. The same architecture can be applied to accurately recognize human hand-drawn sketches of shapes. We conclude that a collection of 2D views can be highly informative for 3D shape recognition and is amenable to emerging CNN architectures and their derivatives.
科研通智能强力驱动
Strongly Powered by AbleSci AI