计算机科学
人工智能
三维重建
特征(语言学)
迭代重建
计算机视觉
几何学
光学(聚焦)
嵌入
活动形状模型
代表(政治)
数学
光学
法学
哲学
物理
政治
分割
语言学
政治学
作者
Xuancheng Zhang,Rui Ma,Changqing Zhang,Minghao Zhang,Xibin Zhao,Yue Gao
标识
DOI:10.1109/tpami.2021.3090917
摘要
Reconstructing a 3D shape from a single-view image using deep learning has become increasingly popular recently. Most existing methods only focus on reconstructing the 3D shape geometry based on image constraints. The lack of explicit modeling of structure relations among shape parts yields low-quality reconstruction results for structure-rich man-made shapes. In addition, conventional 2D-3D joint embedding architecture for image-based 3D shape reconstruction often omits the specific view information from the given image, which may lead to degraded geometry and structure reconstruction. We address these problems by introducing VGSNet, an encoder-decoder architecture for view-aware joint geometry and structure learning. The key idea is to jointly learn a multimodal feature representation of 2D image, 3D shape geometry and structure so that both geometry and structure details can be reconstructed from a single-view image. To this end, we explicitly represent 3D shape structures as part relations and employ image supervision to guide the geometry and structure reconstruction. Trained with pairs of view-aligned images and 3D shapes, the VGSNet implicitly encodes the view-aware shape information in the latent feature space. Qualitative and quantitative comparisons with the state-of-the-art baseline methods as well as ablation studies demonstrate the effectiveness of the VGSNet for structure-aware single-view 3D shape reconstruction.
科研通智能强力驱动
Strongly Powered by AbleSci AI