计算机科学
对偶(语法数字)
编码器
迭代重建
人工智能
图像(数学)
计算机视觉
操作系统
文学类
艺术
作者
Aditya Vallakatla,Ayan Chaudhury
标识
DOI:10.1109/icip55913.2025.11084579
摘要
Reconstruction of 3D point cloud from single view image is an extremely demanding as well as challenging task for several applications of computer vision and robotics. Although several works have been done in last few years, the problem is still not fully solved. In this work, we introduce BicycleDualNet, a novel architecture exploiting BicycleGAN and dual image encoder for 3D reconstruction. While existing methods overlook the depth or normal information in the reconstruction process, we demonstrate that incorporating the depth/normal information in the network can lead to better results. A base encoder using EfficientNet is first trained to match a pre-trained point cloud autoencoder’s latent space. To further improve its performance, BicycleGAN is then used to generate synthetic depth and normal images from input RGB images. A dual encoder is proposed with two branches for RGB and depth/normal images. This dual encoder is trained to predict 3D point clouds through the fixed point cloud decoder. Qualitative and quantitative experiments on both synthetic and real datasets demonstrate that the dual encoder performs better than the state-of-the-art in 3D reconstruction in terms of Chamfer distance and Earth Mover’s distance loss.
科研通智能强力驱动
Strongly Powered by AbleSci AI