计算机科学
计算机视觉
图像处理
云计算
人工智能
点云
扩散
图像(数学)
物理
热力学
操作系统
作者
Yan Zhou,Dewang Ye,Huaidong Zhang,Xuemiao Xu,Huajie Sun,Yufan Xu,Xiangyu Liu,Yuexia Zhou
标识
DOI:10.1109/tip.2025.3539935
摘要
Single-image 3D shape reconstruction has attracted significant attention with the advance of generative models. Recent studies have utilized diffusion models to achieve unprecedented shape reconstruction quality. However, these methods, in each sampling step, perform denoising in a single forward pass, leading to cumulative errors that severely impact the geometric consistency of the generated shapes with the input targets and face difficulties in reconstructing rich details of complex 3D shapes. Moreover, the performance of current works suffers significant degradation due to limited information when only a single image is used as input during testing, further affecting the quality of 3D shape generation. In this paper, we present a recurrent diffusion framework, aiming to improve generation performance during single image-to-shape inference. Diverging from denoising in a single forward pass, we recursively refine the noise prediction in a self-rectified manner with the explicit guidance of the input target, thereby markedly suppressing cumulative errors and improving detail modeling. To enhance the geometric perception ability of the network during single-image inference, we further introduce a multi-view training scheme equipped with a view-robust conditional generation mechanism, which effectively promotes generation quality even when only a single image is available during inference. The effectiveness of our method is demonstrated through extensive evaluations on two public 3D shape datasets, where it surpasses state-of-the-art methods both qualitatively and quantitatively.
科研通智能强力驱动
Strongly Powered by AbleSci AI