计算机科学
反演(地质)
计算机视觉
人工智能
图像编辑
计算机图形学(图像)
算法
图像(数学)
地质学
古生物学
构造盆地
作者
Jiancheng Huang,Yifan Liu,Jiaxi Lv,Shifeng Chen
标识
DOI:10.1109/icassp48485.2024.10448490
摘要
Text-conditional image editing is a very practical AIGC task that has recently emerged with great commercial and academic research value. For real image editing, most diffusion model-based methods use DDIM Inversion as the first stage before editing, but DDIM Inversion often results in reconstruction failure, leading to unsatisfactory performance for all downstream edits. In order to solve this problem, we first mathematically analyze the reason for the reconstruction failure of DDIM Inversion, and then propose a new inversion and sampling method named Entwined Inversion that can achieve satisfactory reconstruction and editing performance, which can solve two major problems: 1) the object can retain the main content of the original image; 2) the edited object can conform to the semantics of the text prompt. In addition, our method does not require training the diffusion model itself on a large dataset, nor does it require any fine-tuning for some particular images.
科研通智能强力驱动
Strongly Powered by AbleSci AI