计算机科学
推论
降噪
忠诚
小波
人工智能
扩散
编码(集合论)
图像(数学)
图像质量
模式识别(心理学)
采样(信号处理)
质量(理念)
加速
计算机视觉
哲学
物理
操作系统
认识论
滤波器(信号处理)
集合(抽象数据类型)
程序设计语言
热力学
电信
作者
Luan Thanh Trinh,Tomoki Hamagami
出处
期刊:IEEE Access
[Institute of Electrical and Electronics Engineers]
日期:2024-01-01
卷期号:12: 78161-78172
被引量:1
标识
DOI:10.1109/access.2024.3406535
摘要
Diffusion models are emerging as powerful solutions for generating high-fidelity and diverse images, often surpassing GANs under many circumstances. However, their slow inference speed hinders their potential for real-time applications. To address this, DiffusionGAN leveraged a conditional GAN to drastically reduce the denoising steps and speed up inference. Its advancement, Wavelet Diffusion, further accelerated the process by converting data into wavelet space, thus enhancing efficiency. Nonetheless, these models still fall short of GANs in terms of speed and image quality. To bridge these gaps, this paper introduces the Latent Denoising Diffusion GAN, which employs pre-trained autoencoders to compress images into a compact latent space, significantly improving inference speed and image quality. Furthermore, we propose a Weighted Learning strategy to enhance diversity and image quality. Experimental results on the CIFAR-10, CelebA-HQ, and LSUN-Church datasets prove that our model achieves state-of-the-art running speed among diffusion models. Compared to its predecessors, DiffusionGAN and Wavelet Diffusion, our model shows remarkable improvements in all evaluation metrics. Code and pre-trained checkpoints: \url{https://github.com/thanhluantrinh/LDDGAN.git}
科研通智能强力驱动
Strongly Powered by AbleSci AI