计算机科学
代表(政治)
编码器
语义学(计算机科学)
编码(内存)
概率逻辑
人工智能
编码(集合论)
图像(数学)
编码
模式识别(心理学)
自然语言处理
集合(抽象数据类型)
理论计算机科学
程序设计语言
生物化学
化学
政治
政治学
法学
基因
操作系统
作者
Konpat Preechakul,Nattanat Chatthee,Suttisak Wizadwongsa,Supasorn Suwajanakorn
标识
DOI:10.1109/cvpr52688.2022.01036
摘要
Diffusion probabilistic models (DPMs) have achieved remarkable quality in image generation that rivals GANs'. But unlike GANs, DPMs use a set of latent variables that lack semantic meaning and cannot serve as a useful representation for other tasks. This paper explores the possibility of using DPMs for representation learning and seeks to extract a meaningful and decodable representation of an input image via autoencoding. Our key idea is to use a learnable encoder for discovering the high-level semantics, and a DPM as the decoder for modeling the remaining stochastic variations. Our method can encode any image into a two-part latent code where the first part is semantically meaningful and linear, and the second part captures stochastic details, allowing near-exact reconstruction. This capability enables challenging applications that currently foil GAN-based methods, such as attribute manipulation on real images. We also show that this two-level encoding improves denoising efficiency and naturally facilitates various downstream tasks including few-shot conditional sampling. Please visit our page: https://Diff-AE.github.io/
科研通智能强力驱动
Strongly Powered by AbleSci AI