发布文献求助

Vector Quantized Diffusion Model for Text-to-Image Synthesis

扩散计算机科学图像质量图像（数学）算法人工智能各项异性扩散自回归模型模式识别（心理学）编码（内存）图像分辨率自编码计算机视觉数学深度学习物理统计热力学

作者

Shuyang Gu,Dong Chen,Jianmin Bao,Fang Wen,Bo Zhang,Dongdong Chen,Yuan Liu,Baining Guo

链接

arxiv.org arxiv.orgdoi.org

标识

DOI：10.1109/cvpr52688.2022.01043

摘要

We present the vector quantized diffusion (VQ-Diffusion) model for text-to-image generation. This method is based on a vector quantized variational autoencoder (VQ-VAE) whose latent space is modeled by a conditional variant of the recently developed Denoising Diffusion Probabilistic Model (DDPM). We find that this latent-space method is well-suited for text-to-image generation tasks because it not only eliminates the unidirectional bias with existing methods but also allows us to incorporate a mask-and-replace diffusion strategy to avoid the accumulation of errors, which is a serious problem with existing methods. Our experiments show that the VQ-Diffusion produces significantly better text-to-image generation results when compared with conventional autoregressive (AR) models with similar numbers of parameters. Compared with previous GAN-based text-to-image methods, our VQ-Diffusion can handle more complex scenes and improve the synthesized image quality by a large margin. Finally, we show that the image generation computation in our method can be made highly efficient by reparameterization. With traditional AR methods, the text-to-image generation time increases linearly with the output image resolution and hence is quite time consuming even for normal size images. The VQ-Diffusion allows us to achieve a better trade-off between quality and speed. Our experiments indicate that the VQ-Diffusion model with the reparameterization is fifteen times faster than traditional AR methods while achieving a better image quality. The code and models are available at https://github.com/cientgu/VQ-Diffusion.

求助该文献

最长约 10秒，即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI

我的文献求助列表浏览历史

一分钟了解求助规则 | 捐赠本站 | 论文查重

更新

大幅提高文件上传限制，最高150M (2024-4-1)

更新

新增期刊收藏功能 (2024-03-23)

科研通是完全免费的文献互助平台，具备全网最快的应助速度，最高的求助完成率。对每一个文献求助，科研通都将尽心尽力，给求助人一个满意的交代。

实时播报: Owen上传了应助文件

1秒前; 希望天下0贩的0上传了应助文件

1秒前; 标致的傲之发布了新的文献求助10

4秒前; FYL完成签到，获得积分10

4秒前; gjww的应助被鲸鲸鲸京采纳，获得10

7秒前; 张张完成签到，获得积分10

7秒前; gjww上传了应助文件

7秒前; FYL关注了科研通微信公众号

7秒前; 霍山柳发布了新的文献求助30

7秒前; maox1aoxin上传了应助文件

8秒前; 摸鱼办主任发布了新的文献求助10

8秒前; hehehe完成签到，获得积分10

8秒前; young111的应助被4ever采纳，获得10

9秒前; 赘婿上传了应助文件

10秒前; 学术智子发布了新的文献求助10

10秒前; 自信的绿真关注了科研通微信公众号

11秒前; cherry完成签到，获得积分10

11秒前; 李健上传了应助文件

12秒前; shinysparrow的应助被张张采纳，获得10

12秒前; 乐乐上传了应助文件

13秒前; 哈哈哈发布了新的文献求助10

13秒前; 谨慎涵双完成签到，获得积分20

14秒前; 黎李丽丽发布了新的文献求助10

16秒前; 雪白机器猫发布了新的文献求助10

16秒前; 谨慎涵双发布了新的文献求助30

18秒前; ccm完成签到，获得积分10

18秒前; gjww的应助被鲸鲸鲸京采纳，获得10

18秒前; 一颗红葡萄完成签到，获得积分10

19秒前; 李爱国上传了应助文件

20秒前; ding上传了应助文件

22秒前; 自信的绿真发布了新的文献求助20

24秒前; gjww上传了应助文件

24秒前; 李爱国的应助被哈哈哈采纳，获得10

26秒前; wanci上传了应助文件

26秒前; 王小鱼发布了新的文献求助10

27秒前; 隐形的baby发布了新的文献求助10

29秒前; 小马甲上传了应助文件

31秒前; 黎李丽丽完成签到，获得积分10

32秒前; Owen上传了应助文件

32秒前; Hello上传了应助文件

32秒前

高分求助中: Sustainable Land Management: Strategies to Cope with the Marginalisation of Agriculture 1000; Corrosion and Oxygen Control 600; Yaws' Handbook of Antoine coefficients for vapor pressure 500; Python Programming for Linguistics and Digital Humanities: Applications for Text-Focused Fields 500; Love and Friendship in the Western Tradition: From Plato to Postmodernity 500; Heterocyclic Stilbene and Bibenzyl Derivatives in Liverworts: Distribution, Structures, Total Synthesis and Biological Activity 500; 重庆市新能源汽车产业大数据招商指南（两链两图两池两库两平台两清单两报告） 400

热门求助领域（近24小时）

热门帖子: 关注科研通微信公众号，转发送积分 2549790; 求助须知：如何正确求助？哪些是违规求助？ 2177097; 关于积分的说明 5607913; 捐赠科研通 1897919; 什么是DOI，文献DOI怎么找？ 947497; 版权声明 565447; 科研通“疑难数据库（出版商）”最低求助积分说明 504113

今日热心研友

糟糕的铁身

酸化土壤改良

天才小能喵

不热心小伙伴

Leo有疑问请留言_别着急驳回

互助遵法尚德

注：热心度 = 本日应助数 + 本日被采纳获取积分÷10

Copyright © 2020-2024 AbleSci.COM, 科研通, All Right Reserved

科研通是非营利科研互助平台，不忘初心，为科研助力

本站互助的所有文件仅供个人学习研究用，禁止任何人把求助的所得文献进行盈利或传播

皖ICP备2024041134号-1

皖公网安备34019202002308

科研通【文献互助QQ群】：826996720【点击一键加群】如果您有特殊求助，或发布求助超过24小时未得到应助，可加群求助

科研通【志愿服务QQ群】：如果您热爱文献互助，有热心愿意为更多人服务，请加入小伙伴群，点击申请加入

关注微信服务号

科研通