Layout-Bridging Text-to-Image Synthesis

计算机科学 图像(数学) 桥接(联网) 跳跃式监视 人工智能 一致性(知识库) 情报检索 计算机网络
作者
Jiadong Liang,Wenjie Pei,Feng Liu
出处
期刊:Cornell University - arXiv 被引量:1
标识
DOI:10.48550/arxiv.2208.06162
摘要

The crux of text-to-image synthesis stems from the difficulty of preserving the cross-modality semantic consistency between the input text and the synthesized image. Typical methods, which seek to model the text-to-image mapping directly, could only capture keywords in the text that indicates common objects or actions but fail to learn their spatial distribution patterns. An effective way to circumvent this limitation is to generate an image layout as guidance, which is attempted by a few methods. Nevertheless, these methods fail to generate practically effective layouts due to the diversity of input text and object location. In this paper we push for effective modeling in both text-to-layout generation and layout-to-image synthesis. Specifically, we formulate the text-to-layout generation as a sequence-to-sequence modeling task, and build our model upon Transformer to learn the spatial relationships between objects by modeling the sequential dependencies between them. In the stage of layout-to-image synthesis, we focus on learning the textual-visual semantic alignment per object in the layout to precisely incorporate the input text into the layout-to-image synthesizing process. To evaluate the quality of generated layout, we design a new metric specifically, dubbed Layout Quality Score, which considers both the absolute distribution errors of bounding boxes in the layout and the mutual spatial relationships between them. Extensive experiments on three datasets demonstrate the superior performance of our method over state-of-the-art methods on both predicting the layout and synthesizing the image from the given text.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
PDF的下载单位、IP信息已删除 (2025-6-4)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
刚刚
岸上牛完成签到,获得积分10
刚刚
刚刚
luoluo完成签到,获得积分10
1秒前
sxl完成签到,获得积分10
2秒前
余呀余完成签到 ,获得积分10
2秒前
wangchuanhai3420完成签到,获得积分10
3秒前
3秒前
科研通AI6应助MIST采纳,获得10
3秒前
折镜完成签到,获得积分10
4秒前
4秒前
牧野七完成签到,获得积分10
4秒前
ppmm发布了新的文献求助10
4秒前
5秒前
jiafang发布了新的文献求助10
5秒前
酷波er应助沉沉采纳,获得10
6秒前
8秒前
赞美太阳公公完成签到,获得积分10
9秒前
9秒前
自由的雪一完成签到,获得积分10
10秒前
慕青应助追求科研的小白采纳,获得10
10秒前
10秒前
科研通AI6应助594612采纳,获得10
11秒前
隐形曼青应助无情的宛儿采纳,获得10
12秒前
小芒果完成签到,获得积分0
12秒前
ELEGENCE发布了新的文献求助10
12秒前
栗子完成签到,获得积分10
13秒前
ccalvintan发布了新的文献求助10
13秒前
酷波er应助清脆大门采纳,获得30
14秒前
tanx发布了新的文献求助10
14秒前
深情安青应助ing采纳,获得20
14秒前
orixero应助桀庚采纳,获得10
15秒前
17秒前
Stern完成签到,获得积分10
17秒前
19秒前
21秒前
21秒前
大白包子李完成签到,获得积分10
21秒前
wrr发布了新的文献求助10
22秒前
22秒前
高分求助中
(禁止应助)【重要!!请各位详细阅读】【科研通的精品贴汇总】 10000
Organic Chemistry 666
The Netter Collection of Medical Illustrations: Digestive System, Volume 9, Part III - Liver, Biliary Tract, and Pancreas (3rd Edition) 600
Social Epistemology: The Niches for Knowledge and Ignorance 500
Introducing Sociology Using the Stuff of Everyday Life 400
Conjugated Polymers: Synthesis & Design 400
Picture Books with Same-sex Parented Families: Unintentional Censorship 380
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 生物化学 物理 内科学 纳米技术 计算机科学 化学工程 复合材料 遗传学 基因 物理化学 催化作用 冶金 细胞生物学 免疫学
热门帖子
关注 科研通微信公众号,转发送积分 4252387
求助须知:如何正确求助?哪些是违规求助? 3785555
关于积分的说明 11881895
捐赠科研通 3436553
什么是DOI,文献DOI怎么找? 1885987
邀请新用户注册赠送积分活动 937467
科研通“疑难数据库(出版商)”最低求助积分说明 843149