Learning From Text: A Multimodal Face Inpainting Network for Irregular Holes

修补 计算机科学 人工智能 面子(社会学概念) 计算机视觉 自然语言处理 模式识别(心理学) 图像(数学) 语言学 哲学
作者
Dandan Zhan,Jiahao Wu,Xing Luo,Zhi Jin
出处
期刊:IEEE Transactions on Circuits and Systems for Video Technology [Institute of Electrical and Electronics Engineers]
卷期号:34 (8): 7484-7497 被引量:1
标识
DOI:10.1109/tcsvt.2024.3370578
摘要

Irregular hole face inpainting is a challenging task, since the appearance of faces varies greatly (e.g., different expressions and poses) and the human vision is more sensitive to subtle blemishes in the inpainted face images. Without external information, most existing methods struggle to generate new content containing semantic information for face components in the absence of sufficient contextual information. As it is known that text can be used to describe the content of an image in most cases, and is flexible and user-friendly. In this work, a concise and effective Multimodal Face Inpainting Network (MuFIN) is proposed, which simultaneously utilizes the information of the known regions and the descriptive text of the input image to address the problem of irregular hole face inpainting. To fully exploit the rest parts of the corrupted face images, a plug-and-play Multi-scale Multi-level Skip Fusion Module (MMSFM), which extracts multi-scale features and fuses shallow features into deep features at multiple levels, is illustrated. Moreover, to bridge the gap between textual and visual modalities and effectively fuse cross-modal features, a Multi-scale Text-Image Fusion Block (MTIFB), which incorporates text features into image features from both local and global scales, is developed. Extensive experiments conducted on two commonly used datasets CelebA and Multi-Modal-CelebA-HQ demonstrate that our method outperforms state-of-the-art methods both qualitatively and quantitatively, and can generate realistic and controllable results.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
qsmei2020完成签到,获得积分10
1秒前
2秒前
3秒前
4秒前
r12r1发布了新的文献求助10
4秒前
梦想完成签到,获得积分10
4秒前
yjh应助jenny采纳,获得10
6秒前
7秒前
rain发布了新的文献求助10
7秒前
汉堡包应助勤恳的向日葵采纳,获得10
8秒前
9秒前
9秒前
所所应助期无分采纳,获得10
9秒前
科研通AI6.1应助Xie采纳,获得10
10秒前
10秒前
wenwen发布了新的文献求助10
11秒前
xiaoshu发布了新的文献求助10
11秒前
孟一完成签到,获得积分10
12秒前
共享精神应助称心寒松采纳,获得10
13秒前
Antonio发布了新的文献求助20
13秒前
13秒前
YH完成签到,获得积分10
14秒前
玲哥儿发布了新的文献求助10
15秒前
15秒前
mengmeng发布了新的文献求助10
15秒前
吃个馍馍发布了新的文献求助10
17秒前
细腻初雪发布了新的文献求助10
17秒前
无花果应助荔枝采纳,获得10
17秒前
17秒前
17秒前
鹿梦完成签到,获得积分10
19秒前
19秒前
Anthony完成签到,获得积分10
20秒前
杨倩发布了新的文献求助10
20秒前
NexusExplorer应助十六采纳,获得10
20秒前
wangzhenghua完成签到 ,获得积分10
20秒前
隐形曼青应助务实乘云采纳,获得10
20秒前
lv发布了新的文献求助10
21秒前
称心寒松发布了新的文献求助10
22秒前
22秒前
高分求助中
(应助此贴封号)【重要!!请各用户(尤其是新用户)详细阅读】【科研通的精品贴汇总】 10000
Prompt Engineering for Clinicians: Harnessing AI in Everyday Medical Practice 600
University Physics for the Life Sciences 500
REAL-WORLD EFFICACY AND GENOMIC LANDSCAPE OF POLATUZUMA VEDOTIN-BASED FIRST-LINE THERAPY IN DIFFUSE LARGE B-CELL LYMPHOMA: A FOCUS ON TP53 MUTATIONS AND TREATMENT RESPONSE 500
Handbook of Luminescence Dating 500
Safety Pharmacology 500
《KNN基无铅压电陶瓷电学性能优化与物理机理研究》 500
热门求助领域 (近24小时)
化学 材料科学 医学 生物 纳米技术 工程类 有机化学 计算机科学 化学工程 生物化学 物理 内科学 复合材料 催化作用 光电子学 物理化学 电极 细胞生物学 基因 遗传学
热门帖子
关注 科研通微信公众号,转发送积分 6954499
求助须知:如何正确求助?哪些是违规求助? 8638288
关于积分的说明 18318668
捐赠科研通 6398895
什么是DOI,文献DOI怎么找? 3083309
关于科研通互助平台的介绍 2129412
邀请新用户注册赠送积分活动 2060065