计算机科学
同步
人工智能
纹理(宇宙学)
计算机视觉
计算机图形学(图像)
磁道(磁盘驱动器)
匹配(统计)
图像(数学)
数学
计算机网络
统计
操作系统
频道(广播)
作者
Supasorn Suwajanakorn,Steven M. Seitz,Ira Kemelmacher-Shlizerman
标识
DOI:10.1145/3072959.3073640
摘要
Given audio of President Barack Obama, we synthesize a high quality video of him speaking with accurate lip sync, composited into a target video clip. Trained on many hours of his weekly address footage, a recurrent neural network learns the mapping from raw audio features to mouth shapes. Given the mouth shape at each time instant, we synthesize high quality mouth texture, and composite it with proper 3D pose matching to change what he appears to be saying in a target video to match the input audio track. Our approach produces photorealistic results.
科研通智能强力驱动
Strongly Powered by AbleSci AI