计算机科学
图像翻译
人工智能
翻译(生物学)
规范化(社会学)
一致性(知识库)
模式识别(心理学)
数据挖掘
计算机视觉
理论计算机科学
图像(数学)
人类学
社会学
信使核糖核酸
基因
化学
生物化学
作者
Jin Liu,Huiyuan Fu,Xin Wang,Huadóng Ma
标识
DOI:10.1109/tcsvt.2024.3353932
摘要
Image-to-image (I2I) translation often requires establishing cycle consistency between the source and the translated images across different domains. However, cycle consistency requires redundant reconstruction, and is too restrictive to satisfy the bijection assumption between the two domains. In this paper, we propose SwinIT, a hierarchical Swin-transformer I2I Translation framework without using cycle consistency. Specifically, we carefully design symmetrical encoders for content and style flows, then explore newly proposed adaptive denormalization and normalization strategies. This framework can effectively capture and fuse content and style representations in a coarse-to-fine manner, ensuring our method achieves high performance without cycle consistency. Guided by element-wise feature adaptive denormalization, our model focuses on preserving semantic structure information. Due to the semantic mismatch between unpaired source and exemplar images, we introduce cross-attention adaptive instance normalization to help achieve better alignment. However, because the original optimization objective lacks direct supervision to preserve high-frequency information, rich edge details are lost during the translation. We propose a wavelet transformation matching loss to recover the details by converting the image into multi-frequency parts. We validate our proposed method in various I2I translation tasks, including arbitrary style transfer, multi-modal image synthesis, and semantic image synthesis, demonstrating its effectiveness in both qualitative and quantitative evaluations.
科研通智能强力驱动
Strongly Powered by AbleSci AI