计算机科学
人工智能
特征(语言学)
RGB颜色模型
深度学习
图像融合
计算机视觉
情态动词
特征学习
测距
模式
约束(计算机辅助设计)
融合
图像(数学)
模态(人机交互)
模式识别(心理学)
工程类
社会学
哲学
化学
高分子化学
机械工程
电信
语言学
社会科学
作者
Lei Zhang,Chunyu Lin,Kang Liao,Yao Zhao
标识
DOI:10.1109/icassp49357.2023.10095104
摘要
Image outpainting technology generates visually plausible content regardless of authenticity, making it unreliable to be applied in practice. Thus, we propose a reliable image outpainting task, introducing the sparse depth from LiDARs (Light Detection And Ranging devices) to extrapolate authentic RGB scenes. The large field view of LiDARs allows it to serve for data enhancement and further multimodal tasks. Concretely, we propose a Depth-Guided Outpainting Network to model different feature representations of two modalities and learn the structure-aware cross-modal fusion. And two components are designed: 1) The Multimodal Learning Module produces unique depth and RGB feature representations from the perspectives of different modal characteristics. 2) The Depth Guidance Fusion Module leverages the complete depth modality to guide the establishment of RGB contents by progressive multimodal feature fusion. Furthermore, we specially design an additional constraint strategy consisting of Cross-modal Loss and Edge Loss to enhance ambiguous contours and expedite reliable content generation. Extensive experiments on KITTI and Waymo datasets demonstrate our superiority over the state-of-the-art method, quantitatively and qualitatively.
科研通智能强力驱动
Strongly Powered by AbleSci AI