计算机科学
仿射变换
管道(软件)
人工智能
卷积神经网络
计算机视觉
网格
转化(遗传学)
算法
数学
几何学
生物化学
基因
化学
程序设计语言
纯数学
作者
Michaël Gharbi,Jiawen Chen,Jonathan T. Barron,Samuel W. Hasinoff,Frédo Durand
标识
DOI:10.1145/3072959.3073592
摘要
Performance is a critical challenge in mobile image processing. Given a reference imaging pipeline, or even human-adjusted pairs of images, we seek to reproduce the enhancements and enable real-time evaluation. For this, we introduce a new neural network architecture inspired by bilateral grid processing and local affine color transforms. Using pairs of input/output images, we train a convolutional neural network to predict the coefficients of a locally-affine model in bilateral space. Our architecture learns to make local, global, and content-dependent decisions to approximate the desired image transformation. At runtime, the neural network consumes a low-resolution version of the input image, produces a set of affine transformations in bilateral space, upsamples those transformations in an edge-preserving fashion using a new slicing node, and then applies those upsampled transformations to the full-resolution image. Our algorithm processes high-resolution images on a smartphone in milliseconds, provides a real-time viewfinder at 1080p resolution, and matches the quality of state-of-the-art approximation techniques on a large class of image operators. Unlike previous work, our model is trained off-line from data and therefore does not require access to the original operator at runtime. This allows our model to learn complex, scene-dependent transformations for which no reference implementation is available, such as the photographic edits of a human retoucher.
科研通智能强力驱动
Strongly Powered by AbleSci AI