像素
计算机科学
人工智能
管道(软件)
可微函数
端到端原则
计算机视觉
任务(项目管理)
人工神经网络
分割
网格
深度学习
数学
几何学
工程类
数学分析
程序设计语言
系统工程
作者
Hendrik Sommerhoff,Shashank Agnihotri,Mohamed Saleh,Michael Moeller,Margret Keuper,Andreas Kolb
出处
期刊:Cornell University - arXiv
日期:2023-04-28
被引量:1
标识
DOI:10.48550/arxiv.2304.14736
摘要
The success of deep learning is frequently described as the ability to train all parameters of a network on a specific application in an end-to-end fashion. Yet, several design choices on the camera level, including the pixel layout of the sensor, are considered as pre-defined and fixed, and high resolution, regular pixel layouts are considered to be the most generic ones in computer vision and graphics, treating all regions of an image as equally important. While several works have considered non-uniform, \eg, hexagonal or foveated, pixel layouts in hardware and image processing, the layout has not been integrated into the end-to-end learning paradigm so far. In this work, we present the first truly end-to-end trained imaging pipeline that optimizes the size and distribution of pixels on the imaging sensor jointly with the parameters of a given neural network on a specific task. We derive an analytic, differentiable approach for the sensor layout parameterization that allows for task-specific, local varying pixel resolutions. We present two pixel layout parameterization functions: rectangular and curvilinear grid shapes that retain a regular topology. We provide a drop-in module that approximates sensor simulation given existing high-resolution images to directly connect our method with existing deep learning models. We show that network predictions benefit from learnable pixel layouts for two different downstream tasks, classification and semantic segmentation.
科研通智能强力驱动
Strongly Powered by AbleSci AI