最小边界框
人工智能
计算机科学
计算机视觉
跳跃式监视
点云
迭代重建
三维重建
体素
对象(语法)
RGB颜色模型
目标检测
姿势
钥匙(锁)
可微函数
图像(数学)
模式识别(心理学)
数学
计算机安全
数学分析
作者
Francis Engelmann,Konstantinos Rematas,Bastian Leibe,Vittorio Ferrari
标识
DOI:10.1109/cvpr46437.2021.00456
摘要
We propose a method to detect and reconstruct multiple 3D objects from a single RGB image. The key idea is to optimize for detection, alignment and shape jointly over all objects in the RGB image, while focusing on realistic and physically plausible reconstructions. To this end, we propose a key-point detector that localizes objects as center points and directly predicts all object properties, including 9-DoF bounding boxes and 3D shapes – all in a single forward pass. The proposed method formulates 3D shape reconstruction as a shape selection problem, i.e. it selects among exemplar shapes from a given database. This makes it agnostic to shape representations, which enables a lightweight reconstruction of realistic and visually-pleasing shapes based on CAD-models, while the training objective is formulated around point clouds and voxel representations. A collision-loss promotes non-intersecting objects, further increasing the reconstruction realism. Given the RGB image, the presented approach performs lightweight reconstruction in a single-stage, it is real-time capable, fully differentiable and end-to-end trainable. Our experiments compare multiple approaches for 9-DoF bounding box estimation, evaluate the novel shape-selection mechanism and compare to recent methods in terms of 3D bounding box estimation and 3D shape reconstruction quality.
科研通智能强力驱动
Strongly Powered by AbleSci AI