深度学习
水准点(测量)
计算机科学
人工智能
代表(政治)
透明度(行为)
比例(比率)
机器学习
测距
特征学习
计算机视觉
地图学
地理
电信
计算机安全
政治
政治学
法学
作者
Lu Lu,Yichen Sheng,Zhi Tu,Wentian Zhao,Xiuzhen Cheng,Kun Wan,Lantao Yu,Qi Guo,Zixun Yu,Yawen Lu,Xiuli Li,Xiechang Sun,R.L. Ashok,Amrita Mukherjee,Hao Kang,Xiangrui Kong,Gang Hua,T Zhang,Bedřich Beneš,Aniket Bera
出处
期刊:Cornell University - arXiv
日期:2023-12-25
标识
DOI:10.48550/arxiv.2312.16256
摘要
We have witnessed significant progress in deep learning-based 3D vision, ranging from neural radiance field (NeRF) based 3D representation learning to applications in novel view synthesis (NVS). However, existing scene-level datasets for deep learning-based 3D vision, limited to either synthetic environments or a narrow selection of real-world scenes, are quite insufficient. This insufficiency not only hinders a comprehensive benchmark of existing methods but also caps what could be explored in deep learning-based 3D analysis. To address this critical gap, we present DL3DV-10K, a large-scale scene dataset, featuring 51.2 million frames from 10,510 videos captured from 65 types of point-of-interest (POI) locations, covering both bounded and unbounded scenes, with different levels of reflection, transparency, and lighting. We conducted a comprehensive benchmark of recent NVS methods on DL3DV-10K, which revealed valuable insights for future research in NVS. In addition, we have obtained encouraging results in a pilot study to learn generalizable NeRF from DL3DV-10K, which manifests the necessity of a large-scale scene-level dataset to forge a path toward a foundation model for learning 3D representation. Our DL3DV-10K dataset, benchmark results, and models will be publicly accessible at https://dl3dv-10k.github.io/DL3DV-10K/.
科研通智能强力驱动
Strongly Powered by AbleSci AI