计算机科学
管道(软件)
稳健性(进化)
核(代数)
人工智能
地点
特征提取
计算机视觉
模式识别(心理学)
数学
生物化学
化学
语言学
哲学
组合数学
基因
程序设计语言
作者
Bingsen Huang,Junqing Lu,Qiang Li,Qiyuan Liu,Maosong Lin,Yongqiang Cheng
出处
期刊:Communications in computer and information science
日期:2023-11-27
卷期号:: 113-125
标识
DOI:10.1007/978-981-99-8145-8_10
摘要
In this paper, we propose PLKA-MVSNet to address the remaining challenges in the in-depth estimation of learning-based multi-view stereo (MVS) methods, particularly the inaccurate depth estimation in challenging areas such as low-texture regions, weak lighting conditions, and non-Lambertian surfaces. We ascribe this problem to the insufficient performance of the feature extractor and the information loss caused by the MVS pipeline transmission, and give our optimization scheme. Specifically, we introduce parallel large kernel attention (PLKA) by using multiple small convolutions instead of a single large convolution, to enhance the perception of texture and structural information, which enables us to capture a larger receptive field and long-range information. In order to adapt to the coarse-to-fine MVS pipeline, we employ PLKA to construct a multi-stage feature extractor. Furthermore, we propose the parallel cost volume aggregation (PCVA) to enhance the robustness of the aggregated cost volume. It introduces two decision-making attentions in the 2D dimension to make up for information loss and pixel omission in the 3D convolution compression. Particularly, our method shows the best overall performance beyond the transformer-based method on the DTU dataset and achieves the best results on the challenging Tanks and Temples advanced dataset.
科研通智能强力驱动
Strongly Powered by AbleSci AI