特征(语言学)
计算机科学
完井(油气井)
人工智能
计算机视觉
模式识别(心理学)
地质学
语言学
石油工程
哲学
作者
Shenglun Chen,Xinzhu Ma,Hong Zhang,Haojie Li,Baoli Sun,Zhihui Wang
标识
DOI:10.1109/tnnls.2025.3551903
摘要
As a key problem in computer vision, depth completion aims to recover dense depth maps from sparse ones [generally derived from light detection and ranging (LiDAR)]. Most methods introduce synchronous RGB images and leverage multimodal fusion to integrate multimodal features from these modalities to describe the complete scene. However, their different natural characteristics lead to inconsistency in features, potentially impacting the effectiveness of multimodal feature fusion. To address this issue, we propose a feature alignment network (FANet) that introduces an alignment scheme to enhance the consistency between multimodal features. This scheme aligns the modality-invariant semantic context, which is invariant to changes in modality and represents the correlation between a pixel and its surroundings. Specifically, we first design an asymmetric context extraction (ACE) module to extract modality-invariant semantic contexts from multimodal features within limited GPU memory, and then pull them closer to improve consistency. Crucially, our alignment scheme is only applied during the training phase, and no additional computation cost is incurred in the inference phase. Moreover, we introduce a simple yet effective refinement module to refine estimated results via residual learning based on intermediate depth maps and sparse depth maps. Extensive experiments on KITTI and VOID datasets demonstrate that our method achieves competitive performance against typical real-time methods. In addition, we embed the proposed alignment scheme and refinement module into other methods to demonstrate their effectiveness.
科研通智能强力驱动
Strongly Powered by AbleSci AI