人工智能
计算机科学
计算机视觉
分割
单眼
机器人
图像分割
立体视觉
语义学(计算机科学)
程序设计语言
作者
Libo Sun,James Bockman,Changming Sun
标识
DOI:10.1109/tim.2023.3328708
摘要
Semantic segmentation is a crucial task with wide-ranging applications, including autonomous driving and robot navigation. However, prevailing state-of-the-art methods primarily focus on monocular images, neglecting the untapped potential of stereo cameras commonly equipped in autonomous vehicles and robots, which capture binocular images. In this article, we aim to introduce an innovative stereo-vision-based semantic segmentation framework that maximizes the utilization of stereo image data to enhance segmentation performance. Unlike conventional monocular approaches that only use one image, our method effectively uses both the images, exploiting interimage correspondences and harnessing previously neglected information. Our core innovations encompass label generation for right images, combined with stereo-vision-based information fusion. For label generation, we propose a novel technique to accurately generate labels for the right images in stereo pairs, even in scenarios with no direct annotations. This innovative approach empowers our models to effectively learn from a complete stereo dataset, enhancing their semantic segmentation capabilities. In addition, our stereo-vision-based information fusion framework seamlessly integrates features and spatial disparities from the binocular images, enabling our models to produce more accurate and contextually enriched semantic segmentation outputs. To validate the efficacy of our proposed approach, we conduct comprehensive experiments on the Cityscapes and KITTI datasets using diverse, well-known semantic segmentation architectures. The results unequivocally demonstrate the superiority and effectiveness of our method.
科研通智能强力驱动
Strongly Powered by AbleSci AI