计算机科学
透视图(图形)
人工智能
融合
图像融合
传感器融合
图像(数学)
语言学
哲学
作者
Mujtaba Asad,Waqar Azeem,He Jiang,Hafiz Tayyab Mustafa,Jie Yang,Wei Liu
标识
DOI:10.1109/tcsvt.2025.3536475
摘要
In the context of Industrial Anomaly Detection (IAD), ensuring the quality of manufactured products is critical. Traditional 2D based methods often fail to capture anomalies present in complex 3D shapes. For effective anomaly detection in 3D shapes, it is essential to incorporate global semantic context, local geometric structure, and color information of the object. To fully leverage these features, we propose a network named 2M3DF, that leverages knowledge from multi-view RGB images and corresponding point cloud information for enhanced anomaly detection performance. Our model initially employs pre-trained feature extractors that generate local features from multi-view RGB images and corresponding point clouds. The novel inter-modality feature representation and fusion module first adapts these inter-modality features and then effectively aligns and aggregates these multimodality features on a pixel-to-point basis. To learn the normality from point-wise fused multimodal features, we fit a multivariate Gaussian distribution to model the normal feature distribution. Comprehensive experimental evaluations using the MVTec3D-AD and Eyecandies dataset validate the effectiveness of our propose model and demonstrate significant improvements in comparison to existing state-of-the-art methods. Our model achieves a 96.6% mean I-AUROC while delivering real-time results.
科研通智能强力驱动
Strongly Powered by AbleSci AI