异常检测
计算机科学
零(语言学)
弹丸
接地零点
异常(物理)
计算机视觉
人工智能
物理
语言学
材料科学
哲学
凝聚态物理
核物理学
冶金
作者
Zhuo Li,Yifei Ge,Qi Li,Lin Meng
标识
DOI:10.1109/icamechs63130.2024.10818831
摘要
This paper presents an efficient zero-shot industrial anomaly detection (IAD) framework based on visual-language models. Industrial anomaly detection usually adopts an unsupervised learning approach, which achieves excellent detection performance though. However, it is still difficult to recognize some more complicated anomalies, such as rotational defects. At this point, more detailed features are needed to describe the image. With the excellent performance of contrastive language-image pretraining (CLIP), this paper proposes a zero-shot industrial anomaly detection framework IAD-CLIP based on visual language models. The framework contains a pre-trained CLIP model, a training-free adaptation module and a test-time adaptation mechanism. The training-free adaptation module uses a value-value attention mechanism and a state prompt space. The pre-trained CLIP model is used for feature extraction and the training-free adaptation module processes the extracted features through visual coders and text encoders for anomaly detection and localization. A test-time adaptation mechanism is used to improve the anomaly localization performance during the testing phase. The experimental results on the industrial anomaly detection dataset MVTec AD show that IAD-CLIP achieves 92.1% AUROC, 94.6% AUPR, and 91.9% F1Max, respectively. This result validates the significant effect of the IAD-CLIP framework proposed in this paper in the industrial anomaly detection task.
科研通智能强力驱动
Strongly Powered by AbleSci AI