Boosting(机器学习)
异常检测
弹丸
零(语言学)
计算机科学
物理
模式识别(心理学)
人工智能
材料科学
语言学
哲学
冶金
作者
Yuyao Liu,Qingyong Li,Zhehong Wang,Jien Kato,Jie Zhang,Wen Wang
标识
DOI:10.1109/tim.2025.3571124
摘要
Zero-Shot Anomaly Detection (ZSAD) is a critical task that detects anomalies without any training samples from the target application, which is crucial for applications in diverse fields such as industrial quality control and medical imaging analysis. Recent advances have seen the application of Contrastive vision-Language Pretraining (CLIP) in ZSAD, exploiting its robust visual-linguistic alignment and zero-shot learning capabilities. However, CLIP is primarily designed for natural image classification, emphasizing global visual embeddings, while anomaly detection requires more accurate representation of anomalous regions and more precise local visual embeddings. To overcome these limitations, this paper proposes the Local Enhanced CLIP (LECLIP) framework for ZSAD. LECLIP incorporates a Local Alignment Module that divides images into blocks and aligns them with learnable text embeddings, ensuring precise relevance expression. Furthermore, a training-free Echo-Attention is proposed to complement the traditional QKV attention, enabling the model to capture both global and local image details effectively, thus providing a more accurate and detailed image representation. Experimental results show that LECLIP achieves superior performance on 15 challenging datasets, including 6 industrial datasets and 9 medical datasets. Code is available at https://github.com/lyy70/LECLIP.
科研通智能强力驱动
Strongly Powered by AbleSci AI