桥(图论)
弹丸
异常检测
异常(物理)
零(语言学)
计算机科学
领域(数学分析)
数据挖掘
物理
材料科学
数学
凝聚态物理
医学
数学分析
语言学
哲学
内科学
冶金
作者
Zhe Zhang,Chen Shu,Jian Huang,Jie Ma
标识
DOI:10.1109/jsen.2025.3544407
摘要
Visual defect detection is crucial for industrial quality control in intelligent manufacturing. Previous research requires target-specific data to train the model for each inspection task. However, due to the challenges of collecting proprietary data and model-training time costs, zero-shot defect detection (ZSDD) has become an emerging topic in the field. ZSDD, which requires models trained with auxiliary data, can detect defects on different products without target-data training. Recently, large pretrained vision-language models (VLMs), such as contrastive language-image pre-training model (CLIP), have demonstrated revolutionary generality with competitive zero-shot performance across various downstream tasks. However, VLMs have limitations in defect detection, which are designed to focus on identifying category semantics of the objects rather than sensing object attributes (defective/nondefective). The current VLMs-based ZSDD methods require manually crafted text prompts to guide the discovery of anomaly attributes. In this article, we propose a novel ZSDD method, namely attribute-aware CLIP, to adapt CLIP for anomaly attribute discovery without designing specific textual prompts. The core is designing a textual domain bridge, which transforms simple general textual prompt features into prompt embeddings better aligned with the attribute awareness. This enables the model to perceive the attributes of objects by text-image feature matching, bridging the gap between object semantic recognition and attribute discovery. Additionally, we perform component clustering on the images to break down the overall object semantics, encouraging the model to focus on attribute awareness. Extensive experiments on 16 real-world defect datasets demonstrate that our method achieves state-of-the-art (SOTA) ZSDD performance in diverse class-semantic datasets.
科研通智能强力驱动
Strongly Powered by AbleSci AI