摘要
• We propose an improved YOLOv11-based detection framework that integrates a C2PSA_iEMA backbone, a C3k2_BFAM_EMA neck, and a four-branch FASFF detection head, designed to address the challenges of low contrast and fine-grained target detection. • A dedicated dataset simulating the auxiliary chamber pipeline of monocrystalline furnaces was constructed, incorporating three representative types of residues to enable effective model training and evaluation. • Extensive experiments, ablation studies, and comparisons with state-of-the-art detectors demonstrate the superiority of the proposed method in terms of accuracy, robustness, and lightweight deployment potential. • From an engineering perspective, the proposed method functions as a quantitative inspection tool for evaluating pipeline cleanliness, providing practical support for defect detection and process optimization in monocrystalline silicon production. To address the challenges of low contrast and fine-grained target detection in the inspection of residual substances on the inner wall of the auxiliary chamber pipeline of a monocrystalline furnace, this paper proposes a detection algorithm based on an improved YOLOv11 architecture. First, the original C2PSA module in the backbone network is integrated with an interactive enhanced multi-scale attention module (iEMA) to construct a new C2PSA_iEMA module, enhancing the representation of subtle features. Second, the C3k2 module in the neck is replaced with a C3k2_BFAM_EMA module, which incorporates an improved bidirectional feature aggregation module (BFAM-EMA), thereby improving multi-scale feature complementarity. Finally, an ASFF module is added before the traditional detection heads and extended to four detection heads (FASFF), enabling more accurate object localization and classification. To validate the proposed method, a dedicated dataset was constructed by simulating the sub-chamber pipeline environment using stainless steel tubes, with paraffin droplets, graphite powder, and silica powder respectively representing crystalline residues, smoke deposits, and oxide residues. Comparative experiments against mainstream detection algorithms including SSD, Faster R-CNN, YOLOv5, and YOLOv8 demonstrate the superiority of the proposed model, achieving an mAP@0.5 of 86.6 %, a precision of 85.5 %, and a recall of 84.0 %. Furthermore, ablation studies confirm the individual effectiveness and combined benefits of the proposed modules. The results collectively show that the improved YOLOv11 model significantly enhances detection accuracy and robustness, offering strong potential for practical engineering applications in complex industrial environments.