摘要
Abstract Strip steel surface defect detection poses significant challenges due to small object omission, blurred features, and drastic scale variations, especially under the strict industrial demands of high accuracy, real-time performance, and lightweight deployment. To address these issues, we propose LSEH-YOLO, a lightweight detection framework integrating hybrid feature fusion and attention-guided mechanisms, with enhancements in backbone, attention, neck, and detection head.Specifically, the C3k2_LFEGM module introduces a three-stage design—local enhancement, gated fusion, and residual feedback—by integrating edge perception and window-level attention to boost feature response. The C2PSA-Mona module embeds Mona non-local guidance into multi-head attention to enhance inter-channel modeling. For the head, the Detect_AS module combines AFGC lightweight attention and directional Strip convolution to refine fine-grained localization. Furthermore, we propose the HB Neck, which reconfigures YOLOv11’s neck with a bidirectional feature flow, multi-strategy fusion (Fusion), lightweight upsampling (EUCB), and a structurally enhanced C3k2_SEH module.Experiments on NEU-DET and GC-DET datasets show that LSEH-YOLO improves mAP by 4.0% and 6.1% over the YOLOv11 baseline, while reducing parameters by 32.9%, FLOPs by 18.8%, and achieving 100.75 FPS. In the generalization evaluation on the GC-DET dataset, the model exhibited excellent accuracy and strong generalization performance.These results demonstrate that LSEH-YOLO effectively balances accuracy, speed, and efficiency, providing a practical solution for multi-scale defect detection in industrial settings.