Abstract Accurate road-defect detection is critical to the safety and ride comfort of autonomous vehicles. We present an enhanced YOLOv11n detector that prioritizes accuracy and robustness while keeping efficiency changes modest. The framework integrates three complementary components: (i) C3K2-DTAB, which couples dilated convolution with global attention in the backbone to enlarge the effective receptive field and strengthen spatial–channel interaction; (ii) CAFMFusion, which connects to the detection head and fuses local texture cues with high-level semantics to improve the recall of small, low-contrast defects; and (iii) improved C2PSA, a content-aware sparse routing mechanism that concentrates computation on salient regions, thereby reducing redundant processing and stabilizing throughput.On the RDD2022 benchmark, the proposed model improves mAP@0.5 by 2.8 percentage points, Precision by 4.1 percentage points, and Recall by 3.0 percentage points over the YOLOv11n baseline. These gains are obtained with only a small increase in parameters (from 2.58M to 2.61M) and a modest change in inference speed (from 173 to 168 FPS) at 640 × 640 on an RTX 3080. Additional evaluations under adverse-weather subsets indicate consistent improvements, supporting the model’s robustness in practical scenarios. Overall, the method achieves a favourable accuracy–efficiency balance suitable for deployment in complex road environments with dense small-object distributions Keywords: Traffic safety; Road defect detection; YOLOv11; Deep Learning.