Accurate identification of cotton leaf pests and diseases is essential for sustainable cultivation but is challenged by complex backgrounds, diverse pest morphologies, and varied symptoms, where existing deep learning models often show insufficient robustness. To address these challenges, RDL-YOLO model is proposed in this study. In the proposed model, RepViT-Atrous Convolution (RepViT-A) is employed as the backbone network to enhance local–global interaction and improve the response intensity and extraction accuracy of key lesion features. In addition, the Dilated Dense Convolution (DDC) module is designed to achieve a dynamic multi-scale receptive field, enabling the network to adapt to lesion defects of different shapes and sizes. LDConv further optimizes the effect of feature fusion. Experimental results showed that the mean Average Precision (mAP) of the proposed model reached 77.1%, representing a 3.7% improvement over the baseline YOLOv11. Compared with leading detectors such as Real-Time Detection Transformer (RT-DETR), You Only Look Once version 11 (YOLOv11), DETRs as Fine-grained Distribution Refinement (D-FINE), and Spatial Transformer Network-YOLO (STN-YOLO). RDL-YOLO exhibits superior performance, enhanced reliability, and strong generalization capabilities in tests on the cotton leaf dataset and public datasets. This advancement offers a practical technical solution for improved agricultural pest and disease management.