Abstract Object detection in remote sensing imagery is challenged by extreme variations in object scale and arbitrary orientations, which significantly degrade the accuracy of conventional detectors. At the same time, there is a growing demand for lightweight models to facilitate deployment in real-world scenarios. To address these issues, we propose DMSE-YOLO, a detection framework that jointly enhances multi-scale representation and orientation awareness while maintaining computational efficiency. The network integrates cross-scale feature fusion with adaptive channel reweighting to enhance multi-scale detection capability, and introduces direction-aware attention mechanisms to improve localization under orientation uncertainty. Furthermore, an efficient convolutional variant is incorporated to form a lightweight version, L-DMSE-YOLO, which preserves detection accuracy with reduced parameters and FLOPs. Experiments on the challenging DOTA-v1.0, DOTA-v1.5, and DIOR-R benchmarks demonstrate that DMSE-YOLO achieves mAP50 scores of 75.04, 69.9, and 77.6, respectively, while the lightweight L-DMSE-YOLO maintains comparable performance with reduced computational cost, making it ideal for resource-constrained applications.