Track defect detection is crucial for ensuring train operation safety and maintaining railway infrastructure integrity. To address the problems of missed detection, inaccurate positioning, and insufficient ability to detect small-scale objects in traditional track defect detection, a track defect detection network (DSO-YOLO) based on improved YOLOv5s is proposed. This method employs a decoupling head and a small-object detection layer due to the YOLOv5s, and adopts the full-dimensional dynamic convolution module ODConv to improve object detection performance. First, the original coupled header is replaced by a decoupled one and the generalizability of Yolov5s is improved by a learning process that separates the target position and classification data. Second, the new small target detection layer expands the feature mapping from three groups to four groups; a better multiscale detection mechanism is introduced to handle targets of different sizes. Finally, ODConv is introduced into the neck structure of YOLOv5s, and a 4-D attention mechanism is adopted to accurately locate the track defect feature regions and refine the local fine-grained features for solving the problem of illumination influence as well as the overlap of defect regions. The experimental consequents show that the mean average precision of the improved model is 98.6%, surpassing YOLOv5s by 3.7%. The suggested model demonstrates higher accuracy in detecting various track defects within complex environments.