Accurately predicting the vibration response of large-span cable-stayed bridges is essential for assessing their structural safety. Traditional physical models based on vehicle-bridge interaction techniques can effectively simulate bridge vibrations but may yield results that deviate from actual observations. Meanwhile, data-driven methods employing neural network surrogate models excel in prediction accuracy due to their robust nonlinear fitting capabilities but often require extensive datasets for training. These models may struggle with generalization when faced with limited or inadequately labeled data. This study introduces a novel approach that integrates physical models with data-driven methods to predict the dynamic response of long-span cable-stayed bridges. This hybrid method leverages the interpretability and robustness of physical models while enhancing predictive accuracy through data-driven techniques. Field tests on bridges validated the method’s applicability and effectiveness, demonstrating an 85% improvement in prediction accuracy for bridge vibration responses.