Abstract OBJECTIVES Prolonged air leak (PAL) is a common complication following pulmonary resection that contributes to increased morbidity, extended hospitalization, and higher costs. Several scoring systems can predict PAL, but their performance has not been fully validated. We compared three widely used models, the Epithor score, Gilbert score, and PALS, in a single video-assisted thoracoscopic surgery (VATS) lobectomy cohort. METHODS This retrospective study included 534 patients who underwent VATS single-lobe lobectomy for primary lung cancer between 2012 and 2021. PAL was defined as an air leak persisting beyond five days. Discrimination was assessed using the area under the receiver-operating-characteristic curve (AUC), calibration plots, Hosmer-Lemeshow test, and Brier score, and clinical utility by decision-curve analysis (DCA). Independent predictors were identified by multivariable logistic regression. RESULTS PAL occurred in 53 patients (9.9%). Male sex, body mass index < 25.5 kg/m2, and pleural adhesions were independent risk factors. Among the three models, the Epithor score showed the highest discrimination (AUC = 0.735), although differences were not significant. PALS achieved the most accurate calibration, while Epithor had the best Brier score. In DCA, Epithor provided the greatest net benefit at low threshold probabilities (≤ 0.12), Gilbert in a narrow midrange (0.12–0.16), and PALS across the typical clinical range (0.16–0.28). CONCLUSIONS In this single-centre retrospective cohort, Epithor performed best overall, whereas PALS provided superior threshold-specific calibration and clinical utility; combining both may enhance individualized risk stratification for PAL, although definitive superiority cannot be established.