Using Machine Learning to Predict Acute Graft‐Versus‐Host Disease in Pediatric Patients Undergoing Allogeneic Hematopoietic Stem Cell Transplantation: Integration of Clinical and Genetic Factors
ABSTRACT Background Acute graft‐versus‐host disease (aGVHD) remains a significant cause of mortality following allogeneic hematopoietic stem cell transplantation (allo‐HSCT). The aim of this study was to construct and validate prediction models for aGVHD using machine learning. Methods Pediatric patients undergoing allo‐HSCT were retrospectively enrolled, and 53 pre‐transplantation clinical factors and 104 candidate single nucleotide polymorphisms of both recipients and donors were identified. Four algorithms were used to generate prediction models: categorical boosting, light gradient boosting machine, random forest, and logistic regression. Results This study included 270 pediatric patients and aGVHD of varying severity (grades I–IV) was observed in 88 individuals, which represents 32.6% of the study cohort. The area under the curve for the grade I–IV aGVHD prediction models ranged from 0.707 to 0.804. The optimal prediction model effectively differentiated aGVHD risk stratification. In the test set, the high‐risk stratum exhibited a significantly elevated 100‐day cumulative incidence of aGVHD (grades I–IV) relative to the low‐risk group (hazard ratio, 6.43; 95% confidence interval, 2.32–17.79; p < 0.001). Additionally, prediction models for moderate‐to‐severe (grades II–IV) and severe (grades III–IV) aGVHD were also constructed and validated. Conclusions aGVHD prediction models were successfully developed and validated in pediatric patients. The early identification of high‐risk patients will enable clinicians to provide personalized aGVHD prophylaxis. Trial Registration Chinese Clinical Trial Registry (Registration Number: ChiCTR2000040561)