The Development and Validation of Multivariable Electronic Health Record-Based Models to Predict Diabetic Ketoacidosis-Related Hospitalizations for Adults with Type 1 Diabetes
作者
Jacob Kohlenberg,Meng Xu,Ryan Coopergard,Erika S. Helgeson,Amy C. Gross,Nestoras Mathioudakis,Craig A. Vandervelden,Mark A. Clements,Lisa S. Chow,Sisi Ma
Aim: To develop and validate models that use electronic health record (EHR) data to predict diabetic ketoacidosis (DKA)-related hospitalizations over 90 and 180 days among adults with type 1 diabetes (T1D). Methods: We used EHR data from adults with T1D treated at an academic health system in the United States, between January 1, 2017, and April 30, 2023. Models were built to predict the 90- and 180-day DKA risk using EHR data from the 2 years preceding the index date. We constructed seven predictors: (1) prior DKA event, (2) number of prior DKA events, (3) average time between DKA events in years, (4) time since the most recent DKA event in years, (5) most recent HbA1c, (6) the absence of a HbA1c result in the past 2 years, and (7) insurance type. The dataset was split into discovery and prospective validation cohorts. Logistic regression models were built using the discovery cohort and validated using the prospective validation cohort. Results: Our dataset included 7798 adults with T1D, of which 667 (8.6%) experienced ≥1 post-T1D diagnosis DKA event, totaling 1102 DKA events. The 90-day model achieved a mean area under the receiver operating characteristic curve (AUC) of 0.87 (standard deviation [SD] ± 0.02). The 180-day model achieved a mean AUC of 0.84 (SD ± 0.02). Among the 5% highest risk individuals, the 90-day model had a recall of 0.45, precision of 0.11, and specificity of 0.95, while the 180-day model had a recall of 0.42, precision of 0.17, and a specificity of 0.96. Conclusion: We developed EHR-based logistic regression models that effectively predict DKA-related hospitalizations in adults with T1D. Future work will enhance model performance by incorporating additional features and applying advanced machine learning methods.