作者
Y. Natalia Alfonso,Sandy Shi Shi,Nishit Patel,Abdul Bachani,Inada Haruhiko,Andrés I. Vecino-Ortiz,Qingfeng Li
摘要
Suicide is the second leading cause of death among adolescents in the United States. Early detection of suicidal behavior is key to providing adolescents with the appropriate and timely support they need to prevent suicide. Machine learning (ML) algorithms are statistical tools that have shown great potential in public health research and practice to detect people at risk of suicide. However, ML algorithms in suicidal behavior research have focused on adults or clinical patients. Few studies have examined the feasibility of using these tools with nonclinical community surveys to detect adolescents in the general population at risk for suicidal behavior. We created and evaluated the accuracy of an advanced ML model to prospectively identify adolescents at risk for suicidal ideation (SI) in school settings, as well as which are the most important risk factors for SI. Restricted-use cross-sectional data from the Add Health home and school surveys for the years 1994-1995 and 1995-1996 were used. The data, consisting of 13,483 adolescents associated with 132 schools, provided a representative sample of adolescents enrolled in grades 7 through 12 in the US. Random forest (RF) algorithms were developed to generate predictions of adolescents at risk of SI in the academic year 1995-1996 based on predictors from 1994-1995. The algorithm included 143 potential predictors associated with suicidal behavior from the following domains: demographic characteristics, psychopathology (externalizing, familial, and internalizing), prior exposure to suicidal behavior, physical health, self- and parent-reported health treatment, social and school-level factors. RF was also used to rank variables by importance in predicting SI. The accuracy and performance of the model were measured by estimating the sensitivity, specificity, and area under the receive operating characteristic (AUC). The prevalence of SI was 10.8% (n=1,460). The RF model’s power predicting individuals at risk of SI was 0.79. Major risk factors included school-level factors (e.g., parent participation in school activities, academic performance, etc.), prior SI, depression, physical health, and demographic indicators. The statistical performance of the RF model suggests that ML models using community surveys can be an effective tool for detecting adolescents at risk of suicide in the general population. To our knowledge, this is the first study to use the advanced ML technique to assess suicidal behaviors among the general population of adolescents in the US. The AUC results are consistent with previous literature on SI prediction models for adults and for patients. The combination of advanced ML algorithms and synthesized datasets will provide new answers to important debates in the literature, with the potential to substantially enhance adolescent suicide knowledge and prevention strategies.