Clarifying the differential factors with adolescent delinquency, abuse, self-injury, and mental health problems: Machine-learning and text mining analysis of a large sample of social work case records
Abstract This study uses machine-learning and text-mining techniques to classify social work case records to better distinguish among adolescents with four types of problems—delinquency, abuse, self-injury, and mental health problems—and identify the differential factors. We selected 573 cases recorded from a local social service organization in Shanghai, China, with 279 delinquent cases, 76 abused cases, 37 self-injured cases, and 181 cases with mental health problems. We utilized the Term Frequency-Inverse Document Frequency (TF-IDF) method to extract keywords, and trained three classification models: Naive Bayes, Decision Tree, and Random Forest. The Decision Tree model outperformed the other models with a precision of 0.9295. Based on the analysis of co-occurring keywords, we further found that adolescent delinquency was associated with early dropout from school, migrant working status, lack of parental guardianship, and negative peer influence; adolescent abuse was associated with unstable family structure and weak family support; adolescent self-injury was primarily associated with depression; and mental health problems were associated with grandparenting, low social-economic status, and transition periods. This classification model can guide tailored services and interventions, and be used as early warning system to mitigate potential risks to adolescents.