Applications of Natural Language Processing and Large Language Models for Social Determinants of Health: Systematic Review

计算机科学自然语言处理语言识别自然语言语言模型人工智能语言学通用网络语言论语言心理学健康的社会决定因素计算语言学社会化媒体上下文模型语言理解领域（数学分析）自然（考古学）数据科学自然语言理解主题模型社交网络（社会语言学）语用学语言技术建模语言互联网认知科学多样性（控制论）

作者

Swati Rajwal,Avinash Kumar Pandey,Ziyuan Zhang,Yankai Chen,Michael X. Liu,Sudeshna Das,Hannah Rogers,Abeed Sarker,Yunyu Xiao

出处

期刊：Journal of Medical Internet Research [JMIR Publications]
日期：2026-03-09 卷期号：28: e83793-e83793

链接

doi.org nih.gov nih.govdoi.org

标识

DOI：10.2196/83793

摘要

Background: Social determinants of health (SDOH) are the social, economic, and environmental conditions that influence health outcomes. SDOH information is often embedded in unstructured text, such as notes in electronic health records and social media posts. Advances in natural language processing (NLP), including emergent large language models (LLMs), offer opportunities to extract, analyze, and interpret SDOH expressions from free text for inclusion in downstream analyses. Existing literature on NLP applications for SDOH is dispersed across disciplines and characterized by methodological heterogeneity and variability in study quality and scope, complicating synthesis and cross-study comparison. Objective: This study aimed to examine the use of NLP, including LLMs, in SDOH research, and highlight gaps and future research directions. Methods: We conducted a systematic review following PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines, searching 7 major databases for publications between 2014 and November 2025. We included journal and conference proceedings papers that applied NLP methods to identify, classify, extract, or predict SDOH from text. Three reviewers independently screened studies and extracted data; conflicts were resolved by two senior reviewers. We abstracted study metadata, dataset characteristics, NLP approaches, SDOH domains addressed, and NLP performance metrics. We also conducted risk-of-bias analyses and identified influential studies based on relative citation counts. Results: 142 studies met the inclusion criteria. Nearly two-thirds (89/142, 62.7%) were published between 2023 and 2025, reflecting rapid recent growth. Most studies relied on electronic health records (93/142, 65.5%) and private datasets (81/142, 57.0%), while only 20.4% (29/142) used publicly available data. Commonly studied SDOH domains were housing instability (72/142, 50.7%), employment (65/142, 45.8%), and financial conditions (63/142, 44.4%); structural factors, such as immigration status (5/142, 3.5%), were rarely examined. Of studies that reported evaluation metrics, most focused on classification (26/83, 31.32%) or extraction (38/83, 45.7%), and used cross-sectional designs. Reported model performances were typically strong, with median F1-scores ranging roughly from 0.75 to 0.85 across model categories. Only 49 studies shared code, and fewer than half clearly described model interpretability or reproducibility practices. LLMs (including encoder-decoder models) appeared in 19.7% (28/142) of studies, highlighting emerging interest but also raising new concerns around transparency and governance. Conclusions: This review provides a timely synthesis of NLP and LLM applications across the SDOH research spectrum, addressing an important gap in a topic receiving increasing research attention. By comparing task formulations, data sources, and performance patterns, the review clarifies the research readiness of current approaches and reveals critical gaps. Our findings advance the field by highlighting the absence of a unified SDOH framework, uneven availability of public benchmarks, and limited evaluation of real-world deployment. Addressing these gaps through transparent, inclusive dataset development and implementation-focused evaluation is essential for translating NLP advances into equitable, real-world health impact.

求助该文献

最长约 10秒，即可获得该文献文件

Applications of Natural Language Processing and Large Language Models for Social Determinants of Health: Systematic Review

今日热心研友