Race, Sex and Age Disparities in the Performance of ECG Deep Learning Models Predicting Heart Failure

医学心力衰竭接收机工作特性民族种族（生物学）人口学内科学人类学植物生物社会学

作者

Dhamanpreet Kaur,J. Weston Hughes,Albert J. Rogers,Guson Kang,Sanjiv M. Narayan,Euan A. Ashley,Marco Pérez

出处

期刊：Cold Spring Harbor Laboratory - medRxiv 日期：2023-05-21 被引量：1

链接

medrxiv.orgdoi.org

标识

DOI：10.1101/2023.05.19.23290257

摘要

ABSTRACT Background Deep learning models may combat widening racial disparities in heart failure outcomes through early identification of individuals at high risk. However, demographic biases in the performance of these models have not been well studied. Methods This retrospective analysis used 12-lead ECGs taken between 2008 - 2018 from 290,252 patients referred for standard clinical indications to Stanford Hospital. The primary model was a convolutional neural network model trained to predict incident heart failure within 5 years. Biases were evaluated on the testing set (160,312 ECGs) using area under the receiver operating curve (AUC), stratified across the protected attributes of race, ethnicity, age, and sex. Results 50,956 incident cases of heart failure were observed within 5 years of ECG collection. The performance of the primary model declined with age. There were no significant differences observed between racial groups overall. However, the primary model performed significantly worse in Black patients aged 0 - 40 compared to all other racial groups in this age group, with differences most pronounced among young Black women. Disparities in model performance did not improve with integration of race, ethnicity, gender, and/or age into model architecture, by training separate models for each racial group, nor by providing the model with a dataset of equal racial representation. Using probability thresholds individualized for race, age, and gender offered substantial improvements in F1-scores. Conclusion The biases found in this study warrant caution against perpetuating disparities through the development of machine learning tools for the prognosis and management of heart failure. Customizing the application of these models by using probability thresholds individualized by race/ethnicity, age, and sex may offer an avenue to mitigate existing algorithmic disparities.

求助该文献

Race, Sex and Age Disparities in the Performance of ECG Deep Learning Models Predicting Heart Failure

今日热心研友