摘要
HomeCirculation: Heart FailureVol. 15, No. 1Unleashing the Power of Machine Learning to Predict Myocardial Recovery After Left Ventricular Assist Device: A Call for the Inclusion of Unstructured Data Sources in Heart Failure Registries Free AccessEditorialPDF/EPUBAboutView PDFView EPUBSections ToolsAdd to favoritesDownload citationsTrack citationsPermissions ShareShare onFacebookTwitterLinked InMendeleyReddit Jump toFree AccessEditorialPDF/EPUBUnleashing the Power of Machine Learning to Predict Myocardial Recovery After Left Ventricular Assist Device: A Call for the Inclusion of Unstructured Data Sources in Heart Failure Registries Ramsey M. Wehbe, MD, MSAI Ramsey M. WehbeRamsey M. Wehbe Correspondence to: Ramsey M. Wehbe, MD, MSAI, Division of Cardiology, Department of Medicine, Northwestern University Feinberg School of Medicine, 676 N St. Clair St, Ste 600, Chicago, IL 60611. Email E-mail Address: [email protected] https://orcid.org/0000-0003-0599-7957 Division of Cardiology, Department of Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL. Search for more papers by this author Originally published24 Dec 2021https://doi.org/10.1161/CIRCHEARTFAILURE.121.009278Circulation: Heart Failure. 2022;15:e009278This article is a commentary on the followingMachine Learning-Based Prediction of Myocardial Recovery in Patients With Left Ventricular Assist Device SupportOther version(s) of this articleYou are viewing the most recent version of this article. Previous versions: December 24, 2021: Ahead of Print “There are only patterns, patterns on top of patterns, patterns that affect other patterns. Patterns hidden by patterns. Patterns within patterns[…]What we call chaos is just patterns we haven’t recognized. What we call random is just patterns we can’t decipher[…]”—Chuck Palahniuk, Survivor1See Article by Topkara et alArtificial intelligence has recently garnered significant attention in popular media as advances in machine learning (ML), and particularly deep learning (DL), have made possible groundbreaking innovations, such as self-driving cars and voice-controlled virtual assistants. Cardiovascular medicine has not been immune to the enthusiasm surrounding ML, as evident by the exponential growth of publications in this space over the past 10 years (Figure). ML models have been employed across the spectrum of cardiovascular disease, particularly for patients with heart failure (HF), to automate time-consuming tasks, assist in the diagnosis or detection of disease, deliver insights into new disease phenotypes and pathophysiologic mechanisms, and—perhaps the most elusive task of all—accurately predict patient outcomes.Download figureDownload PowerPointFigure. Publication timeline showing exponential growth of publications by year for the topic of machine learning in cardiovascular medicine. Data were exported from pubmed.ncbi.nlm.nih.gov.In this edition of Circulation: Heart Failure, Topkara et al2 take on the important task of predicting myocardial recovery after durable left ventricular assist device (LVAD) implantation. The RESTAGE-HF clinical trail (Remission from Stage D Heart Failure) primary end point results were recently published showing durable recovery in a large proportion of carefully selected LVAD patients on a standardized treatment protocol.3 Appropriate selection of patients for specialized care plans to maximize the probability of myocardial recovery is, therefore, key to efficiently allocate resources. Although statistical risk prediction models for myocardial recovery after LVAD exist,4 this study is novel in the application of ML methodology to the problem.In a population of over 20 000 patients from the Society of Thoracic Surgeons INTERMACS (Interagency Registry for Mechanically Assisted Circulatory Support) database, the authors first used least absolute shrinkage and selection operator (LASSO) logistic regression for feature selection among 98 possible risk factors derived from discrete variables included in the database. Next, they used the resulting 28 variables (or features in ML terminology) to evaluate the discriminative ability of 5 different ML models (Bayesian logistic regression, support vector machine, gradient boosted decision tree, neural network, and random forest) in predicting myocardial recovery, which was defined as LVAD explant for myocardial recovery. The authors reported that these ML models (area under the receiver operator characteristic curve [AUC] 0.813–0.824) all outperformed established statistical regression-based recovery prediction models derived in earlier versions of the INTERMACS database (AUC 0.744–0.748) and identified a set of previously underappreciated features, including a history of noncompliance, tobacco/alcohol use, and limited social support at the time of LVAD implant, that were seemingly paradoxically associated with myocardial recovery. This is an important step towards early identification of patients after LVAD implantation who might benefit from intensive guideline directed medical therapies for HF to maximize chances of myocardial recovery. It should be noted that only one in 5 patients predicted to recover by the ML model underwent device explant for myocardial recovery at 4 years postimplant, driven, in part, by the overall low incidence of myocardial recovery in LVAD patients.Interestingly, when the authors derived a novel risk score using a basic logistic regression model on the same contemporary INTERMACS data set used to train the ML models, the best performing ML model only marginally outperformed this simpler statistical model (AUC 0.824 versus 0.796, P=0.046). This finding is consistent with prior risk prediction studies in HF cohorts comparing ML methods to traditional statistical risk modeling, which have demonstrated minimal incremental improvement in prediction metrics with ML.5 However, this is not necessarily a shortcoming of ML methodology, but rather its implementation. Namely, the simple application of increasingly complex mathematical functions to the same discrete data elements has quickly diminishing returns.As the authors point out, one limitation of the current study is that more complex, unstructured source data (eg, raw imaging data, free text from clinical notes, ECG and hemodynamic waveforms, genome sequencing, and time-series data) was not available given the INTERMACS database consists almost exclusively of structured, tabular data. However, one could argue the primary advantage of modern ML models, chiefly DL architectures, is in the ability to effectively model complex, unstructured data sources in ways that were not previously possible using traditional statistical or mathematical modeling. As opposed to the simple single hidden-layer neural networks utilized in the current study, deep neural networks are capable of modeling complex data inputs via a series of learned features and nonlinear transformations without the need for manual feature preprocessing, instead extracting important features for a specific task in an automated fashion. This characteristic of DL models has made them particularly adept at computer vision and natural language processing tasks, dramatically outperforming the previous state of the art in these domains. Unsurprisingly, prior investigators have found that including imaging,6 clinical notes,7 ECG waveforms,8 wearable sensor data,9 or longitudinal data10 in a DL framework improves the performance of risk prediction modeling compared to the use of tabular data alone. There have also been encouraging results yielded by DL models that are able to efficiently handle multiple different types of these unstructured data sources at once as inputs into a multimodal framework.11Clearly, there is promise in analyzing rich unstructured data sources towards unlocking hidden patterns in this data at scale and reducing some of the inherent stochasticity involved in HF outcomes prediction. However, there are a few considerations related to this approach that deserve mention. First, models used to predict outcomes in patients with HF must be explainable or interpretable to be clinically useful. Although the paradigm has traditionally been that more complex data sets and more complex modeling leads to decreased transparency into how a model arrived at a certain prediction (the black box of ML), significant progress has been made and there is active research into lifting the lid on these models using methods such as heatmaps for visual explanations of model predictions.12 Second, more complex, unstructured data sources typically require larger amounts of data for training to prevent overfitting, a phenomenon that limits a model’s generalizability to external data sets. Indeed, it was the curation of large publicly available unstructured data sets such as ImageNet,13 a collection of over 14 million labeled images from natural scenes, that paved the way for the broad success of modern DL-based computer vision systems. While we are starting to see the emergence of large, anonymized, publicly available data sets of cardiovascular imaging14 and free-text clinical reports,15 the applicability of these data sets to the task of predicting outcomes in HF is limited due to the lack of robust clinical and outcomes data of similar quality to that included in clinical registries.Unfortunately, existing large HF registries do not routinely provide such unstructured data. While this data might be available through an individual participating site’s core lab, it is exceedingly difficult to obtain raw imaging data, for example, for an entire registry cohort. One notable exception is the recent launch of the National Heart, Lung, and Blood Institute’s HeartShare program, a goal of which is to explicitly aggregate unstructured data sources including phenotypic data, images, and omics from patients with HF with preserved ejection fraction for large-scale analysis to elucidate mechanisms of disease and identify new targets for therapeutic intervention. This should serve as an example of the standard for modern HF registries—only when our data sets begin to match the capabilities of modern ML algorithms will we unleash the true potential of these technologies for HF outcomes prediction.Article InformationDisclosuresDr Wehbe has received research support from Pfizer and the American Society of Nuclear Cardiology that is outside the scope of this work and not relevant to this editorial.FootnotesThe opinions expressed in this article are not necessarily those of the editors or of the American Heart Association.For Disclosures, see page 30.Correspondence to: Ramsey M. Wehbe, MD, MSAI, Division of Cardiology, Department of Medicine, Northwestern University Feinberg School of Medicine, 676 N St. Clair St, Ste 600, Chicago, IL 60611. Email ramsey.[email protected]eduReferences1. Palahniuk C. Survivor. 1st ed. Norton; 1999.Google Scholar2. Topkara VK, Elias P, Jain R, Sayer G, Burkoff D, Uriel N. Machine Learning-Based Prediction of Myocardial Recovery in Patients With Left Ventricular Assist Device Support.Circ Heart Fail. 2021; 14:20–27. doi: 10.1161/CIRCHEARTFAILURE.121.008711Google Scholar3. Birks EJ, Drakos SG, Patel SR, Lowes BD, Selzman CH, Starling RC, Trivedi J, Slaughter MS, Alturi P, Goldstein D, et al. Prospective multicenter study of myocardial recovery using left ventricular assist devices (RESTAGE-HF [Remission from Stage D Heart Failure]): medium-term and primary end point results.Circulation. 2020; 142:2016–2028. doi: 10.1161/CIRCULATIONAHA.120.046415LinkGoogle Scholar4. Topkara VK, Garan AR, Fine B, Godier-Furnémont AF, Breskin A, Cagliostro B, et al. Myocardial recovery in patients receiving contemporary left ventricular assist devices.Circ Heart Fail. 2016; 9:1–12. doi: 10.1161/CIRCHEARTFAILURE.116.003157LinkGoogle Scholar5. Wehbe RM, Khan SS, Shah SJ, Ahmad FS. Predicting high-risk patients and high-risk outcomes in heart failure.Heart Fail Clin. 2020; 16:387–407. doi: 10.1016/j.hfc.2020.05.002CrossrefMedlineGoogle Scholar6. Eisenberg E, McElhinney PA, Commandeur F, Chen X, Cadet S, Goeller M, Razipour A, Gransar H, Cantu S, Miller RJH, et al. Deep learning-based quantification of epicardial adipose tissue volume and attenuation predicts major adverse cardiovascular events in asymptomatic subjects.Circ Cardiovasc Imaging. 2020; 13:e009829. doi: 10.1161/CIRCIMAGING.119.009829LinkGoogle Scholar7. Huang K, Altosaar J, Ranganath R. ClinicalBERT: Modeling Clinical Notes and Predicting Hospital Readmission.arXiv 2019.Google Scholar8. Raghunath S, Pfeifer JM, Ulloa-Cerna AE, Nemani A, Carbonati T, Jing L, vanMaanen DP, Hartzel DN, Ruhl JA, Lagerman BF, et al. Deep neural networks can predict new-onset atrial fibrillation from the 12-lead ECG and help identify those at risk of atrial fibrillation-related stroke.Circulation. 2021; 143:1287–1298. doi: 10.1161/CIRCULATIONAHA.120.047829LinkGoogle Scholar9. Ballinger B, Hsieh J, Singh A, Sohoni N, Wang J, Tison GH, Marcus GM, Sanchez JM, MacGuire C, Olgin JEet al. DeepHeart: semi-supervised sequence learning for cardiovascular risk prediction.Proc AAAI Conf Artif Intell. 2018; 32:2079–2086.Google Scholar10. Golas SB, Shibahara T, Agboola S, Otaki H, Sato J, Nakae T, Hisamitsu T, Kojima G, Felsted J, Kakarmath S, et al. A machine learning model to predict the risk of 30-day readmissions in patients with heart failure: a retrospective analysis of electronic medical records data.BMC Med Inform Decis Mak. 2018; 18:44. doi: 10.1186/s12911-018-0620-zCrossrefMedlineGoogle Scholar11. Huang SC, Pareek A, Seyyedi S, Banerjee I, Lungren MP. Fusion of medical imaging and electronic health records using deep learning: a systematic review and implementation guidelines.NPJ Digit Med. 2020; 3:136. doi: 10.1038/s41746-020-00341-zCrossrefMedlineGoogle Scholar12. Selvaraju RR, Cogswell M, Das A, Vedantam R, Parikh D, Batra D. Grad-CAM: visual explanations from deep networks via gradient-based localization.Proc IEEE Int Conf Comput Vis. 2017;618–626. doi: 10.1109/ICCV.2017.74Google Scholar13. Deng J, Dong W, Socher R, Li L-J, Kai L, Li F-F. ImageNet: a large-scale hierarchical image database.IEEE Conference on Computer Vision and Pattern Recognition, Institute of Electrical and Electronics Engineers (IEEE); 2009;248–255. doi: 10.1109/CVPR.2009.5206848CrossrefGoogle Scholar14. Ouyang D, He BGhorbani ALungren MPAshley EALiang DH, Zou JY. EchoNet-Dynamic: a Large New Cardiac Motion Video Data Resource for Medical Machine Learning.NeurIPS ML4H Workshop, 2019, p. 1–11.Google Scholar15. Johnson AE, Pollard TJ, Shen L, Lehman LW, Feng M, Ghassemi M, Moody B, Szolovits P, Celi LA, Mark RG. MIMIC-III, a freely accessible critical care database.Sci Data. 2016; 3:160035. doi: 10.1038/sdata.2016.35CrossrefMedlineGoogle Scholar Previous Back to top Next FiguresReferencesRelatedDetailsRelated articlesMachine Learning-Based Prediction of Myocardial Recovery in Patients With Left Ventricular Assist Device SupportVeli K. Topkara, et al. Circulation: Heart Failure. 2022;15 January 2022Vol 15, Issue 1 Advertisement Article InformationMetrics © 2021 American Heart Association, Inc.https://doi.org/10.1161/CIRCHEARTFAILURE.121.009278PMID: 34949097 Originally publishedDecember 24, 2021 Keywordsdeep learningEditorialsartificial intelligenceheart failurecardiovascular diseasesmachine learningPDF download Advertisement SubjectsBig Data and Data StandardsHeart FailureMachine Learning and Artificial IntelligenceRisk Factors