摘要
This article refers to 'Validation of the HFA-PEFF score for the diagnosis of heart failure with preserved ejection fraction' by A. Barandiarán Aizpurua et al., published in this issue on pages 413–421. Unlike heart failure with reduced ejection fraction (HFrEF), the diagnosis of heart failure with preserved ejection fraction (HFpEF) remains challenging and controversial. Apart from inherent challenges with labelling a pathologic condition with a normal value [i.e. left ventricular ejection fraction (LVEF)], discrimination from non-cardiac causes of dyspnoea among patients with preserved LVEF is particularly complicated due to overlapping co-morbidities such as obesity, chronic kidney disease, and pulmonary diseases. These diagnostic challenges are reflected in the heterogeneity in inclusion criteria employed in HFpEF clinical trials, making interpretation of results and comparison between trials difficult. Professional societies such as the American College of Cardiology/American Heart Association and the European Society of Cardiology (ESC) have provided diagnostic considerations for HFpEF,1, 2 but the uptake of these recommendations has been limited. As such, contemporary diagnostic approaches to HFpEF remain unstandardized across the world. Since 2018, two independently derived algorithms for the HFpEF diagnosis have been published. The H2FPEF score from the Mayo Clinic (Rochester, MN, USA) was based on clinical and echocardiographic characteristics that were modelled using logistic regression analyses with invasive haemodynamic testing as the gold standard.3 In contrast, the ESC Heart Failure Association (HFA)-PEFF 4-step algorithm was an expert consensus recommendation by a writing committee of leaders in the field.4 In this issue of the Journal, the study by Barandiarán Aizpurua et al.5 is the first to test the HFA-PEFF score in a cohort, as the score itself was not derived directly from patient data. The authors are to be congratulated for this well-conducted and timely study, with findings that provide ancillary information as this algorithm is being considered for implementation in practice. An important strength is the inclusion of two independent and distinctly different cohorts: one European cohort with primarily ambulatory heart failure (HF) (early stage), and a second US cohort of patients previously hospitalized for HF (more advanced stage). The cohorts displayed demographic differences consistent with disease heterogeneity between the US (younger, higher body mass index, more coronary artery disease) and Europe (older, higher blood pressure, more atrial fibrillation and valvular disease). Overall, the HFA-PEFF score performed well in discriminating patients believed to have HFpEF from those without, with an area under the receiver operating characteristic curve (ROC AUC) of 0.90. It was reassuring to see that the results were comparable in the two cohorts examined. The positive predictive value ('rule in') was particularly good, while the negative predictive value was moderate, illustrating the challenge of ruling out HFpEF among patients with dyspnoea. This emphasizes the importance of confirmatory testing (i.e. invasive haemodynamic testing, stress echocardiography) in intermediate risk patients, which represented a substantial proportion of the studied population. Of note, this is recommended by the HFA-PEFF algorithm, but was not tested in the current study as their calculation of the score was restricted to Step #2 [echocardiography and natriuretic peptides (NPs)] of the algorithm. For most, this appears to be the most relevant section as the majority of patients with suspected HFpEF, and everyone in this study, fulfil the pre-test assessment (Step #1). As for Step #3 (functional testing) and Step #4 (final aetiology), these advanced tests were not available in the study, and access to such tests are unfortunately also limited in most health systems worldwide. Hopefully the expert consensus recommendations for the HFA-PEFF score, with support from the current study, will encourage a broader application of such tests when accessible and affordable. The critical caveat in assessing the diagnostic performance of a test, biomarker, or score is the definition of the diagnostic reference. This is especially challenging in HFpEF given the heterogeneous clinical presentation of the syndrome,6 particularly in cases of low NPs, grossly normal cardiac structure on routine echocardiography, and/or no overt signs and symptoms of volume overload. Accordingly, 'demonstration of elevated left ventricular diastolic pressure at rest or exercise by cardiac catheterization in the presence of signs and symptoms of HF and a preserved LVEF ≥50%' has been suggested as the gold standard diagnostic test.7 In the current study, as in most other cohort studies of HF, the reference diagnosis was based on adjudication by experts who had access to all test results (which indeed included invasive haemodynamic testing in some cases). Despite efforts by the authors to apply objective thresholds for echocardiography parameters and NPs, this method of adjudication may introduce a certain degree of bias that may partially explain the favourable test characteristics of the algorithm. Additionally, evaluation of diagnostic algorithms often suffer from self-fulfilling prophecies by adjudicators who strongly rely on a variable that is also part of the score. In HF, this is particularly true for concentrations of NPs, which may explain why the biomarker subsection appears to drive the predictive value of the HFA-PEFF algorithm in the current study (ROC AUC 0.89 vs. 0.90 for the total score). Subsequent validation efforts of the HFA-PEFF algorithm are needed with objective invasively determined rest and exercise haemodynamics and comparing its performance against the simpler, 6-variable H2FPEF score. In light of the recent results from the PARAGON-HF (Prospective Comparison of Angiotensin Receptor-Neprilysin Inhibitor with Angiotensin-Receptor Blockers Global Outcomes in Heart Failure with Preserved Ejection Fraction) trial,8 current LVEF thresholds to define HF with reduced, mid-range, and preserved LVEF are being reconsidered to more directly align with observed therapeutic responses to investigational therapies. In PARAGON-HF, prespecified subgroup analysis suggested that there was a treatment interaction by LVEF, such that patients with LVEF in the lower range (i.e. ∼45–55%) appeared to benefit more from treatment with sacubitril/valsartan compared with valsartan. Interestingly, similar observations were made for candesartan in the CHARM (Candesartan in Heart failure - Assessment of Mortality and Morbidity) programme,9 and for spironolactone in the Americas region of the TOPCAT (Treatment of Preserved Cardiac Function Heart Failure With an Aldosterone Antagonist) trial.10 Thus, the current threshold of LVEF 40% may exclude a large group of patients who may benefit from standard HFrEF therapies. Nonetheless, for HF patients with completely preserved LVEF >55%, the general lack of therapeutic response to multiple classes of therapies suggests that characterization of the HFpEF syndrome beyond LVEF thresholds alone is urgently needed.11 In the meantime, we anticipate that diagnostics and definitions related to HFpEF will continue to evolve as science and investment in this space grows. At present, application of algorithms such as the H2FpEF and HFA-PEFF for diagnosing HFpEF may be useful, with the understanding that the gold standard diagnosis remains subject to debate and with potential to change (Figure 1). Dr. Myhre is supported by a postdoctoral research grant from South-Eastern Norway Regional Health Authority and University of Oslo. Dr. Vaduganathan is supported by the KL2/Catalyst Medical Research Investigator Training award from Harvard Catalyst | The Harvard Clinical and Translational Science Center (NIH/NCATS Award UL 1TR002541). Dr. Greene is supported by a Heart Failure Society of America/Emergency Medicine Foundation Acute Heart Failure Young Investigator Award funded by Novartis. Conflict of interest: P.L.M. has consulted for Novartis. M.V. serves on advisory boards for Amgen, AstraZeneca, Baxter Healthcare, Bayer AG, and Boehringer Ingelheim, and participates in clinical endpoint committees for studies sponsored by Novartis and the NIH. S.J.G. has received research support from Amgen, Bristol-Myers Squibb, and Novartis, and serves on an advisory board for Amgen.