摘要
We read with interest the study by Wang and colleagues proposing an MRI-based score to predict retreatment response for viable hepatocellular carcinoma (HCC) after initial transarterial chemoembolization (TACE) [1]. The authors tackle a common and clinically meaningful challenge—how best to risk-stratify patients with residual viable disease on early post-TACE MRI and guide subsequent management. We commend the investigators for assembling a multicentre cohort and for including an external validation effort, which together enhance the potential clinical relevance of this work. We would, however, appreciate clarification on several design and reporting aspects that are central to the clinical interpretability and generalizability of the proposed score. First, the primary endpoint—objective response at approximately 6 months—may not be fully comparable across individuals if the interval between the 1-month MRI assessment and retreatment varies substantially. In routine practice, retreatment timing can be influenced by liver function, toxicity, logistics, and physician preference, and the ‘6-month’ MRI may therefore represent different post-retreatment time windows across patients [2]. If the assessment occurs at heterogeneous time points relative to the actual retreatment, the outcome could reflect, at least in part, treatment timing rather than intrinsic tumour behaviour. Reporting the distribution of time from the 1-month MRI to retreatment, as well as from retreatment to the response assessment, and considering a sensitivity analysis anchored to a fixed post-retreatment interval (or adjusting for these intervals) would help readers interpret the score as a predictor of retreatment sensitivity rather than follow-up timing. Second, retreatment strategies appear heterogeneous in the derivation cohort, whereas the external validation cohort consists of patients treated with locoregional therapy alone. Such differences in therapeutic allocation can introduce confounding by indication: imaging features that suggest aggressive residual disease may drive escalation to systemic or combination therapy, which in turn affects the likelihood of subsequent radiologic response. In this context, it becomes challenging to disentangle whether the score predicts tumour biology, treatment selection, or their interaction [3]. We wonder whether the authors could more explicitly define the intended use case—for example, decision support among candidates for locoregional retreatment—and provide stratified performance and calibration within clinically homogeneous treatment subsets [4, 5]. This would also align the validation setting more closely with the proposed point-of-care application. Third, the lesion-level model is developed using multiple viable lesions contributed by some patients. Because lesions within the same patient are not statistically independent, analyses that treat each lesion as an independent observation can underestimate uncertainty and overstate statistical significance [6]. Clarification on whether clustered data methods were used (e.g., generalised estimating equations, mixed-effects logistic regression, or cluster-robust standard errors) would be valuable. In addition, a sensitivity analysis restricted to one lesion per patient (such as the largest viable lesion or a randomly selected lesion) could further support the robustness of the identified imaging predictors. We would be grateful for the authors' clarification on these issues, as doing so may help refine the clinical interpretation and potential use of the proposed score. We commend the investigators for this work and appreciate the opportunity to engage in this discussion. Yicheng Huang: conceptualization and manuscript draft. Zichen Yu: critical revision for important intellectual content, final approval. The authors have nothing to report. The authors have nothing to report. The authors declare no conflicts of interest. Data sharing not applicable to this article as no datasets were generated or analysed during the current study.