A clustering‐based approach to address correlated features in predicting genitourinary toxicity from MRI‐guided prostate SBRT

特征选择 人工智能 计算机科学 聚类分析 降维 回归 模式识别(心理学) 统计 数学
作者
Pouyan Rezapoor,Jonathan Pham,Beth K. Neilsen,Hengjie Liu,Minsong Cao,Yingli Yang,Ke Sheng,Ting Martin,James Lamb,Michael L. Steinberg,Amar U. Kishan,Zachary Taylor,Dan Ruan
出处
期刊:Medical Physics [Wiley]
标识
DOI:10.1002/mp.17834
摘要

Abstract Background It is common in outcome analysis to work with a large set of candidate prognostic features. However, such high‐dimensional input and relatively small sample size leads to risk of overfitting, low generalizability, and correlation bias. Purpose This study addresses the issue of correlation bias mitigation in the context of predicting genitourinary (GU) toxicity in prostate cancer patients underwent MRI‐guided stereotactic body radiation therapy (SBRT). Methods Typical dimension reduction or feature selection methods include regularization for sparsity or information criterion. However, when heavy correlation occurs with (subsets of) input features, the assigned weights of correlated features can be diluted to an extent that the corresponding features are no more effective in the prediction, leading to suboptimal feature discovery and prediction. We propose to perform advanced hierarchical clustering and then apply regression modeling to cluster centroids. This approach addresses the challenges posed by high dimensionality and ill‐conditioning, and improves accuracy and reliability of the resulting prediction models. Performance of the proposed method was evaluated on typical regression models with intrinsic feature reduction methods, namely Least Absolute Shrinkage and Selection Operator (LASSO) regularized logistic regression (LR), support vector machine (SVM), and decision trees (DT). Results Extensive experiments show that introducing cluster‐based feature compaction and representation improves all regression models under fair hyperparameter tuning conditions. Although LASSO and LR with clustered features had similar performance during training and validation, with LASSO‐LR being slightly better, the cluster‐based feature method achieved significantly better performance on the test set by achieving 0.91 AUC and 0.86 accuracy, demonstrating its advantage in stability and robustness. The overall best test performance is achieved by combining feature clustering to five representatives with SVM. Additional correlation study identified individual features closely representing the cluster centroids as exposure volume of rectum at 2 Gy rectum, trigone exposure at 2 Gy and 41 Gy, urethra at 42 Gy urethra, and rectal wall at 42 Gy rectal wall. This indicates the importance of hot spot control of urethra, trigone, and rectal wall for toxicity control. Conclusions These findings underscore the superiority of the clustering method in mitigating correlation bias and enhancing predictive model accuracy. The current model also achieves state of the art (SOTA) performance in predicting GU toxicity in MRI‐guided prostate SBRT. Correlating dose features to feature cluster centroids reveals the importance of hot spot control on urethra, trigone, and rectal wall to reduce toxicity risk.

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
柯雪完成签到,获得积分10
刚刚
啦啦啦发布了新的文献求助10
4秒前
4秒前
子小孙完成签到,获得积分10
6秒前
争当科研巨匠完成签到,获得积分10
8秒前
小布布发布了新的文献求助10
9秒前
有热心愿意完成签到,获得积分10
10秒前
文武兼备完成签到,获得积分10
11秒前
汉堡包应助Spine Lin采纳,获得10
13秒前
光亮的绮晴完成签到 ,获得积分10
14秒前
14秒前
15秒前
伶俐惜灵完成签到,获得积分10
17秒前
Dream完成签到,获得积分0
20秒前
66完成签到,获得积分10
20秒前
啦啦啦发布了新的文献求助10
21秒前
伶俐惜灵发布了新的文献求助10
22秒前
啊盘发布了新的文献求助10
22秒前
丹曦完成签到,获得积分10
23秒前
CodeCraft应助平安顺遂采纳,获得10
25秒前
25秒前
药宫完成签到,获得积分10
25秒前
wx完成签到,获得积分10
26秒前
从容的灵凡完成签到,获得积分10
26秒前
单纯夏烟发布了新的文献求助10
29秒前
哈哈哈发布了新的文献求助10
30秒前
藏识完成签到,获得积分10
31秒前
31秒前
FX完成签到,获得积分10
32秒前
34秒前
35秒前
星辰大海应助俏皮白云采纳,获得10
35秒前
35秒前
huangweihong发布了新的文献求助20
36秒前
务实的紫伊完成签到,获得积分10
37秒前
huang发布了新的文献求助10
37秒前
37秒前
Lrcx完成签到 ,获得积分10
37秒前
菜鸟队长发布了新的文献求助10
38秒前
啦啦啦发布了新的文献求助30
39秒前
高分求助中
Mass producing individuality 600
Разработка метода ускоренного контроля качества электрохромных устройств 500
Chinesen in Europa – Europäer in China: Journalisten, Spione, Studenten 500
Arthur Ewert: A Life for the Comintern 500
China's Relations With Japan 1945-83: The Role of Liao Chengzhi // Kurt Werner Radtke 500
Two Years in Peking 1965-1966: Book 1: Living and Teaching in Mao's China // Reginald Hunt 500
Epigenetic Drug Discovery 500
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 物理 生物化学 纳米技术 计算机科学 化学工程 内科学 复合材料 物理化学 电极 遗传学 量子力学 基因 冶金 催化作用
热门帖子
关注 科研通微信公众号,转发送积分 3823627
求助须知:如何正确求助?哪些是违规求助? 3366071
关于积分的说明 10438607
捐赠科研通 3085191
什么是DOI,文献DOI怎么找? 1697217
邀请新用户注册赠送积分活动 816292
科研通“疑难数据库(出版商)”最低求助积分说明 769474