OWL: an optimized and independently validated machine learning prediction model for lung cancer screening based on the UK Biobank, PLCO, and NLST populations

医学 接收机工作特性 生命银行 全国肺筛查试验 肺癌 肺癌筛查 人口 判别式 风险评估 癌症 机器学习 内科学 人工智能 肿瘤科 计算机科学 生物信息学 环境卫生 计算机安全 生物
作者
Zoucheng Pan,Ruyang Zhang,Sipeng Shen,Yunzhi Lin,Longyao Zhang,Xiang Wang,Qian Ye,Xuan Wang,Jiajin Chen,Yang Zhao,David C. Christiani,Yi Li,Feng Chen,Yongyue Wei
出处
期刊:EBioMedicine [Elsevier]
卷期号:88: 104443-104443
标识
DOI:10.1016/j.ebiom.2023.104443
摘要

A reliable risk prediction model is critically important for identifying individuals with high risk of developing lung cancer as candidates for low-dose chest computed tomography (LDCT) screening. Leveraging a cutting-edge machine learning technique that accommodates a wide list of questionnaire-based predictors, we sought to optimize and validate a lung cancer prediction model.We developed an Optimized early Warning model for Lung cancer risk (OWL) using the XGBoost algorithm with 323,344 participants from the England area in UK Biobank (training set), and independently validated it with 93,227 participants from UKB Scotland and Wales area (validation set 1), as well as 70,605 and 66,231 participants in the Prostate, Lung, Colorectal, and Ovarian cancer screening trial (PLCO) control and intervention subpopulations, respectively (validation sets 2 & 3) and 23,138 and 18,669 participants in the United States National Lung Screening Trial (NLST) control and intervention subpopulations, respectively (validation sets 4 & 5). By comparing with three competitive prediction models, i.e., PLCO modified 2012 (PLCOm2012), PLCO modified 2014 (PLCOall2014), and the Liverpool Lung cancer Project risk model version 3 (LLPv3), we assessed the discrimination of OWL by the area under receiver operating characteristic curve (AUC) at the designed time point. We further evaluated the calibration using relative improvement in the ratio of expected to observed lung cancer cases (RIEO), and illustrated the clinical utility by the decision curve analysis.For general population, with validation set 1, OWL (AUC = 0.855, 95% CI: 0.829-0.880) presented a better discriminative capability than PLCOall2014 (AUC = 0.821, 95% CI: 0.794-0.848) (p < 0.001); with validation sets 2 & 3, AUC of OWL was comparable to PLCOall2014 (AUCPLCOall2014-AUCOWL < 1%). For ever-smokers, OWL outperformed PLCOm2012 and PLCOall2014 among ever-smokers in validation set 1 (AUCOWL = 0.842, 95% CI: 0.814-0.871; AUCPLCOm2012 = 0.792, 95% CI: 0.760-0.823; AUCPLCOall2014 = 0.791, 95% CI: 0.760-0.822, all p < 0.001). OWL remained comparable to PLCOm2012 and PLCOall2014 in discrimination (AUC difference from -0.014 to 0.008) among the ever-smokers in validation sets 2 to 5. In all the validation sets, OWL outperformed LLPv3 among the general population and the ever-smokers. Of note, OWL showed significantly better calibration than PLCOm2012, PLCOall2014 (RIEO from 43.1% to 92.3%, all p < 0.001), and LLPv3 (RIEO from 41.4% to 98.7%, all p < 0.001) in most cases. For clinical utility, OWL exhibited significant improvement in average net benefits (NB) over PLCOall2014 in validation set 1 (NB improvement: 32, p < 0.001); among ever smokers of validation set 1, OWL (average NB = 289) retained significant improvement over PLCOm2012 (average NB = 213) (p < 0.001). OWL had equivalent NBs with PLCOm2012 and PLCOall2014 in PLCO and NLST populations, while outperforming LLPv3 in the three populations.OWL, with a high degree of predictive accuracy and robustness, is a general framework with scientific justifications and clinical utility that can aid in screening individuals with high risks of lung cancer.National Natural Science Foundation of China, the US NIH.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
大幅提高文件上传限制,最高150M (2024-4-1)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
建议保存本图,每天支付宝扫一扫(相册选取)领红包
实时播报
彭于晏应助skyline采纳,获得10
1秒前
同玉完成签到,获得积分10
2秒前
3秒前
研友_n0kjPL完成签到,获得积分0
5秒前
6秒前
乐乐应助阔达的扬采纳,获得10
6秒前
cctv18应助罗杰采纳,获得30
7秒前
A6L发布了新的文献求助10
7秒前
一只老黄牛完成签到,获得积分20
7秒前
赘婿应助如意烨华采纳,获得10
8秒前
黄金天下应助omeprazole采纳,获得10
9秒前
10秒前
酷波er应助通~采纳,获得10
10秒前
ss发布了新的文献求助10
10秒前
我是谁发布了新的文献求助10
11秒前
追寻的妙松完成签到,获得积分20
12秒前
无名完成签到,获得积分10
13秒前
14秒前
123完成签到,获得积分20
14秒前
秋雪瑶应助科研通管家采纳,获得30
14秒前
SciGPT应助科研通管家采纳,获得10
14秒前
上官若男应助科研通管家采纳,获得10
14秒前
大模型应助科研通管家采纳,获得10
14秒前
王哥完成签到 ,获得积分10
15秒前
15秒前
大宝S欧D蜜完成签到 ,获得积分10
16秒前
16秒前
16秒前
如意烨华完成签到,获得积分10
17秒前
朴素海亦完成签到 ,获得积分10
18秒前
wanci应助c欧2采纳,获得30
19秒前
Lucas应助ss采纳,获得10
20秒前
阔达的扬发布了新的文献求助10
21秒前
如意烨华发布了新的文献求助10
21秒前
JamesPei应助我是科研垃圾采纳,获得10
21秒前
dungaway发布了新的文献求助10
21秒前
通~发布了新的文献求助10
22秒前
north发布了新的文献求助10
23秒前
123发布了新的文献求助10
25秒前
阔达的扬完成签到,获得积分10
28秒前
高分求助中
Bioinspired Catalysis with Biomimetic Clusters 1000
Work hardening in tension and fatigue : proceedings of a symposium, Cincinnati, Ohio, November 11, 1975 1000
Teaching Social and Emotional Learning in Physical Education 900
The Instrument Operations and Calibration System for TerraSAR-X 800
Lexique et typologie des poteries: pour la normalisation de la description des poteries (Full Book) 400
Sustainable Land Management: Strategies to Cope with the Marginalisation of Agriculture 400
Transformerboard III 300
热门求助领域 (近24小时)
化学 材料科学 医学 生物 有机化学 工程类 生物化学 纳米技术 物理 内科学 计算机科学 化学工程 复合材料 遗传学 基因 物理化学 催化作用 电极 光电子学 量子力学
热门帖子
关注 科研通微信公众号,转发送积分 2350542
求助须知:如何正确求助?哪些是违规求助? 2056743
关于积分的说明 5123501
捐赠科研通 1787298
什么是DOI,文献DOI怎么找? 892755
版权声明 557066
科研通“疑难数据库(出版商)”最低求助积分说明 476239