OWL: an optimized and independently validated machine learning prediction model for lung cancer screening based on the UK Biobank, PLCO, and NLST populations

医学 接收机工作特性 生命银行 全国肺筛查试验 肺癌 肺癌筛查 人口 判别式 风险评估 癌症 机器学习 内科学 人工智能 肿瘤科 计算机科学 生物信息学 环境卫生 生物 计算机安全
作者
Zoucheng Pan,Ruyang Zhang,Sipeng Shen,Yunzhi Lin,Longyao Zhang,Xiang Wang,Qian Ye,Xuan Wang,Jiajin Chen,Yang Zhao,David C. Christiani,Yi Li,Feng Chen,Yongyue Wei
出处
期刊:EBioMedicine [Elsevier]
卷期号:88: 104443-104443
标识
DOI:10.1016/j.ebiom.2023.104443
摘要

A reliable risk prediction model is critically important for identifying individuals with high risk of developing lung cancer as candidates for low-dose chest computed tomography (LDCT) screening. Leveraging a cutting-edge machine learning technique that accommodates a wide list of questionnaire-based predictors, we sought to optimize and validate a lung cancer prediction model.We developed an Optimized early Warning model for Lung cancer risk (OWL) using the XGBoost algorithm with 323,344 participants from the England area in UK Biobank (training set), and independently validated it with 93,227 participants from UKB Scotland and Wales area (validation set 1), as well as 70,605 and 66,231 participants in the Prostate, Lung, Colorectal, and Ovarian cancer screening trial (PLCO) control and intervention subpopulations, respectively (validation sets 2 & 3) and 23,138 and 18,669 participants in the United States National Lung Screening Trial (NLST) control and intervention subpopulations, respectively (validation sets 4 & 5). By comparing with three competitive prediction models, i.e., PLCO modified 2012 (PLCOm2012), PLCO modified 2014 (PLCOall2014), and the Liverpool Lung cancer Project risk model version 3 (LLPv3), we assessed the discrimination of OWL by the area under receiver operating characteristic curve (AUC) at the designed time point. We further evaluated the calibration using relative improvement in the ratio of expected to observed lung cancer cases (RIEO), and illustrated the clinical utility by the decision curve analysis.For general population, with validation set 1, OWL (AUC = 0.855, 95% CI: 0.829-0.880) presented a better discriminative capability than PLCOall2014 (AUC = 0.821, 95% CI: 0.794-0.848) (p < 0.001); with validation sets 2 & 3, AUC of OWL was comparable to PLCOall2014 (AUCPLCOall2014-AUCOWL < 1%). For ever-smokers, OWL outperformed PLCOm2012 and PLCOall2014 among ever-smokers in validation set 1 (AUCOWL = 0.842, 95% CI: 0.814-0.871; AUCPLCOm2012 = 0.792, 95% CI: 0.760-0.823; AUCPLCOall2014 = 0.791, 95% CI: 0.760-0.822, all p < 0.001). OWL remained comparable to PLCOm2012 and PLCOall2014 in discrimination (AUC difference from -0.014 to 0.008) among the ever-smokers in validation sets 2 to 5. In all the validation sets, OWL outperformed LLPv3 among the general population and the ever-smokers. Of note, OWL showed significantly better calibration than PLCOm2012, PLCOall2014 (RIEO from 43.1% to 92.3%, all p < 0.001), and LLPv3 (RIEO from 41.4% to 98.7%, all p < 0.001) in most cases. For clinical utility, OWL exhibited significant improvement in average net benefits (NB) over PLCOall2014 in validation set 1 (NB improvement: 32, p < 0.001); among ever smokers of validation set 1, OWL (average NB = 289) retained significant improvement over PLCOm2012 (average NB = 213) (p < 0.001). OWL had equivalent NBs with PLCOm2012 and PLCOall2014 in PLCO and NLST populations, while outperforming LLPv3 in the three populations.OWL, with a high degree of predictive accuracy and robustness, is a general framework with scientific justifications and clinical utility that can aid in screening individuals with high risks of lung cancer.National Natural Science Foundation of China, the US NIH.

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
量子星尘发布了新的文献求助20
刚刚
众行绘研完成签到 ,获得积分10
4秒前
轩辕剑身完成签到,获得积分0
5秒前
6秒前
超帅的又槐完成签到,获得积分10
8秒前
小白完成签到 ,获得积分10
9秒前
Lz555完成签到 ,获得积分10
9秒前
量子星尘发布了新的文献求助10
12秒前
123完成签到 ,获得积分10
13秒前
量子星尘发布了新的文献求助10
14秒前
14秒前
16秒前
别闹闹完成签到 ,获得积分10
18秒前
Aliangkou完成签到,获得积分10
19秒前
Y不吃香菜完成签到 ,获得积分10
20秒前
倩倩完成签到 ,获得积分10
21秒前
栀蓝完成签到 ,获得积分10
21秒前
浅忆晨曦完成签到 ,获得积分10
22秒前
画凌烟发布了新的文献求助10
24秒前
幸福妙柏完成签到 ,获得积分10
25秒前
iNk应助adeno采纳,获得10
25秒前
量子星尘发布了新的文献求助10
26秒前
量子星尘发布了新的文献求助10
27秒前
橙子完成签到,获得积分20
30秒前
David完成签到 ,获得积分10
33秒前
踏实的盼秋完成签到 ,获得积分10
34秒前
画凌烟完成签到,获得积分10
36秒前
共享精神应助萨尔莫斯采纳,获得10
36秒前
ken131完成签到 ,获得积分0
38秒前
40秒前
Skywalk满天星完成签到,获得积分10
40秒前
好吃的番茄芝士完成签到 ,获得积分10
40秒前
1515完成签到 ,获得积分10
42秒前
Yanping完成签到,获得积分10
46秒前
量子星尘发布了新的文献求助10
46秒前
CrsCrsCrs完成签到,获得积分10
49秒前
王一一完成签到 ,获得积分10
50秒前
陈M雯完成签到 ,获得积分10
51秒前
多喝水完成签到 ,获得积分10
52秒前
量子星尘发布了新的文献求助10
53秒前
高分求助中
(应助此贴封号)【重要!!请各用户(尤其是新用户)详细阅读】【科研通的精品贴汇总】 10000
Encyclopedia of Quaternary Science Reference Third edition 6000
Encyclopedia of Forensic and Legal Medicine Third Edition 5000
Agyptische Geschichte der 21.30. Dynastie 2000
Variants in Economic Theory 1000
Global Ingredients & Formulations Guide 2014, Hardcover 1000
Research for Social Workers 1000
热门求助领域 (近24小时)
化学 材料科学 生物 医学 工程类 计算机科学 有机化学 物理 生物化学 纳米技术 复合材料 内科学 化学工程 人工智能 催化作用 遗传学 数学 基因 量子力学 物理化学
热门帖子
关注 科研通微信公众号,转发送积分 5816598
求助须知:如何正确求助?哪些是违规求助? 5939830
关于积分的说明 15544252
捐赠科研通 4938841
什么是DOI,文献DOI怎么找? 2660161
邀请新用户注册赠送积分活动 1606389
关于科研通互助平台的介绍 1561216