概化理论
计算机科学
数据科学
胶质母细胞瘤
任务(项目管理)
比例(比率)
数据共享
机器学习
大数据
样品(材料)
人工智能
数据挖掘
医学
心理学
化学
管理
经济
替代医学
癌症研究
量子力学
色谱法
病理
发展心理学
物理
作者
Sarthak Pati,Ujjwal Baid,Brandon Edwards,Micah Sheller,Shih‐Han Wang,G. Anthony Reina,Patrick Foley,А. Д. Груздев,Deepthi Karkada,Christos Davatzikos,Chiharu Sako,Satyam Ghodasara,Michel Bilello,Suyash Mohan,Philipp Kickingereder,Gianluca Brugnara,Chandrakanth Jayachandran Preetha,Felix Sahm,Klaus Maier‐Hein,Maximilian Zenk
标识
DOI:10.1038/s41467-022-33407-5
摘要
Although machine learning (ML) has shown promise in numerous domains, there are concerns about generalizability to out-of-sample data. This is currently addressed by centrally sharing ample, and importantly diverse, data from multiple sites. However, such centralization is challenging to scale (or even not feasible) due to various limitations. Federated ML (FL) provides an alternative to train accurate and generalizable ML models, by only sharing numerical model updates. Here we present findings from the largest FL study to-date, involving data from 71 healthcare institutions across 6 continents, to generate an automatic tumor boundary detector for the rare disease of glioblastoma, utilizing the largest dataset of such patients ever used in the literature (25,256 MRI scans from 6,314 patients). We demonstrate a 33% improvement over a publicly trained model to delineate the surgically targetable tumor, and 23% improvement over the tumor's entire extent. We anticipate our study to: 1) enable more studies in healthcare informed by large and diverse data, ensuring meaningful results for rare diseases and underrepresented populations, 2) facilitate further quantitative analyses for glioblastoma via performance optimization of our consensus model for eventual public release, and 3) demonstrate the effectiveness of FL at such scale and task complexity as a paradigm shift for multi-site collaborations, alleviating the need for data sharing.
科研通智能强力驱动
Strongly Powered by AbleSci AI