前列腺癌
医学
磁共振成像
四分位间距
前列腺
放射科
分割
数据集
核医学
人工智能
癌症
计算机科学
内科学
作者
Nils Netzer,Cedric Weißer,Patrick Schelb,Xianfeng Wang,Xiaoyan Qin,Magdalena Görtz,Viktoria Schütz,Jan Philipp Radtke,Thomas Hielscher,Constantin Schwab,Albrecht Stenzinger,Tristan Anselm Kuder,Regula Gnirs,Markus Hohenfellner,Heinz‐Peter Schlemmer,Klaus H. Maier‐Hein,David Bonekamp
标识
DOI:10.1097/rli.0000000000000791
摘要
Background The potential of deep learning to support radiologist prostate magnetic resonance imaging (MRI) interpretation has been demonstrated. Purpose The aim of this study was to evaluate the effects of increased and diversified training data (TD) on deep learning performance for detection and segmentation of clinically significant prostate cancer–suspicious lesions. Materials and Methods In this retrospective study, biparametric (T2-weighted and diffusion-weighted) prostate MRI acquired with multiple 1.5-T and 3.0-T MRI scanners in consecutive men was used for training and testing of prostate segmentation and lesion detection networks. Ground truth was the combination of targeted and extended systematic MRI–transrectal ultrasound fusion biopsies, with significant prostate cancer defined as International Society of Urological Pathology grade group greater than or equal to 2. U-Nets were internally validated on full, reduced, and PROSTATEx-enhanced training sets and subsequently externally validated on the institutional test set and the PROSTATEx test set. U-Net segmentation was calibrated to clinically desired levels in cross-validation, and test performance was subsequently compared using sensitivities, specificities, predictive values, and Dice coefficient. Results One thousand four hundred eighty-eight institutional examinations (median age, 64 years; interquartile range, 58–70 years) were temporally split into training (2014–2017, 806 examinations, supplemented by 204 PROSTATEx examinations) and test (2018–2020, 682 examinations) sets. In the test set, Prostate Imaging–Reporting and Data System (PI-RADS) cutoffs greater than or equal to 3 and greater than or equal to 4 on a per-patient basis had sensitivity of 97% (241/249) and 90% (223/249) at specificity of 19% (82/433) and 56% (242/433), respectively. The full U-Net had corresponding sensitivity of 97% (241/249) and 88% (219/249) with specificity of 20% (86/433) and 59% (254/433), not statistically different from PI-RADS ( P > 0.3 for all comparisons). U-Net trained using a reduced set of 171 consecutive examinations achieved inferior performance ( P < 0.001). PROSTATEx training enhancement did not improve performance. Dice coefficients were 0.90 for prostate and 0.42/0.53 for MRI lesion segmentation at PI-RADS category 3/4 equivalents. Conclusions In a large institutional test set, U-Net confirms similar performance to clinical PI-RADS assessment and benefits from more TD, with neither institutional nor PROSTATEx performance improved by adding multiscanner or bi-institutional TD.
科研通智能强力驱动
Strongly Powered by AbleSci AI