计算机科学
目视检查
扫描仪
神经影像学
分类器(UML)
人工智能
图像质量
可靠性(半导体)
数据挖掘
质量(理念)
机器学习
模式识别(心理学)
医学
认识论
精神科
图像(数学)
功率(物理)
哲学
物理
量子力学
作者
Gaurav Bhalerao,Grace Gillis,M Dembélé,Sana Suri,Klaus P. Ebmeier,Johannes Klein,Joshua Shulman,Clare E. Mackay,Ludovica Griffanti
标识
DOI:10.1101/2024.04.12.24305603
摘要
Abstract Introduction T1-weighted MRI is widely used in clinical neuroimaging for studying brain structure and its changes, including those related to neurodegenerative diseases, and as anatomical reference for analysing other modalities. Ensuring high-quality T1-weighted scans is vital as image quality affects reliability of outcome measures. However, visual inspection can be subjective and time-consuming, especially with large datasets. The effectiveness of automated quality control (QC) tools for clinical cohorts remains uncertain. In this study, we used T1w scans from elderly participants within ageing and clinical populations to test the accuracy of existing QC tools with respect to visual QC and to establish a new quality prediction framework for clinical research use. Methods Four datasets acquired from multiple scanners and sites were used ( N = 2438, 11 sites, 39 scanner manufacturer models, 3 field strengths – 1.5T, 3T, 2.9T, patients and controls, average age 71 ± 8 years). All structural T1w scans were processed with two standard automated QC pipelines (MRIQC and CAT12). The agreement of the accept-reject ratings was compared between the automated pipelines and with visual QC. We then designed a quality prediction framework that combines the QC measures from the existing automated tools and is trained on clinical datasets. We tested the classifier performance using cross-validation on data from all sites together, also examining the performance across diagnostic groups. We then tested the generalisability of our approach when leaving one site out and explored how well our approach generalises to data from a different scanner manufacturer and/or field strength from those used for training. Results Our results show significant agreement between automated QC tools and visual QC (Kappa=0.30 with MRIQC predictions; Kappa=0.28 with CAT12’s rating) when considering the entire dataset, but the agreement was highly variable across datasets. Our proposed robust undersampling boost (RUS) classifier achieved 87.7% balanced accuracy on the test data combined from different sites (with 86.6% and 88.3% balanced accuracy on scans from patients and controls respectively). This classifier was also found to be generalisable on different combinations of training and test datasets (leave-one-site-out = 78.2% average balanced accuracy; exploratory models = 77.7% average balanced accuracy). Conclusion While existing QC tools may not be robustly applicable to datasets comprised of older adults who have a higher rate of atrophy, they produce quality metrics that can be leveraged to train a more robust quality control classifiers for ageing and clinical cohorts.
科研通智能强力驱动
Strongly Powered by AbleSci AI