Automated quality control of T1-weighted brain MRI scans for clinical research: methods comparison and design of a quality prediction classifier

计算机科学目视检查扫描仪神经影像学分类器（UML）人工智能图像质量可靠性（半导体）数据挖掘质量（理念）机器学习模式识别（心理学）医学认识论精神科图像（数学）功率（物理）哲学物理量子力学

作者

Gaurav Bhalerao,Grace Gillis,M Dembélé,Sana Suri,Klaus P. Ebmeier,Johannes Klein,Joshua Shulman,Clare E. Mackay,Ludovica Griffanti

链接

medrxiv.orgdoi.org

标识

DOI：10.1101/2024.04.12.24305603

摘要

Abstract Introduction T1-weighted MRI is widely used in clinical neuroimaging for studying brain structure and its changes, including those related to neurodegenerative diseases, and as anatomical reference for analysing other modalities. Ensuring high-quality T1-weighted scans is vital as image quality affects reliability of outcome measures. However, visual inspection can be subjective and time-consuming, especially with large datasets. The effectiveness of automated quality control (QC) tools for clinical cohorts remains uncertain. In this study, we used T1w scans from elderly participants within ageing and clinical populations to test the accuracy of existing QC tools with respect to visual QC and to establish a new quality prediction framework for clinical research use. Methods Four datasets acquired from multiple scanners and sites were used ( N = 2438, 11 sites, 39 scanner manufacturer models, 3 field strengths – 1.5T, 3T, 2.9T, patients and controls, average age 71 ± 8 years). All structural T1w scans were processed with two standard automated QC pipelines (MRIQC and CAT12). The agreement of the accept-reject ratings was compared between the automated pipelines and with visual QC. We then designed a quality prediction framework that combines the QC measures from the existing automated tools and is trained on clinical datasets. We tested the classifier performance using cross-validation on data from all sites together, also examining the performance across diagnostic groups. We then tested the generalisability of our approach when leaving one site out and explored how well our approach generalises to data from a different scanner manufacturer and/or field strength from those used for training. Results Our results show significant agreement between automated QC tools and visual QC (Kappa=0.30 with MRIQC predictions; Kappa=0.28 with CAT12’s rating) when considering the entire dataset, but the agreement was highly variable across datasets. Our proposed robust undersampling boost (RUS) classifier achieved 87.7% balanced accuracy on the test data combined from different sites (with 86.6% and 88.3% balanced accuracy on scans from patients and controls respectively). This classifier was also found to be generalisable on different combinations of training and test datasets (leave-one-site-out = 78.2% average balanced accuracy; exploratory models = 77.7% average balanced accuracy). Conclusion While existing QC tools may not be robustly applicable to datasets comprised of older adults who have a higher rate of atrophy, they produce quality metrics that can be leveraged to train a more robust quality control classifiers for ageing and clinical cohorts.

求助该文献

Automated quality control of T1-weighted brain MRI scans for clinical research: methods comparison and design of a quality prediction classifier

今日热心研友