作者
Mohammad Moharrami,Elaheh Vahab,Mobina Bagherianlemraski,Ghazal Hemmati,Sonica Singhal,Carlos Quiñonez,Falk Schwendicke,Michael Glogauer
摘要
ABSTRACT Objectives This systematic review aimed to evaluate the performance of deep learning (DL) models in detecting dental plaque and gingivitis from red, green, and blue (RGB) intraoral photographs. Methods A comprehensive literature search was conducted across Medline, Scopus, Embase, and Web of Science databases up to January 31, 2025. The methodological characteristics and performance metrics of studies developing and validating DL models for classification, detection, or segmentation tasks were analysed. The risk of bias was assessed using the quality assessment of diagnostic accuracy studies 2 (QUADAS‐2) tool, and the certainty of the evidence was evaluated with the grading of recommendations assessment, development, and evaluation (GRADE) framework. Results From 3307 identified records, 23 studies met the inclusion criteria. Of these, 10 focused on dental plaque, 11 on gingivitis, and two addressed both outcomes. The risk of bias was low in all QUADAS‐2 domains for 11 studies, with low applicability concerns in nine. For dental plaque, DL models showed robust performance in the segmentation task, with intersection over union (IoU) values ranging from 0.64 to 0.86 (median 0.74). Three studies indicated that DL models outperformed dentists in identifying dental plaque when disclosing agents were not used. For gingivitis, the models demonstrated potential but underperformed compared to dental plaque, with IoU values ranging from 0.43 to 0.72 (median 0.63). The certainty of the evidence was moderate for dental plaque and low for gingivitis. Conclusions DL models demonstrate promising potential for detecting dental plaque and gingivitis from intraoral photographs, with superior performance in plaque detection. Leveraging accessible imaging devices such as smartphones, these models can enhance teledentistry and may facilitate early screening for periodontal disease. However, the lack of external testing, multicenter studies, and reporting consistency highlights the need for further research to ensure real‐world applicability.