作者
Sergio L. Novi,Nithya Navarathna,Marcel D’Cruz,Justin R. Brooks,Bradley A. Maron,Amal Isaiah
摘要
Importance Deep learning (DL), a subset of artificial intelligence, uses multilayered neural networks to uncover complex patterns in large datasets without manual feature engineering. Unlike traditional machine learning, DL autonomously learns hierarchical representations from raw data, offering distinct advantages for analyzing images (eg, stroboscopy) and physiologic signals (eg, cochlear implant optimization). Despite these advances, DL remains conceptually difficult for many clinicians to integrate into routine clinical practice. This narrative review sought to synthesize recent DL applications and propose a framework for their integration in otolaryngology. Observations A total of 1422 articles (2020-2025) were screened, and 327 original research studies on DL in otolaryngology were included in the analysis. The included articles were categorized into 4 domains: detection and diagnosis (179 [55%]), prediction and prognostics (16; [5%]), image segmentation (93 [28%]), and emerging applications (39 [12%]). Proof-of-concept studies have demonstrated that DL systems can achieve acceptable diagnostic performance comparable to experts, with models accurately identifying nasopharyngeal carcinoma (92%), laryngeal malignant neoplasms (86%), and otologic pathology (>95%). Prognostic applications included survival stratification in oropharyngeal cancer and recurrence prediction in chronic rhinosinusitis. Segmentation models reliably delineated anatomical regions. Emerging uses encompassed hearing aid optimization, surgical instrument tracking, and intraoperative landmark identification. Further progress requires multi-institutional datasets, standardized acquisition protocols, and transparent, interpretable models to improve trust and clinical adoption. Conclusions and Relevance This narrative review found that DL applications in otolaryngology show potential for improving diagnostic performance, predicting outcomes, and providing intraoperative guidance. Widespread and equitable adoption needs to be supported by harmonized, high-quality, and representative datasets, as well as the mitigation of algorithmic bias and robust model interpretability. Federated learning and explainability are emerging frameworks that support the preservation of privacy and increased clinician trust. Standardized reporting, prospective validation, human-in-the-loop models, and interdisciplinary partnerships can help balance the promise of algorithmic approaches and their clinical utility, ensuring that DL tools contribute meaningfully to patient care.