医学
皮肤损伤
病变
皮肤病科
翻译(生物学)
任务(项目管理)
病理
信使核糖核酸
生物化学
化学
管理
经济
基因
作者
Deval Mehta,Clare Primiero,Brigid Betz‐Stablein,Toan D. Nguyen,Yaniv Gal,Adrian Bowling,Martin Haskett,Maithili Sashindranath,C. Paul Bonnington,Victoria Mar,H. Peter Soyer,Zongyuan Ge
摘要
Abstract Background The surge in AI models for diagnosing skin lesions through image analysis is notable, yet their clinical implementation faces challenges. Common limitations include an over reliance on dermoscopy, lack of real‐world applicability when only binary output (e.g. benign/malignant) is offered and low accuracy when faced with rare skin conditions. Objectives To address these common constraints associated with limited diagnostic output, and applicability to real‐world settings. Methods We developed an All‐In‐One H ierarchical‐ O ut of Distribution‐Clinical T riage ( HOT ) AI model for skin lesion analysis. Trained on a large dataset of ~208,000 lesion images, our HOT AI model generates three outputs: a hierarchical three‐level prediction, an alert for out‐of‐distribution (OOD) images and a recommendation for dermoscopy to improve diagnostic prediction. Results Our hierarchical prediction output provides a binary level 1 prediction (benign/malignant), Level 2 prediction of eight possible categories (e.g. melanocytic and keratinocytic) and a more definitive Level 3 prediction from 44 lesion categories. The model produced high sensitivity for Level 1 prediction (88.14% CI: 87.42–88.51); however, significantly lower for Level 3 prediction (63.90%, CI: 62.27–65.61). By relying on all three prediction levels for consensus, Level 1 false‐positives were reduced by 20–25%, and false‐negatives were decreased by 11–13% of cases. OOD detection was benchmarked against previous landmark models and outperformed comparative models. Lastly, 44% of images were recommended for dermoscopy, and with additional image input, Level 3 sensitivity increased from 48.13% (CI:45.08–49.57) to 52.54% (CI:50.25–55.04). Conclusions Our HOT‐AI model attempts to address common challenges in existing models by combining three tasks in one model to increase accuracy and clinical utility. By providing a more nuanced prediction, and alert for OOD, the model output provides greater explainability of the AI decision process. Prospective clinical testing is required to measure how this additional output impacts user trust, and how the model performs in a real‐world setting.
科研通智能强力驱动
Strongly Powered by AbleSci AI