计算机科学
人工智能
睡眠(系统调用)
计算机视觉
计算机安全
医疗急救
医学
操作系统
作者
Wenjin Wang,Chuchu Liao,Jing-Yun Mai,Xiaoxiao He,Liping Pan,Ming Xia,Huiyi Lai,Xuhui Yang,Zhenlang Lin,Wenjin Wang
标识
DOI:10.1109/jbhi.2025.3542594
摘要
Current camera-based infant monitoring mainly focuses on physiological measurement, overlooking its important semantic analysis potential for detecting accidental suffocation caused by oronasal occlusion during sleep. However, developing a robust infant suffocation risk detection model typically requires substantial labeled data, which is very difficult to obtain in real-world scenarios. To address this, we utilized the text-to-image diffusion model to generate diverse infant images depicting oronasal occlusion and non-occlusion scenarios controlled by text prompts. To ease the process of labeling, self- and semi-supervised learning algorithms are leveraged to learn the semantic information from unlabeled data with the support of minimal labeled data to train different model architectures. To evaluate the feasibility of this solution, we conducted a clinical trial in the neonatology department, which collected video data from 22 infants under various oronasal occlusion scenarios using breathable covers (e.g. clinical tissue). The clinical evaluation shows that most models trained on 25,000 generated images achieved over 90% performance on metrics of accuracy, recall, and F1-score, outperforming conventional approaches that pre-train and fine-tune the model using over 90,000 labeled task-related online images. This demonstrates the feasibility of leveraging text-to-image generated data to achieve robust camera-based infant suffocation risk detection, so as to secure the sleep safety of infants. More importantly, it beacons the potential of using text-based large-scale model to solve the general issue of scarcity of human data in artificial intelligence-based healthcare or clinical applications.
科研通智能强力驱动
Strongly Powered by AbleSci AI