定位
计算机科学
符号
人工智能
语音识别
噪音(视频)
模式识别(心理学)
自然语言处理
数学
算术
图像(数学)
作者
Annalisa Baronetto,Luisa S. Graf,Sarah Fischer,Markus F. Neurath,Oliver Amft
出处
期刊:IEEE Journal of Biomedical and Health Informatics
[Institute of Electrical and Electronics Engineers]
日期:2023-07-01
卷期号:27 (7): 3164-3174
被引量:1
标识
DOI:10.1109/jbhi.2023.3269910
摘要
We analyse pretrained and non-pretrained deep neural models to detect 10-seconds Bowel Sounds (BS) audio segments in continuous audio data streams. The models include MobileNet, EfficientNet, and Distilled Transformer architectures. Models were initially trained on AudioSet and then transferred and evaluated on 84 hours of labelled audio data of eighteen healthy participants. Evaluation data was recorded in a semi-naturalistic daytime setting including movement and background noise using a smart shirt with embedded microphones. The collected dataset was annotated for individual BS events by two independent raters with substantial agreement (Cohen's Kappa $\kappa =0.74$ ). Leave-One-Participant-Out cross-validation for detecting 10-second BS audio segments, i.e. segment-based BS spotting, yielded a best F1 score of 73% and 67%, with and without transfer learning respectively. The best model for segment-based BS spotting was EfficientNet-B2 with an attention module. Our results show that pretrained models could improve F1 score up to 26%, in particular, increasing robustness against background noise. Our segment-based BS spotting approach reduces the amount of audio data to be reviewed by experts from 84 h to 11 h, thus by $\sim$ 87%.
科研通智能强力驱动
Strongly Powered by AbleSci AI