计算机科学
深度学习
人工智能
语音识别
心音
心脏病学
医学
作者
Bruno Oliveira,André Lobo,Corrado Costa,Ricardo Fontes‐Carvalho,Miguel Coimbra,Francesco Renna
标识
DOI:10.1109/embc53108.2024.10782371
摘要
We introduce a Gradient-weighted Class Activation Mapping (Grad-CAM) methodology to assess the performance of five distinct models for binary classification (normal/abnormal) of synchronized heart sounds and electrocardiograms. The applied models comprise a one-dimensional convolutional neural network (1D-CNN) using solely ECG signals, a two-dimensional convolutional neural network (2D-CNN) applied separately to PCG and ECG signals, and two multimodal models that employ both signals. In the multimodal models, we implement two fusion approaches: an early fusion and a late fusion. The results indicate a performance improvement in using an early fusion model for the joint classification of both signals, as opposed to using a PCG 2D-CNN or ECG 1D-CNN alone (e.g., ROC-AUC score of 0.81 vs. 0.79 and 0.79, respectively). Although the ECG 2D-CNN demonstrates a higher ROC-AUC score (0.82) compared to the early fusion model, it exhibits a lower F1-score (0.85 vs. 0.86). Grad-CAM unveils that the models tend to yield higher gradients in the QRS complex and T/P-wave of the ECG signal, as well as between the two PCG fundamental sounds (S1 and S2), for discerning normalcy or abnormality, thus showcasing that the models focus on clinically relevant features of the recorded data.
科研通智能强力驱动
Strongly Powered by AbleSci AI