Improving Surgical Site Infection Prediction Using Machine Learning: Addressing Challenges of Highly Imbalanced Data

手术部位感染 计算机科学 机器学习 人工智能 医学 外科
作者
Salha Al-Ahmari,Farrukh Nadeem
出处
期刊:Diagnostics [Multidisciplinary Digital Publishing Institute]
卷期号:15 (4): 501-501
标识
DOI:10.3390/diagnostics15040501
摘要

Background: Surgical site infections (SSIs) lead to higher hospital readmission rates and healthcare costs, representing a significant global healthcare burden. Machine learning (ML) has demonstrated potential in predicting SSIs; however, the challenge of addressing imbalanced class ratios remains. Objectives: The aim of this study is to evaluate and enhance the predictive capabilities of machine learning models for SSIs by assessing the effects of feature selection, resampling techniques, and hyperparameter optimization. Methods: Using routine SSI surveillance data from multiple hospitals in Saudi Arabia, we analyzed a dataset of 64,793 surgical patients, of whom 1632 developed SSI. Seven machine learning algorithms were created and tested: Decision Tree (DT), Gaussian Naive Bayes (GNB), Support Vector Machine (SVM), Logistic Regression (LR), Random Forest (RF), Stochastic Gradient Boosting (SGB), and K-Nearest Neighbors (KNN). We also improved several resampling strategies, such as undersampling and oversampling. Grid search five-fold cross-validation was employed for comprehensive hyperparameter optimization, in conjunction with balanced sampling techniques. Features were selected using a filter method based on their relationships with the target variable. Results: Our findings revealed that RF achieves the highest performance, with an MCC of 0.72. The synthetic minority oversampling technique (SMOTE) is the best-performing resampling technique, consistently enhancing the performance of most machine learning models, except for LR and GNB. LR struggles with class imbalance due to its linear assumptions and bias toward the majority class, while GNB's reliance on feature independence and Gaussian distribution make it unreliable for under-represented minority classes. For computational efficiency, the Instance Hardness Threshold (IHT) offers a viable alternative undersampling technique, though it may compromise performance to some extent. Conclusions: This study underscores the potential of ML models as effective tools for assessing SSI risk, warranting further clinical exploration to improve patient outcomes. By employing advanced ML techniques and robust validation methods, these models demonstrate promising accuracy and reliability in predicting SSI events, even in the face of significant class imbalances. In addition, using MCC in this study ensures a more reliable and robust evaluation of the model's predictive performance, particularly in the presence of an imbalanced dataset, where other metrics may fail to provide an accurate evaluation.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
偏遇完成签到,获得积分10
刚刚
年轻的月饼完成签到 ,获得积分20
刚刚
Iris完成签到,获得积分10
刚刚
猫猫祟完成签到 ,获得积分10
1秒前
ZYA1999完成签到,获得积分10
1秒前
jingfortune完成签到 ,获得积分10
2秒前
Celine完成签到,获得积分10
2秒前
小杰完成签到,获得积分10
4秒前
yrghitiam完成签到,获得积分10
4秒前
嘻嘻完成签到,获得积分10
5秒前
劈头士完成签到,获得积分10
6秒前
7秒前
十七完成签到 ,获得积分10
7秒前
龙眼完成签到,获得积分10
7秒前
点点白帆完成签到,获得积分10
7秒前
7秒前
有魅力落雁完成签到,获得积分10
8秒前
8秒前
打打应助朴实绝悟采纳,获得10
8秒前
9秒前
狂野的迎波完成签到,获得积分20
9秒前
zhou完成签到,获得积分20
9秒前
Cmax_完成签到,获得积分10
10秒前
libiqing77完成签到,获得积分10
12秒前
爆米花应助xnz采纳,获得10
12秒前
12秒前
HJC完成签到,获得积分10
13秒前
13秒前
小常完成签到,获得积分20
13秒前
英俊的铭应助aaron33采纳,获得10
14秒前
张宇宁完成签到 ,获得积分10
14秒前
bushi完成签到,获得积分10
14秒前
白兰猫应助龙行天下采纳,获得10
14秒前
医研完成签到 ,获得积分10
14秒前
樊念烟完成签到,获得积分10
14秒前
sandwich完成签到 ,获得积分10
15秒前
机灵的以筠完成签到 ,获得积分10
15秒前
15秒前
独自受罪完成签到 ,获得积分10
15秒前
nianlu完成签到,获得积分10
15秒前
高分求助中
(应助此贴封号)【重要!!请各用户(尤其是新用户)详细阅读】【科研通的精品贴汇总】 10000
The Organometallic Chemistry of the Transition Metals 800
Chemistry and Physics of Carbon Volume 18 800
The Organometallic Chemistry of the Transition Metals 800
Leading Academic-Practice Partnerships in Nursing and Healthcare: A Paradigm for Change 800
The formation of Australian attitudes towards China, 1918-1941 640
Signals, Systems, and Signal Processing 610
热门求助领域 (近24小时)
化学 材料科学 医学 生物 纳米技术 工程类 有机化学 化学工程 生物化学 计算机科学 物理 内科学 复合材料 催化作用 物理化学 光电子学 电极 细胞生物学 基因 无机化学
热门帖子
关注 科研通微信公众号,转发送积分 6436739
求助须知:如何正确求助?哪些是违规求助? 8251249
关于积分的说明 17552650
捐赠科研通 5495152
什么是DOI,文献DOI怎么找? 2898233
邀请新用户注册赠送积分活动 1875008
关于科研通互助平台的介绍 1716197