随机森林
支持向量机
计算机科学
轮廓
人工智能
均方误差
聚类分析
特征(语言学)
模式识别(心理学)
多层感知器
感知器
糖尿病
数据挖掘
机器学习
人工神经网络
统计
数学
医学
内分泌学
哲学
语言学
作者
Md. Mehedi Hassan,Swarnali Mollick,Farhana Yasmin
标识
DOI:10.1016/j.health.2022.100112
摘要
Diabetes mellitus is often a hyperglycemic condition that poses a substantial threat to human health. Early diabetes detection decreases morbidity and mortality. Due to the scarcity of labeled data and the presence of oddities in diabetes datasets, it is exceedingly difficult to develop a trustworthy and accurate diabetes prognosis. The dataset and groupings of the features using the elbow and silhouette methods have been clustered using K-means. Various machine learning approaches have also been applied to the cluster-based dataset to predict diabetes. We propose an unsupervised cluster-based feature grouping model for early diabetes identification using an open-source dataset containing the data of 520 diabetic patients. On the cluster-based dataset and the complete dataset, the maximum Accuracy (ACC) is 99.57% and 99.03%, respectively. The best Precision, Recall, minimum mean squared error (MSE), maximum mean squared error (MSE), and F1-Score of 1.000 are obtained from multi-layer perceptron (MLP), random forest (RF), and k-Nearest Neighbors (KNN), 0.984 from random forest (RF) and support vector machine (SVM), 0.010 from RF, 0.067 from KNN, and 99.20% from RF, respectively. A comparison table displays the anticipated outcomes and highlights the aspects of this research that are most likely to occur as intended. The preprocessed data and codes are available on the GitHub repository to https://github.com/mhashiq/Early-stage-diabetes-risk-prediction.
科研通智能强力驱动
Strongly Powered by AbleSci AI