随机森林
聚类分析
计算机科学
标杆管理
机器学习
人工智能
k均值聚类
预测建模
鉴定(生物学)
数据挖掘
植物
营销
业务
生物
标识
DOI:10.1166/jmihi.2020.3000
摘要
Importance: Diabetes is a chronic disease that can cause long term damage to various parts of the body. To prevent diabetic complications, different attempts integrating machine learning with medicine have been made for building models to predict whether a patient has diabetes or not, but predicting this disease still has room for improvement. Hybrid prediction model presents a novel method and mostly achieve a much better optimal outcome than single classical machine learning algorithms. Objective: To develop a high accuracy model for different onsets of type 2 diabetes prediction. In this way, the integration between clustering and classification techniques can be improved to help detecting diabetes at an earlier stage without deleting observations with missing values and also decrease insignificant features to get the most related features during data collection. Methods: We implement a noise reduction based technique using Kmeans clustering followed by running the Random forest and XGBoost classifiers to extract the unknown hidden features of the dataset and for more accurate results. Results: Prediction accuracy can be observed by benchmarking our model against up-to-date predictive models and common classification algorithms. With an accuracy of 97.53% by 10 fold cross validation, our T2ML model reaches a better accuracy compared with other experiments reported by other researchers in the literature and over various conventional classification algorithms.
科研通智能强力驱动
Strongly Powered by AbleSci AI