GA-Stacking: A New Stacking-Based Ensemble Learning Method to Forecast the COVID-19 Outbreak

随机森林堆积集成学习决策树计算机科学人工智能机器学习 Boosting（机器学习）集合预报 Lasso（编程语言）梯度升压随机子空间法数据挖掘分类器（UML）物理万维网核磁共振

作者

Walaa N. Ismail,Hessah A. Alsalamah,Ebtsam Mohamed

出处

期刊：Computers, materials & continua 日期：2022-10-31 卷期号：74 (2): 3945-3976 被引量：3

链接

techscience.com techscience.comdoi.org

标识

DOI：10.32604/cmc.2023.031194

摘要

As a result of the increased number of COVID-19 cases, Ensemble Machine Learning (EML) would be an effective tool for combatting this pandemic outbreak. An ensemble of classifiers can improve the performance of single machine learning (ML) classifiers, especially stacking-based ensemble learning. Stacking utilizes heterogeneous-base learners trained in parallel and combines their predictions using a meta-model to determine the final prediction results. However, building an ensemble often causes the model performance to decrease due to the increasing number of learners that are not being properly selected. Therefore, the goal of this paper is to develop and evaluate a generic, data-independent predictive method using stacked-based ensemble learning (GA-Stacking) optimized by a Genetic Algorithm (GA) for outbreak prediction and health decision aided processes. GA-Stacking utilizes five well-known classifiers, including Decision Tree (DT), Random Forest (RF), RIGID regression, Least Absolute Shrinkage and Selection Operator (LASSO), and eXtreme Gradient Boosting (XGBoost), at its first level. It also introduces GA to identify comparisons to forecast the number, combination, and trust of these base classifiers based on the Mean Squared Error (MSE) as a fitness function. At the second level of the stacked ensemble model, a Linear Regression (LR) classifier is used to produce the final prediction. The performance of the model was evaluated using a publicly available dataset from the Center for Systems Science and Engineering, Johns Hopkins University, which consisted of 10,722 data samples. The experimental results indicated that the GA-Stacking model achieved outstanding performance with an overall accuracy of 99.99% for the three selected countries. Furthermore, the proposed model achieved good performance when compared with existing bagging-based approaches. The proposed model can be used to predict the pandemic outbreak correctly and may be applied as a generic data-independent model to predict the epidemic trend for other countries when comparing preventive and control measures.

求助该文献

GA-Stacking: A New Stacking-Based Ensemble Learning Method to Forecast the COVID-19 Outbreak

今日热心研友