随机森林
人工智能
阿达布思
支持向量机
机器学习
集成学习
朴素贝叶斯分类器
特征选择
计算机科学
Boosting(机器学习)
多层感知器
宫颈癌
过采样
逻辑回归
模式识别(心理学)
人工神经网络
癌症
医学
带宽(计算)
内科学
计算机网络
作者
Khandaker Mohammad Mohi Uddin,Abdullah Al Mamun,Anamika Chakrabarti,Rafid Mostafiz,Samrat Kumar Dey
标识
DOI:10.1016/j.neuri.2024.100169
摘要
Cervical cancer has recently emerged as the leading cause of premature death among women. Around 85% of cervical cancer cases occur in underdeveloped countries. There are several risk factors associated with cervical cancer. This study describes a novel predictive model that uses early screening and risk trends from individual health records to forecast cervical cancer patients' prognoses. This study uses machine learning classification techniques to investigate the risk factors for cervical cancer. Additionally, use the voting method to evaluate all models and select the most appropriate model. The dataset used in this study contains missing values and shows a significant imbalance. Thus, the Random Oversampling technique was used as a sampling method. We used Principal Component Analysis (PCA) and XGBoost feature selection techniques to determine the most important features. To predict the accuracy, we used several machine learning classifiers, including Support Vector Machines (SVM), Random Forest (RF), k-nearest Neighbors (KNN), Decision Trees (DT), Naive Bayes (NB), Logistic Regression (LR), AdaBoost (AdB), Gradient Boosting (GB), Multilayer Perceptron (MLP), and Nearest Centroid Classifier (NCC). To demonstrate the efficacy of the suggested model, a comparison of its accuracy, sensitivity, and specificity was performed. We used the Random Oversampling approach along with the Ensemble ML method, hard voting on RF and MLP, and achieved 99.19% accuracy. It is demonstrated that the ensemble ML classifier (hard voting) performs better at handling classification problems when features are decreased and the high-class imbalance problem is handled.
科研通智能强力驱动
Strongly Powered by AbleSci AI