Lasso(编程语言)
基因
判别式
支持向量机
机器学习
计算生物学
候选基因
细胞周期蛋白依赖激酶1
计算机科学
生物
人工智能
遗传学
细胞周期
万维网
作者
Md. Al Mehedi Hasan,Md. Maniruzzaman,Jungpil Shin
标识
DOI:10.1109/tcbb.2023.3322753
摘要
Biomarkers associated with hepatocellular carcinoma (HCC) are of great importance to better understand biological response mechanisms to internal or external intervention. The study aimed to identify key candidate genes for HCC using machine learning (ML) and statistics-based bioinformatics models. Differentially expressed genes (DEGs) were identified using limma and then selected their common genes among DEGs identified from four datasets. After that, protein-protein interaction networks were constructed using STRING and then Cytoscape was used to determine hub genes, significant modules, and their associated genes. Simultaneously, three ML-based techniques such as support vector machine (SVM), least absolute shrinkage and selection operator-logistic regression (LASSO-LR), and partial least squares-discriminant analysis (PLS-DA) were implemented to determine the discriminative genes of HCC from common DEGs. Moreover, metadata of hub genes were formed by listing all hub genes from existing studies to incorporate other findings in our analysis. Finally, seven key candidate genes (ASPM, CCNB1, CDK1, DLGAP5, KIF20 A, MT1X, and TOP2A) were identified by intersecting common genes among hub genes, significant modules genes, discriminative genes from SVM, LASSO-LR, and PLS-DA, and meta hub genes from existing studies. Another three independent test datasets were also used to validate these seven key candidate genes using AUC, computed from ROC.
科研通智能强力驱动
Strongly Powered by AbleSci AI